A novel online incremental and decremental learning algorithm based on variable support vector machine

Chen, Yuantao; Xiong, Jie; Xu, Weihong; Zuo, Jingwen

doi:10.1007/s10586-018-1772-4

A novel online incremental and decremental learning algorithm based on variable support vector machine

Published: 17 January 2018

Volume 22, pages 7435–7445, (2019)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Cluster Computing Aims and scope Submit manuscript

A novel online incremental and decremental learning algorithm based on variable support vector machine

Download PDF

Yuantao Chen¹,
Jie Xiong³,
Weihong Xu¹ &
…
Jingwen Zuo²

1797 Accesses
151 Citations
Explore all metrics

Abstract

In view of the long execution time and low execution efficiency of Support Vector Machine in large-scale training samples, the paper has proposed the online incremental and decremental learning algorithm based on variable support vector machine (VSVM). In deep understanding of the operation mechanism and correlation algorithms for VSVM, each sample has increased training datasets changes and it needs to update the classifier of learning algorithm. Firstly, they are given the online growth amount of learning algorithm taken full advantage of the incremental pre-calculated information, and doesn’t require retraining for the new incremental training datasets. Secondly, the incremental matrix inverse calculation process had greatly reduced the running time of algorithm, and it is given in order to verify out the validity of the online learning algorithm. Finally, the nine groups of datasets in the standard library have been selected in the pattern classification experiment. The experimental results are shown that the online learning algorithm given in the case to ensure the correct classification rates and effective training’s speed. With the implementation of the incremental process, training meetings, the need for large-scale data storage space, result in slow training, the online learning algorithm based on VSVM can solve the problem.

Support vector machine incremental learning triggered by wrongly predicted samples

Article 07 May 2018

Incremental Real Time Support Vector Machines

A Novel Incremental Covariance-Guided One-Class Support Vector Machine

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

In the past two decades, due to their surprising classification capability, support vector machine (SVM) [1] and its variants [2,3,4] have been extensively used in classification applications. SVM has two main learning features: (1) In SVM, the training data are first mapped into a higher dimensional feature space through a nonlinear feature mapping function $\phi \left( x \right) $, and (2) the standard optimization method is then used to find the solution of maximizing the separating margin of two different classes in this feature space while minimizing the training errors. With the introduction of the epsilon-insensitive loss function, the support vector method has been extended to solve regression problems [5].

As the training of SVMs involves a quadratic programming problem, the computational complexity of SVM training algorithms is usually intensive, which is at least quadratic with respect to the number of training examples. It is difficult to deal with large problems using single traditional SVM [6]; Instead, SVM mixtures can be used in large applications [7].

The information and knowledge has great potential value. The updating speed of information is amazing and the traditional generalization ability of learning algorithms aren’t appropriate. It caused learning and local minimum, and when the training sample number of these traditional algorithms faced with the curse of dimensionality or the training can’t be due to memory limitations. The incremental learning algorithm and the data classification technology has gradually become one of the key technologies of computational intelligence technology [8]. It is compared with ordinary data classification techniques.

The incremental learning classification techniques have significant advantages, and mainly manifested in two aspects: (1) Eliminating the need to preserve historical data, thereby reducing the storage space occupied. (2) Due to its new training full use of history in training results, and thus significantly reduce the time of a subsequent training [9]. While some gradient problems, new samples are provided on the amount of information provided by the amount of information and historical sample is different [10]. The conventional incremental algorithms in most of the learning algorithm are using the decision tree algorithm and neural network algorithm to achieve. The shortcomings can usually be found in varying degrees: (1) Because of lack control of the expected risk in entire dataset, the algorithm is easy to produce excessive training data matching; (2) Due to the lack of training data selection forgotten elimination mechanism, they are affecting the classification accuracy of SVM. The learning algorithm is based on structural risk minimization theory. It is one of the few learning algorithms can successfully solve the first problem with the traditional learning technologies. The advantages of SVM are compared to its generalization performance doesn’t depend on all training data, but through the training to get the support vector datasets on behalf of the entire datasets. The support vector datasets only accounts for the training datasets but having a small part of the datasets with important classification boundary information [11]. SVM can discard useless samples in order to reduce the training datasets, thereby reducing the storage space using historical training results so that the new learning algorithm. With the continuity of SVM to incremental learning is an effective method, especially ideal for big datasets.

Under normal circumstances, the incremental learning algorithm is mainly used in the following circumstances: (1) The samples are generated in real time, such as the stock trading data and time sequence data. (2) The samples are obtained in the form of spacer blocks, such as the scientific data. (3) The dataset is too large to be stored in personal computer’s memory, such as web logging data.

Incremental learning for different type learning has some problems. The following two algorithms have to solve the problems: (1) Add the sample called online incremental learning; (2) Add appropriate size of the sample datasets as increments [12]. The literature [3] firstly had proposed Incremental Learning on Support Vector Machine. It is given the increase in incremental learning strategy, but there is no improvement in the solution algorithm and still use Support Vector Machine. They are given only approximate incremental learning time to select a small number of general quadratic programming algorithm can handle the training sample, and then retain only the support vector and lost the other samples. The new training samples has been completed training until all samples are finished training procedure [13]. The incremental learning algorithms with approximate solution will lose part of the information. When they discard some non-support vectors for processing large training datasets than using traditional algorithms to improve some shortcomings, the literature [4] had given online incremental learning algorithm. The resulting solution is precise. The literature [5] had given the local online incremental learning algorithm based on radial basis function. When new samples into the training datasets hasn’t reconsider the entire training datasets, the RBF kernel function is localized and saving computational time in the procedure of Support Vector Machine [14].

2 Variable support vector machine

Variable support vector machine (VSVM) is quick and simple classification algorithm based on improved support vector machine. It is the standard linear SVM quadratic programming problem for simple deformation obtained the M-dimensional space of non-negative constraints convex function minimization problem. M is the number of samples of n-dimensional input data space. The non-negative optimality conditions for constrained minimization problem into the symmetric positive definite complement problem, and then use the simple linear convergence of the iterative algorithm to solve VSVM need two order matrix inversion. The Sherman–Morrison–Woodbury (SMW) equation into smaller $\left( {n+1} \right) $-order matrix inverse [15] $(n\ll m)$. The VSVM algorithm can process big datasets including with millions of sample points, but for the non-linear VSVM matrix inversion can’t be applied to the SMW equation is difficult to handle big-scale problems.

The VSVM is the standard linear SVM classifier. It can make two simple changes to maximize parallel boundary planes interval from n-dimensional to n+1-dimensional space with $\left( {w,b} \right) $. Followed as minimize the error rate $\xi $ with 1-norm becomes $\xi $ with 2-norm, thus eliminating the need for non-negative constraints, the deformation classification problem is strongly convex and the solution is obtained with the SVM classification method [16]. The solutions are almost following steps. The deformation of the linear SVM classification problem is follows as:

$$\begin{aligned}&\min \frac{1}{2}\left( {\left\| w \right\| ^{2}+b^{2}} \right) +\frac{C}{2}\xi ^{T}\xi \nonumber \\&s.t. \qquad y_i \left( {\left( {w\cdot x_i } \right) +b} \right) +\xi _i \ge 1 ,i=1,...,m \end{aligned}$$

(1)

In formula (1), the variable function is shown as formula (2).

$$\begin{aligned} L= & {} \frac{1}{2}\left( {\left\| w \right\| ^{2}+b^{2}} \right) +\frac{C}{2}\xi ^{T}\xi \nonumber \\&-\sum _{i=1}^m {\alpha _i \left( {y_i \left( {\left( {w\cdot x_i } \right) +b} \right) +\xi _i -1} \right) } \end{aligned}$$

(2)

In VSVM algorithm, the iterative formula is shown as formula (3).

$$\begin{aligned}&\alpha ^{i+1}=Q^{-1}\left( {e+\left( {\left( {Q\alpha ^{i}-e} \right) -\lambda \alpha ^{i}} \right) _+ } \right) , \nonumber \\&\quad i=0,1,...,\lambda >0 \end{aligned}$$

(3)

When it is satisfied with the condition of $0<\lambda <\frac{2}{C}$, the VSVM algorithm for arbitrary initial points has the global linear convergence. The inverse matrix Q using SMW equation will be second-order matrix reversal into n + 1 ($n\ll m)$ order matrix inversion. They are making solving problems and it is feasible for big datasets to reduce the calculation. The SMW equation is shown as follows:

$$\begin{aligned} \left( {\frac{I}{v}+AA^{T}} \right) ^{-1}-v\left( {I-A\left( {\frac{I}{v}+A^{T}A} \right) ^{-1}A^{T}} \right) \end{aligned}$$

where in $v>0$, the A matrix is an arbitrary $m\times n$-dimensional matrix.

For the non-linear case, the kernel function is $K\left( {x,y} \right) =\phi \left( x \right) ^{T}\phi \left( y \right) $. The non-linear classification function is shown as formula (4).

$$\begin{aligned} f\left( x \right) =\alpha ^{T}DK\left( {A,x} \right) +b \end{aligned}$$

(4)

So $G=\left[ {A-e} \right] $, $Q=\frac{I}{C}+DK\left( {G,G^{T}} \right) D$. The formula $\mathop {\min }\limits _{0\le \alpha \in R^{m}} \frac{1}{2}\alpha ^{T}Q\alpha -e^{T}\alpha $ can become formula (5).

$$\begin{aligned} \mathop {\min }\limits _{0\le \alpha \in R^{m}} \frac{1}{2}\alpha ^{T}\left( {\frac{I}{C}+DK\left( {G,G^{T}} \right) D} \right) \alpha -e^{T}\alpha \end{aligned}$$

(5)

The nonlinear case iteration formula with linear situation has shown in the same equation.

3 Online incremental learning algorithm based on VSVM

This section is based on VSVM to solve conventional SVM problems quickly and effectively. It can give suitable algorithm for the online incremental learning algorithm of VSVM. The matrix inversion algorithms can perform new sample results to the use of original datasets before the incremental demand. The inverse matrix can be updated without re-optimization of the new QP problem corresponding to the new training datasets and greatly reduce the scale of computation.

3.1 Online incremental learning algorithm

The algorithm analysis is shown that the sample points by Sect. 2 only support vectors and non-support vector both cases. Therefore, they need to consider both cases when the new sample is added to the training datasets of $x_c $. When they add new sample to the training sample datasets of T, the incremental learning algorithm need to update the SVM classifier. When new sample of $x_c $ is joining into the training datasets, they firstly initialize $\alpha _c =0$. Be judged by using the formula (1), if $x_c $ meet $y_c f\left( {x_c } \right) \ge 1$, then $\alpha _c =0$ meet the KKT conditions. Thereby, $x_c $ isn’t support vector and influence on the classification is configured. If $x_c $ meet $y_c f\left( {x_c } \right) <1$, then the $\alpha _c =0$ is clearly not in line with the KKT conditions. Thus, it is affecting the construction of the classifier and the need to re-optimize the dual problem.

It is supposed that T has m samples of the training datasets, the newly added sample $x_{m+1} ,y_{m+1} $, and order $\alpha _{m+1} =0$. Add new sample of $\left( {x_{m+1} ,y_{m+1} } \right) $ corresponding dual problem becomes formula (6).

$$\begin{aligned}&\min \frac{1}{2}\left( {\alpha ^{T}\alpha _{m+1} } \right) Q_{new} \left( {\begin{array}{l} \alpha \\ \alpha _{m+1} \\ \end{array}} \right) -e^{T}\left( {\begin{array}{l} \alpha \\ \alpha _{m+1} \\ \end{array}} \right) \nonumber \\&\quad \left( {\alpha ^{T}\alpha _{m+1} } \right) ^{T}\ge 0 \end{aligned}$$

(6)

Let $h_{m+1} =y_{m+1} \left( {x_{m+1}^T ,-1} \right) $, $H=D\left[ {A-e} \right] $, then $H_{new} =\left( {\begin{array}{l} H \\ h_{m+1} \\ \end{array}} \right) $, the abbreviated $H_{new} =\left( {\begin{array}{l} H \\ h \\ \end{array}} \right) $, which $Q_{new} =\frac{1}{C}+\left( {\begin{array}{l} H \\ h \\ \end{array}} \right) \left( {H^{T}h^{T}} \right) $, it is as formula (7).

$$\begin{aligned} Q_{new} =\left( \begin{array}{ll} Q &{} Hh^{T} \\ hH^{T} &{} \frac{1}{C}+hh^{T} \\ \end{array} \right) \end{aligned}$$

(7)

The VSVM iteration formula (2) by the second part of the new issues of $\alpha _{new} $ as formula (8).

$$\begin{aligned}&\alpha _{new}^{i+1} =Q_{new}^{-1} \left( {\left( {\left( {Q_{new} \alpha _{new}^i -e} \right) -\lambda \alpha _{new}^i } \right) _+ +e} \right) , \nonumber \\&\quad i=0,1,2,\ldots ,\lambda >0 \end{aligned}$$

(8)

where in $\alpha _{new}^0 =\left( {\begin{array}{l} \alpha \\ 0 \\ \end{array}} \right) $, $\alpha $ is optimal solution for incremental training datasets of T.

(1) The Linear Case

If novel solution parameter is $\alpha _{new} $, $Q_{new}^{-1} $ is the key question in front of VSVM iteration formula (8). The application with SMW equation can be obtained as follows:

$$\begin{aligned} Q_{new}^{-1} =C\left( {1-\left( {\begin{array}{l} H \\ h \\ \end{array}} \right) \left( {\frac{1}{C}+H^{T}H+h^{T}h} \right) ^{-1}\left( {H^{T}h^{T}} \right) } \right) \end{aligned}$$

Let $B=\left( {\frac{I}{C}+H^{T}H} \right) ^{-1}$, you can get as follows:

$$\begin{aligned}&\left( {\frac{I}{C}+H^{T}H+h^{T}h} \right) ^{-1}=\left( {B^{-1}+h^{T}h} \right) ^{-1} \\&\quad =\left( {B-\frac{Bh^{T}hB}{1+hBh^{T}}} \right) \end{aligned}$$

Thus

$$\begin{aligned} Q_{new}^{-1} =C\left( {I-\left( {\begin{array}{l} H \\ h \\ \end{array}} \right) \left( {B-\frac{Bh^{T}hB}{1+hBh^{T}}} \right) \left( {H^{T}h^{T}} \right) } \right) \end{aligned}$$

(9)

where in B is a value obtained in the previous calculation, we can use the VSVM iterative formula (8) and add a new sample after the new solutions of $\alpha $ obtained by the novel solution $\alpha _{new} $.

(2) The Nonlinear Case

Let $g=\left( {x_{m+1}^T -1} \right) $, $G=\left[ {A-e} \right] $, then:

$$\begin{aligned} G_{new} =\left[ {\begin{array}{l} G \\ g \\ \end{array}} \right] , D_{new} =\left[ \begin{array}{ll} D &{} 0 \\ 0 &{} y_{m+1} \\ \end{array} \right] \end{aligned}$$

At this point, Q can become $Q_{new} $ as follows:

$$\begin{aligned}&Q_{new} =\frac{I}{C}+D_{new} K\left( {G_{new} ,G_{new}^T } \right) D_{new} \\&\quad =\left( \begin{array}{ll} Q &{} DK\left( {G,g^{T}} \right) y_{m+1} \\ y_{m+1} K\left( {g,G^{T}} \right) D &{} \frac{1}{C}+K\left( {g,g^{T}} \right) \\ \end{array} \right) \end{aligned}$$

Let $DK\left( {G,g^{T}} \right) y_{m+1} =b$, $y_{m+1} K\left( {g,G^{T}} \right) D=b^{T}$, $\frac{1}{C}+K\left( {g,g^{T}} \right) =d$, $Q_{new} $ can be expressed as follows:

$$\begin{aligned} Q_{new} =\left( \begin{array}{ll} Q &{} b \\ b^{T} &{} d \\ \end{array} \right) \end{aligned}$$

It can be obtained by the formula (10).

$$\begin{aligned} Q_{new}^{-1}{=}\left( \begin{array}{ll} \left( {Q-bb^{T}/d} \right) ^{-1} &{} -\left( {Q-bb^{T}/d} \right) ^{-1}b/d \\ -b^{T}\left( {Q-bb^{T}/d} \right) ^{-1}/d &{} b^{T}\left( {Q-bb^{T}/d} \right) ^{-1}b/d^{2}+1/d \\ \end{array} \right) \nonumber \\ \end{aligned}$$

(10)

And $\left( {Q-bb^{T}/d} \right) ^{-1}$ is existed.

3.2 Online incremental learning algorithm based on VSVM

In summary, they are given the online incremental learning algorithm based on VSVM.

Firstly, the original sample datasets are divided into follows.

$$\begin{aligned}&\left\{ {T=\left\{ {\left( {x_1 ,y_1 } \right) ,\left( {x_2 ,y_2 } \right) ,\ldots ,\left( {x_m ,y_m } \right) } \right\} } \right\} \\&\quad \cup \left\{ {\left( {x_{m+1} ,y_{m+1} } \right) } \right\} \cup \ldots \cup \left\{ {\left( {x_N ,y_N } \right) } \right\} . \end{aligned}$$

Step 1:
Given the parameter C, the parameter $\lambda $ is selected to satisfy $0<\lambda <2/C$, the accuracy requirements $\varepsilon >0$, T trained classifier $C_1 $, so that $k=1$.
Step 2:
Get the new sample in $\left( {x_{m+k} ,y_{m+k} } \right) $, and $\alpha _{m+k} =0$, $i=0$.
Step 3:
Classifier $C_k $ is using KKT conditions for new sample $x_{m+k} $ is judged.
Step 3.1:
If the KKT conditions are satisfied, the classifier $C_k $ doesn’t change, $k=k+1$, turn Step2, until the completion of all training sample;
Step 3.2:
If the KKT conditions aren’t satisfied, $x_{m+k} $, and the training datasets form new training datasets, computing the inverse matrix of the new matrix Q;
Step 3.2.1:
If the classification problem is linear separable, it is calculated using formula (9) is the inverse matrix of the new matrix Q;
Step 3.2.2:
If the classification problem is nonlinear separable, using formula (10) to calculate the inverse matrix of the new matrix Q;
Step 4:
Use VSVM iteration formula (8) to calculate $\alpha _{new}^{i+1} $ parameter.
Step 5:
If $\left\| {\alpha _{new}^{i+1} -\alpha _{new}^i } \right\| \le \varepsilon $, the new classifier $C_{k+1} $, $k=k+1$, then turn Step2, until the sample training, otherwise let $i=i+1$, then turn Step4.

4 Online decremental learning algorithm based on VSVM

4.1 Online decremental learning algorithm

Online incremental training algorithm is training with the conduct of the incremental learning algorithm. The size of the sample datasets will become increasingly larger than before. The increased amount of storage is big, while variable multiplier is to be calculated the amount of increase of the promoter and nuclear function. The slow training and the processor load will be increasingly heavier to cause increasingly longer running time. In order to effectively reduce the load on the processor, it is necessary to effectively reduce the size of the learning sample datasets. Therefore, they must be designed to remove redundant sample algorithm, when one or more of the samples removed from the training datasets T needed decremental learning algorithm. The decremental learning algorithm based on VSVM is put forward in the context, the decremental learning of the so-called VSVM learning. In accordance with certain rules, they can abandon some samples to reduce the size of sample datasets. Only samples lost each time is called online decremental learning algorithm.

Each lost multiple samples called bulk decremental learning in the literature [4]. It is presented the online decremental learning algorithm. The decremental learning procedure can be defined as the reversible process given incremental learning, and the algorithm is the evaluation to promote the ability to leave-one-out. Two linear classifications with incremental and decremental learning algorithm based on Proximal Support Vector Machine (PSVM) is given by literature [17]. The running time is relatively fast, followed by literature [7] gives decremental learning algorithm based on PSVM uses a weighted decay coefficient instead of the existing window method and improve the running speed of algorithm.

4.2 Online decremental learning algorithm based on VSVM

In this section, they are given the decremental learning algorithm based on VSVM. Firstly, in this section it need some basic conventions and notations.

(1) The classification of the training datasets $T=\left\{ {\left( {x_i ,y_i } \right) |x_i \in R^{n},y_i =\pm 1,i=1,\ldots ,m} \right\} $, wherein bully $x_i $ is an n-dimensional space. The sample points $A^{T}=\left[ {x_1 ,\ldots ,x_m } \right] $, $y_i \in \left\{ {\pm 1} \right\} $ are corresponding to the positive class of $x_i $ and negative categories like numerals, $i=1,...,m$, $D=diag\left( {y_1 ,\ldots ,y_m } \right) $.

(2) The subscription of $x\in R^{n}$ is optionally with the vector. The $x\backslash K$ represents the vector x to remove all of the K for the next element in the underlying component vector formed where K is a subset of the datasets x.

(3) I is represented the matrix unit. The elementary matrix is represented by $P\left( {i,j} \right) $ to interchange the i$^{th}$ row (column) and the j$^{th}$ row (column) matrix obtained by the matrix I, and $P\left( {i,j} \right) ^{-1}=P\left( {i,j} \right) $.

The decremental learning training datasets of samples has time in descending online decremental learning training datasets in decrements of one sample at each moment. Our study and analysis with online decremental algorithm has based on VSVM, and the first consideration can be removed from the training datasets. It is provided into the original training datasets T. They have m samples and remove the samples of $\left( {x_i ,y_i } \right) $. For the order of m positive definite matrix Q and $Q^{-1}$, they can transform the row and column of Q elementary with row and column transformation. They can assumed that first k-rows and k-columns, and changed to the first line, and let $P\left( {1,k} \right) QP\left( {1,k} \right) =K$, then $K^{-1}=P\left( {1,k} \right) Q^{-1}P\left( {1,k} \right) =U$. Then they can block K and U matrix.

$$\begin{aligned} K=\left( \begin{array}{ll} k_{11} &{} {k^{T}} \\ k &{} {Q_{m-1} } \\ \end{array} \right) , \quad U=\left( \begin{array}{ll} u_{11} &{} {u^{T}} \\ u &{} U_{m-1} \\ \end{array} \right) \end{aligned}$$

Wherein $k_{11} $, $u_{11} \in R^{1}$, $k,u\in R^{m-1}$. Then the $K^{-1}$ is written as follows:

$$\begin{aligned} K^{-1}=\left( \begin{array}{ll} \left( {k_{11} -k^{T}Q_{m-1}^{-1} k} \right) ^{-1} &{} -\left( {k_{11} -k^{T}Q_{m-1}^{-1} k} \right) ^{-1}k^{T}Q_{m-1}^{-1} \\ -Q_{m-1}^{-1} k\left( {k_{11} -k^{T}Q_{m-1}^{-1} k} \right) ^{-1} &{} Q_{m-1}^{-1} k\left( {k_{11} -k^{T}Q_{m-1}^{-1} k} \right) ^{-1}k^{T}Q_{m-1}^{-1} +Q_{m-1}^{-1} \end{array} \right) \end{aligned}$$

By $K^{-1}=U$, it can be obtained as formula (11).

$$\begin{aligned}&\left( {k_{11} -k^{T}Q_{m-1}^{-1} k} \right) ^{-1}=u_{11} \nonumber \\&\quad -\,\left( {k_{11} -k^{T}Q_{m-1}^{-1} k} \right) ^{-1}k^{T}Q_{m-1}^{-1} =u^{T} \nonumber \\&\quad -\,Q_{m-1}^{-1} k\left( {k_{11} -k^{T}Q_{m-1}^{-1} k} \right) ^{-1}=u \nonumber \\&\quad Q_{m-1}^{-1} k\left( {k_{11} -k^{T}Q_{m-1}^{-1} k} \right) ^{-1}k^{T}Q_{m-1}^{-1} \nonumber \\&\quad +\,Q_{m-1}^{-1} =U_{m-1} \end{aligned}$$

(11)

By the formula (11) obtained as follows:

$$\begin{aligned} Q_{m-1}^{-1} =U_{m-1} -\frac{uu^{T}}{u_{11} } \end{aligned}$$

(12)

$Q_{m-1} $ is set to be removed the sample of $\left( {x_k ,y_k } \right) $ after the Q represented by $Q^{-1}$ using formula (11). It can be obtained by the preceding analysis shows $Q_{m-1}^{-1} $. $Q_{m-1} $ of the inverse matrix don’t have to recalculate thereby reducing the amount of computation.

Table 1 Linear case of OI-VSVM experimental results

Full size table

It can be obtained by the iterative formula (8) of VSVM after the decremental solution of $\alpha _{new} $.

$$\begin{aligned}&\alpha _{new}^{i+1} =Q_{m-1}^{-1} \left( {\left( {\left( {Q_{m-1} \alpha _{new}^i -\varepsilon } \right) -\lambda \alpha _{new}^i } \right) _+ +\varepsilon } \right) , \nonumber \\&\quad i=0,1,2\ldots ,\lambda >0 \end{aligned}$$

(13)

where in $\alpha _{new}^0 =\alpha \backslash \left\{ k \right\} $, $\alpha $ before decrementing the optimal solution.

The details of online decremental learning algorithm are based on VSVM:

Step 1:
Given the parameter of C, selected to meet $0<\lambda <2/C$ the parameter $\lambda $, the accuracy requirements $\varepsilon >0$ to obtain the sample to be removed $\left( {x_k ,y_k } \right) $;
Step 2:
Let $\alpha _{new}^0 =\alpha \backslash \left\{ k \right\} $, wherein $\alpha $ is the decrement before the optimal solution, $i=0$;
Step 3:
Use of the inverse matrix of the formula (12) to calculate the new matrix $Q_{m-1} $;
Step 4:
Use the VSVM iteration formula (13) to calculate $\alpha _{new}^{i+1} $;
Step 5:
If $\left\| {\alpha _{new}^{i+1} -\alpha _{new}^i } \right\| \le \varepsilon $, they can get new solution. otherwise, let $i=i+1$ and turn step 4.

5 The experimental results analysis and discussion

5.1 The numerical experiments of online incremental learning algorithm

To describe the convenience, they can give in this section the VSVM-Online Incremental learning algorithm. It is abbreviated as OI-VSVM. It isn’t use previously calculated results to calculate the matrix inverse of online incremental learning algorithm abbreviated is OVSVM.

In order to verify the effectiveness of the OI-VSVM algorithm in Sect. 2, the paper has selected nine datasets [18] in the standard pattern classification used on numerical experiments and comparison. We applied Matlab 2014a for the programming environment in the paper. The numerical experiments are run on personal computer.

We take advantage of the OVSVM algorithm and the literature [4] had proposed online incremental learning algorithm (It is abbreviated to On-line) and OI-VSVM algorithm is selected datasets from the correct rate of training test and CPU running time to compare $\lambda ={1.9}/C$, where C is a penalty parameter, accuracy requirements. $\varepsilon $ is $10^{-5}$ to take in all of the following test. The paper has divided into both cases of linear and nonlinear numerical experiments.

The operating results for the linear case is shown in Table 1, in which the number of sample points for the initial training datasets. C is the penalty parameter. The value of C is adjusted by selecting the datasets of the SVM’s correct rate to determine the highest adjusted datasets with 10% of the size in the training datasets. The data is randomly selected with the training datasets is separated as the following experiments.

For the nonlinear case, the radial basis on selection of the Gaussian kernel function is $K\left( {x,y} \right) =\exp \big ( -\left\| {x-y} \right\| ^{2}/\delta ^{2} \big )$, the parameter selection and the results are shown in Table 2. The RBF kernel function is $\delta $ nuclear parameter.

This section has given used OI-VSVM. It can be seen through the experimental results in the linear and nonlinear case. The OI-VSVM algorithm keep training correct rate and test correct rate doesn’t drop the case greatly reduces the CPU running time. It mainly due to step calculation results, and SMW equation to calculate the increase the sample of matrix inverse in the OI-VSVM utilization, thus saving computation time.

Table 2 Nonlinear case of OI-VSVM experimental results

Full size table

5.2 The numerical experiments of online decremental learning algorithm

The decremental VSVM of the effectiveness of the learning algorithm is given by numerical experiments in this section. In order to express convenience, they are abbreviated as OD-VSVM in this section by given the online decremental learning algorithm based on VSVM. OVSVM for the use of direct inversion method of learning algorithm is the VSVM online decrements.

In order to verify the performance of OD-VSVM, the UCI machine learning database [19] is selected datasets in numerical experiments. Respectively, because the algorithm given directly inverse matrix inversion, so their training correct rate and test correct rate is the same. So the selected datasets from the CPU running time is to be compared on our experiments for the nonlinear case, the RBF kernel function of $K\left( {x,y} \right) =\exp \left( {-\left\| {x-y} \right\| ^{2}/2\delta ^{2}} \right) $ is selected, and take $\delta =2$. In VSVM, we can take $\lambda ={1.9}/C$, where C is the penalty parameter and take the $C=1/m$ (m is the number of sample points in training datasets). The accuracy is $\varepsilon =10^{-5}$.

Table 3 The linear case of OD-VSVM experimental results

Full size table

The CPU running time comparison results with OD-VSVM and OVSVM on the same datasets is shown in Table 3. The CPU running time of OD-VSVM is less than OVSVM when training datasets’ size isn’t too big. The difference between OD-VSVM and OVSVM isn’t obvious. However, the CPU time for big-scale training datasets by OD-VSVM and OVSVM is quite obvious, such as Image1 and Image2 datasets [19].

Table 4 is shown as the experimental results of learning by the nonlinear case VSVM online decrements is shown different algorithms. The CPU running time can be seen from Table 4. The CPU running time below OD-VSVM and OVSVM has more apparent especially for the amount of the training datasets. k is larger than value of OD-VSVM time savings where k is the number of samples for each lostsample.

Table 4 The nonlinear case of OD-VSVM experimental results

Full size table

They have the ability of online adaptive learning makes support vector machine decremental learning, it can change as time itself. The Sect. 4.2 has given the block for matrix based on the online decremental learning algorithm based on VSVM. The two inverse matrix with one algorithm speed, due to take full advantage of the results to avoid re-learning to remove some of the sample on a learning to effectively reduce the complexity of the computation time of OVSVM and OD-VSVM, the numerical experimental results are shown that the section is given as VSVM online and bulk decremental learning algorithm to maintain good algorithm original training correct rate and test correct rate, and reduce the run time, so the proposed decremental learning algorithm in the paper is effective in big-scale datasets.

5.3 Calculating visual saliency map experiment

This part of video data published by the Itti’s datasets [20] to calculate the video image sequences of visual saliency. The datasets are included by day and night video, indoor and outdoor video, news video and other various video. In order to contrast conveniently, they were in the input image is first marked, so very convenient resolution model can result in visual saliency map on the pros and cons.

The image of Fig. 1a is the original image in the video. The image of Fig. 1b is the visual saliency map using Itti’s model [21]. The image of Fig. 1c is the visual saliency map using GBVS model [22]. The image of Fig. 1d is a visual saliency map using IS model [23]. The image of Fig. 1e is a visual saliency map by the proposed algorithm in the paper.

According to the saliency map results, the VSVM method into the characteristics of visual saliency map can better reflect the original image. At the same time, the visual saliency map well to important auxiliary role in the subsequent target detection process.

5.4 The testing experiments of object tracking algorithm

In order to verify the accuracy of the proposed algorithm in target detection field of video based face tracking test, this section has selected Stan Birchfield datasets [24] on the target detection experiments. The test computer implementing environment has Intel E8400 2.6 GHz and 8G DRAM to achieve target detection using Matlab 2014a software. They are focusing on the robustness of the algorithm, including the changes of light intensity, the target shape change, occlusion detection results of target detection algorithm, and individual characteristics and performance comparison.

The first datasets of experiments on video sequences of occlusion ($128\times 96)$ for face detection (video file name: movie_cubicle, total frames: 95 of video). Three algorithms of experiment are target detection algorithm considering the color feature of the target detection algorithm, the application of visual saliency detection algorithm, using OI-VSVM and OD-VSVM and visual saliency global feature. Figure 2 is the three occlusions algorithm of target detection results map. The video image sequence was first, 16, 21, 34, 51, 62, 70, 88 frame images. The video image sequence in the target will be blocked in the process of movement, occluded objects with similar color of the target to be detected. In considering the target detection method of color feature, because the background color distribution approximation and the object color, so target detection effect is poor; while in Fig. 2c is presented in this section method, considering OI-VSVM and OD-VSVM in visual saliency feature fusion. Even in the presence of occlusions, the proposed algorithm in the paper can accurately locate the target, the algorithm robustness.

6 Conclusions

Firstly, the paper has proposed the online incremental learning algorithm based on VSVM. The proposed algorithm has taken full advantage of incremental pre-calculated results, don’t need to re-learning the entire training datasets under the premise, don’t reduce training correct rate and test correct rate, so that the increase reduced amount after the amount of calculation of the inverse matrix. The experimental results in the paper have shown that the algorithm in the case of ensuring the classification accuracy of effectively improving the training speed, with incremental conducted training assembly increasing, leading to training slowly. In order to solve the problem, they are given the online decremental learning algorithm based on the VSVM, calculated the inverse matrix decremental learning algorithm speed through the inverse matrix. Due to take full advantage of the time to learn the results to avoid due to lose some samples caused by re-learning, which effectively reduces the complexity of computation time by numerical experiments to prove the effectiveness of the algorithm.

References

Ratsaby, J.: Incremental learning with sample queries. IEEE Trans. Pattern Anal. Mach. Intell. 20(8), 883–888 (1998)
Article Google Scholar
Chen, Y.T., Xu, W,H., Kuang, F.J. et al.: The research and application of visual saliency and adaptive support vector machine in target tracking field. Comput. Math. Methods Med. 2013, 8 (2013). Article ID 925341, https://doi.org/10.1155/2013/925341
MATH Google Scholar
Syed, N.A., Liu, H., Sung, K.K.: Incremental learning with support vector machines. In: Proceedings of International Joint Conference on Artificial Intelligence (IJCAI-99) (1999)
Cauwenberghs, G., Paggio, T.: Incremental and decremental support vector machine learning. In: Proceedings of Advanced Neural Information Processing, MIT Press (2001)
Ralaivola, L., D’Alche-Buc, F.: Inremental support vector machine learning: a local approach. In: Proceedings of International Conference on Artificial Neural Networks, ICANN’2001, Vienne, Autriche (2001)
Chapter Google Scholar
Mangasarian, O.L., Solodov, M.V.: Nonlinear complementarity as unconstrained and constrained minimization. Math. Progr. Ser. B 62, 277–297 (1993)
Article MathSciNet Google Scholar
Mangasarian, O.L., Musicant, D.R.: Active support vector machines classification. In: Advances in Neural Information Processing Systems (NIPS 2000) (2000)
Mangasarian, O.L., Musicant, D.R.: Successive overrelaxation for support vector machines. IEEE Trans. Neural Netw. 10, 1032–1037 (1999)
Article Google Scholar
Liu, H.P., Liu, Y.P., Sun, F.C.: Robust exemplar extraction using structured sparse coding. IEEE Trans. Neural Netw. Learn. Syst. 26(8), 1816–1821 (2015)
Article MathSciNet Google Scholar
Zhou, Y.C., Wang, X.H., Wang, T., Liu, B.Y., Sun, W.X.: Fault-tolerant multi-path routing protocol for WSN based on HEED. Int. J. Sens. Netw. 20(1), 37–44 (2016)
Article Google Scholar
Liu, H.P., Sun, F.C., Fang, B., Zhang, X.Y.: Robotic room-level localization using multiple sets of sonar measurements. IEEE Trans. Instrum. Meas. 66(1), 2–13 (2017)
Article Google Scholar
Zhou, Y.C., Wang, T., Wang, Y.F.: A novel WSN key pre-distribution scheme based on group-deployment. Int. J. Sens. Netw. 15(3), 143–148 (2014)
Article MathSciNet Google Scholar
Liu, H.P., Guo, D., Sun, F.C.: Object recognition using tactile measurements: kernel sparse coding methods. IEEE Trans. Instrum. Meas. 65(3), 656–665 (2016)
Article Google Scholar
Cai, Q.F., Hao, Z.F., Yang, X.W.: Gaussian kernel-based fuzzy inference systems for high dimensional regression. Neurocomputing 77(1), 197–204 (2012)
Article Google Scholar
Yang, X.W., Yu, Q.Z., He, L.F.: The one-against-all partition based binary tree support vector machine algorithms for multi-class classification. Neurocomputing 77(2), 307–314 (2012)
Google Scholar
He, L.F., Hao, Z.F., Yang, X.W., Xiao, F.Z., Lv, H.R.: An optimal grid-clustering based decision tree support vector machine algorithm for large-scale classification problems. Neurocomputing 77(3), 507–514 (2012)
Google Scholar
Lee, Y.J., Mangasarian, O.L.: SSVM: a smooth support vector machines for classification. Comput. Optim. Appl. 20(1), 5–22 (2000)
Article MathSciNet Google Scholar
Standard pattern classification library. http://ida.first.gmd.de/raetsch/data/benchmarks.htm
Machine learning database. http://www.ics.uci.edu
Itti, L., Baldi, P.: Bayesian surprise attracts human attention. Vis. Res. 49(10), 1295–1306 (2009)
Article Google Scholar
Itti, L., Kouch, C.: Computational modeling of visual attention. Nat. Rev. Neurosci. 2(3), 194–230 (2001)
Article Google Scholar
Judd, T., Ehinger, K., Durand, F. et al.: Learning to predict where humans look. In: Proceedings of IEEE International Conference on Computer Vision, pp. 230–242 (2009)
Hou, X.D., Harel, J., Koch, C.: Image signature: highlighting sparse salient regions. IEEE Trans. Pattern Anal. Mach. Intell. 34(1), 194–201 (2012)
Article Google Scholar
Stan Birchfield’s datasets in Stanford University. http://vision.stanford.edu/~birch/headtracker/

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (Nos. 61772087, 61702052), the Science and Technology Service Platform of Hunan Province (No. 2012TP1001), the Open Research Fund of Hunan Provincial Key Laboratory of Intelligent Processing of Big Data on Transportation (No. 2015TP1005), the Changsha Science and Technology Planning (Nos. KQ1703018, KQ1706064), the Research Foundation of Education Bureau of Hunan Province (Nos. 12C0010, 17A007), the ZOOMLION Intelligent Technology Limited Company (No. 2017zkhx130). We are grateful to anonymous referees for useful comments and suggestions.

Author information

Authors and Affiliations

Hunan Provincial Key Laboratory of Intelligent Processing of Big Data on Transportation & School of Computer and Communicational Engineering, Changsha University of Science and Technology, Changsha, People’s Republic of China
Yuantao Chen & Weihong Xu
Computer Center, College of ChengNan, Changsha University of Science and Technology, Changsha, People’s Republic of China
Jingwen Zuo
School of Computer Science, Yangtze University, Jingzhou, People’s Republic of China
Jie Xiong

Authors

Yuantao Chen
View author publications
You can also search for this author in PubMed Google Scholar
Jie Xiong
View author publications
You can also search for this author in PubMed Google Scholar
Weihong Xu
View author publications
You can also search for this author in PubMed Google Scholar
Jingwen Zuo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yuantao Chen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, Y., Xiong, J., Xu, W. et al. A novel online incremental and decremental learning algorithm based on variable support vector machine. Cluster Comput 22 (Suppl 3), 7435–7445 (2019). https://doi.org/10.1007/s10586-018-1772-4

Download citation

Received: 25 October 2017
Revised: 14 December 2017
Accepted: 08 January 2018
Published: 17 January 2018
Issue Date: May 2019
DOI: https://doi.org/10.1007/s10586-018-1772-4

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A novel online incremental and decremental learning algorithm based on variable support vector machine

Abstract

Similar content being viewed by others

Support vector machine incremental learning triggered by wrongly predicted samples

Incremental Real Time Support Vector Machines

A Novel Incremental Covariance-Guided One-Class Support Vector Machine

1 Introduction

2 Variable support vector machine

3 Online incremental learning algorithm based on VSVM

3.1 Online incremental learning algorithm

3.2 Online incremental learning algorithm based on VSVM

4 Online decremental learning algorithm based on VSVM

4.1 Online decremental learning algorithm

4.2 Online decremental learning algorithm based on VSVM

5 The experimental results analysis and discussion

5.1 The numerical experiments of online incremental learning algorithm

5.2 The numerical experiments of online decremental learning algorithm

5.3 Calculating visual saliency map experiment

5.4 The testing experiments of object tracking algorithm

6 Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A novel online incremental and decremental learning algorithm based on variable support vector machine

Abstract

Similar content being viewed by others

Support vector machine incremental learning triggered by wrongly predicted samples

Incremental Real Time Support Vector Machines

A Novel Incremental Covariance-Guided One-Class Support Vector Machine

Explore related subjects

1 Introduction

2 Variable support vector machine

3 Online incremental learning algorithm based on VSVM

3.1 Online incremental learning algorithm

3.2 Online incremental learning algorithm based on VSVM

4 Online decremental learning algorithm based on VSVM

4.1 Online decremental learning algorithm

4.2 Online decremental learning algorithm based on VSVM

5 The experimental results analysis and discussion

5.1 The numerical experiments of online incremental learning algorithm

5.2 The numerical experiments of online decremental learning algorithm

5.3 Calculating visual saliency map experiment

5.4 The testing experiments of object tracking algorithm

6 Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation