Keywords

1 Introduction

Dimension reduction is a prerequisite for many tasks in data mining, pattern recognition and machine learning. Accordingly, developing effective methods for dimension reduction has been a hot research topic in recent years.

Specifically, manifold learning, a type of effective nonlinear dimension reduction method, aims to discover the low dimensional smooth manifold structure of observed high-dimensional data. In the past decade, a number of manifold learning algorithms are developed, including Isometric feature mapping (Isomap) [1], Laplacian eigenmaps algorithm [2], locally linear embedding (LLE) [3], Local Tangent Space Alignment (LTSA) [4] and so on. All these algorithms operated in a “batch” mode, i.e., the whole set of data must be collected before running the algorithm, which makes it ineffective to deal with data collected sequentially (or data streams). In reality, however, many applications need to handle realtime data streams where data are collected in turn. Therefore, incremental manifold learning methods being able to continuously and efficiently update the manifold constructed on new-coming data and existing data without repeatedly computing the whole data set are required. Meanwhile, the incremental learning algorithms can be used to visualize the data manifold’s changing process, which provides a great help to understand the feature of data structure.

In this paper we propose a new incremental manifold learning algorithm called LITSA, which finds the local principal components of the sample points by incremental PCA. LITSA is not subject to the size of sample points’ covariance matrix, thus solving the large-scale matrix eigen decomposition problem in LTSA. In this method, in order to obtain its coordinates in a low dimension space, we construct the tangent space of each data sample in high dimension space incrementally. To avoid reconstructing the eigenspace repeatedly, we construct the local tangent space matrix incrementally in the light of the sample points’ covariance matrix. Thus, the proposed algorithm can update the manifold structure directly via computing the eigenvectors of local tangent space matrix constructed on the basis of existing samples and newly-arrived data samples. The experimental results on both synthetic and real-world datasets show that the proposed algorithm can achieve a more accurate low-dimensional representation of the original data efficiently.

2 Related Work

The idea of existing incremental manifold learning algorithms can be roughly summed up into two categories.

The first one is based on the hypothesis that existing results are entirely correct, so the the new samples can be efficiently dealt with. These algorithms can efficiently deal with the new data. However, existing data often can not accurately reflect the intrinsic manifold structure. Especially in the case of non-uniform sampling, these algorithms may not provide low dimensional embedding of high-dimensional data correctly.

Algorithms in the second class will update the training data set’s embedding coordinates when embedding the other samples. So they can better reflect the dataset’s characteristics. By contrast, these methods can give more reliable results.

Existing typical incremental manifold learning algorithms include the incremental Isomap algorithm (IIsomap) [5], the incremental version of Laplacian eigenmaps (ILE) [6], the incremental LLE algorithm (ILLE) [7], and the incremental LTSA algorithm (ILTSA) [8]. Another incremental manifold learning algorithm via LTSA by Liu et al. [9] proposed a modified LTSA algorithm and an incremental eigen-decomposition problem with increasing matrix size.

The core idea of the IIsomap algorithm is to efficiently update the geodesic distances and re-estimate the eigenvectors, using the previous computation results. There exists the problem of disconnected neighborhood graphs when the data are undersampled or unevenly distributed, new points may change the current neighborhoods and local distribution of the manifold, the continuity of the neighborhood matrix is not guaranteed. The addition of a new sample can delete critical edges in the graph and subsequently change the geodesic distances dramatically. In this case, short-circuit or cavitation phenomenon will happen. The algorithm of ILE is traditionally performed in a batch mode. It introduces a locally linear reconstruction mechanism to add new adjacent information and revise the existing samples’ low-dimensional embedding results. The sub-manifold method involved needs to solve a \((k + 1) \times (k + 1)\) eigenvector problem and the overall time complexity of solving the eigenvector problem is \(O({(k + 1)^3})\). ILLE algorithm evaluated the mapping results of the new samples and re-calculated the projections of original samples. This algorithm assumed that the eigenvalues of the cost matrix remained the same when a new data point arrived and the incremental learning problem of LLE was tackled by solving a \(d \times d\) minimization problem, where d was the dimensionality of the low-dimensional embedding space. In ILTSA the previous work of incremental algorithms mostly adopted iterative methods of whole process which need abundant repetitive computation, because one does not know what part of a iterative optimization problem has to be recomputed. The problem in method [9] is similar to ILTSA that the alignment matrix has to be reconstructed to include the new point, which is not very practical for large datasets. The other limitation of existing incremental methods is that there is no guarantee on the approximation error, so these approaches suffer from unpredicted approximation error [10]. Similar problems also exist in other incremental methods such as ILLE algorithm, the error in process of dimension reduction will become larger, because the eigenvalues of new cost matrix \({M_{new}}\) in this algorithm have not been updated. As the number of new samples increases, the difference value between the first d smallest eigenvalues of the original cost matrix M will become larger, hence a larger error [11].

Given a set of N data points {\(x_{i}|i=1, 2, \cdots , N\)}, Local Tangent Space Alignment (LTSA) assumes that the data points are sampled from a high dimensional manifold, i.e. located in a m-dimensional manifold \({\mathbb {R}^m}\). It maps \(x_{i}\) to the d-dimensional representation \(\tau _{i}\) and reserves the local geometry information of \(x_{i}\) as much as possible, where \(d < m\). Local geometry information of \(x_{i}\) is defined as the manifold constructed by its k nearest neighbor points to generate tangent space of \(x_{i}\). All the data points in tangent spaces are then aligned to give their global coordinates in the low dimension manifold. Though LTSA can effectively evaluate a dataset’s global mapping coordinates that reflects the data set’s low-dimensional manifolds structure, it has two shortcomings: On the one hand, the size of the covariance matrix used for eigen decomposition in LTSA is equal to the number of samples, so it is inefficient to handle large datasets; On the other hand, it cannot deal with new sample point effectively for the high time complexity, so it is difficult for incremental learning.

Min et al. [12] proved that the local tangent space of a sample can be represented by the eigenvectors of the covariance matrix constructed by the samples in its neighborhood. The matrix’s eigenvectors can be computed by local principal component analysis method. Therefore the problem of computing the sample points’ projection coordinates in the low dimensional space can be transformed into solving the local principal component analysis problem.

Recently, a number of semi-supervised feature extraction algorithms [1316] have come out, which combine semi-supervised techniques with local discriminant analysis approaches. While they have lost considerations of incremental learning on dynamic manifold, which has become a hot topic in big data stream nowadays.

The algorithm proposed in this paper can overcome these shortcomings. It finds the local principal components of the sample points by incremental PCA [17, 18], which is not subject to the size of sample points’ covariance matrix, thus solving the large-scale matrix eigen decomposition problem in LTSA. In addition, by taking the adaptive factor into account, our algorithm is able to deal with new arrived sample points effectively.

3 Manifold Learning Algorithm Based on Incremental Tangent Space Alignment

3.1 Update New Covariance Matrix After Inserting New Point

Firstly, we construct new local tangent spaces for existing sample points and newly arrived one. Given a set of data points \(X = [{x_1},\cdots ,{x_N}]\) sampled from non-linear manifold \({\mathbb {R}^m}\), they will be mapped from m dimension space to d dimension space. Suppose \(X_i = [{x_{i_1}},\cdots ,{x_{i_k}}]\) as a matrix containing k nearest neighbours of \(x_{i}\) (in Euclidean distance). Traditional PCA searches for vectors c, T and matrix U to project \(X_{i}\) to low dimension manifold \({\mathbb {R}^d}\). In order to get the optimal solution, we minimize the reconstruct error: \(\min ||E|| = \mathop {\min }\limits _{c,U,T} ||X - (c{e^T} + UT)|{|_F}\), in which c is the mean value of X, indicated as: Xe/N. Matrix \(U \in {\mathbb {R}^{m \times d}}\) is a set of orthonormal basis of affine subspace, singular value decomposition in linear condition can be used to solve this problem. While in nonlinear conditions, especially in realtime environment, situation is more complicated. Local linear array must be used in realization to solve incremental non-linear mapping problem.

When a new point arrives, denoted by \(x_{new}\), we prepare to update the local information of the new point from eigenvectors of existing points. Suppose the eigenspace of existing N samples has already been constructed, with k-neighbor points of each point. The new point’s coordinate \(X_{new}\) is projected into existing eigenspace through following form: \({w_i} = {u_{i}^T}(X_{new} - \overline{X}_{N}), i=1,2,\cdots ,N\) ,where \(u_{i}\) is the \(i^{th}\) eigenvector in eigenspace which consists of N sample points, \(\overline{X}_{N}\) is mean of the N samples, \(w_{i}\) is the \(i^{th}\) coefficient of the new point in the eigenspace.

Then we can reconstruct the new point’s coordinate based on these coefficients: \(X_{new} = w_{i}u_{i} + \overline{X}_{N}\). To solve incremental nonlinear mapping problem, the first step is to estimate mean value and construct the sample points’ covariance matrix, thereby to determine the sample points’ eigenvectors in eigenspace.

Let \(M_{N}\) be the mean value of the existing N samples in m dimension space. In this paper we propose a novel formulation \(M_{new}\) in order to describe the geometric information of the existing sample points’ covariance matrix after inserting the new point:

$$\begin{aligned} M_{new} = \alpha {M_N}+(1-\alpha ){x_{new}}{e^T}, \end{aligned}$$
(1)

where \(\alpha \) is an adaptive factor controlling how exited samples influence the estimation of \(M_{new}\)’s value, e denotes the identity vector. How to choose \(\alpha \) is based on the incremental environment: if the samples in current environment update fast, \(\alpha \) is set to be small; if slow, \(\alpha \) has larger value. We set \((1-\alpha )\) as the learning rate which depends on the nature of data sets in realtime practice and requires a \(trial-and-error\) procedure to determine, while it is impractical for “on-line” applications [19]. This process lies on the measurement of observed data samples \( X_{new} \). From [20] we know the sample mean is an efficient estimate of mean distribution. The use of samples’ mean value \(M_{N}\) also inspires us to divide the learning rate by \(\frac{1}{N}\) here. So we set \((1- \alpha )=\frac{1}{N}\) as sample mean. Although \(\alpha =\frac{N-1}{N}\) added on \(M_N\) is close to 1 when N is large enough, it is very important for fast convergence with primary samples. If the estimate does not converge well at the beginning, it is harder to be pulled back later when N is large. We can also know that the estimate of \(M_{new}\) has a fairly low error variance and high statistical efficiency through experiments in subsequent section.

Traditional PCA (Principle Component Analysis) computes the eigen-solutions by solving the SVD (Singular Value Decomposition) of the covariance matrix: \(C = \frac{1}{N}\sum \limits _{i = 1}^N {({x_i}} - \mathop x\limits ^\_ {)^T}({x_i} - \mathop x\limits ^\_)\). The corresponding expression in matrix form can be written as: \(C = {({X_i} - \mathop {{x_i}}\limits ^\_ {e^T})^T}({X_i} - \mathop {{x_i}}\limits ^\_ {e^T})\). Compute C’s eigenvectors corresponding to d first eigenvalues which span a subspace of d dimension. We only need to save \(M_{new}\) and existing covariance matrix \(C_n\) because existed samples can be discarded during recursive updating process, thereby the algorithm’s complexity can be greatly decreased. Similarly we can obtain the iterative estimation of covariance matrix given as below:

$$\begin{aligned} C_{n} = \alpha {C_{n - 1}} + (1 - \alpha )(X_{n} - {\overline{X}_n})^T{({X_n} - {\overline{X}_n})} \end{aligned}$$
(2)

After substituting \(\alpha \) with \(\frac{N-1}{N}\) and inserting new sample \(x_{new}\), we get the updating form \({C_{new}}\) for C as follows, where \(X_{new}\) is the matrix form of \({x_{new}}{e^T}\).

$$\begin{aligned} {C_{new}} = \frac{{N - 1}}{N}C + \frac{1}{N}{({X_{new}} - {M_{new}})^T}({X_{new}} - {M_{new}}) \end{aligned}$$
(3)

According to PCA, the optimal solution of \(C = {({X_i} - \mathop {{x_i}}\limits ^\_ {e^T})^T}({X_i} - \mathop {{x_i}}\limits ^\_ {e^T})\) is given by SVD of (\({X_i} - \mathop {{x_i}}\limits ^\_ {e^T}\)). Let \(\lambda _{i} (i=1,\cdots ,d)\) be the d first eigenvalues of C, \(v_i\) as the corresponding eigenvectors, so: \(C \approx {U_{Nd}}{\varLambda _{dd}}U_{Nd}^T = \sum \limits _{i = 1}^d {{\lambda _i}} {v_i}v_i^T\), where the column vectors \({U_{Nd}}\) are the eigenvectors \(v_i\) of C, and diagonal matrix \({\varLambda _{dd}}\) is comprised of \(C's\) eigenvalues \({\lambda _i}\).

Now we can get the incremental \(C_{new}\) in matrix form and its expression model:

$$\begin{aligned} C_{new} = \frac{{N - 1}}{N}U\varLambda {U^T} + \frac{1}{N}{({X_{new}} - {M_{new}})^T}({X_{new}} - {M_{new}}) \end{aligned}$$
(4)

To facilitate the calculation, the above formula can be written as:

$$\begin{aligned} {C_{new}} = \frac{{N - 1}}{N}\sum \limits _{i = 1}^d {{\lambda _i}} {v_i}v_i^T + \frac{1}{N}{({X_{new}} - {M_{new}})^T}({X_{new}} - {M_{new}}) \end{aligned}$$
(5)

where:

$$\begin{aligned} {X_{new}} - {M_{new}} = {x_{new}}{e^T} - \frac{{N - 1}}{N}\mathop {{x_i}}\limits ^\_ {e^T} - \frac{1}{N}{x_{new}}{e^T} = \frac{{N - 1}}{N}({x_{new}} - \mathop {{x_i}}\limits ^\_ ){e^T} \end{aligned}$$
(6)

3.2 Construct the Local Tangent Space in Incremental Form

For the updating of all the points’ embedding coordinates in low dimensional space by an incremental way, we proceed in the following steps.

Firstly, we perform eigen-decomposition for \(C_{new}\) and obtain its first d eigenvalues \({\lambda _i}\) and eigenvectors \(v_i\), \(i=1,2,...,d\). Then we construct \(d+1\) vectors based on \(v_i\) and \({\lambda _i}\): \(A=[y_1,y_2,\cdots ,y_{d+1}]\) of \(k \times (d+1)\) size, where: \({y_i} = {v_i}\sqrt{\frac{{N - 1}}{N}{\lambda _i}}, i = 1, \cdots ,d,{y_{d+1}} = \sqrt{\frac{1}{N}} {({X_{new}} - {M_{new}})^T}\). It can satisfy the condition of \(C_{new} = AA^T\), which indicates the local tangent space matrix of neighbour points of the new sample including itself.

Secondly, an inner-product matrix B of \(k \times k\) size can be formulated as: \(B=AA^T\), much smaller than \(C_{new}\). The advantage of this approach is that we can construct local tangent space matrix incrementally instead of recomputing the new covariance matrix \(C_{new}\). The matrix B is the orthogonal projector onto the subspace spanned by the columns of A. So the local coordinates of new point \(x_{new}\) in the incremental subspace can be obtained by computing the d first singular vectors of A [21], equivalently the d eigenvectors \(u_1, \cdots , u_d\) corresponding to the d smallest eigenvalues of B. Set the eigenvectors and eigenvalues of B as: \(\{ v_n^i\} \) and \(\{ \lambda _n^i\} \), we have: \(Bv_n^i = \lambda _n^iv_n^i\), \(i = 1,2,...,k\). Then, \({A^T}Av_n^i = \lambda _n^iv_n^i\), multiplied by A on both sides: \(A{A^T}Av_n^i = \lambda _n^iAv_n^i\). Let \(u_n^i = Av_n^i\), then: \(A{A^T}u_n^i = \lambda _n^iu_n^i\), i.e.: \(C_{new}u_n^i = \lambda _n^iu_n^i\). So we have \({C_{new}}\)’s eigenvectors: \(u_n^i = Av_n^i\).

With eigenvectors of the covariance matrix in incremental form, new sample point \({x_{new}}\)’s local coordinate can be calculated by formulation Eq. (7):

$$\begin{aligned} {x_{new}} = {w_i}{u_i} + \mathop {{X_N}}\limits ^\_ = \sqrt{\lambda _n^i} u_n^i + \mathop {{X_N}}\limits ^\_ = \sqrt{\lambda _n^i} Av_n^i + \mathop {{X_N}}\limits ^\_ \end{aligned}$$
(7)

3.3 Compute the Low Dimensional Global Coordinates

Now we consider the incremental updating situation. The local coordinates of new sample point in Eq. (7) can be represented in following matrix form:

$$\begin{aligned} X_{new} = W U + {\overline{X}_N}, \end{aligned}$$
(8)

where the low-rank matrix WU is the d-dimensional optimal approximation of original coordinates, U is orthonormal and W has k column vectors: \(U=[{u_1}, \cdots ,{u_d}]\), \(W = [{w_1},\cdots ,{w_k}]\). \({\overline{X}_N}\) is the mean of \(X_N\).

Since the structure of global coordinates is linear in low dimensional feature space, the new sample point’s local coordinate can be approximated by Eq. (8). Then we can construct global coordinates according to the local coordinates:

$$\begin{aligned} Y_i = \frac{1}{k} Y_i e e^T + L_i( W U + {\overline{X}_N}) + E_i , i = 1 , ... , N. \end{aligned}$$
(9)

\(Y_i\) denotes the global coordinates in low dimensional space, \(L_i\) and \(E_i\) denote local affine transformation matrix and reconstruction error respectively after inserting new data point. So,

$$\begin{aligned} {E_i} = {Y_i}(I - \frac{1}{k}e{e^T}) - {L_i}(WU + \mathop {{X_N}}\limits ^\_ ) \end{aligned}$$
(10)

To minimize the reconstruction error we can describe Eq. (10) as:

\(\mathop {\min }\limits _{{Y_i},{L_i}} {Y_i}(I - \frac{1}{k}e{e^T}) - {L_i}(WU + \mathop {{X_N}}\limits ^\_ )\), then:

$$\begin{aligned} {L_i} = {Y_i}(I - \frac{1}{k}e{e^T}){(WU + \mathop {{X_N}}\limits ^\_ )^\dag } \end{aligned}$$
(11)

where: \({(WU + \mathop {{X_N}}\limits ^\_ )^\dag } = {({(WU + \mathop {{X_N}}\limits ^\_ )^T}(WU + \mathop {{X_N}}\limits ^\_ ))^{ - 1}}{(WU + \mathop {{X_N}}\limits ^\_ )^T}\) is pseudo-inverse of \((WU + \mathop {{X_N}}\limits ^\_ )\). So:

$$\begin{aligned} {E_i} = {Y_i}(I - \frac{1}{k}e{e^T})(I - {(WU + \mathop X\limits ^\_ )^\dag }(WU + \mathop X\limits ^\_ )) \end{aligned}$$
(12)

Suppose \({H_i} = (I - \frac{1}{k}e{e^T})(I - {(WU + \overline{X})^\dag }(WU +\overline{X}))\), to determine \(Y_i\) uniquely \( Y_i Y_i^T = I_d \) is the constraint condition. Let \( Y = [ Y_1 , \cdots , Y_N ] \) and \( S_i\) be the 0-1 selection matrix such that \( Y S_i = Y_i \), we have:

$$\begin{aligned} ||{E_i}||_F^2 = ||{Y_i}{H_i}||_F^2 = ||YSH||_F^2 = Tr({H^T}{S^T}{Y^T}YSH) = Tr({H^T}{S^T}SH) \end{aligned}$$
(13)

Then we minimize the reconstruction error: \(\min ||{E_i}||_F^2 = \min Tr({H^T}{S^T}SH)\), Y in Eq. (13) is constituted with d eigenvectors corresponding to the \(2^{nd}\) to \(d+1^{st}\) smallest eigenvalues of matrix D: \(D = H^T S^T S H\), where \(S = [S_1, \cdots , S_N]\) and \(H = diag(H_1, \cdots , H_N)\).

In order to solve the eigenvalues and eigenvectors of D locally and linearly, we can construct \( U_i \) based on eigenvectors \( u_1 , ... , u_d \) of B: \(U_i = [ e/\sqrt{k} ,u_1 , \cdots , u_d ] \), then, \( H_i = I - U_i U_i^T\). After new data points arrive, we update incremental alignment matrix: \(D({I_i},{I_i}) \leftarrow D({I_i},{I_i}) + I - {U_i}U_i^T\), then compute the \(2^{nd}\) to the \((d+1)^{th}\) smallest eigenvalues and the corresponding eigenvectors \([t_2, \cdots , t_{d+1}]\) of D, which correspond to the optimal low dimensional global coordinates of data points.

4 Implementation of LITSA

Given a dataset with N m-dimensional points \(X = [x_1 , \cdots ,x_N]\) sampled from manifold \({\mathbb {R}^m}\) and a new point \(x_n\) which will be inserted into X. The proposed algorithm projects these points into low dimensional space, provides d-dimensional \((d < m)\) coordinates \({y_1},...,{y_N}\) and the new point’s low dimensional mapped point \(y_n\) in an incremental way.

Step 1. Update new covariance matrix.

  • Determine k nearest neighbours of each point \( x_i \) (Euclidean distance) and compose the matrix \(X_i\), \( i = 1, ... , N \);

  • Construct the covariance matrix \( C_n \) in Eq. (2) iteratively, which represents the incremental tangent space of each data point;

  • Construct matrix A and B;

  • Compute the eigenvectors \( u_1 , ... , u_d \) corresponding to the d smallest eigenvalues of B;

  • Set \( U_i = [ e/\sqrt{k} , u_1 , \cdots , u_d ] \).

Step 2. Construct alignment matrix in incremental form.

Set \( I_i = \lbrace i_1 , \cdots , i_k \rbrace \) as k nearest neighborhood’s index set of data point \( x_i \) , compute alignment matrix in incremental form:

  • Initialize \(D \leftarrow 0\);

  • for \( i = 1 \rightarrow N \) do

  •    \( D({I_i},{I_i}) \leftarrow D({I_i},{I_i}) + I - {U_i}U_i^T \)

  • end for

Step 3. Obtain the low dimensional coordinates.

Make eigen-decomposition of matrix D, compute the \(2^{nd}\) to the \((d+1)^{th}\) smallest eigenvalues and the corresponding eigenvectors \([t_2 , \cdots , t_{d+1}]\) which correspond to the optimal low dimensional global coordinates \({y_1},...,{y_N}\) of each point in original dataset including the new point’s low dimensional coordinate \(y_n\).

5 Experiments

5.1 Experiments on Olivetti-Faces

In this section we will apply LTSA and the proposed algorithm LITSA to process Olivetti-faces datasetFootnote 1 which contains 400 face images with \( 64\times 64 \) pixels. Some samples are shown in Fig. 1(a). One image of them is randomly selected for testing, while the rest ones serve as training samples. Figure 1(b) shows the dimension reduction results of training samples by LITSA.

The horizontal axis represents the face expression changing from unhappy to happy and the vertical axis denotes the face images from without glasses to wearing glasses. The testing images shown in Fig. 1(a) are projected to the low-dimensional space accurately according to their characteristics, because images with similar expressions and characteristics have clustered together. This experiment intuitively shows the effectiveness of LITSA as manifold learning dimension reduction algorithm, which accurately detects and preserves the intrinsic structure of original high dimensional data sets.

Fig. 1.
figure 1

(a) Some test images, (b) The unfolding results of the Olivetti-faces dataset by LITSA

Dimension reduction is often used as a preprocess step of classification in many machine learning. So dimension reduction always plays an important role in such classification tasks. In this section we display classification effect after dimensional reduction by several incremental algorithms on Olivetti-faces dataset, where 40 facial images including 5 images of each individual are randomly selected for testing and the rest ones for training. The experiment is repeated for 10 times. After reducing the dataset to different dimensions, we classify them using k-Nearest Neighbors classifier (k-NN, the nearest neighbors number k is set to be 5). From the classification accuracy rates shown in Fig. 2 we can observe that our algorithm LITSA well detects the intrinsic structure information of the input manifold and achieves better classification precision than other incremental algorithms. It indicates that our algorithm has played a positive role to the subsequent classification learning, has vital significance in many applications such as pattern recognition, image processing and so on.

Fig. 2.
figure 2

Accuracy results of kNN classification after dimensional reduction on 10 dimensions

5.2 Local Reconstruction Performance on Rendered Face

In this section we consider the experimental results on Rendered face dataset [1]. This dataset contains 698 facial sculpture images with \(64 \times 64\) pixels. The facial images have two groups of pose parameters which are up to down and left to right.

We compare the reconstruction performance between the proposed algorithm LITSA and other incremental algorithms on Rendered face dataset. To recover these implicit parameters we use matrix P to represent pose parameters. By making affine transformation on the dimensional reduction projection results T by these algorithms mentioned above, we can get the affine transformation expression: \(Y = c{e^T} + LT\). Define the following expression: \(||P - Y||_F^2 = \mathop {\min }\limits _{c,L} ||P - (c{e^T} + LT)||_F^2\). Then the relevant reconstruction error can be measured by: \(error = \frac{{||P - Y||_F^2}}{{||P||_F^2}}\).

Various reconstruction error computed by IIsomap, ILTSA, ILLE, ILE and LITSA along with neighborhood size k are shown in Table 1, in which LITSA keep the minimum reconstruction error. Reconstruction error of IIsomap after reducing dimensions already exceed the comparison range of other algorithms, which indicates that our method has leading position and superiority among these algorithms with reconstruction error in small area.

To sum up, the algorithm LITSA has showed obvious superiority both in intuitive and mathematical sense, through the experiments of this section on local reconstruction performance and reconstruction error measure.

Table 1. Reconstruction error computed by IIsomap, ILTSA, ILLE, ILE and LITSA

6 Conclusion

This paper proposes a new incremental manifold learning algorithm named LITSA based on incremental tangent space construction. Compared with LTSA algorithm and other incremental manifold learning methods, the innovation of this algorithm can be concluded as three aspects: (1) A novel formulation \(M_{new}\) is proposed in order to insert the new point’s geometric information into existing sample points’ covariance matrix. (2) We construct the incremental expression of \(C_{new}\) by inserting \(({X_{new}} - {M_{new}})\). (3) Inspired by the idea of literature [15], after constructing smaller size inner product matrix \(B=AA^T\), we can immediately obtain the local coordinates of dataset inserted with new sample point by computing the eigenvectors and eigenvalues of matrix B. Extensive experiments have been performed in order to evaluate our algorithm’s performance both on artificial and actual datasets, demonstrating its ability of dimension reduction effect and recognition accuracy on large scale datasets.