Enhancing sparsity via full rank decomposition for robust face recognition

Lu, Yuwu; Cui, Jinrong; Fang, Xiaozhao

doi:10.1007/s00521-014-1582-4

Enhancing sparsity via full rank decomposition for robust face recognition

Original Article
Published: 16 April 2014

Volume 25, pages 1043–1052, (2014)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Neural Computing and Applications Aims and scope Submit manuscript

Enhancing sparsity via full rank decomposition for robust face recognition

Download PDF

Yuwu Lu¹,
Jinrong Cui¹ &
Xiaozhao Fang¹

327 Accesses
5 Citations
Explore all metrics

Abstract

In this paper, we propose a fast and robust face recognition method named enhancing sparsity via full rank decomposition. The proposed method first represents the test sample as a linear combination of the training data as the same as sparse representation, then make a full rank decomposition of the training data matrix. We obtain the generalized inverse of the training data matrix and then solve the general solution of the linear equation directly. For obtaining the optimum solution to represent the test sample, we use the least square method to solve it. We classify the test sample into the class which has the minimal reconstruction error. Our method can solve the optimum solution of the linear equation, and it is more suitable for face recognition than sparse representation classifier. The extensive experimental results on publicly available face databases demonstrate the effectiveness of the proposed method for face recognition.

Robust Low-Rank Recovery with a Distance-Measure Structure for Face Recognition

Low-Rank and Sparse Dictionary Learning

Robust face recognition via low-rank sparse representation-based classification

Article 06 November 2015

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

For the importance of face recognition, most techniques of face recognition have been used in pattern recognition [1, 2], image processing [3], and machine learning [4, 5]. Over the last decade, many algorithms have been proposed for face recognition, such as Fisherface [6], Eigenface [7], and graph manifold learning [8]. In face recognition, appearance-based methods represent an image of size n × m by a vector in the n × m dimensional space and then produce lower-dimensional features of the face image for better classifying it [9]. In the application of face recognition, these n × m dimensional spaces are too large to allow robust and fast face recognition. Dimensionality reduction is a common way to attempt to resolve this problem. Many dimensionality reduction methods have been proposed to produce lower-dimensional features, such as principal component analysis (PCA) [10], linear discriminant analysis (LDA) [11], and 2D-PCA [12]. PCA performs dimensionality reduction by projecting the original data on a much lower-dimensional linear subspace spanned by the leading eigenvectors of the covariance matrix of the origin data. LDA searches for the project axes on which data points of the same class are close to each other and data points of different classes are far from each other. The basic idea of 2D-PCA is to directly use 2D matrices to represent face image which improves the computational efficiency and increases face recognition rate because of the preservation of image structural information.

Sparsity could be a useful principle in neuroscience, information theory, and signal processing over the past few decades [13–15]. Sparse representation encodes an image using only a small number of atoms parsimoniously chosen from an overcomeplete dictionary. It has been developed in computer vision and pattern recognition with promising results [16]. Sparse representation classifier (SRC) [17] has been successfully used for robust face recognition. SRC is a nonparametric learning method similar as nearest neighbor (NN) [18] and nearest subspace (NS) [19].

The basic idea of SRC is to represent a test sample as a linear combination of all training samples and then classify the test sample into the class which has the minimal reconstruction error. SRC uses l ¹ minimization instead of the l ⁰ minimization to seek for the sparse solution for its computational convenience and efficiency.

For the efficiency of SRC in face recognition, many extended work have been proposed. In [20], Gao et al. proposed kernel sparse representation, which is the sparse coding in the mapped high-dimensional feature space. In [21], the authors indicated a more general case where features lie in a union of low-dimensional linear subspaces. He et al. [22] incorporated the non-negativity constraint into sparse graph to learn the probabilistic latent clustering relationship between data. Yang et al. proposed robust sparse representation [23].

Although there are so many applications of SRC, l ¹ minimization does not always yield sufficiently sparse solution. In [24], the authors proposed l ^p(0 < p < 1) sparse representation-based classification (l ^p-SRC) to seek for the optimal sparse representation of a test image. However, how to get the optimal parameter p is still an open problem. If we can solve the solution of the sparse system directly without iterative steps, we can save much time and improve the accuracy of classification.

In this paper, we propose a novel method named enhancing sparsity via full rank decomposition (ES-FRD) for face recognition, which first represents the test sample as a linear combination of the training data as the same as sparse representation, then make a full rank decomposition of the training data matrix. Since, we can obtain the generalized inverse of the training data matrix and solve the general solution of the linear equation directly. For obtaining the optimum solution to represent the test sample, we use the least square method to solve it. We classify the test sample into the class which has the minimal reconstruction error. The contributions of the proposed method are as follows:

1.
Computational efficiency. By introducing the general solution of linear equation, we can obtain the optimum solution. Without counting the norm minimization, the proposed method is more efficient than SRC.
2.
Open solution. As we know, the state-of-the-art algorithms of SRC have no open solution to solve norm minimization problem. The proposed method has an open solution.
3.
More sparse. Our method introduces full rank decomposition, which factorizes the image database into two low-rank matrices. It is helpful for obtaining much sparser solution.

The rest of this paper is organized as follows: We review the related works on sparse representation in Sect. 2. In Sect. 3, we give the details of the proposed method. Section 4 gives the experimental results on three public face image data sets. Finally, we conclude the paper in Sect. 5.

2 Related work

In this section, we briefly review SRC and l ^p(0 < p < 1)-SRC.

2.1 Sparse representation classification (SRC)

Assuming that there are n training samples from c object classes, $A = [A_{1} ,A_{2} , \ldots ,A_{c} ] \in R^{m \times n}$ denotes the entire training set, where m is the dimension of sample, and A _i (i = 1, 2, …, c) is the set of training samples from the ith object class. Given a test sample $y \in R^{m}$ of the ith class, the goal is exactly to predict the label of y from the given c class training samples. The linear representation of y can be written in terms of all training samples A as [17]

$$y = Ax$$

(1)

where x is the vector of coefficients. If the test sample y belongs to the ith class, then the entries of x are expected to be zero except some of those associated with this class.

In SRC, the problem of finding the coefficient vector is formulated as a convex programming problem

$$\mathop {\hbox{min} }\limits_{x} \left\| x \right\|_{1} \quad {\text{subject to}}\quad y = Ax$$

(2)

where $\left\| \cdot \right\|$ denotes the l ¹ norm. The sparsity of the coefficient vector can be measured by l ⁰ norm. However, l ⁰ norm minimization is a NP-hard problem. Some recent studies [25, 26] show that if the solution of x is sparse enough, the l ¹ minimization can be employed to seek for the sparse solution for its computational convenience and efficiency.

After obtaining x of Eq. (2), the class reconstruction residual is used to design a sparse representation-based classification (SRC). For each class i, let $\delta_{i} :R^{n} \to R^{n}$ be the characteristic function which selects the coefficients associated with the ith class. For $x \in R^{n}$, $\delta_{i} \in R^{n}$ is a new vector whose nonzero entries are the entries in x that are associated with class i. The test sample y can be approximated by $\hat{y}_{i} = A\delta_{i} (x)$ which uses the vector $\delta_{i}$ from each class. The reconstruction residual for class i is defined as:

$$r_{i} (y) = \left\| {y - \hat{y}_{i} } \right\|_{2}$$

(3)

We classify the given test sample y to the class i associated with the minimal reconstruction residual. We give the algorithm of SRC as follows.

Algorithm of SRC

1. Input: The training data matrix $A = [A_{1} ,A_{2} , \ldots ,A_{c} ] \in R^{m \times n}$ for c classes, a test sample y
2. Solve the $l^{1}$-minimization:
$\mathop {\hbox{min} }\limits_{x} \left\\| x \right\\|_{1} \quad {\text{subject to}}\quad Ax = y$
3. Then compute the residuals $r_{i} (y) = \left\\| {y - \hat{y}_{i} } \right\\|_{2}$
4. Output the identify of y as: Identity (y) = arg min_i{r _i(y)}

2.2 l ^p(0 < p < 1) sparse representation for face recognition

The optimization problem of l ¹ minimization cannot always obtain the sparsest solution. In recent, l ^p(0 < p < 1)-norm has been used as an alternative to l ⁰ norm for sparse signal recovery. The l ^p(0 < p < 1) sparse representation-based classification seeks for the optimal sparse representation of a test image by choosing the most suitable parameter p. The following is the l ^p minimization problem:

$$\mathop {\hbox{min} }\limits_{x} \left\| x \right\|_{p} \quad {\text{subject to}}\quad Ax = y$$

(4)

The authors first proposed an iterative algorithm for solving the non-convex system (4) in [24].

An iterative algorithm for l ^p minimization (0 < p < 1)

Step 1::: Initialize the iteration count t = 0 and coding coefficients $x_{i}^{0} = 1,i = 1,2, \cdots ,n$
Step 2::: Update the coding vector x ^t+1 by solving the weighted l ¹ minimization problem
$$x^{t + 1} = \arg \hbox{min} \sum\limits_{i}^{n} {\frac{{\left| {x_{i} } \right|}}{{(\left| {x_{i}^{t} + \mu_{t} } \right|)^{1 - p} }}} \quad {\text{subject to}}\quad Ax = y$$
(5)
Step 3::: Terminate on convergence or when reaches the maximal number of iteration t _max. Otherwise, let t = t + 1, and go to step 2

The solution of l ^p minimization is sought by the iterative l ¹ minimization algorithm. In the first step, the coding coefficients are initialized by $x_{i}^{0} = 1,$ i = 1, 2, …, n. Step 2 is to solve a weighted l ¹ minimization where the weights $w_{i}^{t + 1} = 1/(\left| {x_{i}^{t} } \right| + \mu_{t} )^{1 - p}$ ($i = 1,2, \cdots ,n$) depend on the solution of the previous iteration. The weights in Eq. (5) relate inversely to the magnitude of the coefficients, so l ^p minimization can partially counteract the influence of the coefficient magnitude on the l ¹ penalty function [27].

The residuals are computed by

$$r_{i} = \left\| {y - A_{i} \hat{x}_{i} } \right\|_{2}$$

(6)

The test sample y is classified to the object class that has the minimize residual.

The state-of-the-art algorithms of sparse representation use norm minimization to get a coefficient vector. Among l ^p(0 < p < 1) minimization, l ¹ minimization and l ² minimization, which one is the sparsest is still unknown. Which norm minimization is best suit for sparse representation is still an open problem in theory. In the next section, by introducing the full rank decomposition of the dictionary matrix, we solve the sparse representation system directly, precisely, and computationally efficiently.

The algorithm of l ^p(0 < p < 1) SRC as follows.

Algorithm of l ^p(0 < p < 1) SRC
1. Input: The training data matrix $A = [A_{1} ,A_{2} , \ldots ,A_{c} ] \in R^{m \times n}$ for c classes, a test sample y and the error tolerance $\varepsilon \ge 0$
2. Solve the following $l^{p}$ minimization problem
$\mathop {\hbox{min} }\limits_{x} \left\\| x \right\\|_{p}$ subject to $Ax = y$
3. Then compute the residuals $r_{i} = \left\\| {y - A_{i} \cdot \hat{x}_{i} } \right\\|_{2}$, $i = 1,2, \ldots ,c.$
4. Output the identify of y as: Identity (y) = arg $\min_{i} \left\{ {r_{i} } \right\}$

3 The proposed method

In this section, we present the proposed method and give analysis of it. The main steps of the proposed method are as follows. First, we approximately obtain the full rank decomposition of the training data matrix. Second, we solve the general solution of Eq. (1) and calculate the minimum residual solution. Finally, we classify the test sample into the class that has the minimal residual.

3.1 Full rank decomposition and general solution of the linear equation in our method

In the linear equation y = Ax, no matter what the dimension of A is, there is always a general solution of it. We present the details to obtain the general solution of a linear equation as follows.

Definition 1 (full rank decomposition)

$A \in R^{m \times n}$ is a matrix of which rank is r. If A = FG, rank(F) = r, and rank (G) = r, A = FG is the full rank decomposition of matrix A.

By exploiting the full rank decomposition, we can obtain the generalized inverse matrix of A by Eq. (7). Definition 2 is the generalized inverse matrix of the matrix A.

Definition 2 (Generalized inverse matrix)

$A \in R^{m \times n}$ is a matrix which rank is r, and its full rank decomposition is A = FG, $F \in R^{m \times r}$ is full column rank, $G \in R^{r \times n}$ is full row rank, the generalized inverse matrix of A is

$$A^{ - } = G^{T} (F^{T} AG^{T} )^{ - 1} F^{T}$$

(7)

We can compute a general solution of the linear equation y = Ax by Eq. (8) after solving the generalized inverse matrix of A. Definition 3 defines general solutions of linear equation.

Definition 3 (General solution of linear equation)

If $A^{ - } \in R^{m \times n}$ is a generalized inverse matrix of $A \in R^{m \times n}$, then the general solution of linear equation y = Ax is

$$x = A^{ - } y + (I - A^{ - } A)z,$$

(8)

where $z \in R^{n \times 1}$ is a random vector.

3.2 The optimum solution in the proposed method for face recognition

Let $A = [A_{1} ,A_{2} , \ldots ,A_{c} ] \in R^{m \times n}$ denote the entire training set, in which $A_{i} \in R^{{m \times n_{i} }}$ are training samples of the ith object classes, where c is the number of class. For any test sample $y \in R^{m}$, the linear representation of y can be written as

$$y = Ax$$

(1)

where x is the coefficient vector. By full rank decomposition, we find two matrix factors F and G whose product is an approximation of the matrix A, represented as

$$A = FG$$

(9)

By Eq. (7), we solve the generalized inverse matrix of A, that is $A^{ - } = G^{T} (F^{T} AG^{T} )^{ - 1} F^{T}$. We obtain the general solution of linear equation y = Ax by Eq. (8), i.e., $x = A^{ - } y + (I - A^{ - } A)z$. Because z is a random vector, the general solution of the linear equation y = Ax is not the optimum. We use the least square to solve the optimum solution of Eq. (1). The details as follow: Let

$$W = \left\| {y - Ax} \right\|_{2}^{2}$$

(10)

substituting Eq. (8) into (10), we obtain

$$\begin{aligned} W & = \left\| {y - Ax} \right\|_{2}^{2} \\ & = (y - Ax)^{T} (y - Ax) \\ & = (y^{T} - x^{T} A^{T} )(y - Ax) \\ & = y^{T} y - 2x^{T} A^{T} y + x^{T} A^{T} Ax \\ & = y^{T} y - 2(A^{ - } y + (I - A^{ - } A)z)^{T} A^{T} y + (A^{ - } y + (I - A^{ - } A)z)^{T} A^{T} A(A^{ - } y + (I - A^{ - } A)z) \\ \end{aligned}$$

Since Eq. (10) is convex and differentiable, any stationary point is a global minimizer of it. Requiring that the derivation of W with respect to z to be zero, we get the following equation

$$W^{\prime}(z) = - 2(I - A^{ - } A)^{T} A^{T} y + 2A^{T} A(I - A^{ - } A)(A^{ - } y + (I - A^{ - } A)z) = 0$$

(11)

We can derive from Eq. (11)

$$z = (I - A^{ - } A)^{ - 1} ((A^{T} A(I - A^{ - } A))^{ - 1} (I - A^{ - } A)^{T} A^{T} y - A^{ - } y).$$

(12)

The Eq. (12) is an optimal coefficient vector for the test sample y.

Then, we calculate the residual by

$$r_{i} (y) = \left\| {y - A\hat{x}} \right\|_{2}$$

(13)

If $\hat{r}_{k} = \arg \min_{i} \left\{ {r_{i} } \right\},i = 1,2, \ldots ,c$, we classify y into the kth class, c is the number of distinguished classes.

The classification procedure of the proposed method is shown in Algorithm 1.

Algorithm 1 Algorithm of the Proposed Method

1.	Input: a set of training samples $A = [A_{1} ,A_{2} , \ldots ,A_{c} ] \in R^{m \times n}$ for c classes, a test sample $y \in R^{m}$
2.	Normalize the columns of A.
3.	Compute the full rank decomposition of A
4.	Compute the generalized inverse matrix of A by Eq. (7)
5.	Compute the general solution of linear equation y = Ax (1) by Eq. (8)
6.	Solve the optimal solution $\widehat{x}$ for y = Ax (1) by (11)
7.	Compute the residuals $r_{i} (y) = \left\\| {y - A\hat{x}} \right\\|_{2}$, $i = 1,2, \cdots ,c.$
8.	Output: identity(y) = arg min_i r _i(y).

Figure 1 describes the flowchart of the proposed method.

3.3 Analysis of the proposed method

In this section, we analyze the characteristics, rationale, and potential advantages of our method. Our method differs from SRC and l ^p(0 < p < 1) SRC as follows: Our method uses full rank decomposition to represent the training data matrix approximately and then best represents the test sample as a linear combination of the training data. Here, the “best” means that the residual between the obtained linear combination and the test sample is the smallest. SRC and l ^p(0 < p < 1) SRC are all use norm minimization to obtain an approximately solution for classifying a new test sample.

The proposed method represents the training data matrix by full rank decomposition and then expresses the test sample as a linear combination of the training data. We use generalized inverse of matrix to solve the minimum norm solution of the linear equation. We classify the test sample by evaluating the reconstruction error class by class. Our method can be also viewed as a method that exploits a linear combination of all training samples to represent the test sample and calculate the solution of this linear representation. The underlying rationale is that for different test samples, the coefficients of the linear representation are different. We can solve the optimal solution of Eq. (8) for the linear representation with the different test samples.

The advantages of our work are as follows:

1.
The proposed method introduces the full rank decomposition, which factorizes the image database into two low-rank matrices. It is helpful for obtaining much sparser solution.
2.
By introducing the general solution of the linear equation, we can find the optimum solution for the sparse system (1). There is no need to solve the norm minimization problem in the proposed method, thus leading to more efficient procedures.
3.
To our knowledge, there is no method can directly solve the solution of SRC. The state-of-the-art algorithms of SRC use norm minimization to solve the coefficient vector. Collaborative representation-based classification (CRC) uses l ² norm to solve the sparse solution of the linear equation. It obtains dominant performance for face recognition, but it should consider the distribution of data [28]. The proposed method can solve the linear representation solution directly and efficiently and not need consider the distribution of the data. That is to say, our method can directly applied data with any distribution.

4 Experimental results and analysis

We use the FERET [29], ORL [30], and AR databases [31] to evaluate the performance of the proposed method for face recognition. We compare our method with NN, NS, SRC, and l ^p(0 < p < 1) SRC. For all above learning algorithms, we test the classification performance with the feature subspace dimensions of 36, 49, 64, 81, and 100. The parameters set as Refs. [17] and [24].

4.1 Evaluation on the FERET database

The FERET database [29] was acquired without any restrictions imposed on facial expression and with at least two frontal images shot at different times during the same photo session. The image sets used for evaluating face recognition algorithms displays diversity across gender, ethnicity, and age. For the FERET face database, we only use a subset made up of 1,400 images from 200 individuals with each subject providing seven images [32]. We crop and normalize all the face images of FERET in 40 × 40. We random select different number (3, 4, 5) of images from each subject to construct the training set, and the rest images make up the test set. Figure 2 shows some example images used in our experiments. Figure 3 is the result of NN, NS, SRC, l ^p(0 < p < 1) SRC and our method conducted on FERET face database.

From Fig. 3, the classification accuracy of the proposed method is higher than other comparison methods. In particular, when the data dimension is 81 and we random choose 3 samples as training samples, our method achieves 13 % than SRC.

4.2 Evaluation on the ORL database

The ORL database [30] contains images from 40 individuals, each providing 10 different images. All subjects are in up-right, frontal position (with tolerance for some side movement). The size of each face image is 112 × 92, and the resulting standardized input vectors are of dimensionality 10,304. In the experiment, the images are converted into the size of 40 × 40. Figure 4 shows images of the same subject of ORL. Figure 5 is the result of NN, NS, SRC, l ^p(0 < p < 1) SRC and our method conducted on ORL face database.

From Fig. 5, the accuracy of classification of the proposed method is higher than other comparison methods. Fig. 5b shows that at the data dimension is 81 and when we random choose 5 samples as training samples, our method achieves 1.5 % than SRC.

4.3 Evaluation on the AR database

The AR [31] database consists of more than 4,000 face images of 126 subjects (70 men and 56 women). The database characterizes divergence from ideal conditions by incorporating various facial expressions (neutral, smile, anger, and scream), occlusion modes (sunglass and scarf), and luminance alterations (left light on, right light on, and all side lights on). Each individual participated in two sessions, separated by 2 weeks (14 days). In the experiments, we exploited cropped face images of 100 subjects (50 men and 50 women). We crop and normalize all the face images of AR in 40 × 40. We test the robust performance of the proposed method on AR database. The experiments are conducted for variations in facial expression, variations in lighting condition, and contiguous occlusion.

4.3.1 Variations in facial expressions

We selected a subset database in which involves variations in facial expressions. Figure 6 shows images used for testing variations in facial expression of one subject. Figure 6a, e is used for training and the others are used for testing. The number of training sample is 240 and the number of test sample is 720. Figure 7 shows the results of variations in facial expressions.

From Fig. 7, we can see that SRC has better classification performance on facial expression than l ^p(0 < p < 1) SRC. Our method has the best performance among all the comparison methods.

4.3.2 Variations in lighting conditions

We selected images which involves lighting change on left, right, and all sides as a subset for test variations in lighting conditions. Figure 8a, e is used for training and the rest images of Fig. 8 are used for testing. Thus, the total number of training samples is 240 and the total number of test samples is 720. Figure 9 shows the experimental results of lighting variation tested on AR database.

From Fig. 9, we can see our method has the best performance among all the comparison methods. SRC has better classification performance on lighting condition than l ^p(0 < p < 1) SRC.

4.3.3 Contiguous occlusion

The problem of face identification in the presence of contiguous occlusion is arguably one of the most challenging paradigms in the context of robust face recognition [33]. For testing the performance of ES-FRD on the contiguous occlusion, we make two parts of experiments in this section. We test the occlusion with sunglasses and scarf, respectively. Figures 10 and 11 are used to test the occlusion with sunglasses and scarf, respectively. Figure 10a, e is used for training and the others are used for testing. Thus, the total number of training samples is 240. The experiment scheme of scarf occlusion is all the same as sunglasses occlusion.

From the Figs. 7, 9 and 12, we verify the robust performance of the proposed method. The proposed method has better classification performance for variations in facial expression, lighting variations, and contiguous occlusion in AR database.

4.4 Sparsity evaluation

In this section, we evaluate the sparsity of the proposed algorithm. According to the Ref [34], the sparseness can be calculated by the equation defined as:

$${\text{sparseness}}(\nu ) = \frac{{\sqrt t - \left( {\sum\nolimits_{i} {\left| {\nu_{i} } \right|} } \right)/\sqrt {\sum {\nu_{i}^{2} } } }}{\sqrt t - 1}$$

where t is the dimensionality of vector v. From Table 1, we can see that the proposed method has more sparseness than other comparison methods.

Table 1 Sparseness of SRC, l ^p(0 < p < 1) SRC, and our method on ORL, FERET database

Full size table

5 Conclusion

Sparse representation-based classification (SRC) has been successfully applied for face recognition. SRC seeks the sparsest linear combination of training sample for any test samples, but it is time consuming for counting the norm minimization solution of the associated coding coefficients. In the paper, we propose a novel method for face recognition, which need not count norm minimization problem. The proposed method first approximately represents the training data matrix by full rank decomposition, then represents the test sample as a linear combination of the training data. The generalized inverse of matrix is used for solving the solution of linear equation. We classify the test sample into the class which has the minimum residual. The experiment results suggest that the proposed method achieves higher accuracy for face recognition.

References

Howland P, Wang J, Park H (2006) Solving the small sample size problem in face recognition using generalized discriminant analysis. Pattern Recogn 39:277–287
Article Google Scholar
Xu Y, Zhang D, Jin Z, Li M, Yang J-Y (2006) A fast kernel-based nonlinear discriminant analysis for multi-class problems. Pattern Recogn 39:1026–1033
Article MATH Google Scholar
Lai ZH, Wong WK, Jin Z, Yang J, Xu Y (2012) Sparse approximation to the eigensubspace for discrimination. IEEE Trans Neural Netw Learn Syst 23:1948–1960
Article Google Scholar
Jiang X, Mandal B, Kot A (2008) Eigenfeature regularization and extraction in face recognition. IEEE Trans Pattern Anal Mach Intell 30:383–394
Article Google Scholar
Xu Y, Zhang D, Yang J-Y (2010) A feature extraction method for use with bimodal biometrics. Pattern Recogn 43:1106–1115
Article MATH Google Scholar
Belhumeur PN, Hespanha J, Kriegman DJ (1997) Eigenfaces vs. Fisherfaces: recognition using class specific linear projection. IEEE Trans Pattern Anal Mach Intell 19:711–720
Article Google Scholar
Turk M, Pentland A (1991) Eigenfaces for recognition. J Cogn Neurosci 3:71–86
Article Google Scholar
Roweis S, Saul L (2000) Laplacian eigenmaps and spectral techniques for embedding. Science 290:2323–2326
Article Google Scholar
He XF, Yan SC, Hu YX, Niyogi P, Zhang HJ (2005) Face recognition using laplacianfaces. IEEE Trans Pattern Anal Mach Intell 27:328–340
Article Google Scholar
Levin A, Shashua A (2002) Principal component analysis over continuous subspaces and intersection of half-spaces. In: Proceedings of European conference computer vision. Springer, Berlin
Fan ZZ, Xu Y, Zhang D (2011) Local linear discriminant analysis framework using sample neighbors. IEEE Trans Neural Netw 22:1119–1132
Article Google Scholar
Yang J, Zhang D, Frangi AF, Yang J (2004) Two-dimensional PCA: a new approach to appearance-based face representation and recognition. IEEE Trans Pattern Anal Mach Intell 26:131–137
Article Google Scholar
Zhou T, Tao D, Wu X (2011) Manifold elastic net: a unified framework for sparse dimension reduction. Data Min Knowl Disc 22:340–371
Article MathSciNet MATH Google Scholar
Ji S, Xue Y, Carin L (2008) Bayesian compressive sensing. IEEE Trans Signal Process 56:2346–2356
Article MathSciNet Google Scholar
Poggio T, Girosi F (1998) A sparse representation for function approximation. Neural Comput 10:1445–1454
Article Google Scholar
Wright J, Ma Y, Mairal J, Sapiro G, Huang T, Yan S (2010) Sparse representation for computer vision and pattern recognition. Proc IEEE 98:1031–1044
Article Google Scholar
Wright J, Yang AY, Ganesh A, Sastry SS, Ma Y (2009) Robust face recognition via sparse representation. IEEE Trans Pattern Anal Mach Intell 31:210–227
Article Google Scholar
Duda R, Hart P (2001) and D. Stork. Pattern Classification, second ed. John Wiley & Sons
Google Scholar
Ho J, Yang M, Lim J, Lee K, Kriegman D (2003) Clustering appearances of objects under varying illumination conditions. Proceedings of the IEEE international conference computer vision and pattern recognition, pp 11–18
Gao S, Tsang I, Chia L-T (2010) Kernel sparse representation for image classification and face recognition. In: ECCV
Elhamifar E, Vidal R (2009) Sparse subspace clustering. In: CVPR
He R, Zheng W-S, Hu B-G, Kong X-W (2011) Nonnegative sparse coding for discriminative semi-supervised learning. In: CVPR
Yang M, Zhang L, Yang J, Zhang D (2011) Robust sparse coding for face recognition. In: CVPR
Guo S, Wang Z, Ruan Q (2013) Enhancing sparsity via l ^p(0<p<1) minimization for robust face recognition. Neurocomputing 99:592–602
Article Google Scholar
Donoho D (2006) For most large underdetermined systems of linear equations the minimal l ₁-norm solution is also the sparsest solution. Commun Pure Appl Math 59:797–829
Article MathSciNet MATH Google Scholar
Cande’s E, Romberg J, Tao T (2006) Stable signal recovery from incomplete and inaccurate measurements. Commun Pure Appl Math 59:1207–1223
Article MathSciNet Google Scholar
Cande’s EJ, Wakin M, Boyd S (2008) Enhancing sparsity by reweighted l ₁ minimization. J Fourier Anal Appl 14:877–905
Article MathSciNet Google Scholar
Zhang L, Yang M, Feng XC (2011) Sparse representation or collaborative representation: which helps face recognition? In: ICCV
http://www.itl.nist.gov/iad/humanid/feret/feret_master.html
Samaria S, Harter AC (1994) Parameterisation of a stochastic model for human face identification. In Proceedings of the 2nd IEEE international workshop applications of Computer Vision, Sarasota, FL
Martinez A, Benavente R (1998) The AR face database, CVC technical report 24
Phillips PJ, Moon H, Rizvi SA, Rauss PJ (2000) The FERET evaluation methodology for face-recognition algorithms. IEEE Trans Pattern Anal Mach Intell 22:1090–1104
Article Google Scholar
Kim MY (2013) Accelerated max-margin multiple kernel learning. Appl Intell 38:45–57
Article Google Scholar
Hoyer PO (2004) Nonnegative matrix factorization with sparseness constraints. J Mach Learn Res 5:1457–1469
MathSciNet MATH Google Scholar

Download references

Acknowledgments

This article is partly supported by the Natural Science Foundation of China (Grant Nos. 61203376, 61375012, 61263032, and 61362031), the General Research Fund of Research Grants Council of Hong Kong (Project No. 531708), the China Postdoctoral Science Foundation under Project 2012M510958 and 2013T60370, the Guangdong Natural Science Foundation under Project S2012040007289, and Shenzhen Municipal Science and Technology Innovation Council (Nos. JC201005260122A, JCYJ201206 13153352732 and JCYJ20120613134843060, JCYJ20130329152024199).

Author information

Authors and Affiliations

Shenzhen Graduate School, Harbin Institute of Technology, Shenzhen, China
Yuwu Lu, Jinrong Cui & Xiaozhao Fang

Authors

Yuwu Lu
View author publications
You can also search for this author in PubMed Google Scholar
Jinrong Cui
View author publications
You can also search for this author in PubMed Google Scholar
Xiaozhao Fang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yuwu Lu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lu, Y., Cui, J. & Fang, X. Enhancing sparsity via full rank decomposition for robust face recognition. Neural Comput & Applic 25, 1043–1052 (2014). https://doi.org/10.1007/s00521-014-1582-4

Download citation

Received: 30 August 2013
Accepted: 24 March 2014
Published: 16 April 2014
Issue Date: October 2014
DOI: https://doi.org/10.1007/s00521-014-1582-4

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Enhancing sparsity via full rank decomposition for robust face recognition

Abstract

Similar content being viewed by others