Abstract
Face recognition is an active topic in recognition systems, while face occlusion is one of the most challenging problems for recognition. Recently, robust sparse coding achieved the state-of-the-art performance, especially when dealing with occluded images. However, robust sparse coding is known that only guarantees the coefficient is global sparse when solving sparse coefficients. In this paper, we enable the elements in the object region to approximate global maximum by fitting the distribution of elements in the object region with successful recognition. The efficacy of the proposed approach is verified on publicly available databases. Furthermore, our method can achieve much better performance when the training samples are limited.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
5.1 Introduction
Recently, face recognition (FR) approaches based on linear representations have led to state-of-the-art performance [1]. The most representative approach is sparse representation-based classification (SRC) [1]. However, SRC is not robust to contiguous occlusion such as sunglasses and scarf. Therefore, many related works [2–4] have been developed. In particular, the robust sparse coding (RSC) [2] method has achieved very good performance in FR with various occlusions. In order to construct a more robust model for sparse coding of face images, RSC finds a maximum likelihood estimation solution of the coding coefficients and obtains high recognition accuracy; however, it is also very time-consuming, like SRC. Recently, Yang et al. [5] proposed a fast and robust FR method, which is faster than SRC and RSC.
Motivated by the recent success of RSC, we proposed a novel approach to improve the recognition rates by approximating the ideal sparse coefficients.
5.2 Related Works
Wright et al. [1] proposed a sparse representation-based classification scheme for FR. Let D = [D 1, D 2, …, D s ] be the set of training samples from all the s classes, where D i is the subset of the training samples from class i. Denote y a testing sample. In SRC, y is sparsely coded over D via l 1-minimization:
where λ is a scalar constant, α is the coding vector of y over D.
Yang et al. [2] proposed RSC to solve this problem by LASSO:
where \( \sigma > 0 \) is a constant. They used a weighted regression function to measure the representation residual. The final equation can be described as:
where W is a diagonal matrix, each element is the weight assigned to each pixel of the test image y.
5.3 Proposed Method
5.3.1 Motivation
In sparse coding, the ideal sparse coefficients α can be described as follows:
where k i is the number of class i. We can know that RSC uses l 1-regularized least squares [6] to solve the sparse coefficients. Although this method can achieve good sparse representation, it only ensures the coefficients global sparse. In practice, we hope the nonzero elements of sparse coefficients α can be concentrated in the object region as far as possible, and the remaining values are as small as possible. RSC is supervised, which means we know the labels of testing images. Thus we can easily identify the location of object region, and it is possible to control the distribution of sparse coefficients. We analyze how RSC method solves sparse coefficients. In RSC, when the testing image y belongs to class i, the solution of sparse coefficients can be performed as follows:
where \( \alpha_{{1,k_{1} }} , \ldots ,\alpha_{{i,k_{i} }} , \ldots ,\alpha_{{m,k_{m} }} , \ldots ,\alpha_{{s,k_{s} }} \) are nonzero elements of coefficients vector \( \alpha^{y \in i} \), s is the number of classes. Denote \( {\text{SOR}}_{x} = \alpha_{x,1} + \alpha_{x,2} + \cdots + \alpha_{x,k} \) as the sum of elements of coefficient vector in the region x, x = 1,2, …, s.
5.3.2 Maximum Sparse Coefficients of Object Region
We analyze the distribution of coefficients α in the object region with successful recognition. Figure 5.1 shows some distributions of elements in object regions. We can see that the value of nonzero elements in each object region is exponential growth. Considering that \( \alpha < 1 \), we choose logistic function to fit the distribution of elements in the object region. The basic formula of logistic function is
where r is independent variable, p is a constant. Here, we modify the model in Eq. (5.6) to solve our problem.
The new formula can be described as follows:
where \( a,a_{1} ,a_{2} ,b,b_{1} ,b_{2} ,c,c_{1} ,c_{2} ,p,p_{1} ,p_{2} ,z,z_{1} \) and \( z_{2} \) are constants, \( \alpha_{\text{or}} \) is coefficient in the object region. From the result in Fig. 5.2, we can identify these parameters as follows: \( a = 2.02 \), \( b = 20.01 \), \( c = 0.40 \), \( p = 8.01 \); \( a_{1} = 4.00 \), \( b_{1} = 20.00 \), \( c_{1} = 0.50 \), \( p_{1} = 5.99 \); \( a_{2} = 3.31 \), \( b_{2} = 10.52 \), \( c_{2} = 0.74 \), \( p_{2} = 4.05 \). For simplicity, we let \( z = 0.2 \), \( z_{1} = 0.35 \) according to the range of independent variable. In order to globally maximize SOR i , we need to find the maximum SOR x . If x ≠ i, we use the following formula to decrease its value:
where \( \alpha_{\text{mr}} \) is the coefficient in the maximum region, \( {\text{SOR}}_{\hbox{max} } \) is the maximum region, \( \Delta {\text{SOR}}_{i} \) is the increment in the object region. Using Eqs. (5.7) and (5.8), we can ensure SOR i to be \( {\text{SOR}}_{\hbox{max} } \). Finally, the sparse coefficients can be described as follows:
where \( \alpha_{\text{rj}} \) is the coefficient in region j, j = 1, 2, 3, …, k. \( \alpha_{\text{ri}} = \alpha_{\text{or}} ,\;\alpha_{\text{rs}} = \alpha_{\text{mr}} \).
5.3.3 Algorithm
We summarize the overall procedure of our algorithm as follows:
-
Step 1.
Input: Normalized test sample y, dictionary D = [D 1, D 2,…,D s ]
-
Step 2.
Solve α: \( \mathop {\hbox{min} }\limits_{\alpha } \{ ||W^{1/2} (y - D\alpha )||_{2}^{2} + \lambda ||\alpha ||_{1} \} \)
-
a.
\( \, j = 1 \), Initialize W j and residual e j
-
b.
Initialize α
-
c.
Identify \( {\text{SOR}}_{\hbox{max} } \) and \( \alpha_{\hbox{max} } \)
-
d.
If \( ({\text{SOR}}_{i} < {\text{SOR}}_{\hbox{max} } ) \) and \( (\alpha_{i - \hbox{max} } < \alpha_{\hbox{max} } ) \)
$$ \alpha = [\alpha_{r1} ,\alpha_{r2} , \ldots ,L(\alpha_{\text{ri}} ), \ldots ,M(\alpha_{\text{rs}} ), \ldots ,\alpha_{{{\text{rk}} - 1}} ,\alpha_{\text{rk}} ]^{T} $$(5.10)Else go to next step. (\( \alpha_{i - \hbox{max} } \) is the maximal element in the object region)
-
e.
Sparse coding by l 1-regularized least squares: α, then go back to c.
-
f.
Compute residual: \( e_{j} (y) = ||y - D\alpha ||_{2}^{2} \)
-
g.
Update weights: W j
-
h.
\( j = j + 1 \), go back to b. until the maximal number of iterations is reached.
-
a.
-
Step 3.
Output: the identity of y as \( {\text{Identity(}}y )= {\text{argmin}}\{ e_{j} \} \)
5.4 Experimental Results
In this section, we carry out experiments on benchmark face databases (AR [7] and Extended Yale B [8]) to demonstrate the performance of our algorithm. In all experiments, we compare our method with some popular methods such as CRC_RLS [9], FDDL [10], and RSC [2].
5.4.1 Recognition Without Occlusion
Extended Yale B database: The Extended Yale B database consists of 2414 frontal-face images of 38 individuals. We used the cropped and normalized 45 × 40 face images. Figure 5.3 shows some facial images from the Extended Yale B database. Table 5.1 shows the recognition rates versus feature dimension by CRC_RLS, FDDL, RSC and our method.
AR database: The AR database consists of over 4,000 frontal images from 126 individuals. Figure 5.4 shows some facial images for example. The comparison of our method with its competing methods is given in Table 5.2. Table 5.3 shows that our method achieves much higher recognition rates than other three methods in different numbers of training samples. Our method performs better than RSC when there are few training samples.
5.4.2 Recognition with Real Disguise
In this section, we evaluate the robustness of our method to real disguise. We select a subset of the AR database consisting of both neutral and corrupted images in this experiment. The following three scenarios are considered for performance evaluation: Sunglasses, Scarf, and Sunglasses + Scarf. Figure 5.5 shows one person’s images with different disguises from training and testing images. Table 5.4 shows that our recognition rates are higher than other methods. The experimental results also demonstrate the good robustness of our method to face occlusion.
5.5 Conclusion
In this paper, motivated by the recent success of RSC, we present a novel method to recognize face image with illumination variations and occlusion. We add constraints and use logistic function to optimize the sparse coefficients. Our extensive experimental results on benchmark face databases show that the proposed method is effective and robust. Compared with RSC, our method achieves higher recognition rates, even when the training samples are limited.
References
Wright J, Yang AY, Ganesh A et al (2009) Robust face recognition via sparse representation. IEEE Trans Pattern Anal Mach Intell 31(2):210–227
Yang M, Zhang L, Yang J et al (2013) Regularized robust coding for face recognition. IEEE Trans Image Process 22(5):1753–1766
Ou WH, You XG, Tao DC et al (2014) Robust face recognition via occlusion dictionary learning. Pattern Recogn 47(4):1559–1572
Li HJ, Nobile N, Suen CY (2014) Face recognition based on discriminative dictionary with multilevel feature fusion. Lect Notes Artif Intell 8774:252–263
Yang M, Feng ZZ, Shiu SCK et al (2014) Fast and robust face recognition via coding residual map learning based adaptive masking. Pattern Recogn 47(2):535–543
Goldstein T, Osher S (2009) The split Bregman method for L1-regularized problems. SIAM J Imaging Sci 2(2):323–343
Martinez A, Benavente R (1998) The AR face database. Technical report 24, CVC
Georghiades A, Belhumeur P, Kriegman D (2001) From few to many: illumination cone models for face recognition under variable lighting and pose. IEEE Trans Pattern Anal Machine Intell 23(6):643–660
Zhang L, Yang M, Feng X (2011) Sparse representation or collaborative representation: which helps face recognition? In: Proceedings of ICCV, pp 471–478
Yang M, Zhang L, Feng X et al (2014) Sparse representation based fisher discrimination dictionary learning for image classification. Int J Comput Vision 109(3):209–232
Acknowledgments
This work is supported by the National Natural Science Foundation of China (NO.61171077), University Science Research Project of Jiangsu Province (NO.12KJB510025), Nantong University Undergraduate Training Program for Innovation (NO.2013067), and the Natural Sciences and Engineering Research Council of Canada.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Xu, Z., Li, H., Jin, X., Suen, C.Y. (2015). Face Recognition Based on Maximum Sparse Coefficients of Object Region. In: Deng, Z., Li, H. (eds) Proceedings of the 2015 Chinese Intelligent Automation Conference. Lecture Notes in Electrical Engineering, vol 336. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-46469-4_5
Download citation
DOI: https://doi.org/10.1007/978-3-662-46469-4_5
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-46468-7
Online ISBN: 978-3-662-46469-4
eBook Packages: EngineeringEngineering (R0)