Keywords

5.1 Introduction

Recently, face recognition (FR) approaches based on linear representations have led to state-of-the-art performance [1]. The most representative approach is sparse representation-based classification (SRC) [1]. However, SRC is not robust to contiguous occlusion such as sunglasses and scarf. Therefore, many related works [24] have been developed. In particular, the robust sparse coding (RSC) [2] method has achieved very good performance in FR with various occlusions. In order to construct a more robust model for sparse coding of face images, RSC finds a maximum likelihood estimation solution of the coding coefficients and obtains high recognition accuracy; however, it is also very time-consuming, like SRC. Recently, Yang et al. [5] proposed a fast and robust FR method, which is faster than SRC and RSC.

Motivated by the recent success of RSC, we proposed a novel approach to improve the recognition rates by approximating the ideal sparse coefficients.

5.2 Related Works

Wright et al. [1] proposed a sparse representation-based classification scheme for FR. Let D = [D 1, D 2, …, D s ] be the set of training samples from all the s classes, where D i is the subset of the training samples from class i. Denote y a testing sample. In SRC, y is sparsely coded over D via l 1-minimization:

$$ \mathop {\text{argmin}}\limits_{\alpha } \{ ||y - D\alpha ||{}_{2}^{2} + \lambda ||\alpha ||_{1} \} $$
(5.1)

where λ is a scalar constant, α is the coding vector of y over D.

Yang et al. [2] proposed RSC to solve this problem by LASSO:

$$ \mathop {\hbox{min} }\limits_{\alpha } ||y - D\alpha ||_{2}^{2} \quad {\text{s}} . {\text{t}} .\quad ||\alpha ||_{1} \le \sigma $$
(5.2)

where \( \sigma > 0 \) is a constant. They used a weighted regression function to measure the representation residual. The final equation can be described as:

$$ \mathop {\hbox{min} }\limits_{\alpha } \{ ||W^{1/2} (y - D\alpha )||_{2}^{2} + \lambda ||\alpha ||_{1} \} $$
(5.3)

where W is a diagonal matrix, each element is the weight assigned to each pixel of the test image y.

5.3 Proposed Method

5.3.1 Motivation

In sparse coding, the ideal sparse coefficients α can be described as follows:

$$ \alpha = [0, \ldots ,0,\alpha_{i,1} ,\alpha_{i,2} , \ldots ,\alpha_{{i,k_{i} }} ,0, \ldots ,0]^{T} $$
(5.4)

where k i is the number of class i. We can know that RSC uses l 1-regularized least squares [6] to solve the sparse coefficients. Although this method can achieve good sparse representation, it only ensures the coefficients global sparse. In practice, we hope the nonzero elements of sparse coefficients α can be concentrated in the object region as far as possible, and the remaining values are as small as possible. RSC is supervised, which means we know the labels of testing images. Thus we can easily identify the location of object region, and it is possible to control the distribution of sparse coefficients. We analyze how RSC method solves sparse coefficients. In RSC, when the testing image y belongs to class i, the solution of sparse coefficients can be performed as follows:

$$ \begin{aligned} \alpha^{y \in i} & = [\overbrace {{0_{1,1} ,\alpha_{1,2} \ldots ,\alpha_{{1,k_{1} }} }}^{{{\text{region}}\;1}}, \ldots ,\overbrace {{\alpha_{i,1} ,\alpha_{i,2} , \ldots ,\alpha_{{i,k_{i} }} }}^{{{\text{object}}\;{\text{region}}\;i}}, \ldots \\ & \quad \underbrace {{\alpha_{m,1} ,0_{m,2} , \ldots ,\alpha_{{m,k_{m} }} }}_{{{\text{region}}\;m}}, \ldots ,\underbrace {{0_{s,1} ,\alpha_{s,2} , \ldots ,\alpha_{{s,k_{s} }} }}_{{{\text{region}}\;s}}] \, ^{T} \\ \end{aligned} $$
(5.5)

where \( \alpha_{{1,k_{1} }} , \ldots ,\alpha_{{i,k_{i} }} , \ldots ,\alpha_{{m,k_{m} }} , \ldots ,\alpha_{{s,k_{s} }} \) are nonzero elements of coefficients vector \( \alpha^{y \in i} \), s is the number of classes. Denote \( {\text{SOR}}_{x} = \alpha_{x,1} + \alpha_{x,2} + \cdots + \alpha_{x,k} \) as the sum of elements of coefficient vector in the region x, x = 1,2, …, s.

5.3.2 Maximum Sparse Coefficients of Object Region

We analyze the distribution of coefficients α in the object region with successful recognition. Figure 5.1 shows some distributions of elements in object regions. We can see that the value of nonzero elements in each object region is exponential growth. Considering that \( \alpha < 1 \), we choose logistic function to fit the distribution of elements in the object region. The basic formula of logistic function is

$$ L(r) = \frac{1}{{1 + p^{ - r} }} $$
(5.6)

where r is independent variable, p is a constant. Here, we modify the model in Eq. (5.6) to solve our problem.

Fig. 5.1
figure 1

Distribution of coefficients in object regions of different testing images

The new formula can be described as follows:

$$ L(\alpha_{\text{or}} ) = \left\{ {\begin{array}{*{20}r} \hfill {\frac{c}{{1 + p^{{(a - b\alpha_{\text{or}} )}} }},} & \hfill {\hbox{max} (\alpha_{\text{or}} ) < z} \\ \hfill {\frac{{c_{1} }}{{1 + p_{1}^{{(a_{1} - b_{1} \alpha_{\text{or}} )}} }},} & \hfill {z \le \hbox{max} (\alpha_{\text{or}} ) < z_{1} } \\ \hfill {\frac{{c_{2} }}{{1 + p_{2}^{{(a_{2} - b_{2} \alpha_{\text{or}} )}} }},} & \hfill {z_{1} \le \hbox{max} (\alpha_{\text{or}} ) < 1} \\ \end{array} } \right. $$
(5.7)

where \( a,a_{1} ,a_{2} ,b,b_{1} ,b_{2} ,c,c_{1} ,c_{2} ,p,p_{1} ,p_{2} ,z,z_{1} \) and \( z_{2} \) are constants, \( \alpha_{\text{or}} \) is coefficient in the object region. From the result in Fig. 5.2, we can identify these parameters as follows: \( a = 2.02 \), \( b = 20.01 \), \( c = 0.40 \), \( p = 8.01 \); \( a_{1} = 4.00 \), \( b_{1} = 20.00 \), \( c_{1} = 0.50 \), \( p_{1} = 5.99 \); \( a_{2} = 3.31 \), \( b_{2} = 10.52 \), \( c_{2} = 0.74 \), \( p_{2} = 4.05 \). For simplicity, we let \( z = 0.2 \), \( z_{1} = 0.35 \) according to the range of independent variable. In order to globally maximize SOR i , we need to find the maximum SOR x . If x ≠ i, we use the following formula to decrease its value:

$$ M\left( {\alpha_{\text{mr}} } \right) = \frac{{{\text{SOR}}_{\hbox{max} } - \Delta {\text{SOR}}_{i} }}{{{\text{SOR}}_{\hbox{max} } }}\alpha_{\text{mr}} $$
(5.8)

where \( \alpha_{\text{mr}} \) is the coefficient in the maximum region, \( {\text{SOR}}_{\hbox{max} } \) is the maximum region, \( \Delta {\text{SOR}}_{i} \) is the increment in the object region. Using Eqs. (5.7) and (5.8), we can ensure SOR i to be \( {\text{SOR}}_{\hbox{max} } \). Finally, the sparse coefficients can be described as follows:

$$ \alpha = [\alpha_{\text{r}1} ,\alpha_{\text{r}2} , \ldots ,L(\alpha_{\text{ri}} ), \ldots ,M(\alpha_{\text{rs}} ), \ldots ,\alpha_{{{\text{rk}} - 1}} ,\alpha_{\text{rk}} ]^{T} $$
(5.9)

where \( \alpha_{\text{rj}} \) is the coefficient in region j, j = 1, 2, 3, …, k. \( \alpha_{\text{ri}} = \alpha_{\text{or}} ,\;\alpha_{\text{rs}} = \alpha_{\text{mr}} \).

Fig. 5.2
figure 2

Distribution of object regions with successful recognition and its fitted curves

5.3.3 Algorithm

We summarize the overall procedure of our algorithm as follows:

  1. Step 1.

    Input: Normalized test sample y, dictionary D = [D 1, D 2,…,D s ]

  2. Step 2.

    Solve α: \( \mathop {\hbox{min} }\limits_{\alpha } \{ ||W^{1/2} (y - D\alpha )||_{2}^{2} + \lambda ||\alpha ||_{1} \} \)

    1. a.

      \( \, j = 1 \), Initialize W j and residual e j

    2. b.

      Initialize α

    3. c.

      Identify \( {\text{SOR}}_{\hbox{max} } \) and \( \alpha_{\hbox{max} } \)

    4. d.

      If \( ({\text{SOR}}_{i} < {\text{SOR}}_{\hbox{max} } ) \) and \( (\alpha_{i - \hbox{max} } < \alpha_{\hbox{max} } ) \)

      $$ \alpha = [\alpha_{r1} ,\alpha_{r2} , \ldots ,L(\alpha_{\text{ri}} ), \ldots ,M(\alpha_{\text{rs}} ), \ldots ,\alpha_{{{\text{rk}} - 1}} ,\alpha_{\text{rk}} ]^{T} $$
      (5.10)

      Else go to next step. (\( \alpha_{i - \hbox{max} } \) is the maximal element in the object region)

    5. e.

      Sparse coding by l 1-regularized least squares: α, then go back to c.

    6. f.

      Compute residual: \( e_{j} (y) = ||y - D\alpha ||_{2}^{2} \)

    7. g.

      Update weights: W j

    8. h.

      \( j = j + 1 \), go back to b. until the maximal number of iterations is reached.

  3. Step 3.

    Output: the identity of y as \( {\text{Identity(}}y )= {\text{argmin}}\{ e_{j} \} \)

5.4 Experimental Results

In this section, we carry out experiments on benchmark face databases (AR [7] and Extended Yale B [8]) to demonstrate the performance of our algorithm. In all experiments, we compare our method with some popular methods such as CRC_RLS [9], FDDL [10], and RSC [2].

5.4.1 Recognition Without Occlusion

Extended Yale B database: The Extended Yale B database consists of 2414 frontal-face images of 38 individuals. We used the cropped and normalized 45 × 40 face images. Figure 5.3 shows some facial images from the Extended Yale B database. Table 5.1 shows the recognition rates versus feature dimension by CRC_RLS, FDDL, RSC and our method.

Fig. 5.3
figure 3

Samples of different illuminations from the Extended Yale B database

Table 5.1 Face recognition rates on the Extended Yale B database

AR database: The AR database consists of over 4,000 frontal images from 126 individuals. Figure 5.4 shows some facial images for example. The comparison of our method with its competing methods is given in Table 5.2. Table 5.3 shows that our method achieves much higher recognition rates than other three methods in different numbers of training samples. Our method performs better than RSC when there are few training samples.

Fig. 5.4
figure 4

Samples from the AR database

Table 5.2 Face recognition rates on the AR database
Table 5.3 Recognition rates of different numbers of training samples on the AR database

5.4.2 Recognition with Real Disguise

In this section, we evaluate the robustness of our method to real disguise. We select a subset of the AR database consisting of both neutral and corrupted images in this experiment. The following three scenarios are considered for performance evaluation: Sunglasses, Scarf, and Sunglasses + Scarf. Figure 5.5 shows one person’s images with different disguises from training and testing images. Table 5.4 shows that our recognition rates are higher than other methods. The experimental results also demonstrate the good robustness of our method to face occlusion.

Fig. 5.5
figure 5

The first row is training images, the rest are testing images

Table 5.4 Face recognition rates on the AR database with disguise occlusion

5.5 Conclusion

In this paper, motivated by the recent success of RSC, we present a novel method to recognize face image with illumination variations and occlusion. We add constraints and use logistic function to optimize the sparse coefficients. Our extensive experimental results on benchmark face databases show that the proposed method is effective and robust. Compared with RSC, our method achieves higher recognition rates, even when the training samples are limited.