A twice face recognition algorithm

Wu, Zhendong; Yu, Zipeng; Yuan, Jie; Zhang, Jianwu

doi:10.1007/s00500-014-1561-9

A twice face recognition algorithm

Methodologies and Application
Published: 17 December 2014

Volume 20, pages 1007–1019, (2016)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Soft Computing Aims and scope Submit manuscript

A twice face recognition algorithm

Download PDF

Zhendong Wu¹,
Zipeng Yu¹,
Jie Yuan¹ &
…
Jianwu Zhang¹

492 Accesses
18 Citations
Explore all metrics

Abstract

The theory of compressive sensing applies the sparse representation to the extraction of useful information from signals and brings a breakthrough to the theory of signal sampling. Based on compressive sensing, sparse representation-based classification (SRC) is proposed. SRC uses the compressibility of the image data to represent the facial image sparsely and could solve the problems of both massive calculation and information loss in dealing with signals. SRC does not, however, deal with the effects of variable illumination, posture and incomplete face image, which could result in severe performance degradation. This paper studies the differences between SRC recognition and human recognition. We find that there is an obvious disadvantage in the SRC algorithm, and it will significantly affect the face recognition performance in actual environment, especially for the variable illumination, posture and incomplete face image. To overcome the disadvantage of SRC algorithm, we propose an SRC-based twice face recognition algorithm named T_SRC. T_SRC uses bidirectional PCA, linear discriminant analysis and GradientFace to execute multichannel analysis, which could extract more “holistic/configural” face features in actual environment than by using SRC algorithm directly. Based on the multichannel analysis, we identify the test image by SRC firstly. Then, by analyzing the residual, this algorithm could decide whether the twice recognition is needed. If the twice recognition is needed, T_SRC extracts the facial details (“featural” face features) by the improved Harris point and Gabor filter detector. We suppose that the facial details are more stable than the whole face in actual environment, and later experiments verify our assumption. At last, this algorithm identifies the class of the test image by SRC again. The results of the experiments prove that the T_SRC algorithm has better recognition rate than SRC.

The Realization of Face Recognition Algorithm Based on Compressed Sensing (Short Paper)

Face recognition using a new compressive sensing-based feature extraction method

Article 10 July 2017

Single-Sample Face Recognition via Fusion Variant Dictionary

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

With the rising awareness of information security, people have begun to attach great importance to the technology of biometric identification. More and more researches about biometric identification have been carried out. How to identify a person’s identity rapidly and protect information security is a crucial problem to be solved. Since the method of face recognition is user-friendly, reliable and unique, researchers and business communities have paid increasing attention to the technology of face recognition, which is one of the hottest technologies in biometric identification.

The traditional algorithms of face recognition are divided into two categories mainly: the algorithm based on featured-based; the algorithm based on holistic approaches (Jafri and Arabnia 2009). The key point of the algorithm based on geometric features is how to extract the face features accurately. The algorithm identifies the test image through the local features and the topology of the structure model. The algorithm based on holistic approaches identifies faces using global representations, i.e., descriptions based on subspace projection or machine learning, such as PCA, neural networks, and so on. The proper classifiers are designed to use learning ability, classification ability and adaptive ability to recognize the test image.

The recognition rate of traditional algorithms is good to a certain extent, but the unfavorable factors such as variable illumination, facial expressions, posture and incomplete face image in the actual environment could have obviously negative impacts on the performance of traditional algorithms. Compressive sensing (CS) is proposed by Donoho (2006). CS theory uses the sparse representation of the signal to extract the useful information and makes a breakthrough to the sampling theory. Meanwhile, CS brings a new method and idea to face recognition. For example, the sparse representation-based classification (SRC) is proposed based on CS (John et al. 2007, 2009).

SRC uses the compressibility of image data to represent the face image sparsely and produces a good recognition rate. Since SRC demands that the face image must be aligned, Huang et al. (2008) introduce a simple technique for obtaining transformation-invariant image sparse representation by using the affine transformation to deal with the alignment problem. This technique simultaneously recovers the sparse representation of a target image and improves the recognition rate of SRC. Wagner et al. (2012) turn the posture problem to the problem of different views, and use the affine transformation to solve the posture problem to some extent. In other words, this paper proposes a simple face recognition system that achieves a higher degree of robustness to illumination variation, image misalignment and partial occlusion. Qian (2011) proposes the method of block sparse representation of the face image. Because of the block face other than the whole face, the performance of SRC could be robust for posture variant. Meanwhile, after estimating the initial value effectively, SRC could avoid the situation of local optimization. Zhou (2012) makes full use of the advantage of the OMP algorithm in small dimension and the classification features of SRC_OMP. The algorithm calculates separately every time according to the categories from the information and further simplifies the SRC algorithm. Deng et al. (2012) use auxiliary intraclass variant dictionary to represent the possible variation between the training and test images named ESRC. ESRC extends SRC to applications where images are very few, or even one training image per subject. Zhuang et al. (2013) propose a method to train a sparse illumination dictionary to represent the different illumination patterns between the training and test images named SIT technique. It is also able to reduce required training images to one image per class. Various features extracted from human faces are provided to the SRC to work, just as the researches on Gabor Occlusion Dictionary by Yang et al. (2013). Xu et al. (2013) propose a coarse-to-fine face recognition method named CFFR, which uses two-stage SRC to recognize the face. The second state of CFFR would classify the test samples, which are based on the training samples of the first-stage candidate classes. Liao et al. (2013) propose an alignment-free approach named MKD-SRC. MKD-SRC could work for both holistic faces and partial faces in addressing the one-sample-per-class problem. MKD-SRC uses a new point detector called Gabor ternary pattern to extract facial features and makes the image more robust to viewpoint changes. Song et al. (2014) propose an algorithm named DA-TSSR, using three-step sparse residuals measurement to perform face discriminant analysis. Song et al. (2014) analyze the linear combination of the training samples deeply, and they find that the recognition rate can be increased by selecting the training samples carefully.

After it was used for several years, the SRC algorithm has shown some limits on the performance of face recognition. Other methods are proposed to continue studying face recognition. Bonnen et al. (2013) propose a component-based recognition method. The method crops and aligns face components such as mouths and eyes firstly, then extracts a multi-scale local binary patterns (MLBP) feature representation from each component secondly, finally uses random sampling LDA (RS-LDA) (Wang and Tang 2010) to discriminate different faces. Lu et al. (2013) propose a multimanifold analysis technology for face recognition. Researches assume that the samples from one face class define a single manifold in the feature space and seek better separated feature dimensions in the low-dimensional feature spaces. Yaniv et al. (2014) propose a deep learning (DL) framework using neural network to deal with the mission of unconstrained face recognition. Researches present a system (DeepFace) that has close to the human visual system. Karczmarek et al. (2014) propose a fuzzy measure approach to study the facial regions saliency. Karczmarek et al. observe the fact that the importance of different facial regions in the recognition process could vary from one region to another. A fuzzy measure ($\lambda $-fuzzy measure) is constructed to describe the relevance of different facial regions. A test of face recognition using $\lambda $-fuzzy measure is performed, with a good result obtained.

Although the above algorithms are promising, there are still some problems worthy of being researched deeply on SRC, such as the relationship between SRC face recognition and human face recognition. Many cognitive science researches have studied human face perception deeply. From those studies, we get some helpful cues. DeGutis et al. (2013) studies support the idea of a unitary holistic processing mechanism that is involved in skilled face recognition. Galit and Brad (2006) find that specialized face perception mechanisms extract both part and spacing information. It means when people recognize face, holistic features and featural features can be extracted and used simultaneously. Renzi et al. (2013) provide some evidence supporting this mechanism through a transcranial magnetic stimulation (TMS) study. Piepers and Robbins (2012) review existing models of holistic/configural processing, discuss how they differ from one another conceptually. Researches favor the model where holistic processing of a face includes some or all of the interrelations between features and has separate coding for features.

Inspired by above researches, we study the properties of SRC algorithm through comparison of experimental psychology and SRC technology. On the basis of these experiments, we find a disadvantage of SRC algorithm in face recognition, and propose an SRC-based twice face recognition algorithm named T_SRC to overcome this disadvantage. T_SRC uses both holistic and detailed information, which are extracted from face images through twice recognition technology. T_SRC could solve the problem of the integrity of face image and is robust to illumination and posture. Firstly, multichannel analysis technology of T_SRC, which is a combination of linear discriminant analysis (LDA) (Belhumeur et al. 1997), BDPCA (Zuo et al. 2006) and GradientFaces (Zhang et al. 2009), is used to extract multi-dimensions holistic features. Secondly, T_SRC uses SRC to identify the test image and calculate the residual. If the residual satisfies the conditions of twice recognition, we uses the improved Harris point and Gabor filter detector to cut out the facial details after extracting the illumination invariants by self quotient image (SQI). Finally, we identify the test image based on SRC again.

This paper continues as follows. In Sect. 2, we introduce the basic principle and main process of SRC. In Sect. 3, we introduce the psychology experiment, and discuss the difference between SRC recognition and human recognition. In Sect. 4, we describe the main contents of T_SRC and display the whole process of T_SRC algorithm. In Sect. 5, we introduce the experimental results of different algorithms and the analysis of the advantages and disadvantages of these algorithms. In Sect. 6, we summarize the whole paper and put forward a direction of further research.

2 Sparse representation-based classification (SRC)

SRC, proposed by Yang et al., cast the recognition problem as one of the ways of finding a sparse representation of the test image in terms of the training set as a whole. The key of SRC lies in finding the appropriate sparse representation.

The main process of SRC is as follows:

1.
Form the training set: ${{A}} = [{{{A}}_1},{{{A}}_2}, \ldots ,{{{A}}_{{k}}}]$, ${{{A}}_{{i}}}{{(1}} \le {{i}} \le {{k)}}$ are the sample images of the person i, and k is the total number of classes. And ${{{A}}_{{i}}} = {{[}}{v_{{{i,1}}}}{{,}}{v_{{{i,2}}}}{{,}} \ldots {{,}}{v_{{{i,n}}}}{{]}}$, ${v_{{{i,n}}}}$ is the nth sample image of the person i.
2.
Use the training set to present the test image y:
$$\begin{aligned} \qquad {{y}} = \sum \limits _{i = 1}^k {\sum \limits _{j = 1}^n {{{{\alpha }}_{i,j}}{v_{i,j}}} } {{ = }}Ax \end{aligned}$$
(1)
$x = [ {{\alpha _1},{\alpha _2}, \ldots ,{\alpha _{{k}}}} ]$ is the sparse solver of the test image and ${\alpha _{{i}}} = [ {{\alpha _{\mathrm{{i,1}}}},{\alpha _{\mathrm{{i,2}}}}, \ldots ,{\alpha _{{{i,n}}}}} ]( {{{1}} \le {{i}} \le {{k}}} )$ is the sparse coefficient of the person i. If the class of the test image is I, $y$ could be sparse represented only by the sample images of person i. In other words, ${\alpha _{{j}}} = 0 \,( {{{j}} \ne {{i}}} )$.
3.
Get the sparse representation via L1 norm minimization:
$$\begin{aligned} \qquad \mathop x\limits ^ \wedge = \min {\Vert x \Vert _1.} \end{aligned}$$
(2)
4.
Calculate the residual:
$$\begin{aligned} \qquad {\gamma _{{i}}} = {\Vert {y - {{A}} \cdot {\delta _{{i}}}(x)} \Vert _2},\quad {{i}} = 1,2, \ldots ,{{k}}. \end{aligned}$$
(3)
Confirm the class of the test image:
$$\begin{aligned} \mathrm{Identity} ( y) = \mathrm{min}({\gamma _{{i}}}). \end{aligned}$$
(4)

As the core problem of SRC, formula (1) is the underdetermined equation. The precise solution could not be ensured though. The conventional method to solve formula (1) is the L2 norm minimization:

$$\begin{aligned} \qquad \mathop x\limits ^ \wedge = \arg \min {\Vert x \Vert _2},\quad \mathrm{s.t.} \,\,\,y = {{A}} \cdot x. \end{aligned}$$

(5)

However, the solution of L2 norm minimization contains numerous non-zero elements so that it could not meet the satisfaction of sparse degree. In other words, the L2 norm minimization focuses on the energy of $x$ other than the sparse coordinate. If we choose the L0 norm minimization, the calculated amount is $C_N^K$ level. Tropp and Gilbert (2006) put forward the conclusion:

If $M \ge cK\log ( {N/K})$, the L1 norm minimization is equivalent to the L0 norm minimization. Therefore, we choose the L1 norm minimization as the sparse decomposition algorithm.

The appropriate L1 norm minimization could reduce the complexity of algorithm and improve the performance and recognition rate. The three common methods are combinational algorithm, convex relaxation algorithm, greedy algorithm. Because of the accurate solver and the low complexity in convex relaxation algorithm, we choose GPSR (Figueiredo et al. 2007), the representative algorithm of convex relaxation algorithm, to solve the problem of L1 norm minimization.

3 A difference between SRC face recognition and human face recognition

Although SRC had a good recognition rate in laboratory environment, the unfavorable factors in the actual environment could have obviously negative impacts on the performance of SRC.

For example, the variant illumination, posture and incomplete face image were all unfavorable factors to SRC. Figure 1 is the situation with variant posture and incomplete face.

From Fig. 1, we found that the residuals were not sparse, and the class of the minimum residual was 2, but the class of the test image was 9. Since the face image was in variant posture and incomplete, the facial contour was different from the other sample images of the class 9, recognition failed.

For analyzing the limitation of SRC algorithm, we designed a group of experiments which consists of a psychology experiment and an SRC recognition experiment which was used to study the difference between SRC face recognition and human face recognition.

3.1 The psychology experiment

3.1.1 Participants

Eighty students of the University of Hangzhou Dianzi (mean age 21.75 years, SD 1.13, range 20–25, 47 males) took part in the experiment. All participants participated in the experiment voluntarily. Participants were randomly divided into two groups. There were 38 students in one group, including 17 females and 21 males, denoted G1; and there were 42 students in the other group, including 16 females and 26 males, denoted G2. G1 was the control group, and G2 was the experimental group.

3.1.2 Materials and procedure

ORL database was preprocessed to ORL_P database. The distance between eyes of each face image was changed by software photoshop. Some examples were shown in Fig. 2. The eyes-distance-adjust was slighter than the adjustment in the experiment by Renzi et al. (2013).

Control group G1 used ORL database, and experimental group G2 used ORL and ORL_P database. For every trial in G2, G2 selected one image from ORL and another image from ORL_P, and both images were used in one trial. According to the experiment by Renzi et al. (2013), participants were asked to judge whether two shortly consecutive presented faces were identical or different by pressing the corresponding key. The experiment only paid attention to the accuracy of human face recognition, and the response speed was not recorded. One test included 21 training-trials and 35 testing-trials, and one trial presented two face images consecutively. The presented faces were mostly extracted from database randomly, excepting eight testing-trials had the same people face images.

Participants sat comfortably at a distance of 40–50 cm away from a 14” TFT-LCD computer monitor (screen resolution 1,366 $\times $ 768 pixels; refresh rate 60 Hz). Before the experiment, a short slide presentation was displayed to explain the task. The timeline of an experimental trial was shown in Fig. 3. Face stimuli were presented in the middle of the screen (subtending a visual angle of approximately 8$^{\circ }$ in height and 6$^{\circ }$ in width). Each trial started with a 1,000-ms-long

central fixation cross and followed by a blank screen for 250 ms. Then the first face image would appear and stay for 500 ms. After that, a mask screen would show and last for 250 ms. Next, the second face image was presented and disappeared after 500 ms, and a blank screen would show for 250 ms subsequently. In the end, the screen would show the sentence “Are they the same face?” for participants to respond.

3.1.3 Results

G1 and G2 produced two groups data of face recognition accuracy. SPSS 19.0 software was used to compare the two sets of data, and to find out whether there was a significant difference between two groups. The results are shown in Tables 1 and 2.

Table 1 The group statistics

Full size table

Table 2 Independent samples T test

Full size table

From Table 1, G1, the human recognition rate of face recognition was about 96 %. The mean value of G1 and G2 were basically the same. According to Table 2, the value of F was 0.631, concomitant probability was 0.429, greater than the significance level (0.05), equal variance assumed was accepted. Reference to the results of T test, the value of t was $-0.617$, concomitant probability was 0.539, greater than the significance level (0.05), and 95 % of confidence interval of the difference across 0. It could be accepted that the significant difference of average recognition rate between G1 and G2 does not exist. It meant G1 and G2 come from the same sample, which represented the accuracy of human face recognition.

It should be noticed that one trial of G2 contained two different images even they come from the same class (one from ORL, one from ORL_P). But people seemed to be completely free from the influence of the eyes-distance changes. In other words, it would not affect the accuracy of human face recognition when the eyes-distance was slightly adjusted.

3.2 The SRC recognition experiment

Similar to the human face recognition experiment, SRC recognition experiment used database ORL and ORL_P. ORL database had 40 classes, one class was corresponding to one people, each class contained ten face images. The experiment extracted training samples as follows: each class extracted five images (No. 1 to No. 5) from ORL database, totally $5\times 40=200$ images as training samples. The test samples are extracted as follows: each class extracted five images (No. 6 to No. 10) from ORL_P database, totally $5\times 40=200$ images as test samples.

Three methods which were often used in face recognition, Original SRC, PCA + SRC and GradientFace + PCA + SRC, were used to test the recognition properties of SRC algorithm. The PCA dimension used 150–200. The results are shown in Table 3.

Table 3 The SRC recognition test

Full size table

From Table 3, it was obvious that the SRC recognition rate was around 90 % in ORL database, but the recognition rate decreased rapidly in ORL_P database, in which the best recognition rate was 50 %. The only difference between ORL and ORL_P database was the slightly changed eyes-distance.

3.3 Discussion

According to the above test, there is an obvious difference between SRC face recognition and human face recognition. The difference is that slight change in face relational features would have obviously negative impact on the performance of SRC algorithms, but would not affect human recognition. Face relational features normally belongs to “holistic/configural” face features. In the actual environment, face relational features could be influenced by unstable elements, such as variable illumination, posture and incomplete face image, which will result in the performance degradation of SRC algorithms.

Since the core of the SRC algorithm is sparse linear representation, unaligned features will make it difficult to sparse linear representation. we assume that SRC algorithm needs the features extracted from the test samples and the training samples to be in strict alignment. It is too strict to recognize faces in the actual environment. Variable illumination, posture and incomplete face image all will change the face features positions or the values of face features, which would result in the performance degradation of SRC algorithms.

4 A twice face recognition algorithm based on SRC (T_SRC)

According to the above discussion, there is an obvious disadvantage in the SRC algorithm that the demand of features alignment is too strict. To overcome this disadvantage, we propose an SRC-based twice face recognition algorithm named T_SRC. Since it is difficult to solve the SRC problem in essence, T_SRC uses two steps to relax the features alignment limitation.

In the first step, T_SRC uses bidirectional PCA, linear discriminant analysis and GradientFace to extract more “holistic/configural” face features rather than sole algorithm. Then, according to the features, T_SRC selects some face images close to the test image to generate 1-step face database, which normally includes 6–8 people look likely the test people. The process of the first step is named multichannel analysis.

In the second step, T_SRC focuses on the facial details (“featural” face features). We suppose that the facial details are more stable in actual environment than the whole face because of its small size. Then, the SRC algorithm could decline the effects of features mismatch in actual environment, and get better performance than the whole face recognition.

4.1 Multichannel analysis

Many algorithms based on holistic approaches have the ability to extract some “holistic” features of one face. But the abilities are different. We study some algorithms and select three algorithms as the basic algorithms for multichannel analysis. Three basic algorithms are bidirectional PCA (Zuo et al. 2006), linear discriminant analysis (Belhumeur et al. 1997) and GradientFace (Zhang et al. 2009). The features extracted by PCA are still covered with variant illumination, which would affect the algorithm performance. However, the features extracted by LDA, BDPCA and GradientFace are not so sensitive to varied illumination, posture and incomplete face. But algorithms do not work effectively always. Sometimes, LDA is sensitive to different images of one people. Sometimes, GradientFace only recognizes the face contour features, and ignores the relational features. Sole algorithm cannot get ideal performance in actual environment. Therefore, multichannel analysis is proposed. Each channel uses one algorithm to recognize the face, and select several most likely faces as candidates. Through combining three channel’s candidates, we get the 1-step face database. Three algorithms are BDPCA, PCA + LDA and GradientFace + PCA + LDA.

4.2 Extract the facial details features

Since variant illumination would create the dark side in face images and affect the facial details, we choose SQI as the image preprocessing to remove the affection of varied illumination (Wang et al. 2004). In this way, the effect of illumination could be reduced. Figure 4b is the result of SQI extracting illumination invariant.

After SQI dealing with the image, we begin to extract the facial details. First, we use Harris point detector to extract the facial details before the second recognition (Mokhtarian and Suomela 1998; Cordelia et al. 2000). Since the window of detection is the Gaussian function and the point is detected after the smooth processing, the features are robust to the noise. Meanwhile, the point has rotation and translation invariance. This advantage of Harris point helps to reduce the effect of posture.

If the point is $( {x,y} )$, and the function of gray degree is $f( {x,y} )$. When the point $( {x,y} )$ moves to $( {\upsilon ,\nu } )$, the gray intensity is:

$$\begin{aligned} E( {x,y} ) = {\sum \limits _{\upsilon ,\nu } {{w_{\upsilon ,\nu }} \cdot [ {f( {x + \upsilon ,y + \nu } ) - f( {x,y} )}]} ^2} \end{aligned}$$

To simplify this formula,

$$\begin{aligned} E( {x,y}) = \sum \limits _{\upsilon ,\nu } {{w_{\upsilon ,\nu }}} \cdot \left[ {\begin{array}{*{20}{c}} {{{\left( {\frac{{\partial f}}{{\partial x}}} \right) }^2}}&{}{\frac{{\partial f}}{{\partial x}}\frac{{\partial f}}{{\partial y}}}\\ {\frac{{\partial f}}{{\partial x}}\frac{{\partial f}}{{\partial y}}}&{}{{{\left( {\frac{{\partial f}}{{\partial y}}} \right) }^2}} \end{array}} \right] {( {\upsilon ,\nu })^{T}}, \end{aligned}$$

where ${w_{\upsilon ,\nu }}$ denotes the coefficient of the Gaussian Function. $\frac{{\partial f}}{{\partial x}}$, $\frac{{\partial f}}{{\partial y}}$ denotes the variations on axis X, Y of the gray intensity.

$$\begin{aligned} {M} = \left[ {\begin{array}{*{20}{c}} {{{\left( {\frac{{\partial f}}{{\partial x}}} \right) }^2}}&{}{\frac{{\partial f}}{{\partial x}}\frac{{\partial f}}{{\partial y}}}\\ {\frac{{\partial f}}{{\partial x}}\frac{{\partial f}}{{\partial y}}}&{}{{{\left( {\frac{{\partial f}}{{\partial y}}} \right) }^2}} \end{array}} \right] \end{aligned}$$

denotes the autocorrelation matrix of the point.

At last, we could get corner point via formula (2):

$$\begin{aligned} \qquad R = \det M - k \cdot {( {\mathrm{trace}M})^2}. \end{aligned}$$

(6)

The result of the Harris point is shown in Fig. 4c.

However, we found that the point extracted by Harris point detector contains many false corner points. In order to reduce those false points which could not express the facial features, we adopt Gabor filter. We use a Gabor filter to make a convolution with face images. Figure 4d is the result of Gabor filters face image handling. From Fig. 4d, we could find the brightness of eyes, nose, mouth are more obvious than other points. Hence, we could get the coordinates of the facial features by comparing the brightness. For example, we could locate the coordinates of eyes based on Gabor filter (Li et al. 2006; Xiong et al. 2007).

Presume the gray value of point $( {x,y})$ is $f( {x,y})$ and the size of the image is $M \times N$.

1.
We locate the X-coordinate of eyes by vertical gray-level projection:
$$\begin{aligned} g( x ) = \sum \limits _{y = 1}^N {f( {x,y}).} \end{aligned}$$
Since the area of eyes is not a single pixel, but a certain scope, we must average the value:
$$\begin{aligned} \overline{g} ( x ) = \frac{1}{T}\sum \limits _{i = - \frac{2}{T}}^{\frac{2}{T}} {g( {x + i}).} \end{aligned}$$
T is the width of the eye area.
2.
Judge the X-coordinate of eyes. If $\overline{g} ( x ) > \overline{g} ( {x - 1} )$ and $\overline{g} ( x ) > \overline{g} ( {x + 1} )$, then the x may be the X-coordinate of eyes.
3.
We locate the Y-coordinate of eyes by horizontal gray-level projection:
$$\begin{aligned} h( y) = \sum \limits _{x = 1}^{\frac{M}{2}} {f( {x,y} ).} \end{aligned}$$
Since the eye locate in the upper face, we just calculate the gray value of the upper face. Then, we get the mean value of Y-coordinate:
$$\begin{aligned} \overline{h} ( y) = \frac{1}{T}\sum \limits _{i = - \frac{2}{T}}^{\frac{2}{T}} {h( {y + i}).} \end{aligned}$$
4.
Judge the Y-coordinate of eyes by the same way.

Figure 5 is the result of locating the coordinates of eyes based on Gabor filter.

In this method, we could locate some possible coordinates of eyes. Then, we could get rid of the false corner points of Harris with the help of Gabor filter.

Presume the point set extracted by Harris point detector is:

$$\begin{aligned} {{{P}}_{{1}}} = [ {{p_1}( {x,y} ),{p_2}( {x,y} ), \ldots ,{p_{{n}}}( {x,y})}]. \end{aligned}$$

n is the number of points. And the point set extracted by Gabor filter is:

$$\begin{aligned} {{{P}}_{{2}}} = [ {{p_1^{'}}( {x,y} ),{p_2^{'}}( {x,y}), \ldots ,{{p_{{m}}}^{'}}( {x,y} )}]. \end{aligned}$$

We calculate the distance between the points:

$$\begin{aligned} d = \Vert {{p_i}( {x,y} ) - p_j^{'}( {x,y} )} \Vert _2,\quad ( {1 \le i \le {{n,1}} \le {{j}} \le {{m}}}). \end{aligned}$$

If $d \le D$, where $D$ is the threshold value of distance, we choose $h( {x,y} ) = \frac{1}{2}( {{p_i}( {x,y} ) + p_j^{'}( {x,y} )})$ as the facial feature.

$$\begin{aligned} {{H}}( {x,y} )&= [ {{h_1}( {x,y} ),{h_2}( {x,y} ), \ldots ,{h_k}( {x,y})}],\\&( {k \le {{m}},k \le {{n}}}) \end{aligned}$$

After collecting all the facial feature points, we get the coordinates of the points as the center and cut out the part of image with the size of $30 \times 30$. The results of facial details after cutting are given in Figs. 6 and 7.

4.3 The process of T_SRC algorithm

The main process of T_SRC is shown in Fig. 8 and illustrated as follows:

1.
Form the training set:
$$\begin{aligned}\mathbf{{A}} = [ {{A_1},{A_2}, \ldots ,{A_{{k}}}}]\end{aligned}$$
using all of the face images ${v_{{{i,j}}}}({{i}} \le k,j \le n)$, k is the number of classes, n is the number of training images each class, usually equals 5.
2.
Apply multichannel analysis to preprocess the matrix A. That is, respectively, using BDPCA, PCA + LDA, GradientFace + PCA + LDA to process the matrix A, and get the results matrix A $_{1}$, A $_{2}$, A $_{3}$.
1. (a)
  BDPCA uses two matrixes left and right multiply the matrix A, and gets the matrix A $_{1}$. Normally, A $_{1}$ has less dimensions than A.
2. (b)
  PCA + LDA calculate the projecting matrix W from A, and the W is used to project the matrix A to matrix A $_{2}$.
3. (c)
  GradientFace algorithm processes the face images matrix A, getting matrix Tmp, and then PCA + LDA is used to process matrix Tmp, gets matrix A $_{3}$.
4. (d)
  The test image y also will be processed by multichannel analysis, just as matrix A, and gets the vectors T $_{1}$, T $_{2}$, T $_{3}$.
3.
Use SRC algorithm to calculate the sparse representation of the vector of the test image y in the three matrixes A1, A2 and A3.
$$\begin{aligned} {T_i} = {A_i} \cdot x,\quad i = 1\mathrm{-}3 \end{aligned}$$
4.
Select the most likely images to compose 1-step face database. The method is as follows: calculate the residuals between the test image and every class
$$\begin{aligned} {\gamma _j} = {\Vert {{T_i} - {A_i}{\delta _j}(x)} \Vert _2},\quad (1 \le i \le 3,1 \le j \le n). \end{aligned}$$
Select the 1–3 min ${\gamma _j}$ images from A1, A2, A3.
5.
Determine whether the second recognition is needed: if the difference between the minimum residual and other residuals is less than residual threshold T, the second recognition is needed; otherwise, we could identify the test image:
$$\begin{aligned} \mathrm{Identify} ( y) = \mathrm{min}({\gamma _{{j}}}). \end{aligned}$$
6.
If the second recognition is needed, we would pick 1-step face database and apply SQI to face images. Then, extract the facial details by Harris point and Gabor filter detector. The facial details features will be stored into the 2-step database.
7.
Extract the facial details of the test image.
8.
Identify the class of the test image by using 2-step database, facial details of the test image and SRC algorithm.

5 Experimental results and analysis

For the purpose of evaluating the performance of the proposed scheme T_SRC, we conducted the experiments on three typical face image databases: ORL database, extended YaleB database and the actual environment database. There were 40 people in total and each person had ten images in ORL database. The size of the image was ${{112 \times 92}}$. In the Extended YaleB database, there were 39 people in total and 25 images per person, and the size of an image was ${{192 \times 168}}$. In order to compare them conveniently, we selected ten images (the same images as ORL) from 25 images per person randomly. The actual environment database was collected by our computer front-camera via OpenCV. There were 36 persons and 10 images per person in the actual environment database. And the size of an image was ${{128 \times 128}}$. The face databases are shown in Fig. 9.

To explore the properties of T_SRC algorithm, the 1-step T_SRC recognition rate was compared with human recognition rate, and the 2-step T_SRC recognition rate was compared with SRC + PCA, SRC + BDPCA, SRC + LDA and SRC + GradientFace + LDA algorithms. The second stage of recognition algorithm was solely tested for studying the difference between the “holistic” features recognition and the details features recognition. At last, the residual threshold T of T_SRC algorithm was studied, which would affect the recognition rate to a certain extent.

5.1 Experiment of the 1-step T_SRC recognition rate

Multichannel analysis was the 1-step process of T_SRC algorithm, and the result was 1-step database, which includes several most likely images. In T_SRC, multichannel analysis was used to extract multi-dimension holistic features to recognize faces. It was hoped that multichannel analysis would get close to human face recognition rate.

This experiment compared 1-step T_SRC recognition rate with BDPCA, LDA and LDA + GradientFaces, respectively. The 1-step T_SRC recognition rate calculated as follows: for every test, if the 1-step database included the class of the test image, the test would be calculated as recognition success, and conversely, would be recorded as a failure. No. 1–5 images were selected from every database as the training set. The rest of the five images were taken as test images. 1–3 most likely images from each channel were selected to compose 1-step database. ORL used two most likely images, Extended YaleB used one most likely images and the actual environment database used three most likely images. The results are shown in Table 4.

Table 4 The recognition rate of 1-step T_SRC recognition (%)

Full size table

According to Table 4, every single channel had different recognition rate at different database, and none of them had equally good performance in three databases. Especially in actual environment database, every single channel recognition rates was low, but through combination of their results, the 1-step recognition rate was close to human recognition rate. The multichannel analysis technology was effective.

5.2 Experiment of the 2-step T_SRC recognition rate

Although 1-step T_SRC had high recognition rate, it was not the real recognition rate. It just meant some most likely classes included the right class of the test image. The 2-step T_SRC recognition rate needed to be tested.

This experiment tested the recognition rate of SRC + PCA, SRC + BDPCA, SRC + LDA, SRC + GradientFace + LDA, and T_SRC in three database, respectively. In the test, we chose 5, 6, 7, 8 images per person according to the serial number as the training images and the rest were the test images. The residual threshold T was 100. Tables 5, 6 and 7 showed the results of the experiment, with different number of the sample images 5, 6, 7, and 8. Figure 10 is the graphical expression of Tables 5, 6 and 7.

Table 5 The recognition rate of actual environment database (%)

Full size table

Table 6 The recognition rate of ORL database (%)

Full size table

Table 7 The recognition rate of extended YaleB database (%)

Full size table

From Fig. 10, the red lines showed the recognition rate of T_SRC algorithm, which had better recognition rate than other recognition algorithms, especially in actual environment database. According to the results of Tables 5, 6 and 7, the recognition rates of ORL and YaleB database were obviously better than actual environment database. Comparing these three databases, we found that the effect of illumination in ORL and YaleB databases was smaller than that in the images of the actual environment database. And the images in ORL and YaleB database were all front faces, but the images in the actual environment database depended on posture to some degree. And there were some incomplete face images in actual environment database. Besides, the backlight in actual environment database also increased the difficulty of correct recognition. It demonstrated that variable illumination, posture and incomplete face image would result in the performance degradation of SRC algorithms.

Comparing the recognition rate in three databases, we found that T_SRC had better recognition rate than other algorithms on the whole, especially in the actual environment database. Although the performance of T_SRC in ORL and YaleB databases was not ideal when it compared with SRC + BDPCA and SRC + GradientFace + LDA, respectively, it had better recognition rate and robustness in illumination, posture and incomplete faces. We inferred that the reason behind it lies in the second recognition stage of T_SRC—only “details” features were used and “holistic” features were ignored. It would result in some images not being distinguished accurately. To study this inference, the next experiment was implemented as follows.

5.3 Experiment of the second recognition of T_SRC

This experiment used “details” features to recognize face image, and compared the recognition rates with the “holistic” recognition such as SRC + PCA, SRC + BDPCA. This experiment used the actual environment database and other conditions that were the same as 5.2. The Table 8 is the result of the test. The results of SRC + PCA, SRC + BDPCA, 1-step T_SRC were the same as 5.1, 5.2. Since the calculated amount of extracting details was big, the experiment selected ten persons randomly from the database to compose details-test-database. And the recognition rate of the second stage of T_SRC, which used face “details” features to recognize, was tested in the details-test-database.

Table 8 The recognition rate of the second stage of T_SRC (%)

Full size table

Form Table 8, we concluded that both “holistic” and “details” features used together would benefit face recognition, but either one alone cannot have good performance. The improvements of T_SRC rely on the combination of 1-step and 2-step processes. And we believed that if more improvements of the second recognition could be made, T_SRC would be better.

5.4 Experiment of the residual threshold

The residual threshold T was used to determine whether the second recognition was needed. It would affect the recognition rate of the algorithm. This experiment was about the recognition rate of different T. The experimental database was the actual database, and the algorithm was T_SRC. The other conditions were the same as 5.2. The result is shown in Fig. 11.

From Fig. 11, we found that the recognition rate of T_SRC varied with different residual threshold. If the T was too small, T_SRC could not make the second recognition timely, which lead to the set of candidates being too small for its second recognition. It reduced the accuracy of T_SRC algorithm. If the T was too big, T_SRC would make the second recognition frequently, the candidates set that needed to have a second recognition would be too large. According to the experiment of 5.3, it would reduce the accuracy of T_SRC algorithm too. Thus, the appropriate residual threshold played a significant role in recognition rate.

6 Conclusions

With the increasing of the face recognition rate, the technology of face recognition is widely used in our life, such as public security forensic and management of houses security. Recently, people have higher expectations on the research of face recognition. To solve the problem of variant illumination, posture and incomplete face in face recognition, we study the differences between SRC recognition and human recognition, and find out there is an obvious disadvantage in the SRC algorithm. According to this observation, we propose a twice recognition algorithm for face recognition based on SRC named T_SRC. T_SRC could reduce the effect of variant illumination, posture and incomplete face image and improve the recognition rate. In experiments, we found that T_SRC has a better performance than other algorithms and is robust for posture and incomplete face, comparing with SRC + PCA, SRC + BDPCA, SRC + LDA, SRC + GradientFace + LDA.

However, there are still some problems to be further researched, such as how to combine the “holistic” and “details” features. Besides, how to extract facial details more precisely is also worth studying. We believe that the performance of T_SRC would be improved by the more precise and combinational facial details. All in all, the next phase of research is how to extract the facial details more precisely and how to combine the details to get holistic information for face recognition.

References

Belhumeur PN, Hespanha JP, Kriegman D (1997) Eigenfaces vs. fisherfaces: recognition using class specific linear projection. IEEE Trans Pattern Anal Mach Intell 19(7):711–720
Article Google Scholar
Bonnen K, Klare B, Jain A (2013) Component-based representation in automated face recognition. IEEE Trans Inf Forensics Secur 8(1):239–253
Article Google Scholar
Cordelia S, Roger M, Christian B (2000) Evaluation of interest point detectors. Int J Comput Vis 37(2):151–172
Article MATH Google Scholar
DeGutis J, Wilmer J, Mercado RJ, Cohan S (2013) Using regression to measure holistic face processing reveals a strong link with face recognition ability. Cognition 126(1):87–100
Article Google Scholar
Deng W, Jiani H, Jun G (2012) Extended SRC: undersampled face recognition via intraclass variant dictionary. IEEE Trans Pattern Anal Mach Intell 34(9):1864–1870
Article Google Scholar
Donoho DL (2006) Compressed sensing. IEEE Trans Inf Theory 52(4):1289–1306
Article MathSciNet MATH Google Scholar
Figueiredo MAT, Nowak RD, Wright SJ (2007) Gradient projection for sparse reconstruction: application to compressed sensing and other inverse problems. IEEE J Sel Topics Signal Process 1(4):586–597
Article Google Scholar
Galit Y, Brad D (2006) Specialized face perception mechanisms extract both part and spacing information: evidence from developmental prosopagnosia. J Cogn Neurosci 18(4):580–593
Article Google Scholar
Huang J, Huang X, Metaxas D (2008) Simultaneous image transformation and sparse representation recovery. In: IEEE conference on computer vision and pattern recognition, Anchorage, pp 1–8
Jafri R, Arabnia HR (2009) A survey of face recognition techniques. J Inf Process Syst 5(2):41–68
Article Google Scholar
John W, Yang AY, Ganesh A et al (2007) Feature selection in face recognition: a sparse representation perspective. In: Technical report, no. UCB/EECS-2007-99, pp 8–14
John W, Yang AY, Ganesh A et al. (2009) Robust face recognition via sparse representation. IEEE Trans Pattern Anal Mach Intell 31(2):210–227
Karczmarek P, Pedrycz W, Reformat M, Akhoundi E (2014) A study in facial regions saliency: a fuzzy measure approach. Soft Comput 18(2):379–391
Article Google Scholar
Li S, Liu D, Shen L (2006) Eye location method based on Gabor transform. Meas Control Technol 25(5):27–29
Google Scholar
Liao S, Jain AK, Li SZ (2013) Partial face recognition: alignment-free approach. IEEE Trans Softw Eng 35(5):1193–2005
Lu J, Tan Y, Wang G (2013) Discriminative multimanifold analysis for face recognition from a single training sample per person. IEEE Trans Pattern Anal Mach Intell 35(1):39–51
Mokhtarian F, Suomela R(1998) Robust image corner detection through curvature scale space. IEEE Trans Pattern Anal MachI ntell 20(12):1376–1381
Piepers DW, Robbins RA (2012) A review and clarification of the terms “holistic”, “configural”, and “relational” in the face perception literature. Front Psychol 3:article 559
Qian P (2011) Face recognition algorithms based on compressed sensing (doctor thesis). Chinese Science and Technology University, Hefei
Google Scholar
Renzi C, Schiavi S, Carbon C-C, Vecchi T, Silvanto J, Cattaneo Z (2013) Processing of featural and configural aspects of faces is lateralized in dorsolateral prefrontal cortex: a TMS study. NeuroImage 74:45–51
Article Google Scholar
Song X, Liu Z, Yang J, Wu X (2014) Using idea of three-step sparse residuals measurement to perform discriminant analysis. Soft Comput (published online 19 August 2014)
Tropp JA, Gilbert AC (2006) Signal recovery from partial information via orthogonal matching pursuit. IEEE Trans Inf Theory 53(12):4655–4666
Article MathSciNet Google Scholar
Wagner A, Wright J, Ganesh A et al (2012) Towards a practical face recognition system: robust registration and illumination by sparse representation. IEEE Trans Pattern Anal Mach Intell 34(2):372–386
Article Google Scholar
Wang X, Tang X (2010) Random sampling for subspace face recognition. Int J Comput Vis 70(1):91C104
Google Scholar
Wang H, Li SZ, Wang Y (2004) Face recognition under varying lighting conditions using self quotient image. In: IEEE international conference on automatic face and gesture recognition, Washington, pp 819–824
Xiong F, Zhang Y, Zhang G (2007) The eye location algorithm based on Gabor filter. Comput Digit Eng 35(11):16–18
Google Scholar
Xu Y, Zhu Q, Fan Z, Zhang D, Mi J, Lai Z (2013) Using the idea of the sparse representation to perform coarse-to-fine face recognition. Inf Sci 238:138–148
Article MathSciNet Google Scholar
Yang M, Zhang L, Shiu SCK, Zhang D (2013) Gabor feature based robust representation and classification for face recognition with Gabor occlusion dictionary. Pattern Recogn 46(7):1865–1878
Article Google Scholar
Yaniv T, Yang M, Aurelio RM, Lior W (2014) DeepFace: closing the gap to human-level performance in face verification. In: IEEE conference on computer vision and pattern recognition (CVPR), Columbus, pp 1701–1708
Zhang T, Qing C, Tang YY, Fang B, Shang Z (2009) Face recognition under varying illumination using gradientfaces. IEEE Trans Image Process 18(11):2599–2606
Article MathSciNet Google Scholar
Zhou W (2012) The research of compressed sensing in image processing. Doctor thesis, Shanghai Jiaotong University, Shanghai (2012)
Zhuang L, Yang AY, Zhou Z, Shankar SS, Ma Y (2013) Single-sample face recognition with image corruption and misalignment via sparse illumination transfer. In: IEEE conference on computer vision and pattern recognition (CVPR), Portland, pp 3546–3553
Zuo W, Zhang D, Yang J, Wang K (2006) BDPCA plus LDA: a novel fast feature extraction technique for face recognition. IEEE Trans Syst Man Cybern Part B Cybern 36(4):946–953
Article Google Scholar

Download references

Author information

Authors and Affiliations

College of Communication Engineering, Hangzhou Dianzi University, Hangzhou, 310018, China
Zhendong Wu, Zipeng Yu, Jie Yuan & Jianwu Zhang

Authors

Zhendong Wu
View author publications
You can also search for this author in PubMed Google Scholar
Zipeng Yu
View author publications
You can also search for this author in PubMed Google Scholar
Jie Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Jianwu Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhendong Wu.

Additional information

Communicated by V. Loia.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wu, Z., Yu, Z., Yuan, J. et al. A twice face recognition algorithm. Soft Comput 20, 1007–1019 (2016). https://doi.org/10.1007/s00500-014-1561-9

Download citation

Published: 17 December 2014
Issue Date: March 2016
DOI: https://doi.org/10.1007/s00500-014-1561-9

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A twice face recognition algorithm

Abstract

Similar content being viewed by others

The Realization of Face Recognition Algorithm Based on Compressed Sensing (Short Paper)

Face recognition using a new compressive sensing-based feature extraction method

Single-Sample Face Recognition via Fusion Variant Dictionary

1 Introduction

2 Sparse representation-based classification (SRC)