PCA-Based Face Recognition: Similarity Measures and Number of Eigenvectors

Borade, Sushma Niket; Deshmukh, Ratnadeep R.

doi:10.1007/978-3-319-41501-7_9

Sushma Niket Borade¹⁵ &
Ratnadeep R. Deshmukh¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9730))

Included in the following conference series:

International Conference on Image Analysis and Recognition

2840 Accesses
1 Citations

Abstract

This paper examines the performance of face recognition using Principal Component Analysis by (i) varying number of eigenvectors; and (ii) using different similarity measures for classification. We tested 15 similarity measures. ORL database is used for experimentation work which consists of 400 face images. We observed that changing similarity measure causes significant change in the performance. System showed best performance using following distance measures: Cosine, Correlation and City block. Using Cosine similarity measure, we needed to extract lesser images (30 %) in order to achieve cumulative recognition of 100 %. The performance of the system improved with the increasing number of eigenvectors (till roughly 30 % of eigenvectors). After that performance almost stabilized. Some of the worst performers are Standardized Euclidean, Weighted Modified SSE and Weighted Modified Manhattan.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Effect of Distance Measures on the Performance of Face Recognition Using Principal Component Analysis

Face Recognition Using PCA and Minimum Distance Classifier

A novel Mutual Information based PCA approach for face identification

Article 07 August 2023

Keywords

1 Introduction

There are many types of personal authentication systems and face recognition is one of the active research areas since last several decades. Several methods have been proposed to recognize faces [1–3]. There are two main categories of face recognition methods: feature-based and appearance-based [1]. Using appearance-based methods, a face image of size N × N pixels is represented by a vector in N ² dimensional space. Practically, these spaces are too large to perform robust and fast recognition of faces. To solve this problem, dimensionality reduction is done using Principal Component Analysis (PCA) technique. In 1987, PCA was first used to represent face images by Sirovich and Kirby [4]. Turk and Pentland applied PCA to face recognition and presented eigenfaces method in 1991 [5]. We study the effect of 15 similarity measures on the performance of face recognition using PCA. Following characteristics are used to measure the system performance: area above cumulative match characteristics (CMC), rate of recognition and percent of images needed to extract to achieve cumulative recognition of 100 %.

Organization of the paper is as follows: In Sect. 2, we present face recognition using PCA technique in detail. Various similarity measures are described in Sect. 3. In Sect. 4, experimental work and the results obtained are presented. Section 5 offers the conclusion.

2 Principal Component Analysis

We implemented face recognition using PCA as proposed by Turk and Pentland [5]. Let the gallery set of M face images be Г ₁ , Г ₂ ,…, Г _M. The average face image of the whole set is defined by

$$ \Psi = \frac{1}{M} \sum\nolimits_{i = 1}^{M} {\Gamma _{i} } $$

(1)

Each face image differs from the average face, Ѱ, by the vector $ \phi_{i} = \varGamma_{i} -\Psi $, where i = 1 to M. Find covariance matrix C as

$$ C = A A^{T} ,\;{\text{where}}\;{\text{matrix}}\;A = [\phi_{1} \phi_{2} .. \phi_{M} ]. $$

(2)

Matrix C is of size N ² by N ². It’s computationally expensive to find its N ² eigenvectors. Therefore form M by M matrix, $ L = A^{T} A $ and get its M eigenvectors, $ \upsilon_{i} $. The most significant M eigenvectors of C are found as:

$$ u_{l} = \sum\nolimits_{k = 1}^{M} {v_{lk} \phi_{k} , l = 1, \ldots , M } $$

(3)

Now from these M eigenvectors, consider M′ (<M) eigenvectors (with the highest M′ eigenvalues). New probe image is projected into the facespace using operation:

$$ w_{k} = u_{k}^{T}(\varGamma - \Psi ) $$

(4)

for k = 1, … , M′.

These values of w form projection vector $ \Omega = \left[ { w_{1} , w_{2} , \ldots , w_{{M^{'} }} } \right]. $ Probe image is then classified as belonging to closest face class by using some similarity measure.

3 Similarity Measures

Consider two feature vectors x and y of dimensions n each. The distances between these feature vectors can be calculated as [6–11]:

1.
City block distance (or Manhattan distance):

$$ d\left( {x,y} \right) = \sum\nolimits_{i = 1}^{n} {\left| {x_{i} - y_{i} } \right|} $$
(5)
2.
Euclidean distance:

$$ d\left( {x,y} \right) = \sqrt {\sum\nolimits_{i = 1}^{n} {(x_{i} - y_{i} )^{2} } } $$
(6)
3.
Squared Euclidean distance (Sum square error, SSE):

$$ d\left( {x,y} \right) = \sum\nolimits_{i = 1}^{n} {(x_{i} - y_{i} )^{2} }. $$
(7)
4.
Mean square error (MSE):

$$ d\left( {x,y} \right) = \frac{1}{n}\sum\nolimits_{i = 1}^{n} {(x_{i} - y_{i} )^{2} } $$
(8)
5.
Cosine distance:

$$ d\left( {x,y} \right) = - \frac{{\sum\nolimits_{i = 1}^{n} {x_{i} y_{i} } }}{{\sqrt {\mathop \sum \nolimits_{i = 1}^{n} x_{i}^{2} \mathop \sum \nolimits_{i = 1}^{n} y_{i}^{2} } }} $$
(9)
6.
Mahalanobis distance:

$$ d\left( {x,y} \right) = \sqrt {(x - y)S^{ - 1} (x - y)^{t} } $$
(10)
where S is the covariance matrix of the distribution.
7.
Standard Euclidean distance:

$$ d\left( {x,y} \right) = \sqrt {(x - y)V^{ - 1} (x - y)^{t} } $$
(11)
where V is the n by n diagonal matrix whose j ^th diagonal element is S(j) ², where S is the vector of standard deviations.
8.
Minkowski distance:

$$ d\left( {x,y} \right) = (\sum\nolimits_{i = 1}^{n} {\left| {x_{i} - y_{i} } \right|^{p} } )^{1/p} .$$
(12)
where p is a scalar exponent and here p > 0.
9.
Chebychev distance:

$$ d\left( {x,y} \right) = max_{i} \{ |x_{i} - y_{i} |\} $$
(13)
10.
Correlation distance:

$$ d\left( {x,y} \right) = - \frac{{n\mathop \sum \nolimits_{i = 1}^{n} x_{i} y_{i} - \mathop \sum \nolimits_{i = 1}^{n} x_{i} \mathop \sum \nolimits_{i = 1}^{n} y_{i} }}{{\sqrt {({\text{n}}\mathop \sum \nolimits_{{{\text{i}} = 1}}^{\text{n}} {\text{x}}_{\text{i}}^{2} - (\mathop \sum \nolimits_{{{\text{i}} = 1}}^{\text{n}} {\text{x}}_{\text{i}} )^{2})({\text{n}}\mathop \sum \nolimits_{{{\text{i}} = 1}}^{\text{n}} {\text{y}}_{\text{i}}^{2} - (\mathop \sum \nolimits_{{{\text{i}} = 1}}^{\text{n}} {\text{y}}_{\text{i}} )^{2} )}}} $$
(14)
11.
Canberra distance:

$$ d\left( {x,y} \right) = \mathop \sum \nolimits_{i = 1}^{n} \frac{{|x_{i} - y_{i} |}}{{\left| {x_{i} } \right| + |y_{i} |}} $$
(15)
12.
Modified SSE distance:

$$ d\left( {x,y} \right) = \frac{{\mathop \sum \nolimits_{i = 1}^{n}(x_{i} - y_{i} )^{2} }}{{\mathop \sum \nolimits_{i = 1}^{n} x_{i}^{2}\mathop \sum \nolimits_{i = 1}^{n} y_{i}^{2} }} $$
(16)
13.
Modified Manhattan distance:

$$ d\left( {x,y} \right) = \frac{{\mathop \sum \nolimits_{i = 1}^{n} |x_{i} - y_{i} |}}{{\mathop \sum \nolimits_{i = 1}^{n} |x_{i} |\mathop \sum \nolimits_{i = 1}^{n} |y_{i} |}} $$
(17)
14.
Weighted Modified SSE distance:

$$ d\left( {x,y} \right) = \frac{{\mathop \sum \nolimits_{i = 1}^{n} z_{i} \left( {x_{i} - y_{i} } \right)^{2} }}{{\mathop \sum \nolimits_{i = 1}^{n} x_{i}^{2}\mathop \sum \nolimits_{i = 1}^{n} y_{i}^{2} }},\quad z_{i} = \sqrt {1/\lambda_{i} } $$
(18)
where λ _i are eigenvalues.
15.
Weighted Modified Manhattan distance:

$$ d\left( {x,y} \right) = \frac{{\mathop \sum \nolimits_{i = 1}^{n} z_{i} \left| {x_{i} - y_{i} } \right|}}{{\mathop \sum \nolimits_{i = 1}^{n}|x_{i} | \mathop \sum \nolimits_{i = 1}^{n}|y_{i} |}},\quad z_{i} = \sqrt {1/\lambda_{i} } $$
(19)

In this paper we perform identification task. CMC curve graphically represents the performance of the identification system. It is a plot of rank values on the X axis and probability of correct identification at or below that rank on the Y axis [10, 12].

4 Experiments and Results

We tested performance of the system using ORL face image database [13]. Face images are taken at the AT and T Laboratories between April 1992 and April 1994. ORL database contains images of 40 different persons, with 10 images per person. The face images are taken at different times by varying lighting, facial details and facial expressions. All face images are frontal with some pose variation. Each image is of size 112 × 92 pixels with 256 gray levels.

For experimental work, each face image is resized to 50 × 40 pixels. Training is done by considering first 5 images and testing is done considering remaining five images per person. This gave us gallery set of 200 images and probe set of 200 images. Experiments are implemented using MATLAB® R2013a. We used nearest mean rule in which we computed a template for each identity in the database. The closest identity is chosen as the match.

The experimental results with 15 similarity measures are listed in Tables 1, 2 and 3. The performance for face recognition is measured by calculating the area above cumulative match characteristic curve (CMCA). If CMCA is smaller, it indicates better recognition performance. We show how many images (in percents) should be extracted to get cumulative recognition rate between 80 to 100 %. If it is smaller, it implies that fewer images need to be extracted to get required cumulative recognition rate. We also find recognition rate that is achieved if the closest match is extracted from the system. If this first one recognition rate is more, it indicates better results. In Tables 1, 2 and 3, we used subscripts to mark the best results.

Table 1. Performance using 20 % of features (40).

Full size table

Table 2. Performance using 60 % of features (120).

Full size table

Table 3. Performance using 90 % of features (180).

Full size table

Table 4 reports the sorted similarity measures w.r.t the performance of the system using following characteristics: (i) recognition rate, (ii) overall recognition accuracy (i.e. CMCA), (iii) images (in percent) extracted to get 100 % cumulative identification.

Table 4. Sorted similarity measures with respect to the performance of the system.

Full size table

The training set has 200 images (5 images per person). This produces 199 eigenvectors as there will be only (M − 1) meaningful eigenvectors with remaining eigenvectors having associated eigenvalues of zero [5]. Figure 1 shows the variation in the system performance with number of eigenvectors for top 6 performers. We need to extract fewer images (30 %) to achieve 100 % cumulative recognition using Cosine distance measure if 10–90 % of eigenvectors are used and correlation if 30–90 % of eigenvectors are used. Performance with respect to rank increases till 10 % and 12 % of eigenvectors for Cosine and Correlation similarity measures respectively, and later it stabilizes.

System achieved best performance using Cosine similarity measure (336.25–364.38) if 10–20 % of eigenvectors are used, 323.13 if 90 % of eigenvectors are used and Correlation (322.5–323.75) if 30–60 % of eigenvectors are used. For CMCA, top performance is shown by Correlation and Cosine similarity measures. Performance with these measures increases until approximately 30 % of eigenvectors are used, and then performance almost stabilizes.

We achieved largest recognition rates using Correlation (82.5–86 %) if 10–30 % of eigenvectors are used, Cosine distance (85–86 %) if 20–30 % of eigenvectors are used and City block distance (86–87.5 %) if 30–90 % of eigenvectors are used. Best recognition rate is achieved using City block measure. It shows increase in recognition rate with number of eigenvectors. The variation in the performance indicates that selecting similarity measure is a critical decision in designing a PCA-based face recognition.

5 Conclusions

This paper investigates 15 different similarity measures for face recognition using PCA. We examined the performance of the system by varying the number of eigenvectors. The experiments are conducted on ORL face database which has 400 face images. The best identification performance is reported using following similarity measures: Cosine, Correlation and City block. Using Cosine distance we need to extract fewer images to achieve 100 % cumulative recognition than using any other similarity measure. This research shows the effect of similarity measures on the performance of the system. It is observed that, as number of eigenvectors increased, recognition rates also increased. This observation is consistent with prior studies [10]. Performance of the system increased till roughly 30 % of eigenvectors. After that, it almost stabilized. Standardized Euclidean, Weighted Modified SSE and Weighted Modified Manhattan are worst performers to name a few with CMCA of (510.7–1538), (687.5–1323.8) and (734.4–1516.9) for 10–90 % of eigenvectors.

References

Zhao, W., Chellappa, R., Phillips, P.J., Rosenfeld, A.: Face recognition: a literature survey. ACM Comput. Surv. 35, 399–458 (2003)
Article Google Scholar
Jafri, R., Arabnia, H.R.: A survey of face recognition techniques. J. Inf. Process. Syst. 5, 41–68 (2009)
Article Google Scholar
Rao, A., Noushath, S.: Subspace methods for face recognition. Comput. Sci. Rev. 4, 1–17 (2010)
Article Google Scholar
Sirovich, L., Kirby, M.: Low-dimensional procedure for the characterization of human faces. J. Opt. Soc. Am. 4, 519–524 (1987)
Article Google Scholar
Turk, M., Pentland, A.P.: Eigenfaces for recognition. J. Cogn. Neurosci. 3, 71–86 (1991)
Article Google Scholar
Yambor, W., Draper, B.: Analyzing PCA-based face recognition algorithm: eigenvector selection and distance measures. In: Christensen, H., Phillips, J. (eds.) Empirical Evaluation Methods in Computer Vision, pp. 39–51. World Scientific Press, Singapore (2002)
Chapter Google Scholar
Mathwork Help for Similarity Measures. http://www.mathworks.com/help/stats/pdist2.html
Perlibakas, V.: Distance measures for PCA-based face recognition. Pattern Recogn. Lett. 25, 711–724 (2004)
Article Google Scholar
Miller, P., Lyle, J.: The effect of distance measures on the recognition rates of PCA and LDA based facial recognition. In: Digital Image Processing. Clemson University, Clemson
Google Scholar
Moon, H., Phillips, P.J.: Computational and performance aspects of PCA-based face recognition algorithms. Perception 30, 303–321 (2001)
Article Google Scholar
Borade, S.N., Deshmukh, R.R.: Effect of distance measures on the performance of face recognition using principal component analysis. In: Berretti, S. (ed.) Intelligent Systems Technologies and Applications, AISC, vol. 384, pp. 569–577. Springer, Heidelberg (2016)
Chapter Google Scholar
Biometrics Testing and Statistics. www.biometrics.gov/documnets/biotestingandstats.pdf
ORL Face Database. http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html

Download references

Author information

Authors and Affiliations

Department of Computer Science and Information Technology, Dr. Babasaheb Ambedkar Marathwada University, Aurangabad, 431004, India
Sushma Niket Borade & Ratnadeep R. Deshmukh

Authors

Sushma Niket Borade
View author publications
You can also search for this author in PubMed Google Scholar
Ratnadeep R. Deshmukh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sushma Niket Borade .

Editor information

Editors and Affiliations

University of Porto, Porto, Portugal
Aurélio Campilho
Department of Electrical, University of Waterloo, Waterloo, Ontario, Canada
Fakhri Karray

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Borade, S.N., Deshmukh, R.R. (2016). PCA-Based Face Recognition: Similarity Measures and Number of Eigenvectors. In: Campilho, A., Karray, F. (eds) Image Analysis and Recognition. ICIAR 2016. Lecture Notes in Computer Science(), vol 9730. Springer, Cham. https://doi.org/10.1007/978-3-319-41501-7_9

Download citation

DOI: https://doi.org/10.1007/978-3-319-41501-7_9
Published: 01 July 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41500-0
Online ISBN: 978-3-319-41501-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

PCA-Based Face Recognition: Similarity Measures and Number of Eigenvectors

Abstract

Similar content being viewed by others

Effect of Distance Measures on the Performance of Face Recognition Using Principal Component Analysis

Face Recognition Using PCA and Minimum Distance Classifier

A novel Mutual Information based PCA approach for face identification

Keywords

1 Introduction

2 Principal Component Analysis

3 Similarity Measures

4 Experiments and Results

5 Conclusions

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

PCA-Based Face Recognition: Similarity Measures and Number of Eigenvectors

Abstract

Similar content being viewed by others

Effect of Distance Measures on the Performance of Face Recognition Using Principal Component Analysis

Face Recognition Using PCA and Minimum Distance Classifier

A novel Mutual Information based PCA approach for face identification

Keywords

1 Introduction

2 Principal Component Analysis

3 Similarity Measures

4 Experiments and Results

5 Conclusions

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation