Keywords

1 Introduction

Face Recognition and verification have been at the top of the research agenda of the computer vision community for more than a decade. The scientific interest in this research topic has been motivated by several factors. The main attractor is the inherent challenge that the problem of face image processing, face detection and recognition. However, the impetus for better understanding of the issues raised by automatic face recognition is also fuelled by the immense commercial significance that robust and reliable face recognition technology would entail. Its applications are envisaged in physical and logical access control, security, man–machine interfaces and low bitrate communication.

To date, most of the research efforts, as well as commercial developments, have focused on two dimensional (2D) approaches. This focus on monocular imaging has partly been motivated by costs but to a certain extent also by the need to retrieve faces from existing 2D image and video database. With recent advances in image capture techniques and devices, various types of face-image data have been utilized and various algorithms have been developed for each type of image data. Among various types of face images, a 2D intensity image has been the most popular and common image data used for face recognition because it is easy to acquire and utilize. It, however, has the intrinsic problem that it is vulnerable to the change of illumination. Sometimes the change of illumination gives more difference than the change of people, which severely degrades the recognition performance. Therefore, illumination-controlled images are required to avoid such an undesirable situation when 2D intensity images are used. To overcome the limitation of 2D intensity images, Three Dimensional (3D) images are being used, such as 3D meshes and range images. A 3D mesh image is the best 2D representation of 3D objects. It contains 3D structural information of the surface as well as the intensity information of each point. By utilizing the 3D structural information, the problem of vulnerability to the change of illumination can be solved. A 3D mesh image is suitable image data for face recognition, but it is complex and difficult to handle.

A range image is simply an image with depth information. In other words, a range image is an array of numbers where the numbers quantify the distances from the focal plane of the sensor to the surfaces of objects within the field of view along rays emanating from a regularly spaced grid. Range images have some advantages over 2D intensity images and 3D mesh images. First, range images are robust to the change of illumination and color because the value on each point represents the depth value which does not depend on illumination or color. Also, range images are simple representations of 3D information. The 3D information in 3D mesh images is useful in face recognition, but it is difficult to handle. Different from 3D mesh images, it is easy to utilize the 3D information of range images because the 3D information of each point is explicit on a regularly spaced grid. Due to these advantages, range images are very promising in face recognition.

The majority of the 3D face recognition studies have focused on developing holistic statistical techniques based on the appearance of face range images or on techniques that employ 3D surface matching. A survey of literature on the research work focusing on various potential problems and challenges in the 3D face recognition can be found in the survey [15]. Gupta et al. [6] presented a novel anthropometric 3D face recognition algorithm. This approach employs 3D Euclidean and Geodesic distances between 10 automatically located anthropometric facial fiducial points and a linear discriminant classifier with 96.8 % recognition rate. Lu et al. [7] constructed many 3D models as registered templates, then they matched 2.5D images (original 3D data) to these models using iterative closest point (ICP). Chang et al. [8] describe a “multi-region” approach to 3D face recognition. It is a type of classifier ensemble approach in which multiple overlapping sub regions around the nose are independently matched using ICP and the results of the 3D matching are fused. Jahanbim et al. [9] presented an approach of verification system based on Gabor features extracted from range images. In this approach, multiple landmarks (fiducials) on face are automatically detected, and also the Gabor features on all fiducials are concatenated, to form a feature vector to collect all the face features. Hiremath et al. [10] have discussed the 3D face recognition by using Radon Transform and PCA with recognition accuracy of 95.30 %. Tang et al. [11] presented a 3D face recognition algorithm based on sparse representation. In this method they used geometrical features, namely, triangle area, triangle normal and geodesic distance.

In this proposed method, our objective is to propose Discriminant Analysis method for face recognition based on Radon Transformation, principal component analysis (PCA) and linear discriminant analysis (LDA) which are applied on 3D facial range images. The experimentation is done using the Texas 3D face database [12].

2 Materials and Methods

For experimentation, we consider the Texas 3D Face Database [12]. The 3D models in the Texas 3D Face recognition Database were acquired using an MU-2 stereo imaging system. All subjects were requested to stand at a known distance from the camera system. The stereo system was calibrated against a target image containing a known pattern of dots on a white background. The database contains 1,149 3D models of 118 adult human subjects. The number of images of each subject varies from 2 per subject to 89 per subject. The subjects age ranges from minimum 22 to maximum 77 years. The database includes images of both males and females from the major ethnic groups of Caucasians, Africans, Asians, East-Indians, and Hispanics. The facial expressions present are smiling or talking faces with open/closed mouths and/or closed eyes. The neutral faces are emotionless.

3 Proposed Method

3.1 Radon Transform

The Radon transform (RT) is a fundamental tool in many areas. The 3D radon Transform is defined using 1D projections of a 3D object\( f\left( {{\text{x}},{\text{y}},{\text{z}}} \right) \) where these projections are obtained by integrating\( f\left( {{\text{x}},{\text{y}},{\text{z}}} \right) \)on a plane, whose orientation can be described by a unit vector\( \overrightarrow {\alpha } \). Geometrically, the continuous 3D Radon transform maps a function\( {\mathbb{R}}^{3} \) into the set of its plane integrals in\( {\mathbb{R}}^{3} \). Given a 3D function\( f(\overrightarrow {x} )\, \triangleq \,f(x,y,z) \) and a plane whose representation is given using the normal\( \overrightarrow {\alpha } \) and the distance s of the plane from the origin, the 3D continuous Radon Transform of ‘f’ for this plane is defined by

$$ \begin{aligned} \Re f(\overrightarrow {a} ,s) & =\int\limits_{ -\infty }^{\infty } {\int\limits_{ - \infty }^{\infty} {\int\limits_{- \infty }^{\infty } {f(\overrightarrow {x}) } } }\delta\,(\overrightarrow {x}^{T} \alpha \, -\,s)\,d\overrightarrow {x} \\& = \int\limits_{ - \infty}^{\infty } {\int\limits_{ - \infty}^{\infty } {\int\limits_{ -\infty }^{\infty } {f(x,y,z)\,\delta\,(x\,{ \sin }\,\theta \,{ \cos }\,\phi\, + \,y\,{ \sin }\,\theta\,{ \sin }\,\phi \, + \,z\,{ \cos }\,\theta \, -\,s)\,dxdydz} } } \\\end{aligned} $$

where\( \overrightarrow {x} \, = \, \left[ {{\text{x}},\,{\text{y}},\,{\text{z}}} \right]{\text{T}},\,\overrightarrow {\alpha } \, = \,[\sin \theta \,\cos \phi ,\,\sin \theta \,\sin \phi ,\,\cos \theta ]\,{\text{T}} \), and\( \delta \) is Dirac’s delta function defined by\( \delta (x)\, = \,0,\,x\, \ne \,0, \) \( \int\nolimits_{ - \infty }^{\infty } {\delta (x)\,dx\, = \,1} \). The Radon transform maps the spatial domain (x, y, z) to the domain (\( \overrightarrow {\alpha } ,\,s \)) are not the polar coordinates of (x, y, z). The 3D continuous Radon Transform satisfies the 3D Fourier slice theorem [10,13].

3.2 Linear Discriminant Analysis

The principal component analysis (PCA) is a standard technique used to approximate the original data with lower dimensional feature vectors [14,15]. The basic approach is to compute the eigenvectors of the covariance matrix, and approximate the original data by a linear combination of the leading eigenvectors. The mean square error (MSE) in reconstruction is equal to the sum of the remaining eigenvalues. The coefficients of projection of an arbitrary data vector along the principal components (eigenvectors) form the feature vector. In PCA, since no class membership information is used, the data vectors of the same class and of different classes are treated similarly. In the linear discriminant analysis (LDA), the class membership information is used to emphasize the variation of data vectors belonging to different classes and to deemphasize the variations of data vectors within a class. The LDA produces an optimal linear discriminant function\( f\,(x)\, = \,W^{T} x \) which maps the input into the classification space in which the class identification of this sample is decided based on some metric such as Euclidian distance [16,17]. A typical LDA implementation is carried out via scatter matrices analysis. The within and between-class scatter matrices as follows:

$$ \begin{aligned} S_{w} = & \,\frac{1}{M}\sum\limits_{{i = 1}}^{M} {\Pr \left( {C_{i} } \right)\sum _{i} } \\ S_{b} = & \,\frac{1}{M}\sum\limits_{{i = 1}}^{M} {\Pr \left( {C_{i} } \right)\left( {m_{i} - m} \right)\left( {m_{i} - m} \right)^{T} } \\ \end{aligned} $$

Here\( S_{w} \) is the within-class scatter matrix showing the average scatter\( \sum_{i} \) of the sample vectors x of different class\( C_{i} \) around their respective mean\( m_{i} \):

$$ \sum_{i} \, = \,E\left[ {(x\, - \,m_{i} )\;(x\, - \,m_{i} )^{T} \left| {C\, = \,C_{i} } \right.} \right] $$

Similarly,\( S_{b} \) is the between-class scatter matrix, representing the deviation of the conditional mean vectors mi’s from the overall mean vector m. Various measures are available for quantifying the discriminatory power, the commonly used one being,

$$ J(W)\, = \,\frac{{\left\| {W^{T} S_{w} W} \right\|}}{{\left\| {W^{T} S_{b} W} \right\|}}, $$

whereW is the optimal discrimination projection and can be obtained via solving the generalized eigenvalues problem\( S_{b} W\, = \,\lambda S_{w} W \). The distance measure used in the matching could be a simple Euclidian distance. Thus, the fundamental difference between the PCA and LDA approaches is that, while PCA performs eigenvalues analysis an covariance matrix, the LDA does it on scatter matrices.

3.3 Proposed Methodology

The Radon transform is applied to an input facial range image I1 in steps of h from 0° to 180° orientations; where h may be 1°–3° or any convenient value. It yields a binary image I2 with facial area being segmented. After superposing I2 and I1, the cropped facial range image I3 is obtained. Next, the principal component analysis (PCA) technique is applied to the complete set of such cropped facial range images corresponding to the face images in the face database. It yields the set of eigenfaces. After yielding the eigenfaces we perform linear discriminant analysis (LDA) to these eigenfaces which are used for face recognition in a given test face range image. The Figs.  1 and 2 shows the overview of proposed framework and intermediate results of the Radon transformation of an input face image respectively. The algorithms of the training phase and the testing phase of the proposed method are given below:

Fig. 1
figure 1

Overview of proposed framework

Fig. 2
figure 2

a The range image I1.b Radon transform of I1 in 0o–180o orientation.c Binary image I2 obtained after radon transformation.d Cropped facial range image I3 after superposing I2 and I1

Algorithm 1: Training Phase

  1. 1.

    Input the range image I1 from the training set containing M images (Fig.  2a).

  2. 2.

    Apply Radon transform, from 0° to 180° orientations (in steps ofh), to the input range image I1 yielding a binary image I2 (Fig.  2c).

  3. 3.

    Superpose the binary image I2 obtained in the Step 2 on the input range image I1 to obtain the cropped facial range image I3 (Fig.  2d).

  4. 4.

    Repeat the Steps 1–3 for all the M facial range images in the training set.

  5. 5.

    Apply PCA to the set of cropped facial range images obtained in the Step 4 and obtain M eigenfaces.

  6. 6.

    Compute the weights\( w_{1} ,w_{2} , \ldots ,w_{p} \) for each training face image, where p < M is the dimension of Eigen subspace on which the training face image is projected.

  7. 7.

    Store the weights\( w_{1} ,w_{2} , \ldots ,w_{p} \) for each training image as its facial features in the PCA feature library of the face database.

  8. 8.

    Perform LDA on the feature subspace (i.e., weight vectors).

  9. 9.

    Store the LDA components (feature vectors) in the LDA feature library of the face database.

Algorithm 2: Testing Phase

  1. 1.

    Input the test range image Z1.

  2. 2.

    Apply Radon transform, from 0° to 180° orientations (in steps of h), to the input range image Z1 yielding a binary image Z2.

  3. 3.

    Superimpose the binary image Z2 on Z1 to obtain the cropped facial image Z3.

  4. 4.

    Compute the weights\( w_{i}^{test} ,i\, = \,1,2, \ldots ,p \), for the test image Z1 by projecting the test image on the LDA feature subspace of dimensionp.

  5. 5.

    Compute the Euclidian distance D between the feature vector\( w_{i}^{test} \) and the feature vectors stored in the LDA feature library.

  6. 6.

    The face image in the face database, corresponding to the minimum distance D computed in the Step 5, is the recognized face.

  7. 7.

    Output the texture face image corresponding to the recognized facial range image of the Step 6.

4 Results and Discussion

For experimentation, we consider the Texas 3D face database [12]. The proposed method is implemented using Intel Core 2 Quad processor @ 2.66 GHz machine and MATLAB 7.9. In the training phase, 2 frontal face images with neutral expression of each 100 subjects are selected as training data set. In the testing phase, randomly chosen 200 face images of the Texas 3D face database with variations in facial expressions are used. The sample training images which are used for our experimentation are shown in the Fig.  3, and their corresponding texture images are shown in the Fig.  4. The eigenfaces and mean facial range image computed for PCA during the training phase are shown in the Figs.  5 and 6, respectively.

Fig. 3
figure 3

The first five range images of the training dataset

Fig. 4
figure 4

The facial texture images corresponding to the training range images of the Fig.  3

Fig. 5
figure 5

The first five eigenfaces obtained by using the PCA in the training phase

Fig. 6
figure 6

Mean facial range image computed in the Step 5 for PCA during the training phase

The comparison of recognition rates and times obtained by the proposed (RT + PCA + LDA) approach, PCA (alone) and RT + PCA approach is presented in the Table  1. The projection orientation of Radon transform is in steps of 1°, 2° and 5°. The 2, 4 and 5 LDA components have been considered. We observe that the proposed method, namely, RT (with steps of 1° orientation) with PCA and LDA, yields better results as compared to the PCA (alone) and RT + PCA method.

Table 1 The face recognition accuracy (%) obtained by the proposed method using different number of eigenfaces and LDA components

The graph of recognition rates versus the number of eigenfaces is shown in the Fig.  7 for the proposed method (RT + PCA + LDA) with other PCA and RT + PCA methods. It is observed that the recognition rate improves as the number of eigenfaces is increased. It is 99.16 % for 40 eigenfaces in case of proposed method. Further, the proposed method based on RT, PCA and LDA outperforms the PCA method.

Fig. 7
figure 7

The recognition accuracy (%) versus the number of eigenfaces of the proposed method and the other methods

We compare the rank-one recognition rates of the proposed method to the state-of-the-art 3D face recognition methods, namely, 3D face recognition using RT + PCA [10], LDA [8] and sparse representation [11] in the Table  2.

Table 2 Comparison of the proposed method with the state-of-the-art 3D face recognition algorithms

5 Conclusion

In this paper, we have proposed a novel hybrid method for Three Dimensional (3D) face recognition using Radon transform with PCA and LDA based features on face range images. In this method the LDA based feature computation can be done at high speeds, since only few LDA components are adequate to yield good classification results. Our experimental results yield 99.16 % recognition performance with a small number of features, which compares well with other state-of-the-art methods. The experimental results demonstrate the efficacy and the robustness of the method to illumination and pose variations. The recognition accuracy can be further improved by considering a larger training set and a better classifier.