Abstract
Representing and extracting good quality of facial feature extraction is an essential step in many applications, such as face recognition, pose normalization, expression recognition, human–computer interaction and face tracking. We are interested in the extraction of the pertinent features in 3D face. In this paper, we propose an improved algorithm for 3D face characterization. We propose novel characteristics based on seven salient points of the 3D face. We have used the Euclidean distances and the angles between these points. This step is highly important in 3D face recognition. Our original technique allows fully automated processing, treating incomplete and noisy input data. Besides, it is robust against holes in a meshed image and insensitive to facial expressions. Moreover, it is suitable for different resolutions of images. All the experiments have been performed on the FRAV3D and GAVAB databases.
Access provided by CONRICYT-eBooks. Download chapter PDF
Similar content being viewed by others
Keywords
1 Introduction
Automatic recognition and authentication of individuals is becoming an important research field of computer vision. The applications are varied and are used for the safety of goods and lives as they are utilized in airports, military places and banks as well as by anti-criminal police.
Other usages of 3D authentication are used for video surveillance, biometrics, autonomous robotics, intelligent human–machine interface, photography and image and video indexing. Therefore, different biometric characteristics such as fingerprint, vein, hand geometry, iris, can be used for automatic authentication. The drawbacks of these modalities are that the cooperation of the candidate is usually essential. Among these, the face recognition is a very important biometric modality. The main advantages are that the candidate does not need direct physical contact with the system, and it is a transparent process since hidden cameras can be used without the candidate’s knowledge.
It has been shown that 2D face recognition is sensitive to illumination changes and variation of pose and expression. 3D face recognition is likely to solve these problems. 3D face recognition is more robust against pose changes and can better overcome illumination [1]. In the last 5 years, a rapid increase in the usage of 3D techniques for applications in video, face recognition and virtual reality has taken place both in academy and industry applications. The acquisition technology of 3D images has become simpler and cheaper. The main advantage of the 3D data usage is that it conserves all the geometric information of the object, which is close to the reality representation.
In this paper, we work in unaffected region by the pose changes and the variation of facial expressions. Indeed, we focus on the anthropometric landmarks that play an important role to perform quantitative measurements. Indeed, we extract the most important features that help us to recognize the 3D face. Our features vector is composed of the Euclidean distances and angles between these keypoints, the entropy of the region of interest and the width of the forehead.
The paper is organized as follows. In Sect. 2, an overview of some research works is presented. The 3D face databases are shown in Sect. 3. In Sect. 4, we detail our contribution in 3D face characterization. The achieved experiments and the results are discussed in Sect. 5. The conclusion and the perspectives are presented in Sect. 6.
2 Related Works
Recently, many authors worked on 3D face. Among them, we may cite: the algorithm developed in [2] described differential geometry descriptors such as derivatives, coefficients of the fundamental forms, different types of curvatures and Shape Index. After that they computed the geodesic and Euclidean distances between landmarks, nose volume and ratios between geodesic and Euclidean distances.
The method in [3] based on a new triangular surface patch (TSP) descriptor to localize the landmark in the 3D face. This descriptor represents a surface patch of an object, along with a fast approximate nearest neighbour lookup to find similar surface patches from training heads.
In the research work of salzar et al. [4], each landmark is located automatically on the face surface by means of a Markov network. The network captures the statistics of a property of the surface around each landmark and the structure of the connections. The estimation of the location of landmark on a test model is carried out by using probabilistic inference over the Markov network. They performed inference using the loopy belief propagation algorithm.
Another work proposed in [5] represented a linear method, namely linear discriminant analysis, and a nonlinear method, namely AdaBoost to characterize the 3D face.
The work of Böer et al. [6] consisted in the Discriminative Generalized Hough Transform (DGHT) to describe the 3D face scans.
Ballihi et al. [7] are interested in finding facial a curve of level sets (circular curves) and streamlines (radial curves). Indeed, they used a geometric shape analysis framework based on Riemannian geometry to compare pairwise facial curves and a boosting method to highlight geometric features according to the target application.
Ramadan and Abdel-Kader [8] suggested to use spherical wavelet parameterization of the 3D face image.
In [9], feature extraction is done by geodesic distances and linear discriminant analysis (LDA).
Han et al. [10], based on the mapping of a deformable model to a given test image involves two transformations, rigid and nonrigid transformation. They utilized two intrinsic geometric descriptors, namely geodesic distances and Euclidean distances, to represent the facial models by describing the set of distance-based features and contours.
Berretti et al. [11] in their work proposed three different signatures to locally describe the 3Dface at the keypoints, namely the histogram of gradients (HOG), the histogram of orientations (SHOT) and the geometric histogram (GH).
In [12], Fang et al. performed the principal component analysis (PCA) method the feature space to reduce its dimensionality. Then, they applied LDA to find the optimal subspace that preserves the most discriminant information. In fact, these two methods are applied sequentially to reduce the feature dimension and find the optimal subspace. The feature vectors combine geometric information of the landmarks and the statistics on the density of edges and curvature around the landmarks.
The method introduced in [13], based on wavelet descriptors as a multiscale tool to analyze 3D face. Also, they used an ellipsoidal cropping through the detection of facial landmarks to detect and crop the facial region. In the step of the preprocessing of the facial region, they applied a procedure called Sorted Exact Distance Representation in order to fill holes. Gaussian filter is used to the range image to smooth the surface. Then, they are based on the Gaussian function as mother wavelets to extract the landmarks.
Therefore, in our work, we prefer to improve geometric descriptors which are based on an anthropometric measurements and Euclidean distances between feature points of a 3D meshed face.
Experimental results are tested on GAVAB and FRAV 3D databases. Our technique outperforms other latest methods in the state-of-the-art.
3 3D Face Databases
3.1 GAVAB Database
GAVAB is a database that consists of 549 3D facial range scans (Fig. 1) of 61 different subjects (45 are male and 16 are female) captured by a Minolta Vivid 700 scanner. The faces were placed at 1.5 ± 0.5 m of distance from the scanner. Although the chair (with wheels) was at 1.5 m of distance, the individuals had the flexibility to move their head normally. This fact produces any resolution changes from some images to others. The height of the chair could be changed in order to make the head always visible to the scanner [14].
Each image is a mesh of connected 3D points of the facial surface. Textured information of each vertex has been eliminated to reduce the size of all these face models. The subjects are all Caucasian, and most of them are aged between 18 and 40 [15]. Some examples in the GAVAB database are shown in Fig. 2.
Each subject has been scanned 9 times with many poses and different facial expressions. The scans with pose variations contain 1 facial scan while looking up (+35°), 1 while looking down (−35°), 1 for the right profile (+90°) and 1 for the left profile (−90°). Figure 3 shows some examples of pose variation [15].
The facial scan without pose changes include 4 different close frontal facial scans: 2 of them are with a neutral expression, 1 with a smile and 1 with an accentuated laugh. In Fig. 4, we find a different facial expression of some subjects in the database [15].
In addition, GavabDB is the noisiest dataset currently available to the public, showing many holes and missing data especially in the eyebrows (Fig. 5a). Also, the images of this database have much noise and nonfacial regions, such as neck, hands, shoulders and hair.(Fig. 5b). Moreover, each person has several scans with different poses and facial expressions.
The most important problem in this database is that all the scans of a given person do not have the same number of faces and vertexes (Fig. 6), which can cause a problem to the extraction process in the region of interest. Besides, this database shows a high interpersonal variation.
As a matter of fact, in this database we find all types of scans with pose variations, many facial expressions, occlusions, fat and slim persons, etc. Fig. 7 presents the variation between some individuals.
Therefore, we will use this database to evaluate our technique, since it has been used for automatic face recognition experiments and other possible facial applications such as pose correction or register of 3D facial models.
3.2 FRAV 3D Database
This database contains 106 subjects, with approximately one woman every three men. The data were acquired with a Minolta VIVID 700 scanner, which provides texture information and a VRML file. Some examples in the FRAV V3 database are shown in Fig. 8 [16].
Each person has 16 captures with different poses and lighting conditions, trying to cover all possible variations, including turns in different directions as it is shown in Fig. 9 [16].
Moreover this database provides scans with different facial expression as it is presented in Fig. 10.
In every case only one parameter was modified between two captures. This is one of the main advantages of this database, respect to others [16].
4 Proposed Approach
In this paper, a novel approach based on landmark localization and characterization of a 3D face is presented. It contains five essential steps (Fig. 11). Indeed, we begin with the lecture of the 3D mesh. Then we extract the region of interest (ROI) using the anthropometric proportions and a cropping filter. After that, a pre-treatment of the extracted part is needed. Finally, after the detection of the salient points, we use the Euclidean distance and the angle between them to determine the features vector.
4.1 Region of Interest Detection
We all know that any face is divided in three parts in the anthropometric proportions; the first part is between the higher forehead and eyebrows, the second is between the eyebrows and nose tip and the third region is between the nose tip and the chin.
We focus on the second part of the face where the number of vertexes is in the interval [1/3 2/3] of the image (Fig. 12). After that, we have worked exactly in the region of the nose tip that is located in the end of the second part.
The regions of interest (eyes + nose) are automatically detected for frontal scans and scans with pose variation. We choose to work on these regions above the mouth to avoid the expression variations. It is named a static part because it is weakly affected by the variations of facial expression while the lower part includes the mouth is strongly affected by the change of facial expression. In this part we can detect the keypoints that are used as the foundation to detect the most prominent features to compose the characteristic vector.
4.2 3D Face Preprocessing
Preprocessing data is a very important and nontrivial step of biometric recognition systems. We preprocess a 3D face to minimize the effects of the intrinsic noise of a face scanner, such as holes and spikes, including some undesired parts such as clothes, neck, ears and hair. This part is common to all situations including 3D scan preprocessing. Our preprocessing has three stages: segmentation, filling holes and Delaunay triangulation, median filter and denosing. The details of these stages are given in the following.
-
Segmentation
The first step of the pretreatment is the segmentation in which we split the 3D face in two areas. We are based on the anthropometric proportions of the face to detect the region of interest that we need (the eyes and the nose). Indeed, we are interested in the second part of the face. Then, we use a cropped filter to extract the ROI. Figure 13 shows the regions of interest that we work on. These regions are unaffected by the variations of facial expression.
-
Filling holes and Delaunay triangulation
The holes are created either because of the absorption of laser in dark areas, such as eyebrows and mustaches, or self-occlusion or open mouths [17]. These holes and noises in a 3D scan reduce the recognition system accuracy, that is why it is important to fill holes in the input mesh.
Though, building triangular mesh costs too much time and raises many issues; so we have applied Delaunay triangulation (Fig. 14) to fill holes in the mesh of the extracted face region so as to place them on a regular grid. This technic is not only very fast and easy, but also it solves this problem more efficiently.
-
Denosing
In this stage, we have used median filter to remove the undesirable spikes and clutter due to the 3D scanner (Fig. 15). This filter may also unnecessarily suppress actual and small variations on the data which appear in the 3D surface.
Finally, we perform a mesh heat diffusion flow in order to eliminate the maximum of the undesirable noise (Fig. 16).
4.3 3D Face Characterization
The Cranio-facial measurements are used in different fields: in sculpture to create well-proportioned facial ideals, in anthropology to analyze prehistoric human remains and more recently in computer vision (estimation of the orientation of the head, detection of characteristic points of a face) and computer graphics to create parametric models of human faces.
In this paper, we propose an approach of 3D face characterization using anthropometric measures based on seven salient landmarks: (left inner eye corner, left outer eye corner, right inner eye corner, right outer eye corner, centre of the root of the nose, nose tip and forehead point).
We extract the desired region of interest (eyes and nose) that is unaffected by the variations of pose, occlusions and facial expressions. We focus on the third phase of facial recognition system: the characterization. This step is the most important and interesting step in any recognition system. It takes place after the detection and preprocessing phase. This phase is based on an analysis of a 3D shape image.
After detecting the points of interest, we built our feature vector which is composed by:
-
Entropy oft he region of interest;
-
Euclidean distances;
Figure 17 shows the different Euclidean distances suggested in our approach as a feature to characterize the 3D face.
where as follows:
The Eqs. (1)–(5) determined, respectively, the length of the right eye, the length of left eye, the length of the root of the nose, the distance between the eyes and the distance between the eyes and nose. These features are calculated using Euclidean distance between the keypoints detected in the region of interest. We extract the coordinates of each point in the three axes (x, y, z). So, the Euclidean distance between two points implicates computing the square root of the sum of the squares of the differences between corresponding values.
-
Angles between keypoints
Figure 18 shows the angles between the keypoints. When (a) is the angle between nose tip, left outer eye corner and right outer eye corner, (b) is the angle between nose tip, left inner eye corner and right inner eye corner, (c) is the angle between nose tip, right inner eye corner and right outer eye corner, (d) is the angle between nose tip, left inner eye corner and left outer eye corner and (e) is the angle between the nose tip, centre of the root of the nose and centre of the forehead.
We use the trigonometric function to determine the degrees of angles between the landmarks. We use exactly the cosine theta that returns the angle whose cosine is the specified number.
This function calculates the angle between two three-dimensional vectors as it is shown above.
As it is shown in Fig. 19, there are lots of different shapes that noses can take (dorsal hump, long nose, tension tip, over protected tip, bulbous tip and under projected tip). Therefore, we suggest adding another primitive that is the angle between nose tip, centre of the root of the nose and centre of the forehead (Fig. 18e). This angle is highly differing from an individual to another, and it depends on the nose of people.
The principal keypoints detected are as follows:
-
A: Right outer eye corner.
-
B: Right inner eye corner.
-
C: Left inner eye corner.
-
D: Left outer eye corner.
-
H: Nose tip.
-
I: Centre of the forehead
The keypoints E, F and G are calculated from the principal keypoints. It indicated that:
-
E: Centre of the right eye.
-
F: Centre of the left eye.
-
G: Centre of the root of the nose.
-
Width of forehead;
Both the GAVAB and the FRAV 3D databases contain variety of scans between people. Each person has its properties and a specific form of the face (see Fig. 20). We propose to add a specific characteristic which is the width of the face to the feature vector. This feature helps us to determine the form of the face.
We begin by the extracting of the face in a circle inwhich the nose tip is its centre (Fig. 22). Hence, applying the Eqs. 6 and 7, we can calculate the width of the forehead as an arc of the circle. In Fig. 21, L is the arc of the circle.
where:
-
r = Radius: distance between the nose tip and the first vertex of the 3D face.
-
θ = Angle (in radians) between nose tip, left outer eye corner and right outer eye corner (Fig. 17a).
-
L = Length of the arc = Width of the forehead.
The advantage of our approach is that we work only in the upper part of the face so we work just in the half circle to extract the pertinent characteristics of the 3D face. These characteristics were then fed into a standard classifier such as support vector machines (SVM) to handle the classification.
5 Experiments and Results
We test our algorithm on the GAVAB and FRAV 3D face databases. Experimental results show that the proposed method is comparable to state-of-the-art methods in terms of its robustness, flexibility and accuracy (Table 1).
We suggest working with the geometric measurements because they have a permanent and high subjective perception. These signatures are the most discriminate and consistent that deal with pose variation, occlusions, missing data and facial expressions. Furthermore, the literature shows the absence of 3D biometric systems based on the fusion of these features.
Purposely, the major contributions of this paper consist of a new geometric features approach for 3D face recognition system handling the challenge of facial expression.
Moreover, in our fully automatic approach, we work just in the upper part of the 3D face. We success 100 % to extract this region. Therefore, it is not only permanent, but also fast in time of execution.
6 Conclusion and Perspectives
In our approach, we use the anthropometric measurements to extract the pertinent primitives of 3D face. We focus on the upper face because this region is unaffected by the variations of facial expressions. We use seven salient points that are unaffected by the variations of pose, occlusions and facial expressions. We use the Euclidean distances and the measure of angles between these points to extract the features vector of the 3D face. This vector is necessary in 3D face recognition system.
Future evaluations should be carried out on more challenging datasets with an increased number of images, also incorporating rotations and translations to improve the success rate.
References
Bockeler M, Zhou X (2013) An efficient 3D facial landmark detection algorithm with haar-like features and anthropometric constraints. In: International conference of the Biometrics Special Interest Group (BIOSIG), pp 1–8. IEEE
Vezzetti E, Marcolin F, Fracastoro G (2014) 3D face recognition: An automatic strategy based on geometrical descriptors and landmarks. Robot Auton Syst 62(12):1768–1776
Papazov C, Marks TK, Jones M (2015) Real-time 3D head pose and facial landmark estimation from depth images using triangular surface patch features. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4722–4730
Papazov C, Marks TK, Jones M (2015) Real-time 3D head pose and facial landmark estimation from depth images using triangular surface patch features. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4722–4730
Creusot C, Pears N, Austin J (2013) A machine-learning approach to keypoint detection and landmarking on 3D meshes. Int J Comput Vision 102(1–3):146–179
Böer G, Hahmann F, Buhr I et al (2015) Detection of facial landmarks in 3D face scans using the discriminative generalized hough transform (DGHT). Bildverarbeitung für die Medizin. Springer, Berlin, pp 299–304
Ballihi L, Ben Amor B, Daoudi M et al (2012) Boosting 3-D-geometric features for efficient face recognition and gender classification. IEEE Trans Inf Forensic Secur 7(6):1766–1779
Ramadan RM, Abdel-Kader RF (2012) 3D Face compression and recognition using spherical wavelet parametrization. Int J Adv Comput Sci Appl 3(9)
Hiremath PS, Hiremath M (2013) 3D face recognition based on deformation invariant image using symbolic LDA. Int J 2(2)
Han X, Yap MH, Palmer I (2012) Face recognition in the presence of expressions. J Softw Eng Appl 5:321–329
Berretti S, Werghi N, Del Bimbo A, Pala P (2013) Matching 3D face scans using interest points and local histogram descriptors. Comput Graph 37(5):509–525
Fang T, Zhao X, Ocegueda O et al (2011) 3D facial expression recognition: a perspective on promises and challenges. In: IEEE international conference on automatic face and gesture recognition and workshops, pp 603–610
Pinto SCD, Mena-Chalco JP, Lopes FM et al (2011) 3D facial expression analysis by using 2D and 3D wavelet transforms. In: 18th IEEE international conference on image processing (ICIP), pp 1281–1284
Hatem H, Beiji Z, Majeed R et al (2013) Nose tip localization in three-dimensional facial mesh data. Int J Adv Comput Technol 5(13):99
Zhang Y (2014) Contribution to concept detection on images using visual and textual descriptors (Doctoral dissertation, Ecully, Ecole centrale de Lyon)
Grgic M, Delac K (2013) Face recognition homepage. Zagreb, Croatia (www.face-rec.org/databases), 324. Accessed 1 May 2016
Drira H, Ben Amor B, Srivastava A et al (2013) 3D face recognition under expressions, occlusions, and pose variations. IEEE Trans Pattern Anal Mach Intell 35(9):2270–2283
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Sghaier, S., Souani, C., Faeidh, H., Besbes, K. (2017). Improved Approach for 3D Face Characterization. In: Dey, N., Santhi, V. (eds) Intelligent Techniques in Signal Processing for Multimedia Security. Studies in Computational Intelligence, vol 660. Springer, Cham. https://doi.org/10.1007/978-3-319-44790-2_13
Download citation
DOI: https://doi.org/10.1007/978-3-319-44790-2_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-44789-6
Online ISBN: 978-3-319-44790-2
eBook Packages: EngineeringEngineering (R0)