1 Introduction

Face recognition is one of the rapidly developed biometric technique among other identification techniques like fingerprint recognition, hand geometry, handwriting verification, signature verification, retina scan, voice recognition and iris scanner. One of the main advantage of face recognition technique is that does not require the collaboration of the individual. Human face is a biometric trait that can be easily used for identification or authentication purpose in various security systems, without passers-by even being aware of the system. In the traditional identification systems, a key or password has to remembered, which could be forgotten with passage of time or could be stolen by some intruder, which is not possible with the biometrics, as they are integral to a person. Even after decades of research in face recognition field, face recognition systems are not plenarily reliable due to spoofing attacks. Face liveness module can be integrated to face recognition systems, to ameliorate its reliability and efficacy against such spoofing attacks. In this research, three liveness indicators have been used utilized for more precision and reliability.

A face recognition system acquires the facial image of candidate, normalized or enhanced that image, extract and compare the facial features of image against stored features in database for recognizing purpose. Face recognition systems can generally be categorized as holistic methods and feature based methods [1]. All of these methods have certain advantages as well as disadvantages also. Particular face recognition method could be selected depending upon specific requirements. Many accurate and efficient face recognition systems are available, but have low reliability due to the problem of spoofing. Face recognition algorithms should be capable to differentiate a live face from non-authentic face [2]. Face spoofing is an attack where an intruder tries to bypass the face recognition system. In this attack intruder directly interacts with the system like a normal user.

Based on type of liveness indicator face liveness detection can be separated into three main categories i.e. (a) Motion Analysis (b) Texture Analysis and (c) Life Sign Detection [3]. Motion analysis refers to motion features, in texture analysis texture descriptors are used and life sign detection refers to biometrics like eye-blinking, head and lip movement etc. Pose estimation algorithm proposed for comparing user’s head movements as per the given instructions, is time consuming, cumbrous and requires user’s collaboration [4]. 3D model provides 3D face information but the limitation is rigidness and lack of physiological information. Human vision system is capable of identifying physiological liveness clues such as facial expression variation and head rotation etc. easily, but the task of computing these clues for a computer is very complicated till date [5]. 3D scanner was used based on 3D structural properties of live face [6]. Main drawback of this method is increased system cost as expensive 3D optoelectronic sensor is required.

Sparse method, collaborative representation method and a combination of these methods had been proposed for lookalike faces [7]. Features had been extracted from facial depth and texture for lookalike faces from a single 2D frontal image. Unalterable depth by lookalike faces had been mainly assumed by the authors. Another method for liveness detection based on the optical flow fields was proposed [8]. Difference was analysed in the optical flow fields generated by 2D planner objects and 3D objects, by considering various properties such as moving forwards/backwards, rotation, translation etc. This method was affected by background sensitivity and illumination changes. Eye blinking had been used for detecting the liveness where Conditional Random Fields model (CRF) was used [9]. In another method, Local Binary Pattern (LBP) had been implemented and texture pattern had been analysed [10]. Dynamic Mode Decomposition algorithm (DMD) had been proposed along with local binary patterns (LBPs), and support vector machines (SVMs) for detecting face spoofing [11]. CRF model has been compared by the authors with a discriminative Cascaded AdaBoost Model and generative HMM Model [12].

Most researchers utilized eyeblink as it is an essential function of eyes. For effective and more reliable face liveness detection combined biometric traits could be utilized instead of system based on capturing only eyeblink. A robust anti-spoofing liveness detection method has been proposed by considering three liveness indicators simultaneously. In the present work novel algorithm for face liveness detection has been proposed. In proposed algorithm Eyeblink, lip movement and chin movement are recorded as the liveness indicators in the In-House dataset for enhanced and secured face recognition system. Changes in the consecutive frames have been analysed to detect motion for ensuring face liveness. The proposed anti-spoofing method compares a static background frame with the current frame of a video sequence on pixel by pixel basis to detect the motion. For testing and result analysis purpose three datasets have been used for the proposed face liveness detection method. Eyeblink has been captured in first two datasets while eyeblink, lip movement and chin movement have been captured simultaneously in the In-House dataset and experimental results are validated.

2 Datasets Used

2.1 ZJU Eyeblink Dataset

This dataset is available on the web publicly [5]. It has 80 video clips of 20 candidates in AVI format. Four clips per candidates have been captured in this dataset. 1st clip is the front view of candidate without glasses, 2nd clip is the front view of candidate with thin rim glasses, 3rd clip is the front view of candidate having thin black frame glasses while in 4th clip candidate is looking upwards without any glasses.

2.2 Print-Attack Replay Dataset

This dataset has been developed by Idiap Research Institute based in Switzerland. This dataset contains 200 video clips of 50 clients, in which photo attack have been attempted. Print-Attack Replay dataset also has 200 real access videos of the same clients. The data of this dataset has been split in 3 sub categories. Training category has 30% of data; development category has 30% of data while 40% of data has been covered under test category. Different clients have been used for different categories. Higher resolution photos of users have been captured under the same lighting conditions for generating photo attack [13].

2.3 In-House Dataset

For Liveness detection based on Eyeblink, lip movement and chin movement, In-House dataset has been developed by us at Sant Longowal Institute of Engineering and Technology (SLIET). This dataset consists of 65 video clips of students of Department of Computer Science and Engineering, SLIET. Video clips have been captured under different lighting conditions. Videos have been acquired by Sony Handy Camera SR 68E with a frame rate of 25 fps for at least 10 s. High resolution photos of valid users have been captured under same lightning conditions in computer lab for generating photo attack. This dataset has been divided into two categories. The first category consists of videos of authentic users while the other category consists of the pictures of valid users held by attacker.

3 Face Liveness Detection Algorithm

The live person can move the entire face and facial features like eyes, lips etc. Opening and closing of eye, lip and chin movement have been extracted in the proposed method to test the liveness of a person. Average intensity has been calculated to detect the changes that take place in different frames for both the eye region (A) and the lip & chin region (B). Finally, average intensity has been compared against the threshold value (T) for detecting the face liveness.

Step 1 Average intensity (E) for Eyeblink detection of eye region (A) with p × q pixels has been calculated as

$${\text{E}} = \frac{1}{{{\text{p}} \times {\text{q}}}}\mathop \sum \limits_{{{\text{i}} = 1 }}^{\text{p}} \mathop \sum \limits_{{{\text{j}} = 1}}^{\text{q}} {\text{A}}\left( {{\text{i}},{\text{j}}} \right)$$
(1)

Step 2 Similarly, average intensity (M) for lip movement and chin movement detection of lip and chin region (B) with s × t pixels has been calculated as

$${\text{M}} = \frac{1}{{{\text{s}} \times {\text{t}}}}\mathop \sum \limits_{{{\text{i}} = 1 }}^{\text{s}} \mathop \sum \limits_{{{\text{j}} = 1}}^{\text{t}} {\text{B}}\left( {{\text{i}},{\text{j}}} \right)$$
(2)

Step 3 Compute liveness of eyeblink (QE) by comparing the average intensity (E) with threshold value (T). Liveness is detected if average intensity (E) is greater or equal to threshold value (T).

$${\text{Q}}^{\text{E}} = \left\{ {\begin{array}{*{20}l} {1,} \hfill & {{\text{if}}\left( {{\text{E}} \ge {\text{T}}} \right)} \hfill \\ {0,} \hfill & {\text{otherwise}} \hfill \\ \end{array} } \right.$$
(3)

Step 4 Compute liveness of lip and chin movement (QM) by comparing the average intensity (M) with threshold value (T). Liveness is detected if average intensity (M) is greater or equal to threshold value (T).

$${\text{Q}}^{\text{M}} = \left\{ {\begin{array}{*{20}l} {1,} \hfill & {{\text{if}}\left( {{\text{E}} \ge {\text{T}}} \right)} \hfill \\ {0,} \hfill & {\text{otherwise}} \hfill \\ \end{array} } \right.$$
(4)

4 Experimental Results and Discussions

Eyeblink, lip movement and chin movement of authentic user and fake attack have been tested to detect face liveness. Experimental results show Eyeblink has been effectively detected as shown in Figs. 1e, 2e and 4e. In case of Fig. 1 video clip of candidate with thin rimmed glasses has been considered from ZJU Eyeblink Dataset.

Fig. 1
figure 1

Eyeblink detection with glasses (ZJU Eyeblink Dataset)

Fig. 2
figure 2

Eyeblink detection of valid user (Print-Attack Replay Dataset)

Video clips have been used from the Print-Attack Replay Dataset in case of Figs. 2 and 3. Photo attack has been successfully detected in case of Fig. 3 as no motion has been detected from the photograph. Photo of authentic user is being held by the attacker in front of the camera. Video clips from In-House dataset have been used in case of Figs. 4 and 5. Eyeblink has been successfully detected as shown in Fig. 4e. Lip movement and chin movement has been successfully detected simultaneously as shown in Fig. 5e. Eye-mouth photo imposter attack has been successfully detected as shown in Fig. 6 by testing parallel response of both lip movement and chin movement, where the attacker tried to spoof the system by cutting the eye and lip region of valid user’s photo but no chin movement has been detected. Table 1 shows the average intensity values for eye region and chin region of clips from three different datasets (ZJU Eyeblink Dataset, Print-Attack Replay dataset and In-House dataset), for detecting face liveness.

Fig. 3
figure 3

Photo Attack-No Motion Detected (Print-Attack Replay Dataset)

Fig. 4
figure 4

Eyeblink detection (In-House dataset)

Fig. 5
figure 5

Lips and chin motion detected (In-House dataset)

Fig. 6
figure 6

Eye-mouth photo imposter attack (In-House dataset)

Table 1 Intensity calculations for different datasets

In our proposed method, multiple liveness indicators have been considered. Consecutive frames have been analysed, while eyeblink, lips and chin movement have been considered as multiple indicators for liveness detection. Average intensity has been calculated for eye region and mouth region, and compared against the threshold value for motion detection. Face liveness results have been validated based on True Positive Rate (TPR), True Negative Rate (TNR), False Positive Rate (FPR) and False Negative Rate (FNR) parameters.

The effectiveness of the proposed algorithm has been demonstrated using three different datasets. Comparison of True Positive Rate (TPR) by proposed algorithm for three different datasets has been graphically represented in Fig. 7 with single indicator (Eyeblink). Video clips from different datasets have been tested to compute the TPR. The proposed algorithm gives higher TPR even for single indicator utilizing the In-House dataset. TPR has been calculated as

$$TPR = TP/\left( {TP + FN} \right)$$
(5)

where TP is True Positive and TN is True Negative. TP is set to 1 when motion is captured in both input and output frames of video. TN is set to 1 when there is no motion in both input and output frames of video.

Fig. 7
figure 7

Comparison of TPR for eyeblink on different datasets

Table 2 shows the live detection rate comparison for CRF Model and the proposed algorithm on the ZJU Eyeblink Dataset. Liveness detection has been computed for both eyes for front facial images without glasses, with thin rim glasses, with black frame glasses and upward images without glasses. The average liveness detection rate for CRF Model is 94.65% for window size W = 3 while rate is 97.23% with our proposed algorithm.

Table 2 Comparison of liveness detection

Comparison of face liveness detection rate has been graphically represented as shown in Fig. 8. The proposed algorithm gives overall 97.23% liveness detection rate with single indicator by utilizing ZJU Eyeblink Dataset which outperforms 94.65% rate by CRF Eyeblink model. Comparison has been made for images having front face without glasses, with thin rim glasses, with black frame glasses and upward images without glasses.

Fig. 8
figure 8

Face liveness detection rate with eyeblink

Face liveness detection rate with multiple liveness indicators has been graphically represented in the Fig. 9 by utilizing the In-House dataset. Higher liveness detection rate (99.41%) clearly designates the prosperous implementation of the proposed algorithm with multiple liveness indicators as compared with the single indicator.

Fig. 9
figure 9

Liveness detection rate with multiple liveness indicators utilizing In-House dataset

5 Conclusion

This paper investigates multiple liveness indicators for detection the face liveness against spoofing. The face liveness detection algorithm has been prosperously implemented, as physical presence of person is required for liveness detection. Being essential biometric traits in human face, average intensity has been calculated for eyeblink, lip movement and chin movement. Determinately, average intensity has been compared with the threshold value for detecting the face liveness. Face liveness detection reliability has been ensured by placing different attacks from three datasets. This method effectively detects eye-mouth photo imposter attack as lip movement and chin movement response have also been computed simultaneously. The chin movement has been detected even with very little whispering of lips. The face liveness detection algorithm gives 99.41% face liveness detection rate with multiple liveness indicators utilizing In-House dataset.