Keywords

1 Introduction

Facial recognition is one of the most popular topic in visual recognition. It becomes increasingly present in our daily life due to its wide range of applications such as security, home automation, photo identification on social networks.

Like face recognition, facial expression recognition follows the same principle. As pattern recognition problems, they consist of two important parts: features extraction and classification. Feature extraction plays a crucial role in the recognition stage: the richer the extracted features, the better the classification will succeed. Over the last few decades, many features have been developed and the most popular and successful is the Local Binary Patterns (LBP). Due to its simplicity and its efficiency, a large number of variants have been proposed, focusing on various configurations such as pixel neighborhood topology, thresholding and quantification, encoding and grouping complementary features.

This paper focuses on the neighborhood topology of the LBP variants dedicated to face and facial expression recognition: five new LBP-like descriptors are thus introduced and analysed which are Doubled Local Binary Pattern (d-LBP), Reduced Divided Local Binary Pattern (RedDLBP), Median Block Local Binary Pattern (MedBLBP), Divided Block Local Binary Pattern (DBLBP) and Divided Median Block Local Binary Pattern (DMedBLBP). The proposed descriptors are compared with some existing variants on facial expression and face recognition tasks, respectively, over JAFFE  [1] and, YaleB  [2] and ORL  [3] databases. Moreover, the noise robustness of the methods is evaluated.

The rest of the paper is as follows. Section 2 introduces the general concepts of the LBP process followed by the presentation of various existing variants adapted to facial recognition and facial expression recognition. Section 3 presents the new proposed variants. Experiments and the comparison of the proposals with the existing approaches are presented in Sect. 4. Finally, Sect. 5 discusses the study and concludes the paper.

2 Local Binary Pattern Features and Variants

2.1 Basics of Local Binary Pattern

The original LBP was proposed by Ojala in 1996  [4]. It describes the pixels of an image by using a \(3\times 3\) neighborhood around each pixel. The central pixel is then subtracted from its eight neighbors. If the resulting value is negative, the pixel is set to ‘0’, otherwise it is set to ‘1’ which concatenate together to give an 8-bits code corresponding to an integer ranging from 0 to 255. The original LBP is defined by Eq. (1) and based on the principle of Fig. 1.

Fig. 1.
figure 1

Basic LBP operator

$$\begin{aligned} LBP_{8,1} = \sum _{p=0}^{7}S(g_{p}-g_{c})2^{p} \end{aligned}$$
(1)

where, \(S(x)= 0\ if\ x < 0\) or \(1\ if\ x \ge 0\) and \(g_{p}\) corresponds to the value of the \(p^{th}\) neighbor pixel and \(g_{c}\) the central pixel.

With the basic LBP operator, dominant features with a large scale structure cannot be captured due to its small neighborhood. Hence, a variant is introduced  [4] which extends the neighborhood to (P, R) corresponding to P sampling points symmetrically arranged on a circle of radius R as illustrated on Fig. 2.

Fig. 2.
figure 2

Examples of the extended LBP’s with different (P, R)

One of the greatest advantages of \(LBP_{P,R}\) is its rotation invariance. Examples of images processed with \(LBP_{8,1}\) are shown in Fig. 3.

Fig. 3.
figure 3

Examples of LBP images

In order to reduce the size of the original \(LBP_{P,R}\) descriptor, Ojala et al. introduced the concept of uniform patterns  [5] which represent 90% of the patterns. A pattern is uniform if in its binary code (considered circular), the number of transitions between ‘0’ and ‘1’ is less than three. For example, 01110000 (2 transitions) is uniform whereas 11001001 (4 transitions).

2.2 Local Binary Pattern Variants for Face Analysis

Several other variants have been developed which improve the performance, focusing on different aspects: e.g. MB-LBP  [6], MQLBP  [7] and many others. In this subsection, we review 4 LBP variants specifically adapted for facial analysis.

2.2.1 Multi-Block Local Binary Pattern (MB-LBP)

It was proposed in 2007 by L. Zhang et al.  [6]. This method uses the LBP principle but applies to the mean value of the surrounding blocks. Blocks of size \(2\times 3\) are used in  [8]: the mean value of each block is compared to that of the central block and assigned ‘0’ if it is lower; otherwise, it is assigned ‘1’. By using mean values, MB-LBP is more robust to noise and more stable for face analysis than LBP.

2.2.2 Median Local Binary Pattern (MBP)

Proposed by Hafiane et al.  [9], MBP compares each pixel of \(3\times 3\) neighborhood with the median value of the block and assigns ‘0’ it is lower and ‘1’ otherwise. It includes the central pixel into the code so that to generate a 9-bits code.

2.2.3 Divided Local Binary Pattern (DLBP)

Hua et al.  [10] have proposed the Divided Local Binary Pattern (DLBP) which breaks down the LBP into two parts: one for even indices of the neighborhood and the other for odd indices. This reduces the range of the data by reducing the code length from 8 to 4 bits.

2.2.4 Multi-quantized Local Binary Patterns (MQLBP)

Proposed by Patel et al.  [7], MQLBP extends Local Ternary Patterns (LTP)  [11] with the main idea to split the pixels difference into 2 L levels instead of two levels. This splitting is done depending on a set of 2L − 1 thresholds. It utilizes both the sign and magnitude of the difference between the central pixel and the surrounding ones. Each quantized level is encoded separately to generate multiple local binary patterns. The basic LBP corresponding to the case of L = 1.

3 Proposed Variants

After analysing the strengths and weaknesses of existing methods, new variants are proposed. First, we observed that capturing more global features could give better information about the pixel environment. So the neighborhood is extended by enlarging the radius R or by adding another radius. To capture more global features, a group of pixels is used instead of a single pixel: this way, one can reduce the noise effect too. The proposed new variants are introduced hereafter.

3.1 Doubled Local Binary Pattern (d-LBP)

It extends the neighborhood of the basic LBP by using two rings of radius 1 and 3. Thus d-LBP uses two neighborhoods of 8 pixels as illustrated in Fig. 4(a). This leads to two LBP codes and two local histograms which will be concatenated.

3.2 Reduced Divided Local Binary Pattern (RedDLBP)

This variant uses the radius of 2 and the neighborhood of 6 pixels. Then, the neighborhood is cut down into two groups as can be seen on Fig. 4(b): this gives two 3-bits codes leading to 2 histograms which concatenate to give the descriptor.

Fig. 4.
figure 4

Operating principle of: a) d-LBP and b) RedDLBP

3.3 Median Block Local Binary Pattern (MedBLBP)

Inspired by the Multi-Block Local Binary Pattern (MB-LBP)  [6], it uses the median values of the surrounding blocks instead of mean values. This helps discard abnormal values and reduce the influence of noise.

3.4 Divided Block Local Binary Pattern (DBLBP) and Divided Median Block Local Binary Pattern (DMedBLBP)

DBLBP and DMedBLBP are derived from a combination of MB-LBP  [6] and DLBP  [10] operators. DBLBP exploits a block of size \(3\times 3\) pixels around the central pixel and 8 surrounding blocks of size \(3\times 3\) too. Then, the mean values of the blocks are used instead of the pixel values. Finally, the binary code is cut down into two parts to generate two values like for DLBP  [10]. DMedBLBP operates exactly as DBLBP, except that it uses the median value instead of the mean value of the block.

4 Experiments

4.1 Databases

JAFFE: this dataset consists of 213 frontal black and white photos of 10 posed Japanese women  [1]. They were photographed with 7 different facial expressions: 6 basic facial expressions (happiness, sadness, fear, anger, surprise, disgust) and a neutral expression, see Fig. 5 for some examples. This publicly available database is used for facial expression recognition.

Fig. 5.
figure 5

Some sample images from JAFFE database

YaleB: this database  [2] includes 5850 of face images of 10 human subjects in 9 poses and 65 illumination conditions. The photos were taken in laboratory controlled lighting conditions. This database is used for the evaluation of face recognition methods under variable lighting conditions.

ORL: it consists of 10 different images of 40 distinct subjects representing 7 facial expressions under different conditions: open/closed eyes, w/ and w/o smiling and w/ and w/o glasses. All the images were taken with a dark homogeneous background with the subjects in frontal position with some side movement  [3]. It is used for the evaluation of face recognition under variable lighting.

4.2 Experimental Setup

This section describes the methodology used to evaluate the performance of our proposed methods against existing methods. To facilitate the execution of our experiments, we coded a Matlab application, named LBP Studio. It simplifies the implementation and testing of new LBP methods. Indeed, one only needs to edit a single file appropriately to generate the target LBP code. The developed tool comprises SVM as the classification engine. The user can exploit the friendly graphical interface to setup the experiments as illustrated on Fig. 6.

Fig. 6.
figure 6

Graphical interface of LBP Studio

First, the input images are divided into local blocks after the image has undergone the selected LBP operation. Then, a histogram of the generated LBP values is built for each block. Finally, these histograms are concatenated together to provide the image descriptor.

Experimental Evaluation: experiments are conducted to evaluate the proposed new variants and to compare their performance with those of the existing variants. To do this, each class of the database is randomly divided into a training set and testing set. The recognition rate is then computed as:

$$\begin{aligned} Recognition\ rate = \frac{No.\ of\ images\ classified\ correctly}{Total\ no.\ of\ test\ images} \end{aligned}$$
(2)

To take the variability of the performance into account, this procedure is repeated 100 times on the JAFFE database and 30 times on the YaleB and ORL databases. The comparison is then based on the average rate of all iterations.

Implementation Parameters of LBP Descriptors: the basic LBP with the parameters \(P=8\), \(R=1\) and then \(R=2\) are tested. The MBP variant with \(P=8\) and \(R=1\), and the MBLBP, MedBLBP, DBLBP, DMedBLBP variants with blocks of size \(3\times 3\) are considered. For DLBP and RedDLBP, two layers of \(P=4\) and \(P=3\) are used respectively, at a radius of \(R=1\) and \(R=2\). Finally, the d-LBP variant with two layers of \(P=8\) neighbors and radius \(R_1=1\) and \(R_2=3\) is implemented.

4.3 Results

General Results: Table 1 shows the recognition rates of proposed and existing operators on JAFFE  [1] and, YaleB  [2] and ORL  [3] databases, for facial expressions and face recognition, respectively. Based on Table 1, one can see the effectiveness of the proposed variants.

Table 1. Comparison of average recognition rates on JAFFE, YaleB and ORLdatabases. Red color indicates the highest rate, cyan the \(2^{nd}\) and blue the \(3^{rd}\).

The RedDLBP descriptor stands out by obtaining the best expression recognition rate and also has high performance in face recognition. These results show that the proposed variants perform well both for face and facial expression recognition. For the tests on the YaleB database, the d-LBP stands out by obtaining the \(2^{nd}\) best score while DBLBP performs the best on ORL database.

Robustness Against Noise: robustness to noise is evaluated. To achieve this, JAFFE database is noised with Gaussian noise or with salt-and-pepper noise of standard deviation ranging from 0 to 0.5. Examples of image with Gaussian noise and salt-and-pepper noise are shown Fig. 7.

The results are summarized in Table 2 as the average recognition rates. Based on these results, we can observe that the basic \(LBP_{8,2}\) is very sensitive to noise. With noise range of 0 to 0.5, the recognition rate has decreased from 85.37% to 57.40% and to 63.64% for Gaussian noise and salt-and-pepper noise respectively. This table shows that the proposed DBLBP is the most robust to Gaussian noise. The DMedBLBP also performs well faced to Gaussian noise and performs the best against the salt-and-pepper noise. The DBLBP and MedLBP variants also stand out, obtaining the \(2^{nd}\) and the \(3^{rd}\) highest average recognition rate, behind DMedBLBP, respectively on salt and pepper noise.

Fig. 7.
figure 7

Original image (left) and image with a Gaussian noise (middle) and Salt-and-Pepper noise (right) of 0.2 standard deviation

Table 2. Average recognition rate for noised JAFFE database. Red color indicates the highest rate, cyan the \(2^{nd}\) and blue the \(3^{rd}\).

5 Conclusion

After analysing several variants of the LBP features, this paper has introduced five new ones. The new proposed variants exploit some interesting properties that can help improve the performance. This consists of:

  • taking more pixels in the neighborhood to catch large scale objects,

  • capturing more distant pixels,

  • using blocks of pixels instead of pixels, to reduce the effect of the noise,

  • using the mean value and the median value of the surrounding blocks,

  • splitting the binary code.

Evaluated over JAFFE, YaleB and ORL databases, the proposed variants perform satisfactorily compared to the state of the art variants. These results demonstrate that taking more global information into account reduces the noise effect while keeping a good description. The proposed variants are also evaluated upon Feret’s database for the challenging task of gender recognition  [7]. The preliminary performance sounds satisfactory and appeals for further investigation for improvement. In order to confirm the efficiency of the proposed LBP-like variants, in face, facial expression and gender recognition, future work includes extensive experiments over more databases and against various noise types.