Keywords

1 Introduction

Facial expression recognition has been active research field since many years. Facial expression recognition system is a computer application to recognize the expression, i.e. emotion of a person from face images. There are seven universal facial expressions, i.e. surprise, happy, fear, anger, sad, disgust and neutral. The applications of facial expression recognition technique are in the field of security, medical purpose, computer entertainment, human–computer interaction, etc.

There are three main modules for facial expression recognition system and these are image preprocessing, features extraction and classification. In image preprocessing step, acquisition of face images and different types of preprocessing task are performed to make these face images into good quality for further analysis. In features extraction module, extraction of some valuable features from the images is made, so that recognition task is performed depending on those features. In the last module, the features are analysed to recognize the facial expression.

There are various methods to recognize the facial expression like Gabor filter selection based [1, 2], linear discriminant analysis based [3,4,5], principal component analysis (PCA) based [6,7,8,9], two-dimensional principal component analysis (2DPCA) based [10, 11], etc. Principal component analysis (PCA) and 2DPCA are most popular methods in this approach. PCA is based on one-dimensional matrix. So in PCA, image matrix needs to transform into 1D vector before calculating the covariance matrix. 2DPCA is based on 2D image matrix. So in 2DPCA, covariance matrix can be calculated directly from image matrix. Eigenvector and eigenvalues are calculated from covariance matrix. PCA- and 2DPCA-based methods find the covariance of overall dataset based on whole image. 2DPCA has some advantage over PCA [10], and these advantages are easy to calculate the covariance matrix accurately, identify the particular eigenvector quickly, low computational cost and good recognition rate.

In the proposed method, main concentration is on partitioning the whole image into small sub-images and then perform 2DPCA over small sub-images, because it gives more significant information as it focuses on smaller units, i.e. local details. Classification is used for finding the expression of an image. 2DPCA has more recognition rate than PCA. Japanese female facial expression (JAFFE) dataset has been used in experiment. Experimental result shows that 2DPCA gives more recognition rate when applied on face parts instead of entire face.

The remainder of this article is organized as follows. The proposed scheme is outlined in Sect. 2. Section 3 contains experimental details and result analysis. Conclusive remarks are given in Sect. 4.

2 The Proposed Method

In the proposed method, 2DPCA-based transformation for extracting features is performed and then minimum distance classifier is used to recognize the face expressions. 2DPCA is not applied directly on face images but over segmented sub-images. The block diagram for proposed system is shown in Fig. 1.

Fig. 1
figure 1

Structure of proposed facial expression recognition system

The main three steps are image preprocessing, feature extraction and classification. Feature extraction contains two sub-steps segmentation into sub-images and 2DPCA-based transformation. The details of each step are discussed as follows.

2.1 Image Preprocessing

All input images are greyscale image. All image sizes are same. Seven different expressions of images are taken. Faces are detected from the images using Viola–Jones face detection algorithm [12]. The tasks involved in image preprocessing are acquisition of face images, removal of different types of noises, enhancement of image and registration of images. It is assumed that face images are properly preprocessed in the proposed method.

2.2 Feature Extraction

Features are the very important information of images. There are huge number of features in images. There are also some unwanted features in images. So features extraction is required to extract important features from the images [13]. In this step, features are extracted from each face image and perform comparison between faces based on these extracted features.

In the proposed method, the main aim is to extract the meaningful features from faces. For this purpose, each face image is divided into sub-images and then 2DPCA-based transformation is performed to extract features. Here, main concentration is on partitioning the whole image into small sub-images and then perform 2DPCA over small sub-images, because it gives more significant information as it focuses on smaller units, i.e. local details. For example, partition a whole face image into sub-images like mouth, nose, left eye, right eye, etc. Then, 2DPCA is performed over all mouth images, nose images, left eye images and right eye images separately and consider the variance in subgroup basis. The two steps of proposed feature extraction method are discussed as follows.

2.2.1 Partition of Face Image into Sub-image

Some parts of faces like mouth, nose and eyes have more information about face. These parts play important rule in face. So it is better to operate on those portions of the face instead of entire face. Mouth, nose and eyes are cropped from the detected face using computer vision system tool, i.e. CascadeObjectDetector [12] and these portions of faces are called as feature images. All cropped images have different sizes but to perform some operation we need to resize the images. Mouth image is resized as 50 by 50; nose image is resized as 50 by 50; left eye is resized as 40 by 40; and right eye is resized as 40 by 40. Four cropped portions of face image are represented in Fig. 2.

Fig. 2
figure 2

Segmented sub-images

2.2.2 2DPCA-Based Features Extraction from Sub-images

Two-dimensional principal component analysis (2DPCA) algorithm [10] is used to extract the features from the feature images (mouth, nose, left eye and right eye). All seven expressions are set into seven different classes for training. 2DPCA is performed over mouth, nose, left eye and right eye images, separately, to extract the features. Eigenvectors and eigenvalues are calculated from these images, separately. The steps to extract the features using 2DPCA [14] are as follows.

Let the total number of images (represent by \(A_i\)) be M. The mean image \(\mu _A\) of all images is calculated by the following equation.

$$\begin{aligned} \mu _A = \frac{1}{M}\sum \limits _{{i\,=\,1}}^{{M}} A_{i} \end{aligned}$$
(1)

The image scatter (covariance) matrix, \(G_t\), is calculated by

$$\begin{aligned} G_t = \frac{1}{M}\sum \limits _{{i\,=\,1}} ^{{M}} (A_{i} - \mu _A)^{T}(A_{i} - \mu _A) \end{aligned}$$
(2)

Now, the eigenvectors corresponding to largest eigenvalues of the covariance matrix \(G_t\) are calculated. Generally, a set of eigenvalues and corresponding eigenvectors are selected, to make the eigenimages when multiplied it with face images.

Let d be the number of eigenvector corresponding to d largest eigenvalues as the projection axes \(X_1, X_2, X_3, \ldots , X_d\).

Then, the projected feature vector is evaluated by multiplying the eigenvector and images. Projected feature vectors are called principal component of images. Principal component contains the significant information of the images.

$$\begin{aligned} Y_{i} = A_{i}X_{k}, \end{aligned}$$
(3)

where \(i=1, \ldots , M\) and \(k=1, \ldots , d\). Then, projected feature vectors are stored as

$$\begin{aligned} Y_{i} = [Y_{1}^{i}, Y_{2}^{i}, \ldots , Y_{d}^{i}] \end{aligned}$$
(4)

2.3 Classification

The minimum distance classifier (MDC) is used for classification [10]. In test image, there will be four eigenvectors of cropped images mouth, nose, left eye and right eye, respectively. The distance between the features vector of each sub-image of test image is calculated corresponding to each sub-image of training set. Basically, Euclidean distance is considered here to measure the distance. Feature vector of each sub-image of test image and each sub-image of training sets is represented as

$$\begin{aligned} (Y_{1}, Y_{2}, \ldots , Y_{d}) \end{aligned}$$
(5)

and

$$\begin{aligned} (Y_{1}^i, Y_{2}^i, \ldots , Y_{d}^i), \end{aligned}$$
(6)

(where \(i=1, \ldots , M\) and \(k=1, \ldots , d\)). The mathematical formula to finding the distance is

$$\begin{aligned} d_{x, y} = \surd [\sum \limits _{{i\,=\,1}}^{{m}} (x_{i} - y_{i})^{2}]. \end{aligned}$$
(7)

Then, we need to normalize the distances and bring all values into the range [0, 1], after finding all the distance of mouth, nose, left eye and right eye. The mathematical formula to find the normalized distance is

$$\begin{aligned} X^{i} = \frac{X - X_{min}}{X_{max} - X_{min}} \end{aligned}$$
(8)

There will be four normalized values for mouth, nose, left eye and right eye. Then, we have to add all four normalized values and find the minimum normalized value among all the normalized distances. Minimum normalized value will classify the expression of a test image [15].

3 Experiment and Evaluation

In this section, experimental details are discussed for the facial expression recognition using 2DPCA. JAFFE facial expression database is used in this experiment. We have experimented with MATLAB 2013a version. Two major stages in this experiment are extraction of the features images and classify the facial expression by minimum distance classifier.

3.1 Description of the Dataset

JAFFE dataset contains frontal face images of size \(256 \times 256\) of different facial expressions. All images are in greyscale image. This JAFFE dataset contains seven universal facial expression images. We rearranged the database in seven classes, and each class contains 25 images. In this experiment, for training, 15 number of images are taken for each expression. In training dataset, 105 images are taken and it contains all seven expressions. For testing, 10 images are taken per expression and there will be 70 images for testing. Facial expression is shown in Fig. 3.

Fig. 3
figure 3

Different facial expressions of JAFFE dataset: 1, surprise; 2, sadness; 3, neutral; 4, joy; 5, fear; 6, disgust; 7, anger

3.2 Experimental Details

In this experiment, 25 number of images are taken for each expression. In training dataset, we have divided into seven classes and each class contains 15 number of images for training. In testing, we have taken 10 images per class to recognize the expression. There are four parts of each image, and these face parts are mouth, nose, left eye and right eye. We have calculated the feature vectors of each part. We have compared all four feature vector of test image with four features vectors of all trained images. In case of test image, there will be one feature vector for mouth, one feature vector for nose, one feature vector for left eye and one feature vector for right. For training dataset, there will be 105 features vectors for mouth of 105 images, there will be 105 features vectors for nose of 105 images, there will be 105 features vectors for left eye of 105 images and there will be 105 features vectors for right eye of 105 images. We have calculated the Euclidean distance between feature vectors of each part of images. After that, we have calculated the normalized distance. There will be four normalized distances of one test image with each training image. Then, we have added all the four normalized distances. We have to find the minimum normalized distance among all the normalized distance, and that minimum normalized distance will classify the facial expression.

3.3 Result Analysis

15 images per class from the dataset are selected for training and 10 images per class for testing. There is no overlap of any images between training data and test data. We repeated this experiment 10 times by changing the training images and test images from dataset. Class-wise accuracy rate is given in table below. The overall classification accuracy for the proposed method is recognition rate that is 89.86%.

Table 1 Comparison of class-wise and overall classification accuracy (in %)

The performance of the proposed method is compared with the result of other two existing method like facial expression recognition using principal component analysis (PCA) [9] and facial expression representation and classification using 2DPCA [10]. The proposed method has higher recognition rate than other two methods. The performance of these three methods is shown in Table 1.

4 Conclusions

This paper presents a new method to recognize the facial expression using 2DPCA by taking some important parts of face (mouth, nose, left eye and right eye) instead of entire face and extracting some significant feature from the feature images by 2DPCA. Eigenvectors and eigenvalues are calculated from the covariance matrix. Minimum distance classifier is used to classify the facial expression. It gives more recognition rate while performing 2DPCA on some significant face parts instead of entire face. The performance of 2DPCA is more efficient than PCA. The proposed method has far better result than other two existing methods like PCA and 2DPCA on entire face. The average facial expression recognition rate is very high in the proposed method.