Keywords

1 Introduction

Human expression plays a vital role in establishing non-audio communication between human beings. Nowadays, the facial expression identification technique is gaining more and more attention from the people. Facial appearance includes key information about psychological, emotional, and even physical state of the chat. Facial appearance recognition will also create a practical impact. It has a very wide application forecast such as comprehensible interface among human and machine, humanistic mean of goods, and emotional robot. With facial appearance identification structures, the system can review the human emotions [1]. The intellectual processor will be capable to recognize, understand, and act in reaction to human intentions, expressions, and moods [2].

The human facial expression detection systems are implemented in several living places like safety or examination; they can forecast the criminal and behavior by scanning the figures of their features to facilitate and confined by the control camcorder. In addition, facial appearance identification scheme has been utilized to create the response engine that is additionally cooperative with humans. The respond engine has turn out to be more intellectual by evaluating the customer’s tone and commerce with the reactions according to their face sensation [3].

In addition, facial sensation is dominant in signed and any language identification approach that agreements with the challenge of hearing and mutilate humans. Person facial appearance identification approach has a significant contact on the diversion pasture and in addition its utilization boost the effectiveness of machines for particular medical-oriented robots and industrialized checking [4]. Usually, the automated system with facial appearance identification approach has been utilized to progress in our everyday lives.

Depression is a generally unattended health issue, unbounded by work stress, health issue, and commonly affecting majority of the student’s performance in their study. To avoid it in future, the researchers construct a real-time face expression classification approach, so the teacher can supervise the student’s mentality through classroom movement.

1.1 Categories of Facıal Emotıons

Universally six fundamental human expressions and unbiased feel are recognized globally. The real-life human facial emotions are divided into seven types of modules. The expressions are categorized as happy, fear, anger, surprise, sad, disgust, and neutral. Example pictures from online Japanese Female Facial Emotions (JAFFE) Database for six modules are given in Fig. 1. Henceforth, this paper concentrates on real-time Indian human facial expressions like fear, happy, anger, disgust, surprise, and neutral.

Fig. 1
figure 1

Facial emotions from JAFFE database

2 Related Work

This segment deals with the proposed work performed till now by different developers in the area of expression identification through facial language. Intentions of a number of writers are discussed in this section. Daisy and Kannan [5] projected a face appearance identification approach based on an original limited rotated local Gabor filter method. Gabor filter technique utilizes two-step feature firmness techniques such as principal component analysis (PCA) and linear discriminant analysis to choose and constrict the Gabor feature selection and smallest amount of space classification to identify the facial emotions. These techniques are valuable for a combined measurement decrease and excellent appreciation act in association with conventional entire Gabor filter method. The finest common identification achieves good recognition results for GTAVE, MIT, CMU, PIE, and real-time home databases.

Bashyal et al. [6] have projected a well-organized method for face appearance identification by using Gabor filters method as feature identification. JAFFE dataset is utilized for testing, and the obtained result is greater than 91%. Xie et al. [7] have projected spatial highest occurrence model based on numerical factors and employed elastic figure-based texture matching techniques for figure or surface-oriented human face appearance detection.

Zhang et al. [8] obtain the starting point as mixture corresponding to features of face as fixed, vibrant, position-based arithmetic, or area-based exterior characteristics. Novelist faced on fixed frames and experimental output for dimensional characters. These types of features have been identified by Gabor filters. The algorithm has presented excellent experimental outputs. In [9], the authors have published a novel technique that developed mutually geometric and texture data of facial positions. Gauss Laguerre method is employed for the identification of texture data for a variety of face emotions. The backpropagation neural network and probabilistic neural network techniques are utilized to detect the different type of X-ray images [10]. Fuzzy logic system-based aeration control approach for contaminated stream water [11]. Fuzzy chaos whale optimization and BAT integrated techniques for limitation evaluation are in sewage management [12]. Wireless rechargeable sensor network mistake techniques and immovability investigation [13].

This research exposes a variety of techniques tracked by researchers for facial emotion classification. Exhaustive investigation is carried out in additional subsections of this research work.

3 Proposed Facial Expression Recognition System

Facial language replicates the emotions of person, which disclose expensive information of one’s sentiment, feedback, etc. Properly distinguishing these emotions is a difficult task. This segment describes the facial emotion identification system.

3.1 System Architecture

Overall system architecture of the projected facial expression identification is given in Fig. 2. Four modules collect the projected system: first phase is preprocessing, face detection as second phase, facial feature extraction as third phase, and facial emotional recognition as final phase by using two techniques named as SVM and tree-based classifier.

Fig. 2
figure 2

System architecture for facial emotion recognition system

3.2 Preprocessing

Preprocessing method is the primary phase by inflowing the frame records into the face recognition and facial expression identification system. The essential data required for most facial appearance identification method is face location. In preprocessing component, frames are reformed from 640 × 640 pixel rates to 400 × 400 pixel rates.

3.3 Face Detection Using Cascade Classifier

In some applications such as human–computer boundary, face detection, video surveillance, and facial expressions, the very first step used here is localizing and detecting the human face. Viola–Jones object recognition process supported on cascade of recognizer is worn to trace the person face within every image of the sequence of video. More distinctively, we apply 14 feature prototypes [14], which comprise four corner features, eight line total features, and two corner-surround features. These samples are balanced separately in horizontal way in order to produce a wealthy and over-complete group of characteristics. These set of characteristics can be figured out in a regular and small time irrespectively of the location as given in [15]. Face detection and extraction using cascade classifier are given in Fig. 3.

Fig. 3
figure 3

Face detection and extraction using cascade classifier

3.4 Feature Extraction for Emotion Detection

Feature information is an informative area identified from a frames or a large sequence of video. Visual information exhibits various models of characteristics that could be used to recognize or represent the relevant information it opens. The gray length method (GRLM) is a way of extracting higher-order statistical texture features. The theory and techniques behind the method are presented in [16]. A position of successive grayscale level, collinear in a given path constitutes a grayscale level run. The run length is the amount of pixels in the run, and the run length charge is the amount of times such a run occurs in a face image. The grayscale level run length of matrix (GRLM) is a four-dimensional matrix in which every factor f(x, y | θ) provides the entire number of incidences of runs of length ‘y’ at grayscale level ‘x’, in a particular direction θ.

4 Facial Emotion Recognition Using SVM and Tree-based Classifier

Machine learning (ML) contains many processes such as support vector machine (SVM), artificial neural networks (ANN), genetic algorithms (GA), Bayesian training, and probabilistic models [17], in that which we required only the accomplishment of the very first one machine learning (ML) methods on person fingerprint classification approach.

4.1 Support Vector Machine (SVM)

SVM methods is well-accepted practice for recognition in chart prototype classification [18]. The SVM is generally demoralized in kernel-based knowledge techniques. It achieves rational very important pattern classification presentation in optimization method [19]. Categorization tasks are commonly concerned with the use of testing and training data. The training data is subdivided into (x1, y1), (x2, y2), … (xm, ym) into two classes, with xi Rn enclosing an n-dimensional feature vector and yi + 1,1 enclosing class labels. SVM's goal is to create a copy that predicts the target value from the testing dataset. The hyperplane w.x + b = 0, where w Rn, b R is used to separate both classes in any space Z, is used in binary classification [20]. M = 2/||w|| gives the maximum scope.

The hyperplane is utilized to categorize the input feature space into the required target model. However, in order to fit the decision boundary in a hyperplane to make the most of distance boundary is preferred from feature data points for recognition. The sample maximum margin is given in Fig. 4.

Fig. 4
figure 4

Representation of hyperplane

SVM can be structured by using a variety of kernel function to improve the accuracy: polynomial, Gaussian, and sigmoidal. SVM is well appropriate for mutually unstructured and structured feature data.

4.2 Tree-Based Classifier

4.2.1 Random Forest

A random forest [RF] is a group of decision trees skilled with random features. Random forest moves as follows. Given a set of training examples, a set of random trees H is shaped such that for the kth tree in the forest, a random vector φk is produced autonomously of the past random vectors φ1…φk − 1. This vector k is then used to raise the tree resulting in a classifier hk (x, φk), where x is a feature vector. Random forest [21] is the fast and robust categorization performance that can handle multiclass problem.

4.2.2 Decısıon Tree (J48)

Decision trees are commonly used methods for pattern classification. Chi-squared automatic integration detection (CHAID) introduced in [22] and classifier 4.5 (C4.5, J48) in [17]. In this study, J48 algorithm decision tree was applied to traffic personnel hand features. J48 classifier is a standard model in C4.5 decision tree for supervised classification. In decision tree, feature collection procedure is done by information index. The information index for an exacting feature data Z at a node is calculated as

$$\mathrm{Information Index} \left(X,Z\right)= {\mathrm{Entropy}}_{1}\left(X\right)-\sum_{\mathrm{Value} at z}\frac{\left|X\right|}{\left|{X}_{n}\right|}{\mathrm{Entropy}}_{1}(X)$$

where X is the combination of instance at that exacting node and

$$\left|X\right|: {\mathrm{Cordinality}}_{1}$$

Entropy1 of X is found as:

$${\mathrm{Entropy}}_{1}\left(X\right)= \sum_{n-1}^{X}-{p}_{n}{\mathrm{log}}_{2}{p}_{i}$$

4.2.3 Naive Bayes (NB)

NB tree concept is a hybrid algorithm that characterizes a cross between Naive Bayes recognizer and C4.5 decision tree recognizer, and it is most excellent explained as a decision tree with nodes and branches [23]. The feature space which is classified in Naive Bayes is always independent to every other. If a is class variable and b is dependent feature space.

$$a={\mathrm{avrgmax}}_{a} {p}_{1}(a)\prod_{n=1}^{k}{p}_{1}(\frac{{b}_{n}}{a})$$

P1 (a) is called class probability and is provisional probability.

$${p}_{1}\frac{{b}_{n}}{a}$$

Bayesian theorem probability states

$${\mathrm{Posterior}}_{1}=\frac{{\mathrm{Prior}}_{1}*\mathrm{Likelihood}}{{\mathrm{Evidance}}_{1}}$$

5 Proposed Experimental Results

In this segment, the investigational outputs acquired in facial emotion identification systems are offered. The experimentations are performed by using SVM Torch and Weka Tool. LIBSVM [24] technique to expand the mold for every sensation and the molds are utilized to experiment the presentation, and recognitions are utilized for performance reason. Weka is a whole group of Java technique methods that are utilized to bring out several data mining and machine learning algorithms [10]. We evaluate the presentation of tree methods namely NBTree, REPTree, random forest, and random tree.

5.1 Database

The experimentations are accomplished with 25 subjects and the objects being clicked by the camera in natural scene at 25 fps with a 640 × 640 declaration. Real-time dataset includes 1500 objects of six person face emotion including neutral caused by 25 Indian female objects. Every objects has been priced on six emotions modules by 1500 Indian subjects.

Figure 5 shows the real-time database images considered for facial expression recognition. The proposed training set is organized of 900 objects (the every set organizes 15 persons and every person includes 60 objects). And remaining procedure, the proposed test set includes 600 objects that are evaluated of random choosing 10 objects from each and every emotions.

Fig. 5
figure 5

Real-time database for facial emotion

5.2 Performance Evaluation

This research work provides a methodical investigation of multiclass system. Standard estimation calculation include precision (P) = True Positive∕(True Positive + False Positive), recall (R) = True Positive∕(True Positive + False Negative), specificity (S) = True Negative∕(True Negative + False Positive), and F-measure = 2 Precision Recall∕(Precision + Recall) that are utilized to estimate the presentation of the projected facial emotion classification system. These types of calculations provide the excellent perception on the recognition and presentation for facial emotion system.

5.3 Results Obtained Using SVM and Tree-Based Classifiers

This section presents human facial emotion classification system using support vector machine technique and using WEKA Tool [10]. For this reason, human facial emotion datasets shown in Fig. 4 were used. The presentation result of the decision trees (DT) and random forests (RF) classifiers were calculated using tenfold cross-validation model and support vector machine using multiclass classification problem. The complete outputs for the average of recall, precision, and specificity examination for the experimental results are obtained in Fig. 5. Table 1 gives the confusion matrix for proposed facial emotion recognition using SVM (RBF) multiclass classification with proposed feature. Tables 2, 3, and 4 correspondingly tabulate the confusion matrix of the tree-based classifiers using the proposed facial features on six emotions by using tenfold cross-validation for emotion recognition.

Table 1 Confusion matrix for proposed facial emotion recognition using SVM (RBF)
Table 2 Confusion matrix for proposed facial emotion recognition using random forest
Table 3 Confusion matrix for proposed facial emotion recognition using decision tree (J48)
Table 4 Confusion matrix for proposed facial emotion recognition using Naive Bayes

The accuracy results obtained with proposed feature are by using SVM RBF kernel and tree-based classifier for each individual human facial emotion recognition, and they are presented in Table 5. Figure 6 shows the comparison of the precision, recall, and specificity value of proposed facial features by using support vector machine [SVM] with radial basis function, random forests, decision trees (DT), and Naïve Bayes techniques. The projected approach provides superior performance value of proposed feature by using SVM RBF when compared to tree-based classifiers. Computer-aided detection and diagnosis of techniques is proposed in Balaji et al. [25].

Table 5 Comparison for the accuracy of all facial emotions
Fig. 6
figure 6

Comparison of the precision, recall, and specificity value of SVM with RBF, random forests, decision trees, and Naïve Bayes classifier

6 Conclusion and Future Work

The proposed research has shown the accurate performance calculation by using SVM and tree-based classifier for human facial emotion classification approach. The research result gives significant attention to four steps, such as preprocessing, face identification, facial feature identification, and finally human facial emotion recognition. SVM RBF kernel gives higher accuracy compared to tree-based classifier. In forthcoming research, the suppleness of this proposed research is developed by using deep learning (DL) to classify facial emotions by using a variety of methods in a complicated environment.