Abstract
In the field of Brain Computer Interface (BCI), applications in real life like emotion recognition from recorded electrical activity from brain have become famous topic of research nowadays. Learning successful representations of consistent performances from electroencephalogram (EEG) signals is one of the difficulties in recognition tasks. This research is intended to propose a discriminative and efficacious classification approach for categorizing brain signals patterns depending on the level of activity or frequency for recognizing emotion states. The paper classifies three possible emotion states such as neutral, negative and positive emotional states by operating the Muse EEG headset with four electrode channels (AF7, AF8, TP9, TP10) captured while a subject was watching an emotional video clip on screen. In this experiment various statistical, linear and non linear features are extracted and then Machine and Deep learning based models are implemented to classify the EEG evoked emotions. In this work, a brief comparison study is carried out between the various implemented models with respect to train and test accuracy, recall, precision and F1 score. The highest average accuracy achieved are 98.13% for the proposed Convolutional Neural Network (CNN) model among all implemented Deep learning models and 98.12% for Random forest among the various machine learning techniques implemented. The proposed Long Short Term Memory (LSTM) and Gated Recurrent Unit (GRU) model with 97.42 and 97.19% and Decision tree and Support Vector Machine with 96.25 and 96.42% have also provided comparable results for emotion classification respectively.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
Emotions are considered to have a strong impact on how people interpret information, make decisions, and shape their actions when interacting with others. The study of psychological and behavioral interaction is a relatively recent area of study. Emotion analysis is an essential task in everyday life, particularly in the domain of human-computer interface [1, 2]. Emotion analysis will aid in improving the caliber of communication between computer’s intelligence and human brain. Emotion analysis is also used in health care to understand patients’ neurocognitive functioning [3], and physiological signals like galvanic skin response, heart rate, electromyography and electroencephalography (EEG) are commonly used to assess emotional state [4]. The relationship between various emotional states and EEG signals has been extensively analyzed in the last decade [5]. EEG is an exclusive strategy for measuring brain responses to emotional stimuli because it is non-invasive, fast, and inexpensive. EEG signals are commonly applied for emotion recognition since they can be used to investigate different aspects of emotions based on frequency band, electrode location, and temporal details [6]. In recent studies, EEG signals have been shown to acquire good classification score for emotion classification, and can successfully describe how physiologically cognitive and emotional acts are interrelated. Based on these peculiarities, EEG signals are considered as the primary cradle of knowledge for emotion detection in HCI systems. It is a promising area of study that has gathered a lot of attention from different disciplines, ranging from neuroscience to computer science. While previous investigation has looked into different methodologies for emotion recognition, due to the complex pattern, emotion recognition techniques are still in demand by many innovative applications. Emotion recognition is often essential in medical care in order to understand a patient’s social and mental state. Virtual world based application [7], driving assistance [8], gaming [9], mental health-care [10], and social security [11] are some of the other applications area of emotion recognition that have been created.
EEG can acquire brain activity signals and explore them using machine learning algorithms to predict emotion retreated behaviors and translate them into instructions. Typically, either a classification or a regression model is being used for this task. The kind of machine learning algorithms are defined by the results and sensory input displayed throughout the experiment. Machine learning algorithms provides the ability to learn from problem-specific training data in order to modernize the process of establishing analytical models and solving associated tasks. Deep learning is an artificial neural network-based machine learning premise. Deep learning models supersede shallow machine learning algorithm and conventional data analysis perspectives in many implementations. The main distinction between deep learning and traditional machine learning is its performance as data volume increases. Processing features takes time and requires specialized knowledge. The accuracy of the extracted features determines the performance of most ML algorithms. A significant difference between DL and traditional machine-learning algorithms is the attempt to obtain high-level features directly from data. As a result, DL reduces the effort required to design a feature extractor for each problem. Therefore, In this research work both Machine learning and Deep learning based algorithms has been implemented for emotion recognition and the performance accuracy has been evaluated and compared. This paper aims to investigate and analyze the different classifiers and their performance accuracy on EEG signals for emotion detection.
The overall work-flow of the designed architecture is given in Fig. 1. As a very initial stage, EEG signals are acquired while a subject is wearing a Muse EEG headset and is shown an emotional video on the screen. The second stage includes preprocessing of the acquired EEG signals using basic filtering process in order to remove artifacts, followed by feature extraction step to extract some statistical, temporal, frequential and time-frequency based features from pre-processed EEG signals. Finally, these features are divided into train and test samples and then train samples are used as input to the proposed Machine and Deep learning based models for a multi-classification task representing the three emotional states i.e. positive, negative and neutral. The paper also presents emotion classification accuracy results of the implemented Machine and Deep Learning based various network designed to classify stimuli evoked EEG recordings. The result shows that the deep learning based models provide better accuracy compared to some of the Machine learning based classifiers. The following sections of the paper present an extensive literature review in Sect. 2, followed by the methodology in Sect. 3 including dataset description, the proposed architecture and the implementation of various model. At last, Sect. 4 documents the result and discussion followed by the conclusion in Sect. 5.
2 Related Work
An extensive literature survey has been done in three aspects: (i) the previous work related to various extracted features and feature extraction techniques carried out for EEG signals. (ii) Machine learning based EEG evoked emotion recognition related work. (iii) Deep learning based emotion recognition related tasks.
2.1 Related Work on Feature Extraction Techniques on EEG Signals
Several innovative and BCI inspired studies focused on feature extraction and machine learning techniques have been published. Various feature extraction techniques have been used successfully in EEG analysis to retrieve complex and non-linear pattern which can help for classification tasks [12, 13]. Existing complex pattern discovery strategies rely on hand-crafted procedures for extracting prominent features from EEG signals, which must then be able to classify EEG signals using a variety of classifiers. Several studies have concentrated on the feature extraction step, with the goal of identifying the most important features for EEG evoked emotions. The EEG-evoked emotion classification procedure can be separated into two primary stages. The main stage is feature extraction from EEG signals which represents the prominent emotional state. In EEG based processes, signal processing techniques such as Fourier transforms, wavelet transforms and chaos theory are widely used in features extraction. However, since these signals are dynamic, time-variant and non-linear in nature, the study of EEG signal is intricate with frequency and time domain techniques [14]. The measures of complexity and chaos within the time domain-based mechanism, have demonstrated to be the most discriminative in EEG classification [15]. The extracted EEG features can be separated in domain of frequency or temporal band. The temporal features primarily perceives the time related data of EEG signals, for example, fractal dimension [16], Hjorth features [17], and higher order crossing features [18]. The features extracted from frequency domain aims to perceive frequency based EEG data, for example, differential entropy (DE) [19], the rational asymmetry (RASM), power spectral density (PSD), [20] and so on. PSD features using the Fast Fourier Transform and Short-Time Fourier Transform, as described in [21] are considered the most basic and widely used. Other studies looked into the characteristics of decomposition techniques including the Intrinsic Mode Functions (IMF) and Discrete Wavelet Transform (DWT) [22]. The multiband feature matrix (MFM) in [23] ensures that the location of the sensors on the scalp is taken into account during the feature extraction process. Other reported EEG features from different works are Shannon entropy, Sample Entropy, Log energy entropy, Differential Entropy, Wavelet Entropy, Approximate Entropy, Common Spatial Patterns (CSP) and Asymmetry Index (AI) on the 5 Frequency bands of EEG signal i.e. delta, theta, alpha, beta, and gamma bands [24]. Due to the complicated nature of the brain signals, recent publications consider an increasing number of non-linear features such as Higher Order Crossings or Fractal Dimensions. Nonetheless, simple features such as band powers have almost become obsolete omnipotent despite the fact that they are premised on various underlying algorithms that are often referred to for solely for comparison purposes [25].
2.2 Machine Learning Based EEG Evoked Emotion Classification Related Work
Multiple machine learning (ML) based methods, such as linear discriminant analysis (LDA) [26], Naive Bayes (NB), support vector machine (SVM) [27], random forest(RF), k-nearest neighbors (k-NN) [28], and others, were utilized with reasonable accuracy as EEG-evoked classifiers for emotional classification tasks. The Multiple Layer Perception (MLP) was used to detect the emotions from normalized features vectors of EEG signals and reported 69.69% accuracy. MLP and SVM classifier are again implemented in [20] to classify the emotion states from EEG recording taken while listening to music, using PSD and DASM features which improved the classification precision to 82.29%. An average accuracy of 87.53% has been achieved by SVM from the linear dynamic features of EEG signals recorded while watching videos [29]. Some non linear EEG features are also extracted and K nearest neighbor shall be used to classify extracted characteristics into emotional state, in order to demonstrate the superiority of a non-linear method of extraction over earlier frequency extraction techniques [30]. Another work, that extracted different nonlinear features from empirical decomposition (EMD) of EEG dataset (SEED) and the attributes that have been evaluated are input into a random forest (RF) algorithm to obtain classification accuracy of 93.87, 91.45 and 89.59% for negative, neutral and positive emotions respectively [31]. In [32], the ICA technique applied to extract features from SEED dataset and these feature are classified by ANN. A tuned Q wavelet transform (TQWT) algorithm as a feature extractor is employed, and then a rotation forest ensemble (RFE) classifier is fused with several classifiers such as k-NN, SVM, DT, ANN and RF algorithms and attains over 93% classification precision with RFE + SVM [33]. Recently, Laura et. al. [34] used Naïve Bayes, SVM with various kernel, Random Forest and an Artificial Neural Network classifier for emotion classification on AMIGOS dataset.
In the current work, some of the conventional ML based algorithms has been implemented on EEG features for comparative analysis and measures the classification precision, which gave further inspiration to develop models to refine the results further.
2.3 Deep Learning Based EEG Evoked Emotion Classification Related Work
Deep learning methods, as opposed to traditional machine learning approaches, extract deeply layered features automatically from large datasets and are better suited to displaying the most prominent features of data. Deep learning methods have been successful in recognizing emotions in numerous studies. Deep learning (DL) methodology has achieved best success in the fields of computer vision [35, 36], natural language handling [37], speech recognition [38], and visual stimuli based EEG classification [39] as well as EEG-based emotion recognition [40] in the last few years, according to some researches. Some DL based approaches such as Deep Belief Networks (DBNs) [41], Recurrent Neural Network (RNN), Convolutional Neural Networks (CNNs) [40], Gated Recurrent Unit (GRU), Long Short Term Memory (LSTM) [42] and others have been adapted for EEG dependent emotion classification. The LSTM technique is a enhanced form of RNN (Recurrent Neural Network) that is particularly good at processing time series data like EEG signals. The LSTM network is ideal for EEG-related tasks because of its ability to remember past information. It also prevents the vanishing gradient problems that a permanent network has, and it has been used in a wide range of applications. CNN is the most accurate and optimistic deep neural network model for classification tasks among the various deep learning models. Different CNNs, as well as some multimodal frameworks with CNN, have been commonly used to obtain results from object, text, recorded video, visual and speech classification tasks [43, 44]. Recently the CNN along with its variants has been implemented for emotion recognition task [45].
3 Methodology
3.1 Dataset Acquisition
In this experiment, a publicly accessible Kaggle emotion dataset is utilized for emotion classification task [46]. Data was collected using the Muse Headband sensor from a male and female subject for three minutes per emotion state i.e. positive, negative, neutral. The Muse is a commercial EEG sensing system with five dry-sensors: one for a reference point (NZ) and four electrode channels (AF7, AF8, TP9, TP10) for recording brain wave activity. Also, there are six minutes of neutral data collected while subject is in rest, six (negative and positive) stimuli was used to evoke the emotions. Furthermore, during any events, nobody was supposed to close their eyes. The subjects in the relaxed task were advised to ease their muscles and relax while listening to moderate music and sound effects intended to help in meditation. Similarly, another test was performed to record a neutral emotion but with no stimuli at all. This test was performed before the others to avoid the long-term consequences of a relaxed state. The Muse Headset’s EEG samples was automatically registered for sixty seconds. The data was found to be streaming at a variable frequency between 150 and 270 Hz. The recorded EEG data was sampled before extracting the features.
3.2 Feature Extraction
Since EEG data is non-linear and non-stationary in nature, single values cannot be used to determine class. The classification of EEG signals mostly depends on the temporal existence of the signals, instead of the values specifically. The identification of the nature that govern various frequency bands of EEG signal often necessitates temporal analysis. Temporal statistical extraction is carried out for these purposes. A time window slider of size 1 s with a 0.5 s overlap are used for temporal statistical extraction. In the current research paper, some previously experimented successful statistical attributes of EEG signals [47] are used to extract features. This section explains why statistical extraction is necessary, as well as how to go about implementing it. The following sections explain the various statistical feature variants that are extracted from the initial dataset:
-
1.
Mean: A collection of signal values is interpreted within a series of time-frame \(x_{1}, x_{2}, x_{3},....., x_{n},\) and mean values are calculated as in Eq. 1:
$$\begin{aligned} \hbox {mean}(\mu )= \frac{1}{n}\sum _{i=1}^{n}{x_{i}} \end{aligned}$$(1) -
2.
Standard Deviation: The standard deviation for all signals are calculated as:
$$\begin{aligned} \hbox {SD}(\sigma )= \sqrt{\frac{\sum _{i=1}^{n}({x_{i}-\mu })^2}{n}} \end{aligned}$$(2) -
3.
The skewness and kurtosis indicating asymmetry and peak of waves i.e. statistical instances of the third and fourth order of mean is defined in Eq. 3 and 4.
$$\begin{aligned} \hbox {Skewness}(s)= \frac{\mu ^{3}}{\sigma ^{3}} \end{aligned}$$(3)$$\begin{aligned} \hbox {kurtosis}: \sqrt{\frac{\sum _{i=1}^{n}({x_{i}-\mu })^k}{n}} \end{aligned}$$(4)where k is the third and fourth order of mean.
-
4.
Maximum value inside every specific time interval \({max_{1},max_{2},..., max_{n}}\).
-
5.
Minimum value inside every specific time interval \({min_{1},min_{2},..., min_{n}}\).
-
6.
Min-Max Derivatives: By splitting the temporal frame in half and calculating the each half’s values of each time-frame, min-max derivatives values can be calculated as in Eq. 5.
$$\begin{aligned} D=\frac{\mu ^p-\mu ^{p/2}}{2} \end{aligned}$$(5)where p= 1 sec and p/2= 0.5 sec i.e., in a one-second time frame, the second half of the data sequence is shown. In order to get the derivative given the max and min features in sub time windows (t), the same strategy is used in (Eq. 6 and 7):
$$\begin{aligned} max(t)=\frac{max^p-max^{p/2}}{2} \end{aligned}$$(6)$$\begin{aligned} min(t)=\frac{min^p-min^{p/2}}{2} \end{aligned}$$(7) -
7.
After slicing the original one-second time window into four batches of 0.25 s each, the next temporal features are extracted. The mean, maximum, and minimum values of each batch were then calculated:{\(\mu _1, \mu _2, \mu _3, \mu _4\)}, {\(max_1, max_2, max_3, max_4\)} and {\(min_1, min_2, min_3, min_4\)}.
-
8.
Then the 1D Euclidean distance between all mean values, using the formulas \(\delta _{\mu 12}= |\mu _1- \mu _2|\), \(\delta _{\mu 13} = |\mu _1- \mu _3|\), \(\delta _{\mu 14} = |\mu _1- \mu _4|\), \(\delta _{\mu 23} = |\mu _2- \mu _3|\), \(\delta _{\mu 24} = |\mu _2- \mu _4|\), \(\delta _{\mu 34} = |\mu _3- \mu _4|\) is calculated, and the same for the minimum and maximum values, yielding 18 features dependent on distances are extracted. Total 30 features in the short period of time window for each signal by using the four mean values, four max and four min values, and integrating the existing 18 features, total of 150 temporal features per second considering the five signals are extracted.
-
9.
Based on the preceding 150 feature vectors, last 6 features is discarded to achieve 144 features, which allows to construct a \(12 \times 12\) square matrix to evaluate the log-covariance as follows in Eq. 8:
$$\begin{aligned} lcM=U(logm(cov(M))) \end{aligned}$$(8)where the upper triangular products are returned by U(), the matrix multiplication function is logm(.), and the covariance matrix is given by (9)
$$\begin{aligned} cov(M)=cov_{ij}=\frac{1}{n}\sum _{k}^{n}(x_{ik}-\mu _{i})(x_{kj}-\mu _{j}) \end{aligned}$$(9) -
10.
Entropy is an instability measure that is used in brain-machine interface applications to calculate the amount of chaos in the system since it is a non-linear measure that quantifies the level of relevance of the data. Shannon entropy is proven efficient for non-linear time series data and calculated as in Eq. 10:
$$\begin{aligned} Shannon Entropy (SE) = -\sum _{i} S_j \times log(S_j) \end{aligned}$$(10)where h is a function determined in each 1 sec time window and \({S_{j}}\) indicates each (normalized signal) feature of this periodic window.
-
11.
Then, by splitting the same time period in half to compute the log-energy entropy as in (11):
$$\begin{aligned} loge=\sum _{i} log (S_{i}^{2})+ \sum _{j} log (S_{j}^{2}) \end{aligned}$$(11)where i and j is an iterator for the values from the first sub window (0–0.5 sec.) and the second sub window (0.5–1 sec) respectively.
-
12.
Fast Fourier Transform (FFT) is a useful tool for analyzing the spectrum of a time series computed as follows at each time window (12):
$$\begin{aligned} FFT=\sum _{n=0}^{N-1} S_{n}^{t} e^{-i2\pi k \frac{n}{N}} k=0..... N-1 \end{aligned}$$(12)
The EEG signals are represented using these statistical features mentioned above considered for each electrode channel and time window, this produces a total of \(2147 \times 2548\) features where 2548 is the number of rows in the feature set. These features are then used as input to the various model for classification of emotion states.
3.3 EEG Evoked Emotion Classification Using Machine and Deep Learning Algorithm
The current work proposes an EEG emotion classification model and compares the various machine learning and deep learning based models to classify EEG related signal. Various ML and DL based architecture has been implemented and evaluated in this paper, which has been already implemented for emotion recognition tasks on different emotion datasets and achieved comparable performance and classification accuracy percentage on described dataset in section above.
3.4 Implemented Machine Learning Classification Algorithms for EEG Evoked Emotion Classification
In this paper, machine learning based methodology namely Support Vector Machine (SVM), Decision Tree (DT), Gaussian Naive Bayes (GNB), K-Nearest Neighbor (k-NN) and Random Forest (RF), Multiple Layer Perception (MLP) and Artificial Neural network (ANN) has been implemented, trained and tested on emotion dataset described in Sect. 3.1.
The short description with the parameter settings of these techniques are listed as follows:
Support Vector Machine (SVM) is exceptionally favored as it produces huge precision with less calculation. SVM can be applied to solve regression as well as classification problems. Although in many cases, it is generally utilized in classification tasks. In this emotion based classification task, SVM has been implemented on the datasets described in Sect. 3.1 and the extracted EEG temporal feature sets were fed to the SVM model as input for classifying each emotion state into negative, neutral, positive. The Radial Basis Function (RBF) kernel has been applied for this purpose as it is widely used for many classification problem and it has localized and finite response along the entire x-axis. Generally It allows for binary classification and the separation of data points into two classes. The same method is used for multi-class problem after decomposing the multi classification problem into numerous binary classification problems. The goal is to map an emotion-based feature set to a high-dimensional space in order to achieve mutual linear separation between each pair classes. To classify the features of the two classes, SVM generate various possible hyperplanes that could be optimized by finding the maximum margin between the features of each pair of the classes for multiple emotion classification.
Random Forest is the most popular machine learning algorithm because of its simplicity and diversity. It is based on building a “forest" using multiple decision trees and combines them together using bagging algorithm that can uplift the overall classification accuracy. Here, the estimator parameter i.e. the number of trees in the forest is 50.
K-Nearest Neighbor (k-NN) is a supervised machine learning algorithm which can be utilized for both regression and classification problems. In k-NN, calculation similarity between various feature set is used to predict the estimations of a new data point which further implies that the optimal spot for the new testing tuple will be determined by how precisely it resembles with the neighbor’s test tuple of the training set. In the current work the default number of neighbors used is 5.
Gaussian Naive Bayes is a basic yet unexpectedly strong predictive method. NB classifiers are an assortment of classification algorithms dependent on Bayes’ theorem with the "native" assumption that each pair of emotion based features has the class variable value of conditional independence. It is a group of algorithms where every one of them share a typical standard, for example, each pair of features being classified are independent of one another. Gaussian Naive Bayes is a variation of Naive Bayes that follows Gaussian normal distribution and works finely with continuous data.
Decision Tree is a binary as well as multi-class systematic classification technique. It questions the dataset i.e feature sets of the EEG evoked emotions and split each set into nodes to get the final emotional states. A binary tree can display the decision tree algorithm. A query is asked about the root and each of the internal nodes, and the data on this node is further divided into multiple records with different features. The tree leaves are the classes in which the dataset is divided. High dimensional information can be handled with remarkable precision using decision trees. The task for measuring the quality of the split is "GINI" for the Gini insufficiency and "Entropy," which is employed in this work as well.
Artificial Neural Networks (ANN) also known as Neural Networks (NNs), are computer structures that are loosely modeled after the biological neural networks concept that make up human brains. ANN is a supervised learning technique which acquires the information from network in the form of connected network units. It is impossible for a person to extract this information. This consideration prompted the extraction of a classification rule in data mining. The classification process begins with the dataset. There are two components to the data set: a training and a test sample. The training sample is implemented to learn the network, while test sample is applied to assess the classifier’s accuracy. Various methods, such as the hold-out process, cross validation, and random sampling, can be used to divide a data set. Generally, learning steps of a neural network is as follows:
-
1.
The input, output, and hidden layers each have a fixed number of nodes.
-
2.
For the learning method, an algorithm is used.
A neural network’s most vital capability is to change the structure of the network and learn by changing the weight makes it suitable in the artificial intelligence field. In this work, three dense layer has been taken for building up the ANN architecture for emotion recognition. The first dense layer with 256 neurons takes the input emotion feature vectors described in Sect. 3.2 and provides the resultant transformed feature map for second dense layer having 128 neurons. Finally, the output dense layer with 3 neurons representing the 3 emotion states takes the output from previous dense layer and provides the emotion classification accuracy with other evaluation parameters using ANN model on emotion dataset.
Multiple Layer Perception (MLP) stands for multi-layer perceptron and is a form of ANN. An input layer with at least single hidden layer and an output layer are the three layers of nodes in the simplest MLP. For training, MLP employs a supervised learning mechanism known as backpropagation. Backpropagation is used to conduct learning over a set amount of time calculated in epochs. Backpropagation is an example of automatic differentiation in computer science which compares the classification error. Then the outputs of the network are sent backwards to ground truths to extract a gradient from the final layer, which determines the weights of neurons of a network, commanding their enactment referred to as a gradient descent optimization. The weights of neurons are calculated using an algorithm by calculating the error rate of a network. Following the learning process, an exquisite neural network is developed as a function to intrigue into the output class. A simple MLP model has been implemented in this paper with two dense layer with 100 and 3 neurons for three emotional states classification.
3.5 Implemented Deep Learning Classification Algorithms for EEG Evoked Emotion Classification
In this work, three deep learning based strategies has been implemented for designing of emotion recognition model to improve the classification accuracy and its performance is compared with above described ML methods. The DL based models such as Gated Recurrent Unit (GRU) and Long short-term memory (LSTM) and Convolutional Neural Network (CNN) are trained and evaluated for detection of three emotion states (Fig. 2).
Convolutional Neural Network (CNN) is already established as a popular and applicable deep learning approach. The ability to extract feature maps through the convoutional layers and select the best feature maps by the means of maxpooling layer helps to remove the feature extraction and selection stage from conventional classification tasks.
In this paper, the CNN architecture is composed of 2 Conv1D layers, a Maxpooling 1D layer, a Flatten layer and 2 fully connected (FC) layers. The first Conv1D layer receives the feature vector (2548,1) representing a single dimension input feature row and outputs a feature map by applying 16 kernels with kernel size 10 of stride 1 and no padding. The second Conv1D layer comprises of the 16 kernels with 3 kernel size with 1 stride rate and no padding which yields a resultant feature map followed by a 1D Maxpooling layer with parameter size 8 which suppresses the output and reduce the dimension of output vector. Finally, the resultant feature map are flattened to get a 1D vector and 2 fully connected layer with 100 and 3 outputs neurons and derive the probabilities for 3 classes i.e. negative, positive and neutral emotion states. Both Conv1D layers use non-linear ReLu (Rectified Linear Units) activation function to convert the feature map between 0.01 to 1. The first dense layer likewise employs the ReLu activation function, while another dense layer employs the softmax activation function, which always returns a value between 0 and 1.
Long Short-Term Memory (LSTM): Human Emotions, change over time and this inconsistency is expressed in EEG signal temporal interrelationships. The classification strategy of Long Short-Term Memory networks (LSTM) is used to investigate these associations. LSTM, an enhanced design of Recurrent Neural Network (RNN). RNN stands for Recurrent Neural Network in which knowledge is transferred from the current loop to the following loop in a network of loops. This looping nature of RNN architecture make it useful for time series data. Standard RNN, on the other hand, has a problem with long-term dependencies. As the distance between loops widens, RNN can lose its ability to link information. Because of the explicit nature of its recurring module, short and long-term dependencies can be learned through LSTM. Since LSTM is good at learning short term as well as long-term dependencies of time series, it was used in this study to look at EEG signal temporal correlations. In this work a LSTM architecture is proposed which composed 2 LSTM layer, 2 dropout layer and a dense layer. The extracted feature vector is fed to the first LSTM layer which consists of 64 consecutive LSTM blocks with ReLU activation function to learn long term dependencies for input feature vector. These learned output features are then fed to a dropout layer which randomly drops 20% of neurons to prevent overfitting problem and improves the model’s generalization. The second LSTM layer is then applied with 32 LSTM blocks with Sigmoid function to get the more generalized feature vectors out of it followed by a dropout layer. Finally, a dense layer of 3 neurons with sigmoid function indicating the 3 emotion states are applied to get the classification accuracy. A brief proposed architecture is shown in Fig. 3a and an architecture of a single LSTM cell is shown in Fig. 4.
Gated Recurrent Unit (GRU) is a newer form of recurrent neural network that is very similar to LSTM. The GRUs replaced the cell state and forget state of LSTM with the reset gate and update gate to control the flow of information. In this experiment, GRU model has also been implemented to build an emotion classification model. The extracted emotion feature vector are fed to the GRU layer with 256 GRU neurons connected with each other to recognize the previous feature details as well as to learn current feature map to classify the emotions. These learned features are then converted to 1D vector using flatten layer. Finally the fully connected layer with 3 layer representing 3 emotion states provides the classification accuracy. A brief proposed architecture is shown in Fig. 3b and an architecture of a single GRU cell is shown in Fig. 5.
A brief explanation with layers name, parameters size and activation function used in MLP, ANN, CNN, LSTM and GRU model is shown in Table 1. These architectures has been trained on 80% of dataset then 20% data is utilized for testing the overall model performance. Additionally, all implemented models were trained with 16 batch size, ADAM optimizer and categorical cross-entropy loss function for 100 epochs. For the Adaptive Moment Estimation (ADAM) optimizer, a learning rate of 0.0001 and decay of \(1e^{-6}\) was used. This architecture has been tested using the above described dataset and the acquired results and further analysis is discussed in the Result and Discussion section.
4 Result and Discussion
In this current work, various learning algorithm is implemented for emotion detection to acquire better classification accuracy and a comparative analysis of these methods with respect to their performance and accuracy is also provided. As discussed in the previous section, ten different models i.e. machine learning based models namely SVM, RF, k-NN, GNB, DT, MLP and ANN, additionally Deep learning based models i.e. CNN, LSTM, GRU is implemented to classify the EEG based emotion evoked signals. As described in dataset description section, Kaggle emotion dataset was employed to train and test the proposed models. The nature of EEG signals is uncertain and frail signals which hides many important details and clues which can contribute in Classification tasks. Various statistical feature extraction techniques such as mean, standard deviation, skewness, kurtosis, maximum, minimum and its derivatives along with Euclidean distance between mean, max and min values were also extracted. Furthermore, Log-covariance matrix and some non-linear features such as Shannon and log energy entropy and a linear feature ie FFT (fast Fourier feature) is extracted. Therefore, to achieve better accuracy out of EEG signals, these extracted features then served as an input data for the proposed ML and DL based classification techniques. Additionally, this method was processed with categorical cross-entropy and Adaptive Moment Estimation (Adam) optimizer has been used for all the above described model for 100 epochs. The performance of the classification is further analyzed by using the average train and test Accuracy (Acc.), Recall (R), Precision (P) and F1 score values (in %) and Loss and Accuracy graph for the proposed DL based model. The number of accurate predictions divided by the total number of observations in the dataset yields the accuracy (Acc.) as described in equation 13. The comparison analysis of proposed work performance with other existing work is provided in table 3.
The highest level of accuracy is 1.0, while the lowest level is 0.0.
Recall calculates the proportion of correctly classified emotion states and total number of entries in the dataset (Eq. 14), while Precision is the proportion of correctly classified emotion states and the predicted emotion states (Eq. 15).
Accuracy provides overall percentage of true positive values has been successfully classified but the F1 score provide the harmonic mean value of true positive and true negative values classified described in Eq. 16.
The overall train and test accuracy along with precision, recall and F1 score for all models on the dataset is provided in Table 2.
It can be observed from this table that most of the implemented ML and DL based methodologies have achieved better accuracy. Among all ML based algorithms implemented for emotion recognition, DT, RF and SVM achieved highest classification accuracy with 98.12, 96.42 and 96.25%. The other DL based models such as LSTM, GRU and CNN outperformed with 97.42, 97.19 and 98.13% accuracy on test dataset respectively and 100% classification accuracy on train dataset. CNN and RF model achieved highest performance but LSTM, GRU, DT and SVM has achieved comparable accuracy and performed well. Figures 6, 7, 8 and 9 shows the accuracy and loss graph of training and testing of ANN, CNN, LSTM and GRU respectively. The comparison graph of all implemented methods is shown in Fig. 10. All Machine learning and Deep learning model; accuracy, F1 score, precision and recall is compared in Figs. 11 and 12 separately.
5 Conclusion
This study suggested some ML and DL based algorithms for classification of EEG evoked multiple emotion states ie: Neutral, Positive and Negative. To develop a emotion classification model, various features were first extracted from raw EEG data and then these feature sets were then supplied to various ML and DL based models to acquire the overall model’s performance accuracy and loss. These implemented methodology were then also compared and evaluated using the train and test dataset and average accuracy on train and test dataset, F1 score and model’s precision, recall were evaluated for each implemented model with their accuracy and loss graph also described. This paper experimented on all previously implemented ML based methods such as SVM, KNN, DT, RF and GNB for comparative analysis with proposed Deep learning based models. The proposed CNN model was able to classify the feature sets as inputs extracted from the EEG dataset to identify 3 different emotion states: positive, negative and neutral from the captured EEG signals of a subject while viewing emotional video clips on the screen. The other algorithms suc h as LSTM and GRU model efficiency is evident from its accuracy in comparison to CNN model. A future extension to the work may include testing the model on other datasets acquired through different visual stimulus. Different optimization techniques can be applied to achieve optimal solution and better performance.
References
Cheng, J., Chen, M., Li, C., Liu, Y., Song, R., Liu, A., & Chen, X. (2020). Emotion recognition from multi-channel EEG via deep forest. IEEE Journal of Biomedical and Health Informatics, 25(2), 453–464.
Liu, Yu., Ding, Y., Li, C., Cheng, J., Song, R., Wan, F., & Chen, X. (2020). Multi-channel EEG-based emotion recognition via a multi-level features guided capsule network. Computers in Biology and Medicine, 123, 103927.
Ali, Mouhannad., Mosa, Ahmad Haj., Machot, Fadi Al., & Kyamakya, Kyandoghere. (2016) EEG-based emotion recognition approach for e-healthcare applications. In 2016 eighth international conference on ubiquitous and future networks (ICUFN), pages 946–950. IEEE,
Tao, W., Li, C., Song, R., Cheng, J., Liu, Y., Wan, F., & Chen, X. (2020) EEG-based emotion recognition via channel-wise attention and self attention. IEEE Transactions on Affective Computing, 1–12. https://doi.org/10.1109/TAFFC.2020.3025777
Jenke, R., Peer, A., & Buss, M. (2014). Feature extraction and selection for emotion recognition from EEG. IEESE Transactions on Affective computing, 5(3), 327–339.
Alhagry, S., Fahmy, A. A., & El-Khoribi, R. A. (2017). Emotion recognition based on EEG using LSTM recurrent neural network. Emotion, 8(10), 355–358.
Menezes, M. L. R., Samara, A., Galway, L., Sant’Anna, A., Verikas, A., Alonso-Fernandez, F., et al. (2017). Towards emotion recognition for virtual environments: an evaluation of EEG features on benchmark dataset. Personal and Ubiquitous Computing, 21(6), 1003–1013.
De Nadai, Silvia., D’Incà, Massimo., Parodi, Francesco., Benza, Mauro., Trotta, Anita., Zero, Enrico., Zero, Luca., & Sacile, Roberto. (2016) Enhancing safety of transport by road by on-line monitoring of driver emotions. In 2016 11th System of Systems Engineering Conference (SoSE), pages 1–4. Ieee,
Wang, F., Zhong, S. H., Peng, J., Jiang, J., & Liu, Y. (2018) Data augmentation for EEG-based emotion recognition with deep convolutional neural networks. In: International Conference on Multimedia Modeling, (pp. 82–93). Springer
Guo, R., Li, S., He, L., Gao, W., Qi, H., & Owens, G.(2013) Pervasive and unobtrusive emotion sensing for human mental health. In 2013 7th International Conference on Pervasive Computing Technologies for Healthcare and Workshops, (pp. 436–439). IEEE,
Verschuere, B., Crombez, G., Koster, E., & Uzieblo, K. (2006). Psychopathy and physiological detection of concealed information: A review. Psychologica Belgica, 46, 1–2.
Acharya, U. R., Sree, S. V., Ang, P. C. A., Yanti, R., & Suri, J. S. (2012). Application of non-linear and wavelet based features for the automated identification of epileptic EEG signals. International Journal of Neural Systems, 22(02), 1250002.
Kumari, N., Anwar, S., & Bhattacharjee, V. (2020). Correlation and relief attribute rank-based feature selection methods for detection of alcoholic disorder using electroencephalogram signals. IETE Journal of Research, 1–13. https://doi.org/10.1080/03772063.2020.1780166
Anuragi, A., & Sisodia, D. S. (2019). Alcohol use disorder detection using eeg signal features and flexible analytical wavelet transform. Biomedical Signal Processing and Control, 52, 384–393.
Musselman, M., & Djurdjanovic, D. (2012). Time-frequency distributions in the classification of epilepsy from eeg signals. Expert Systems with Applications, 39(13), 11413–11422.
Liu, Y., & Sourina, O. (2013). Real-time fractal-based valence level recognition from EEG. Transactions on computational science XVIII (pp. 101–120). Berlin: Springer.
Hjorth, B. (1970). EEG analysis based on time domain properties. Electroencephalography and Clinical Neurophysiology, 29(3), 306–310.
Petrantonakis, P. C., & Hadjileontiadis, L. J. (2009). Emotion recognition from EEG using higher order crossings. IEEE Transactions on Information Technology in Biomedicine, 14(2), 186–197.
Shi, Li-Chen., Jiao, Ying-Ying., & Lu, Bao-Liang. (2013) Differential entropy feature for EEG-based vigilance estimation. In 2013 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pages 6627–6630. IEEE,
Lin, Y.-P., Wang, C.-H., Jung, T.-P., Tien-Lin, W., Jeng, S.-K., Duann, J.-R., & Chen, J.-H. (2010). EEG-based emotion recognition in music listening. IEEE Transactions on Biomedical Engineering, 57(7), 1798–1806.
Wang, Z.-M., Shu-Yuan, H., & Song, H. (2019). Channel selection method for EEG emotion recognition using normalized mutual information. IEEE Access, 7, 143303–143311.
Li, Y., Huang, J., Zhou, H., & Zhong, N. (2017). Human emotion recognition with electroencephalographic multidimensional features by hybrid deep neural networks. Applied Sciences, 7(10), 1060.
Kwon Woo Ha and Ji Woo Jeong. (2019). Motor imagery EEG classification using Capsule Networks. Sensors, 19(13), 2854.
Alarcao, S. M., & Fonseca, M. J. (2017). Emotions recognition using EEG signals: A survey. IEEE Transactions on Affective Computing, 10(3), 374–393.
Ackermann, Pascal., Kohlschein, Christian., Bitsch, Jó Agila., Wehrle, Klaus., & Jeschke, Sabina. (2016) Eeg-based automatic emotion recognition: Feature extraction, selection and classification methods. In 2016 IEEE 18th international conference on e-health networking, applications and services (Healthcom), (pp. 1–6). IEEE
Estepp, J. R., & Christensen, J. C. (2015). Electrode replacement does not affect classification accuracy in dual-session use of a passive brain-computer interface for assessing cognitive workload. Frontiers in Neuroscience, 9, 54.
Zheng, W.-L., & Bao-Liang, L. (2015). Investigating critical frequency bands and channels for EEG-based emotion recognition with deep neural networks. IEEE Transactions on Autonomous Mental Development, 7(3), 162–175.
Naji, M., Firoozabadi, M., & Azadfallah, P. (2015). Emotion classification during music listening from forehead biosignals. Signal, Image and Video Processing, 9(6), 1365–1375.
Nie, Dan., Wang, Xiao-Wei., Shi, Li-Chen., & Lu, Bao-Liang. (2011) EEG-based emotion recognition during watching movies. In 2011 5th International IEEE/EMBS Conference on Neural Engineering, (pp. 667–670). IEEE
Bahari, Fatemeh., & Janghorbani, Amin. (2013) EEG-based emotion recognition using recurrence plot analysis and k nearest neighbor classifier. In 2013 20th Iranian Conference on Biomedical Engineering (ICBME), (pp. 228–233). IEEE
Veeramallu, Gnana Keerthi Priya., Anupalli, Yamuna., Jilumudi, Sravan kumar., & Bhattacharyya, Abhijit. (2019) EEG based automatic emotion recognition using emd and random forest classifier. In 2019 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT), (pp. 1–6). IEEE
Lahane, P., & Sangaiah, A. K. (2015). An approach to EEG based emotion recognition and classification using kernel density estimation. Procedia Computer Science, 48, 574–581.
Subasi, A., Tuncer, T., Dogan, S., Tanko, D., & Sakoglu, U. (2021). EEG-based emotion recognition using tunable q wavelet transform and rotation forest ensemble classifier. Biomedical Signal Processing and Control, 68, 102648.
Martínez-Tejada, Laura Alejandra., Yoshimura, Natsue., & Koike, Yasuharu.(2020) Classifier comparison using EEG features for emotion recognition process. In 2020 IEEE 18th World Symposium on Applied Machine Intelligence and Informatics (SAMI), (pp. 225–230). IEEE,
LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324.
He, Kaiming., Zhang, Xiangyu., Ren, Shaoqing., & Sun, Jian. (2016) Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, (pp. 770–778)
Liu, Xiaodong., He, Pengcheng., Chen, Weizhu., & Gao, Jianfeng. (2019) Multi-task deep neural networks for natural language understanding. arXiv preprint arXiv:1901.11504, 2019.
Yao, Z., Wang, Z., Liu, W., Liu, Y., & Pan, J. (2020). Speech emotion recognition using fusion of three multi-task learning-based classifiers: HSF-DNN, MS-CNN and LLD-RNN. Speech Communication, 120, 11–19.
Kavasidis, Isaak., Palazzo, Simone., Spampinato, Concetto., Giordano, Daniela., & Shah, Mubarak. (2017) Brain2image: Converting brain signals into images. In Proceedings of the 25th ACM international conference on Multimedia, (pp. 1809–1817)
Yang, Yilong., Wu, Qingfeng., Fu, Yazhen., & Chen, Xiaowei. (2018) Continuous convolutional neural network with 3d input for EEG-based emotion recognition. In International Conference on Neural Information Processing, (pp. 433–443). Springer
Zheng, Wei-Long., Zhu, Jia-Yi., Peng, Yong., & Lu, Bao-Liang. (2014) EEG-based emotion classification using deep belief networks. In 2014 IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6. IEEE, 2014
Sun, Bo., Wei, Qinglan., Li, Liandong., Xu, Qihua., He, Jun., & Yu, Lejun. (2016) LSTM for dynamic emotion and group emotion recognition in the wild. In Proceedings of the 18th ACM International Conference on Multimodal Interaction, (pp. 451–457)
Kumari, N., Anwar, S., & Bhattacharjee, V. (2022). Automated visual stimuli evoked multi-channel EEG signal classification using EEGCapsNet. Pattern Recognition Letters, 153, 29–35.
Kumari, N., Anwar, S., & Bhattacharjee, V. (2022). A deep learning-based approach for accurate diagnosis of alcohol usage severity using eeg signals. IETE Journal of Research, 1–15. https://doi.org/10.1080/03772063.2022.2038705
Kumari, N., Anwar, S., & Bhattacharjee, V. (2022) Time series-dependent feature of eeg signals for improved visually evoked emotion classification using EmotionCapsNet. Neural Computing and Applications, 34, 13291–13303.
Bird, Jordan J., Faria, Diego R., Manso, Luis J., Ekárt, Anikó, & Buckingham, ChristopherD. (2019). A deep evolutionary approach to bioinspired classifier optimisation for brain-machine interaction. Complexity, 1–14. https://doi.org/10.1155/2019/4316548
Bird, Jordan J., Manso, Luis J., Ribeiro, Eduardo P., Ekart, Aniko., & Faria, Diego R. (2018) A study on mental state classification using EEG-based brain-machine interface. In 2018 International Conference on Intelligent Systems (IS), (pp. 795–800). IEEE
Li, Zhenqi., Tian, Xiang., Shu, Lin., Xu, Xiangmin., & Hu, Bin.(2017) Emotion recognition from eeg using rasm and lstm. In International Conference on Internet Multimedia Computing and Service, (pp. 310–318). Springer
Cui, H., Aiping Liu, X., Zhang, X. C., Wang, K., & Chen, X. (2020). EEG-based emotion recognition using an end-to-end regional-asymmetric convolutional neural network. Knowledge-Based Systems, 205, 106243.
Funding
Not Applicable
Author information
Authors and Affiliations
Contributions
All authors contributed equally to this work.
Corresponding author
Ethics declarations
Conflict of interest
Not Applicable
Ethics approval
Not Applicable
Consent to participate
Not Applicable
Consent for publication
Not Applicable
Availability of data and materials
Not Applicable
Code availability
Not Applicable
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Kumari, N., Anwar, S. & Bhattacharjee, V. A Comparative Analysis of Machine and Deep Learning Techniques for EEG Evoked Emotion Classification. Wireless Pers Commun 128, 2869–2890 (2023). https://doi.org/10.1007/s11277-022-10076-7
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11277-022-10076-7