Keywords

1 Introduction

Brain computer interface (BCI) facilitates a connection between the human brain and external device like computer. The BCI system can be used for assisting the physically disabled and impaired people [1, 2]. The BCI system requires analysis, monitoring, measurement, and evaluation of electrical activity of the brain which is extracted by either a set of electrodes over the scalp or electrodes implanted inside the brain. The BCI system can be used for analysis and classification of EEG signals corresponding to different emotions.

Emotion is one of the main factors that affect activities of our day to day life. The applications of emotion classification may include medical areas like as neurology and psychology. The diagnosis of neurological disorders has been suggested based on automatic emotion recognition system using various signals like electromyogram (EMG), electrocardiogram (ECG) and facial images [3, 4]. Emotions expressed via speech and facial expressions are commonly used techniques for classification of human emotions [5, 6]. However, the speech and facial expressions may lead to false emotion. Therefore, it motivates the use of physiological signals like EEG signal for analysis and classification of human emotions. The EEG signals can play an important role in detecting the emotional states for developing the BCI based analysis and classification of emotions.

It should be noted that the EEG signal indicates the electrical potential differences corresponding to different emotions generated by human brain. The research areas like psychology, neurophysiology and BCI are focusing on the important indication of emotions using EEG signals. Different emotional states can be affected by conditions like age, gender, background, and ethnicity. Moreover, various people have lot of personal emotional experiences to the same stimulus. In [7, 8], it has been provided the significant differences in emotional states which are generated for autonomic nervous system. But automated classification was not carried out. Most of the emotions exist for very small interval of time in the range of few micro to milli seconds [9]. Generally emotions are developed at the deeper part of the human brain called limbic system, which initiates emotional interpretation of the EEG signals from the autonomic nervous system. These incoming signals propagates to hypothalamus to trigger the corresponding perceptive physiological effects like increase in heart rate, R-to-R interval and blood volume pulse. These processed signals travels to the amygdala, which is important part of human brain for learning connections to stimuli by comparing them to past experiences. Some of the results on emotion recognition research have shown that, the amygdala and corticothalamic connections mainly participate in emotion recognition process. In addition, prefrontal cortex, cerebral cortex and occipital lobe areas also have significant role in provoking emotions such as happy, fear, and sad [10]. Regions of human brain which contribute for emotions are as follows: (a) sadness (left temporal areas), (b) sadness, happiness and disgust (right prefrontal cortex area), (c) anger (right frontal cortex activation) (d) fear (bilateral temporal activation), (e) sadness and happiness (contribute most of the brain areas) and (f) all emotions also share the areas (prefrontal cortex, cingulated gyrus, and temporal cortex).

Although most of the activation for emotions emerge in right hemisphere corresponding to different time-segments of EEG signals. The left hemisphere also plays a significant role in activation of emotions. Apparently, brain might be partly or entirely engaged to emotional processing during emotions like sadness, anger, happiness, disgust and fear. Thus, the results support the hypothesis that there are no exclusive emotion centers in the brain. But the results indicate that the several brain areas are activated during emotion processing in a well-defined and specific dynamic process. It has been noticed that left and right hemispheres of the brain together experience different classes of emotions [11]. The left hemisphere is responsible for approach. On the other hand, the right hemisphere is responsible for withdrawal. In [12], it has been explained that the left frontal region is an important center for self-regulation, motivation and planning. The damage of left frontal region can cause to apathetic behavior in combination with a loss of interest and pleasure in objects. The right anterior region contributes to high activation of right frontal and anterior temporal regions during arousal emotional states like fear and disgust. In [11], it has been noticed that there is less alpha power in right frontal regions for disgust than that for happiness while happiness caused less alpha power in the left frontal region than that of disgust. In addition, the analysis of EEG signals have been carried out for brain asymmetries during reward and punishment. It has been found that punishment has association with less alpha power in right mid and lateral frontal regions of the brain and reward has been associated with less alpha power in the left mid and lateral frontal regions [13]. In an experiment, it has been shown that alpha power over the left hemisphere increases in happy conditions in comparison to negative emotional conditions. During the study, the three emotions fear, happiness and sadness have been induced by using visual and auditory stimuli [10]. There are two to twenty basic or prototype emotions as defined by many researchers. Most of the theories suggests that each emotion reflects an particular motivational tendency and behavior. Emotions represent particular forms of action and physiological patterns [14]. The physiological patterns have been applied for classification of emotions into three types: (1) distress (2) interest and (3) pleasure [15]. The basic emotions as defined in [16] are as follows: anger, fear, sadness, disgust, surprise, curiosity, acceptance, and joy.

Most of the methods developed in the literature for neuropsychological studies have reported the correlations between EEG signals and emotional states. These methods have been based on time-domain analysis and frequency-domain analysis. In the time-domain analysis, event-related potentials (ERPs) components have reflected emotional states [17]. The ERP components of short to middle latencies have been shown to have correlation with valence [18, 19], whereas with the ERP components of middle to long latencies have been shown to have correlation with arousal [20, 21]. The computation of ERPs requires averaging EEG signals over multiple trials, rendering ERP features inappropriate for online processing. However, recent development in single-trial ERP computation methods have resulted in a increased possibility to use ERP features for online emotional state estimation [2224]. In the frequency-domain, the spectral power of different frequency bands corresponds to different emotional states. The frequency bands have been used for analysis of emotions (happy, sad, angry, fear) or neutral. It has been noted that stimulus can modulate the power synchronization with in frequency bands [25, 26]. In [27], it has been proved that the frontal alpha asymmetry reflects the approach/avoidance aspects of emotion. The gamma band power has been related to some emotions like happiness and sadness [28, 29]. The theta power of ERS has been related to transitions in the emotional state [3032]. In [33], the EEG signals have been decomposed into frequency bands and then principal component analysis (PCA) has been employed for reduction of features. These features have been used as input to the binary classifier for classification of emotions based in the bi-dimensional valence-arousal approach. In [34], the EEG signals have been used for recognition of human emotions with the help of humanoid robots. The aim of this experiment was to provide the ability for robots to detect emotion and react to it in the same way as occurring in a human to human interaction. The discrete wavelet transform (DWT) based features namely, energy, recoursing energy efficiency (REE) and root mean square (RMS) have been used for classification of four emotions (happy, disgust, surprise and fear) with Fuzzy C-Means (FCM) clustering [35].

In [36], the participants have been asked to remember past emotional events and the method has been used SVM classifier to obtained the classification accuracy of 79 % using EEG signals for three classes and 76 % using EEG signals for two classes. In another study [37], the wavelet coefficient and chaotic parameters like fractal dimension, correlation dimension and wavelet entropy have been used to extract features from EEG and psychophysiological signals. The selected features combined with linear discriminate analysis (LDA) and SVM, obtained classification accuracy 80.1 and 84.9 % for two classes of emotional stress using LDA and SVM respectively. In [38], the combination of music and story has been used as stimuli to introduce a user independent system. The classification accuracy as obtained with this method was 78.4 and 61.8 % for three and four classes respectively. In [39], film clips have been used to stimulate participants with five different emotions joy, anger, sadness, fear, and relax. The statistical features extracted from EEG have been used as input for SVM classifier as result 41.7 % of the patterns that have been correctly recognized.

In [40], a BCI system for the recognition of human emotions have been used with 64 channels EEG recording system. A Laplace filter has been applied for pre-processing of EEG signals. The wavelets transform algorithm has been used for features extraction from EEG signals and two different classifier namely the k nearest neighbors and linear discriminant have been used for classification of discrete emotions such as happiness, surprise, fear, disgust and neutral. The efficiency of the asymmetry index (ASI) based emotional filters has been justified through an extensive classification process involving higher-order crossings and cross-correlation as feature-vector extraction techniques and a support vector machine classifier for six different classification scenarios in the valence/arousal space. This study has resulted in mean classification rates from 64.17 up to 82.91 % in a user-independent base, revealing the potential of establishing such a filtering for reliable EEG-based emotion recognition systems [41]. Electric potential associated with brain activities that has been measured thought EEG signals have a potential source for emotion detection. The power of EEG signal in the specific bandwidth or brain wave has been used for analysis for positive or negative expression particularly the change of power in alpha and beta wave [42]. In [43], different stimuli like sounds, images, and combination of both have been used to distinguish between three emotion classes (neutral, happy, unhappy). In [44], calm and excited emotions have been evoked using images, the best classification accuracy that has been archived 72 %. In [45], authors has compared three feature extraction methods based on fractal dimension of EEG signals including Higuchi, Minkowski Bouligand, and Fractional Brownian motion using kNN and SVM classifiers on four classes of emotions. The principal component analysis (PCA) has been used to correlate EEG features with complex music appreciation which has been used as input to the SVM classifier to classify EEG dynamics in four subjectively-reported emotional states [46]. In [47], a system has been proposed for estimating the feelings of joy, anger, sorrow and relaxation by using neural network, which has obtained classification accuracy of 54.5 % for joy, 67.7 % for anger, 59 % for sorrow and 62.9 % for relaxation. In [34], a system has been implemented based on EEG signals to enable a robot to recognize human emotions. Emotions have been evoked by images and classified in three different emotions, namely: pleasant, unpleasant and neutral.

The emotions have been elicited by stimulating participants with a Pong game and anagram puzzles. The four machine learning methods K-nearest neighbor, regression tree (RT), Bayesian network and SVM have been for emotion classification, the best average accuracy has been obtained with SVM 85.51 % [48]. In [49], the dynamic difficulty adjustment (DDA) mechanism has been developed for adjustment of game difficulty in real time based on anxiety measures. This demonstrates the interest of using affective computing for the purpose of game adaptation. In [50], the authors have proposed technique to continuously assess the emotional state of a player using fuzzy logic. The obtained results have shown that the emotional states have evolved according to the events of the game, but no exact measure of performance have been reported. This tool could be used to include the player’s experience in the design of innovative video games. In [51], three emotional states namely boredom, anxiety, engagement have been detected from peripheral signals by using SVM classifier. The emotions have been elicited by using a Tetris game.

The features extracted form the mutual information and magnitude squared coherence estimation of EEG signals have been used as features for k-nearest neighbors (kNN) and SVM classifiers for classification of emotions. The performance of the EEG-based emotion recognition system has been then evaluated using five-fold cross-validation [52]. In [53], features that have been extracted by independent component analysis (ICA) and the K-means algorithm have been used to distinguish emotions in EEG. [54] have investigated the use of the naive Bayes classifier, SVM, and ANN to detect different emotions in EEG. The EEG signals have been recorded from 10 participants by using the international affective picture system (IAPS) database. The frequency band power has been measured along with the cross-correlation between EEG band powers, the peak frequencies in the alpha band, and the Hjorth parameters [55].

The different emotional states using EEG signals have been measured by the Kolmogorov entropy and the principal Lyapunov exponent [56]. Non-linear dynamic complexity has been used to measure the complexity of the EEG signals during meditation [30]. The fractal dimension, the energy of the different frequency-bands, and the Lyapunov exponent have been used as features for the classification of human emotions [57]. The correlation dimension measures the complexity of EEG signals, which also has been used for analysis of human emotions [58]. The statistical and energy features obtained by using discrete wavelet transform (DWT) of EEG signal have been used for human emotion classification [59]. Wireless concept based detection of state of valence using EEG signals has been proposed in [60]. The higher order spectra (HOS) together with genetic algorithm have been used for classification of two emotional stress states with an average accuracy of 82 % [61].

The event related potential and event related oscillation based features have been proposed as input feature set for emotion classification [62]. The obtained classification accuracies are 79.5 and 81.3 % for Mahalanobis distance based classifier (MD) and support vector machine (SVM) respectively. The time-frequency domain based features have been suggested as input feature set to SVM classier for classification of three emotional states [63]. The obtained average classification accuracy in this work is 63 % [63]. The methodology based on surface Laplacian (SL) filtering, wavelet transforms (WT) and linear classifier has been developed for classification of emotions using EEG signals [64]. The classification accuracies reported in this study are 83.04 and 79.17 % for kNN and linear discriminant analysis (LDA) respectively. The short-time Fourier transform (STFT) based features have been suggested as an input for SVM classifier for classification of emotions [65, 66]. The obtained classification accuracy in this work is 82.29 % for classification of four emotions using SVM classifier. The features obtained using higher order crossing have been used for classification of emotions using EEG signals [67]. The classification accuracy achieved with this methodology for six emotions is 83.33 %. Time and frequency domain based features have been suggested for classification of emotions from EEG signals [68]. The proposed methodology in this work has provided classification accuracy of 66.5 % considering four emotions [68]. The spectrogram, Zhao-Atlas-Marks and Hilbert-Huang spectrum based features have been used for classification of arousal and neutral with classification accuracy of 86.52 % [69].

The classification of emotions is probabilistic. The previous research on human emotion has dealt with classification using probability theory to estimate the human emotional state by checking the presence or absence of a certain emotion [70, 71]. The techniques based on probability theory are still insufficient to handle all the facets of uncertainty in human emotion classification [72]. The fuzzy set theory can provide a systematic approach to process uncertain information, just as humans are able to interpret imprecise and inadequate information. In order to incorporate human expertise, the fuzzy C-means clustering (FCM) has been used to cluster each component to get different emotional descriptors [73]. These descriptors have been combined together to form the fuzzy-GIST in order to generate the emotional feature space for human emotion recognition [74]. The fuzzy sets have attracted interest in information technology, production techniques, decision making, pattern recognition, diagnostics, data analysis, etc. [7577]. The neuro-fuzzy systems are fuzzy systems, which use ANN theory in order to determine their properties like fuzzy sets and fuzzy rules by processing data samples. Neuro-fuzzy systems employs fuzzy logic and artificial neural networks (ANNs) by utilizing the mathematical properties of ANNs in tuning rule-based fuzzy systems that represent the way humans process information. The adaptive neuro-fuzzy inference system (ANFIS) has shown to be significant in modeling of nonlinear functions. The ANFIS learns features in the data set and adjusts the system parameters based up on a given error criterion [78, 79]. The application of ANFIS in biomedical engineering have been reported to be significant for classification [80, 81]; Übeyli and Güler [82, 83] and data analysis [84]. The most prominent classification methods are support vector machine (SVM) [85], fuzzy k-means [86], and fuzzy c-means [87]. These classifier have been resulted in moderate classification accuracy for up to three [88], four [89], and five [39] distinct emotions. Other researchers have made efforts to study the operator engagement, fatigue, and workload by using EEG signals with respect to complexity of a task [9094].

The emotion classification methods have been developed based on different feature extraction techniques from EEG signals. Many EEG signal analysis methods have employed preprocesses for reducing the artifacts. The recorded EEG signals in response to stimuli pass through the preprocessing step in which noise reduction algorithms and spatio-temporal filtering methods are applied to improve the signal-to-noise ratio (SNR). Then, the feature extraction step determines specific band powers, ERPs, and phase coupling indices that have correlation with the aimed emotional states. Commonly, this feature selection process is being optimized in order to achieve maximum emotion classification accuracy. The classification steps compute the most probable emotional states from the selected EEG features. The number of classes depend on the definition of the emotional state space like the continuous state of arousal and valence, or the discrete states.

In this chapter, we present an emotion classification system based on multiwavelet transform (MWT) of EEG signals. The EEG signals have been acquired using audio–video stimulus. The MWT decomposes the EEG signals into a set of sub-signals. The features: ratio of the norms based measure, Shannon entropy measure, and normalized Renyi entropy measure have been computed from the sub-signals of the EEG signals. The extracted features have been used as an input to the multiclass least squares support vector machine (MC-LS-SVM) for emotion classification from EEG signals. This chapter is organized as follows: Sect. 8.2 presents the experimental setup, pre-processing, the MWT, features extraction and MC-LS-SVM classifier. The experimental results and discussion for the emotion classification using EEG signals based on the proposed methodology have been provided in Sect. 8.3. Finally, Sect. 8.4 concludes the chapter.

2 Methodology

2.1 Experimental Setup

The EEG signals have been acquired from 8 healthy subjects (4 males and 4 females) during audio–video stimulus. The subjects were having age between 20–35 years. The subjects were undergraduate students or employees from Indian Institute of Technology Indore, India. A 16-channel EEG module (BIOPAC system, Inc.) with 10–20 electrode system was used for recording of EEG signals. The sampling frequency of EEG signals was 1,000 Hz. The bipolar montage has been used during recording of the EEG signals. The prefrontal cortex plays significant role in impulse control and in many other emotions [95, 96]. Therefore, the electrode positions Fp1/Fp2 and F3/F4 have been used to record the EEG signals. The right (A2) and left (A1) earlobes have been used for ground and reference electrodes respectively.

Generally, the number of basic emotions can be up to 15 [97]. The eight basic emotions such as anger, fear, sadness, disgust, surprise, curiosity, acceptance, and joy have been described in [16]. The emotions can be represented based on their valence (positive and negative) and arousal (calm and excited) with two dimensional scale [98]. The different ways of inducing emotions are: visual includes images and pictures [41], recalling of past emotional events [44], audio may be songs and sounds [99], audio–video includes film clips and video clips [100, 101]. In this work, we have studied four basic emotional states based on 2-D valence-arousal emotion model [102], which includes happy, neutral, sadness, and fear. In this study, the EEG signals have been obtained from eight subjects with 5 trials each using 3 audio–video stimulus. Figure 8.1 shows EEG data recording and main parts of the proposed methodology for emotion classification. The EEG signals for four emotional states: happy, neutral, sad and fear have been shown in Fig. 8.2. The subsections of the proposed method for emotion classification from EEG signals are shown in Fig. 8.3. The subsections include pre-processing, multiwavelet transform, feature extraction, and MC-LS-SVM classifier. The details of each subsection are explained as follows:

Fig. 8.1
figure 1

EEG data recording and main parts of the proposed methodology for emotion classification

Fig. 8.2
figure 2

The EEG signals of different emotional states: a happy, b neutral, c sad, and d fear

Fig. 8.3
figure 3

The block diagram of proposed methodology for emotion classification from EEG signals

2.2 Pre-processing

The recorded EEG signals are contaminated with noise like power line, external interferences and other artifacts. The 8th order, band pass, Butterworth filter with a bandwidth of 0.5–100 Hz has been used for removing noise. The 50 Hz notch filter has been applied to remove the noise due to power-line interference. The MWT requires pre-processing which includes generation of vectored input stream and pre-filtering. There are many ways to obtain the vectored input stream [103]. In this work, the vectored input stream has been obtained using repeated row pre-processing scheme. The matrix-valued multiwavelet filter bank also requires multiple streams of input as decided by multiplicity.

2.3 Multiwavelet Transform

The scalar wavelets which are obtained by mother wavelets by varying one scaling function have been widely used in non-stationary signal processing. The scalar wavelet have led to the notion of multiwavelets, which is a more recent generalization having more numbers of distinct scaling functions. It offers many theoretical and experimental advantages. For example, multiwavelets have been constructed to simultaneously possess symmetry, orthogonality, and compact support [104108]. The multiwavelets have some unique characteristics that cannot be obtained with scalar wavelets. Multiwavelets can simultaneously provide perfect reconstruction while preserving length (orthogonality), good performance at the boundaries (via linear-phase symmetry), and a high order of approximation (vanishing moments). These features of multiwavelets cause to better performance of multiwavelets over scalar wavelets in image processing applications. Particular applications, where multiwavelets have been found to offer superior performance over single wavelets, include signal/image classification [107, 108], compression [104], and denoising [106]. The wavelet transform based features have been used for epileptic EEG signal classification and recognition [109, 110]. The multiwavelets attracted because of their significant characteristics, which consist of more than one scaling and wavelet functions. Multiwavelets simultaneously possess orthogonality, short support, symmetry, and a high order of approximation through vanishing moments, that all of them are important for signal processing application [103]. The performance of multiwavelet have shown superior as compare to scalars wavelets in image classification, denoising [106] and image compression [104]. In [111], it has been shown that the multiwavelet transform has an efficient signal processing technique for the feature extraction from EEG signals in comparison with scalar wavelet. It motivates us to use multiwavelet transform of EEG signals for classification of human emotions.

The standard multi-resolution analysis (MRA) for scalar wavelet uses one scaling function \( \phi (t) \) and one wavelet \( \psi (t) \). The integer translates and the dilates of the scaling function are represented as \( \phi (t - k) \) and \( \phi (2^{j} t - k) \) respectively. The multiwavelet is the extension of scalar wavelet where multiple scaling functions and associated multiple wavelets are used. In case of multiwavelet, a basis for the subspace \( V_{o} \) is generated by translation of r scaling functions denoted by \( \phi_{1} (t - k),\phi_{2} (t - k), \ldots ,\phi_{r} (t - k) \). The multiwavelet can be considered as vector-valued wavelets which satisfy the condition of two-scale relationship with involvement of matrices rather than scalars. The vector-valued scaling function Φ(t) = [ϕ 1(t), ϕ 2(t), … ϕ r (t)]T, where \( T \) represents the transpose and the associated r-wavelets \( \varPsi (t) = [\psi_{1} (t),\psi_{2} (t), \ldots ,\psi_{r} (t)]^{T} \) satisfies the following matrix dilation and matrix wavelet equations [103]:

$$ \varPhi (t) = \sum\limits_{k} G[k]\varPhi (2t - k) $$
(8.1)
$$ \varPsi (t) = \sum\limits_{k} H[k]\varPhi (2t - k) $$
(8.2)

where, the coefficients \( G[k] \) and \( H[k] \) are matrices. The matrices \( G[k] \) and \( H[k] \) are low-pass filter and high-pass filters for multiwavelet filter bank respectively. The multiplicity r is generally 2 for most of the multiwavelets [103]. The multiwavelet can simultaneously exhibit symmetry, orthogonality, and short support, which is not possible using scalar wavelet [103, 112]. In this study, we consider multiple scaling functions and multiwavelets which are developed by Geronimo, Hardin and Massopust (GHM) [113115]. They are shown in Fig. 8.4. The GHM dilation and translation equations for this system have following four coefficients:

$$ G_{0} = \left[ {\begin{array}{*{20}c} \frac{3}{5} & {\frac{4\sqrt 2 }{5}} \\ {\frac{ - 1}{10\sqrt 2 }} & {\frac{ - 3}{10}} \\ \end{array} } \right],G_{1} = \left[ {\begin{array}{*{20}c} \frac{3}{5} & 0 \\ {\frac{9}{10\sqrt 2 }} & 1 \\ \end{array} } \right],G_{2} = \left[ {\begin{array}{*{20}c} 0 & 0 \\ {\frac{9}{10\sqrt 2 }} & {\frac{ - 3}{10}} \\ \end{array} } \right],G_{3} = \left[ {\begin{array}{*{20}c} 0 & 0 \\ {\frac{ - 1}{10\sqrt 2 }} & 0 \\ \end{array} } \right]; $$
(8.3)
$$ H_{0} = \left[ {\begin{array}{*{20}c} {\frac{ - 1}{10\sqrt 2 }} & {\frac{ - 3}{10}} \\ \frac{1}{10} & {\frac{3\sqrt 2 }{10}} \\ \end{array} } \right],H_{1} = \left[ {\begin{array}{*{20}c} {\frac{9}{10\sqrt 2 }} & { - 1} \\ {\frac{ - 9}{10}} & 0 \\ \end{array} } \right],H_{2} = \left[ {\begin{array}{*{20}c} {\frac{9}{10\sqrt 2 }} & {\frac{ - 3}{10}} \\ \frac{9}{10} & {\frac{ - 3\sqrt 2 }{10}} \\ \end{array} } \right],H_{3} = \left[ {\begin{array}{*{20}c} {\frac{ - 1}{10\sqrt 2 }} & 0 \\ { - 1} & 0 \\ \end{array} } \right] .$$
(8.4)
Fig. 8.4
figure 4

The GHM pair of scaling functions and wavelet functions

The GHM multiwavelet has several remarkable properties. The GHM scaling functions have short support of [0, 1] and [0, 2]. The scaling functions are symmetric and the system exhibit second order of approximation. Moreover, multiwavelet form symmetric/antisymmetric pair. Translates of scaling functions and wavelets satisfy orthogonality, which is not possible in case of scalar wavelet. Figures 8.5, 8.6, 8.7 and 8.8 show the third level sub-band signals as obtained by multiwavelet decomposition of EEG signal shown in Fig. 8.2a–d respectively.

Fig. 8.5
figure 5

The third level sub-band signals obtained by multiwavelet decomposition of EEG signal corresponding to happy emotion as shown in Fig. 8.2a

Fig. 8.6
figure 6

The third level sub-band signals obtained by multiwavelet decomposition of EEG signal corresponding to neutral emotion as shown in Fig. 8.2b

Fig. 8.7
figure 7

The third level sub-band signals obtained by multiwavelet decomposition of EEG signal corresponding to sad emotion as shown in Fig. 8.2c

Fig. 8.8
figure 8

The third level sub-band signals obtained by multiwavelet decomposition of EEG signal corresponding to fear emotion as shown in Fig. 8.2d

2.4 Features Extraction

Many entropy based methods have been proposed for EEG signal analysis. Different approaches for computing entropy in physiological systems have been developed in the literature. In [116], the researchers have suggested that a measure of the entropy which is the rate of information of a chaotic system would be a useful parameter for characterizing such a system. In [117], the authors have developed a method to calculate the Kolmogorov-Smirnov (K-S) entropy of a time series. The modified version of Eckmann and Ruelle (E-R) entropy in [118], has been proposed by modifying the distance metric proposed in [119]. The authors have suggested a modification of E-R entropy by introducing statistical entropy named approximate entropy (ApEn) [120]. However, it has been demonstrated that the method to compute ApEn introduces a bias, as the ApEn algorithm counts each sequence as matching itself [121]. In order to reduce this bias, the proposed modified version of the ApEn algorithm known as sample entropy (SampEn). The sample entropy measures the irregularity of the time series. In [122], the authors have compared approximation entropy and sample entropy method for neurophysiological signals. They have addressed issues related to the choice of the input parameters and have shown that the sample entropy approach has produced more consistent results. They have also shown that the sample entropy is less sensitive to the length of the data. Recently, the sample entropy has been used as a feature for the classification of different classes of EEG signals [123].

The features namely, the ratio of norms based measure, Shannon entropy measure and normalized Renyi entropy measure have been measured from sub-signals obtained from the multiwavelet decomposition of EEG signals. These features are briefly described as follows:

Ratio of Norms Based Measure Ratio of norms based measure is defined as the ratio of the fourth power norm and the square of second power norm [124]. It is expressed as:

$$ E_{RN} = \frac{{\sum\nolimits_{n = 1}^{N} |x[n]|^{4} }}{{\left[ {\sum\nolimits_{n = 1}^{N} |x[n]|^{2} } \right]^{2} }} $$
(8.5)

where, \( x[n] \) is signal under study.

Shannon Entropy Measure The Shannon entropy is a measure of uncertainty of the signal [125]. It can be defined as:

$$ E_{SE} = - \sum\limits_{k = 1}^{L} p_{k} \log [p_{k} ] $$
(8.6)

Normalized Renyi Entropy Measure The Renyi entropy measure can be normalized either with respect to signal energy or distribution volume [126]. In this study, the normalized Renyi entropy \( E_{NE} \) which is normalized with respect to signal energy has been used. The \( E_{NE} \) can be expressed as follows:

$$ E_{NE} = \frac{1}{1 - \alpha }\log \left[ {\frac{{\sum\nolimits_{k = 1}^{L} p_{k}^{\alpha } }}{{\sum\nolimits_{k = 1}^{L} |p_{k} |}}} \right] $$
(8.7)

where, α is the order of Renyi entropy, which has been taken as 3 being the smallest integer value.

2.5 Multiclass Least-Squares Support Vector Machine

Multiclass support vector machine (SVM) classifiers have become popular in recent years in the fields of classification, regression analysis, and novelty detection [127]. Multiclass least squares support vector machine (MC-LS-SVM) algorithms have shown very promising results as EEG signal classifiers [128].

The effectiveness of the proposed features in emotion classification from EEG signals is evaluated using a MC-LS-SVM. The least squares support vector machines are a group of supervised learning methods that can be applied for classification of data [129132]. For multiclass classification problem, we have considered the training data \( \{ x_{i} ,y_{i}^{k} \}_{i = 1,k = 1}^{i = P,k = m} \), where \( y_{i}^{k} \) denotes the output of the kth output unit for pattern i. The P denotes the number of training dataset. The derivation of the MC-LS-SVM is based upon the following formulation [127, 133]:

$$ {\kern 1pt} {\text{Minimize}}\,J_{LS}^{(m)} (w_{k} ,b_{k} ,e_{i,k} ) = \frac{1}{2}\sum\limits_{k = 1}^{m} w_{k}^{T} w_{k} + \frac{\gamma }{2}\sum\limits_{i = 1}^{P} \sum\limits_{k = 1}^{m} e_{i,k}^{2} $$
(8.8)

with the following equality constraints:

$$ \left\{ {\begin{array}{*{20}l} {y_{i}^{1} [w_{1}^{T} g_{1} (x_{i} ) + b_{1} ] = 1 - e_{i,1} ,\quad i = 1,2, \ldots ,P} \hfill \\ {y_{i}^{2} [w_{2}^{T} g_{2} (x_{i} ) + b_{2} ] = 1 - e_{i,2} , \quad i = 1,2, \ldots ,P} \hfill \\ \cdot \hfill \\ \cdot \hfill \\ \cdot \hfill \\ {y_{i}^{m} [w_{m}^{T} g_{m} (x_{i} ) + b_{m} ] = 1 - e_{i,m} ,\quad i = 1,2, \ldots ,P} \hfill \\ \end{array} } \right. $$
(8.9)

where, \( w_{k} \) and \( \gamma \) are the weight vector of kth classification error and the regularization factor respectively. The \( e_{i,m} \) and \( b_{k} \) denotes the classification error and the bias respectively. The \( g_{k} (.) \) is a nonlinear function that maps the input space into a higher dimensional space. The Lagrangian multipliers \( \alpha_{i,k} \) can be defined for [128] as:

$$ L^{(m)} \left( {w_{k} ,b_{k} ,e_{i,k} ;\alpha_{i,k} } \right) = J_{LS}^{(m)} - \sum\limits_{i,k} \alpha_{i,k} \left\{ {y_{i}^{(k)} \left[ {w_{k}^{T} g_{k} (x_{i} ) + b_{k} } \right] - 1 + e_{i,k} } \right\} $$
(8.10)

which provides the following conditions for optimality:

$$ \left\{ {\begin{array}{*{20}l} {\frac{\partial L}{{\partial w_{k} }} = 0, \to w_{k} = \sum\limits_{i = 1}^{P} \alpha_{i,k} y_{i}^{(k)} g_{k} (x_{i} )} \hfill & {} \hfill \\ {\frac{\partial L}{{\partial b_{k} }} = 0, \to \sum\limits_{i = 1}^{P} \alpha_{i,k} y_{i}^{(k)} = 0} \hfill & {} \hfill \\ {\frac{\partial L}{{\partial e_{i,k} }} = 0, \to \alpha_{i,k} = \gamma e_{i,k} } \hfill & {} \hfill \\ {\frac{\partial L}{{\partial \alpha_{i,k} }} = 0, \to y_{i}^{(k)} [w_{k}^{T} g_{k} (x_{i} ) + b_{k} ] = 1 - e_{i,k} } \hfill & {} \hfill \\ \end{array} } \right. $$
(8.11)

where, i = 1, 2, …, P and k = 1, 2, …, m. Elimination of \( w_{i} \) and \( e_{k,i} \) provides the linear system as:

$$ \left[ {\begin{array}{*{20}c} 0 & {Y_{M}^{T} } \\ {Y_{M} } & {\Omega _{M} } \\ \end{array} } \right]\left[ {\begin{array}{*{20}c} {b_{M} } \\ {\alpha_{M} } \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} 0 \\ {\bar{1}} \\ \end{array} } \right] $$

with the following matrices:

$$ \begin{aligned} Y_{M} & = {\text{blockdiag}}{\kern 1pt} \left\{ {\left[ {\begin{array}{*{20}l} {y_{1}^{(1)} } \hfill \\ \cdot \hfill \\ \cdot \hfill \\ \cdot \hfill \\ {y_{P}^{(1)} } \hfill \\ \end{array} } \right], \ldots ,\left[ {\begin{array}{*{20}l} {y_{1}^{(m)} } \hfill \\ \cdot \hfill \\ \cdot \hfill \\ \cdot \hfill \\ {y_{P}^{(m)} } \hfill \\ \end{array} } \right]} \right\} \\\Omega _{M} & = {\text{blockdiag}}{\kern 1pt} \{\Omega _{1} , \ldots ,\Omega _{m} \} \quad \quad\Omega _{i}^{k} = y_{i}^{k} y^{k} g_{k}^{T} (x)g_{k} (x_{i} ) + \gamma^{ - 1} I \\ \overline{1} & = [1, \ldots ,1]\quad \quad \quad b_{M} = [b_{1} , \ldots ,b_{m} ] \\ \alpha_{i,k} & = [\alpha_{1,1} , \ldots ,\alpha_{P,1} ; \ldots ;\alpha_{1,m} , \ldots ,\alpha_{P,m} ] \\ \end{aligned} $$

where, \( K_{k} (x,x_{i} ) = g_{k}^{T} (x)g_{k} (x_{i} ) \) is kernel function, which satisfy Mercer condition [127]. The decision function of multiclass least square support vector machines (MC-LS-SVM) is defined as [134]:

$$ f(x) = {\kern 1pt} {\text{sign}}{\kern 1pt} \left[ {\sum\limits_{i = 1}^{P} \alpha_{ik} y_{i}^{(k)} K_{k} (x,x_{i} ) + b_{k} } \right] $$
(8.12)

The radial basis function (RBF) kernel for MC-LS-SVM can be defined as [45]:

$$ K_{k} (x,x_{i} )\, = {\kern 1pt} \,{ \exp }{\kern 1pt} \left[ {\frac{{ - ||x - x_{i} ||^{2} }}{{2\sigma_{k}^{2} }}} \right] $$
(8.13)

where, \( \sigma_{k} \) controls the width of RBF function.

The multidimensional wavelet kernel function for MC-LS-SVM can be given as [134, 135]:

$$ K_{k} (x,x_{i} ) = \prod\limits_{l = 1}^{d} \psi \left( {\frac{{x^{l} - x_{i}^{l} }}{{a_{k} }}} \right) $$
(8.14)

The kernel function of Mexican hat wavelet for MC-LS-WSVM can be defined as [128]:

$$ K_{k} (x,x_{i} ) = \prod\limits_{l = 1}^{d} \left[ {1 - \frac{{(x^{l} - x_{i}^{l} )^{2} }}{{a_{k}^{2} }}} \right]{\kern 1pt} { \exp }{\kern 1pt} \left[ { - \frac{{{\parallel }x^{l} - x_{i}^{l} {\parallel }^{2} }}{{2a_{k}^{2} }}} \right] $$
(15)

Similarly, the kernel function of Morlet mother wavelet for MC-LS-WSVM can be defined as [128]:

$$ K_{k} (x,x_{i} ) = \prod\limits_{l = 1}^{d} {\kern 1pt} { \cos }{\kern 1pt} \left[ {\omega_{0} \frac{{(x^{l} - x_{i}^{l} )}}{{a_{k} }}} \right]{\kern 1pt} { \exp }{\kern 1pt} \left[ { - \frac{{{\parallel }x^{l} - x_{i}^{l} {\parallel }^{2} }}{{2a_{k}^{2} }}} \right] $$
(8.16)

where, \( x_{i}^{l} \) is the lth component of ith training data.

3 Results and Discussion

In the proposed method, the emotions measured by EEG signals more advantageous because it is difficult to influence electrical brain activity intentionally. The EEG signals are acquire using audio–video stimulus because it is more effective for evoking the emotion. The EEG signals are firstly pre-processed with repeated-row method to form an input signal vector, then the input signal vector is decomposed into sub-signals through GHM multiwavelet with 3-level decomposition. Multiwavelets offer simultaneously orthogonality, symmetry, and compact support and therefore outperform the scalar wavelets. The features namely, ratio of the norms based measure, Shannon entropy measure, and normalized Renyi entropy measure have been extracted from sub-signals as obtained by GHM multiwavelet decomposition of EEG signals. To the knowledge of the authors, there is no other work in the literature related to emotion classification using features based on multiwavelet transform of EEG signals. Emotion classification is multiclass classification problem. Recently it has been shown that wavelet based kernel is better as compared to RBF kernel of MC-LS-SVM classifier for multiclass classification problem. Therefore, it motivates to use these kernels with MC-LS-SVM classifier for emotion classification. These features have been used as an input to MC-LS-SVM classifier with the RBF kernel, Mexican hat and Morlet wavelet kernel for classification of emotions from EEG signals.

The classification performance of the MC-LS-SVM classifier for emotion classification can be determined by computing the classification accuracy, ten-fold cross-validation, and confusion matrix. The classification accuracy (Acc) can be defined as the ratio of the number of events correctly detected to the total number of events.

$$ {\kern 1pt} {\text{Acc}} = \frac{{{\kern 1pt} {\text{number}}\,{\text{of}}\,{\text{correctly}}\,{\text{detected}}\,{\text{events}}}}{{{\text{total}}\,{\text{number}}\,{\text{of}}\,{\text{events}}}} \times 100 $$
(8.17)

In ten-fold cross-validation, a dataset Y is randomly divided into 10 disjoint subsets \( Y_{1} ,Y_{2} , \ldots ,Y_{10} \) of nearly uniform size of each class. Then, the method is repeated 10 times and at every time, the test set is formed from one of the 10 subsets and remaining 9 subsets are used to form a training set. Then the average error across all 10 trials is computed in order to obtain the final classification accuracy. A confusion matrix contains information about the actual and predicted classifications performed by a classification method. Confusion matrix provides the common misclassifications in the classification of emotions from EEG signals.

Table 8.1 shows the classification accuracy (%) for RBF kernel, Mexican hat and Morlet wavelet kernel functions of the MC-LS-SVM classifier for emotion classification with GHM multiwavelet. The classification accuracy for happy 89.17 %, neutral 81.67 %, sad 85.00 %, and fear 83.33 % are obtained by proposed method. It has been observed that classification accuracy of happy class is greater compared to other class and neutral emotion have lesser classification accuracy may be due to influenced by the other class of emotion. The classification accuracy for classification of emotions from EEG signals obtained by proposed method is 84.79 % with Morlet wavelet kernel function of MC-LS-SVM classifier. Table 8.2 shows the confusion matrix for classification of emotions from EEG signals with Morlet wavelet kernel function. It has been observed that highest misclassification between sad and neutral emotion. Other observation happy and fear or happy and sad have same misclassification. Table 8.3 presents a comparison with the proposed method and other existing methods in the literature for emotion classification. It is clear from Table 8.3 that the proposed method has provided better classification performance as compared to existing methods. It may be the effect of combination of proposed features extracted from MWT and MC-LS-SVM.

Table 8.1 The classification accuracy (%) with different kernels of the MC-LS-SVM classifier for emotion classification from EEG signals using GHM multiwavelet
Table 8.2 The confusion matrix of Morlet wavelet kernel function of the MC-LS-SVM classifier for classification of emotion from EEG signals
Table 8.3 A comparison of classification accuracy of the different emotion classification methods

4 Conclusion

This chapter explores the capability of proposed features derived from MWT for classification of emotions from EEG signals. The EEG signals are firstly decomposed into several sub-signals through 3-level MWT with repeated-row preprocessing. The multiwavelet transform, the repeated-row preprocessing of the scalar input produces the oversampling of the EEG signal, which makes the extracted features more discriminative. In addition, the multiwavelet decomposition contains two or more scaling and wavelet functions, the low-pass and high-pass filters are matrices instead of scalars. The features namely, ratio of the norms based measure, Shannon entropy measure, and normalized Renyi entropy measure are extracted from the sub-signals obtained by multiwavelet decomposition of EEG signals. These features are then used as input for MC-LS-SVM classifier for automatic classification of emotions. The experimental results have indicated that Morlet wavelet kernel function of MC-LS-SVM classifier has provided classification accuracy of 84.79 % for classification of emotions from EEG signals.

The EEG signal processing based methodology for emotion classification may be improved further. The developed method in this chapter only captures the static properties in the EEG signal in response to emotional stimuli. The methodologies can be developed to include the temporal dynamics of emotional information processing in the human cognitive system. It is expected that this way of processing may estimate the emotional state more accurately. It would of interest to develop new nonstationary signal decomposition based methodology and machine learning algorithms for improving the classification accuracy in human emotion classification from EEG signals. In this study, the selection of parameters of kernel functions used in LS-SVM and kernel function has been done on the basis of the trial and error. In future, the research can be done for automatic selection of kernel functions and kernel parameters for automatic classification of human emotions from EEG signals.