Keywords

1 Introduction

Emotions play a fundamental role in human communication and interaction, significantly influencing our behavior, decision-making, and overall well-being. Accurately detecting and interpreting human emotions has far-reaching applications, ranging from affective computing to mental health monitoring and human-robot interaction. Among the various modalities utilized for emotion analysis, electroencephalography (EEG) stands out as a promising and non-invasive technique for capturing the neural correlates of emotions. The use of EEG for emotion recognition presents numerous advantages; one being, EEG is non-invasive and relatively affordable, along with a high temporal resolution, allowing for the precise examination of rapid changes in emotional states. Moreover, EEG is capable of capturing brain activity associated with both conscious and unconscious emotional processes, offering a comprehensive perspective on emotional experiences. Unlike emotions captured using physiological signals from Autonomous Nervous System (like heart rate, galvanic skin response) which are vulnerable to noise, those captured directly from the Central Nervous System (like the EEG) capture the expression of emotional experience from its origin. This has sparked extensive research in the field of EEG-based emotion recognition, aiming to harness the power of EEG signals to advance our understanding of emotions and pave the way for practical applications [10]. Leveraging machine learning and pattern recognition techniques has made it possible to translate complex EEG data into meaningful emotional states, bridging the gap between neuroscience and technology.

There are two widely accepted emotion models around which such research has centered around - discrete and dimension. Based on the discrete basic emotion description approach, emotions can be categorized into six fundamental emotions: sadness, joy, surprise, anger, disgust, and fear. Alternatively, the dimension approach enables emotions to be classified based on multiple dimensions (valence, arousal, and dominance). Valence pertains to the level of positivity or negativity experienced by an individual, while arousal reflects the degree of emotional excitement or indifference. The dominance dimension encompasses a spectrum ranging from submissive (lack of control) to dominance (assertiveness). In practice, emotion recognition predominantly relies on the dimension approach due to its simplicity in comparison to the detailed description of discrete basic emotions [13], which also allows for a quantitative analysis. In this work, our investigations explore the latter. Emotion recognition from EEG involves extracting relevant time or frequency domain feature components in response to stimuli evoking different emotions. However, a common limitation of existing methods is the lack of spatial correlation between EEG electrodes in univariate feature extraction. EEG brain network is a highly valuable approach for examining EEG signals, wherein each EEG channel serves as a node and the connections between nodes are referred to as edges. The concept of brain connectivity encompasses functional connectivity and effective connectivity [1, 2]. Moreover, findings in cognitive neuroscience have provided evidence for the structural and functional dissimilarities between the brain hemispheres [5].

To address this limitation and leverage hemispherical functional brain connections for emotion recognition, the phase locking value (PLV) method [9] has been utilized in our work which enables the investigation of task-induced changes in long-range neural activity synchronization in EEG data.

Fig. 1.
figure 1

Block diagram of our proposed method

Based on this, we investigate the connections both within-hemisphere, and cross-hemisphere. Therefore, this paper proposes an EEG emotion recognition scheme based on significant Phase Locking Value (PLV) features extracted from hemispherical brain regions in the EEG data acquired as part of DEAP dataset [7], to understand the functional connections underlying within same hemisphere and cross hemisphere. By investigating performance of various machine learning models in being able to recognize the human emotions from EEG signals, this work throws light on which rhythmic EEG bands (alpha, beta, theta, gamma, all), and hemispherical brain connections (within or cross) are most efficient and responsive to emotions to measure the emotional state.

2 Related Work

There have been many studies conducted on using DEAP dataset for emotion recognition. Wang et al. (2018) [14] used an EEG specific 3-D CNN architecture to extract spatio-temporal emotional features, which are used for classification. Chen et al. (2015) [4] used connectivity features representation for valence and arousal classification.

Current findings in cognitive neuroscience have provided evidence for the structural and functional dissimilarities between the brain hemispheres [5]. Zheng et al. (2015) [17] conducted an investigation on emotional cognitive characteristics induced by emotional stimuli, revealing distinct cognitive variances between the left and right hemispheres. The study indicated that the right hemisphere exhibits enhanced sensitivity towards negative emotions. Similarly, Li et al. (2021) [11] employed the calculation of differential entropy between pairs of EEG channels positioned symmetrically in the two hemispheres, and used bi-hemisphere domain adversarial neural network to learn emotional features distinctively from each hemisphere.

Consequently, the analysis of EEG signals in both the left and right hemispheres holds immense significance in advancing emotional recognition techniques. Following this, Zhang et al. (2022) [16] focused on the asymmetry of the brain’s hemispheres and employed cross-frequency Granger causality analysis to extract relevant features from both the left and right hemispheres, highlighting the significance of considering functional connectivity between hemispheres and leveraging cross-frequency interactions to improve the performance of EEG-based emotion recognition systems.

Fig. 2.
figure 2

32 electrode positions in the international 10-20 system; Selected electrodes from each hemisphere for PLV features: left (in green color) and right (in blue color) (Color figure online)

Wang et al. (2019) [15] used Phase-Locking Value (PLV), to extract information about functional connections along with the spatial information of electrodes and brain regions.

3 Proposed Method

3.1 Dataset Description

The DEAP (Database for Emotion Analysis using Physiological Signals) dataset [7] consists of data from 32 participants, who were exposed to 40 one-minute video clips with varying emotional content. These video clips were carefully selected to elicit different emotional states.

While the EEG signal is recorded at 512 Hz, it is down-sampled to 128 Hz sampling frequency in this work. Although the videos were of one minute, recording was started 3 s prior, resulting in recordings of 63 s. Therefore, data of dimension 32 \(\times \) 40 \(\times \) 32 \(\times \) 8064 (participants \(\times \) videos \(\times \) channels \(\times \) EEG sampling points) was recorded.

At the end of each trial, the participants self-reported ratings about their emotional levels in the form of valence, arousal, dominance and liking and familiarity. Each of these scales spans from a low value of one to a high value of nine. To create binary-classification tasks, the scales are divided into two categories. Each of valence-arousal-dominance is categorized as high valence-arousal-dominance (ranging from five to nine) and low valence-arousal-dominance (ranging from one to five) based on the respective scales.

A block diagram depicting our proposed method is shown in Fig. 1, while Algorithms 1 and 2 depict our proposed approach.

Algorithm 1.
figure a

Emotion Recognition: Preprocessing

Algorithm 2.
figure b

Emotion Recognition: Feature Extraction and Selection

3.2 Data Preprocessing

Preprocessing involved several steps to enhance the quality and extract relevant information from the recorded signals, which are described as follows.

Firstly, we used Z-score normalization, which helps eliminate the individual variations and biases present in the recorded signals. Since the recording was started 3 s before the actual video, the signal in the first three seconds is used for z-score based baseline removal. Assuming \(x(i)\) is the input signal, it is performed by first calculating the mean (\(\mu \)) and standard deviation (\(\sigma \)) of the signal upto the first three seconds (N = 3 \(\times \) 128), and then normalizing as shown in Eq. 1.

$$\begin{aligned} x_{\text {normalized}}(i) = \frac{x(i) - \mu }{\sigma } \end{aligned}$$
(1)

We then used Wavelet-based multiscale PCA [3] to remove noise from the normalized signal. Firstly, a covariance matrix \(C_j\) of the wavelet coefficients is computed as shown in Eq. 2. Top eigenvectors with highest eigenvalues are chosen after eigenvalue decomposition. However, in this work, we selected the whole set of the principle components, instead of taking a subset.

Fig. 3.
figure 3

PLV features extracted for within-hemisphere (top) and cross-hemisphere (below) for first subject and first trial

$$\begin{aligned} C_j = \frac{1}{N} \sum _{n=1}^{N} W_{\text {coeff}}(n, j) \cdot W_{\text {coeff}}(n, j)^T \end{aligned}$$
(2)

Finally, the normalized EEG signal is projected onto the selected eigenvectors \(V_{ij}\) (\(i\)th eigenvalue at scale \(j\)) at each scale to obtain the wavelet-based multiscale PCA features, as shown in Eq. 3.

$$\begin{aligned} \text {PCA}_{ij}(n) = x_{\text {normalized}}(n) \cdot V_{ij} \end{aligned}$$
(3)

After applying these steps, bandpass filtering is performed on the EEG signals by decomposing them into \(\alpha \) (8–15 Hz), \(\beta \) (16–30 Hz), \(\theta \) (4–7 Hz), and \(\gamma \) (30–45 Hz) bands.

3.3 Feature Extraction

The extraction of Phase-Locking-Value (PLV) features from each EEG band - \(\alpha \) (alpha), \(\beta \) (beta), \(\theta \) (theta), and \(\gamma \) (gamma) bands, involves a series of steps to quantify phase coupling between different electrode pairs within each frequency band. The electrodes selected from each hemisphere for extraction of these features are shown in Fig. 2, which include 14 electrodes from each of the left and right hemispheres.

The extraction process involves the following steps. Firstly, we segment the preprocessed EEG signals into 10 time windows of 6 s each. Next, for each segment, the EEG signals within the selected frequency band are processed using the Hilbert transform H(x(t)), to obtain the instantaneous phase information, as shown in Eq. 4.

$$\begin{aligned} \phi (t) = \arctan \left( \frac{{H(x(t))}}{{x(t)}}\right) \end{aligned}$$
(4)
Table 1. Classifier Details

Once the instantaneous phase information is obtained, pairwise phase difference value is computed for each electrode pair within the frequency band of interest. Finally, PLV for each electrode pair within the frequency band is calculated by averaging of the absolute value of the complex exponential of the pairwise phase differences (\(\varDelta \phi _n\)) over the entire segment, as shown in Eq. 5. An example of PLV features extracted for within-hemisphere and cross-hemisphere is shown in Fig. 3.

$$\begin{aligned} \text {PLV} = \left| \frac{1}{N} \sum _{n=1}^{N} \exp (i\varDelta \phi _n) \right| \end{aligned}$$
(5)

In our proposed approach, we calculate PLVs within each frequency band, for both within-hemisphere and cross-hemisphere, which is described as follows. The 28 electrodes left after removal of the four middle electrodes (Fz, Cz, Pz, and Oz) are symmetrical. To investigate the role of hemispherical functional brain connections, we compute PLVs on these electrode pairs through two kinds of combinations: (1) within-hemisphere (wherein, electrodes in each hemisphere form a pair with every other electrode in the same hemisphere), and (2) cross-hemisphere (wherein, electrodes in one hemisphere form a pair with each electrode from the other hemisphere). While the former reflects the connections in each hemisphere, the latter reflects the connections across hemispheres and between the left and right hemispheres.

Since there are 14 EEG electrode nodes in each hemisphere, the number of effective PLV values in the case of cross-hemisphere is 14 * 14 = 196, while in the case of within-hemisphere is 14 * 14 * 2 = 392.

Table 2. Accuracy on the DEAP dataset for Valence Classification using features from Within and Cross Hemisphere

3.4 Feature Selection

We select relevant PLV features for classification using two feature selection methods - mRmR (maximum Relevance-minimum Redundancy) and chi-square, explained as follows. The mRmR algorithm [12] aims to select features that have a high relevance to the target variable (e.g., emotion classification) while minimizing redundancy among selected features, as shown for a specific feature \(F_i\) in Eq. 6.

$$\begin{aligned} \text {{mRmR}}(F_i) = \text {{Relevance}}(F_i) - \alpha \times \text {{Redundancy}}(F_i) \end{aligned}$$
(6)

We also use chi-square statistic [6] to select the features with the highest statistical significance, with respect to the target variable. It’s calculation for a specific feature \(F_i\) with c classes in the target variable and observed \(O_{ij}\) and expected frequencies \(E_{ij}\) is shown in Eq. 7. We use these methods to identify the fifty most relevant and discriminative PLV features for our task.

$$\begin{aligned} \chi ^2(F_i) = \sum _{j=1}^{c} \frac{(O_{ij} - E_{ij})^2}{E_{ij}} \end{aligned}$$
(7)
Table 3. Accuracy on the DEAP dataset for Arousal Classification using features from Within and Cross Hemisphere

3.5 Classification

PLV features are extracted from each band, for each of within and cross-hemisphere. Fifty most significant features are selected through mRmR and Chi-squared methods, on which classification is performed. We employed several popular machine learning classifiers to learn the underlying patterns and relationship, namely K-Nearest Neighbors (KNN), Decision Tree, Support Vector Machines (SVM), Random Forest, and Adaboost. The technical specifications of these models are shown in Table 1. The trained models were then evaluated using cross-validation for 10 folds. The mean of accuracies obtained and their standard deviation are used as an evaluation metric to assess the performance of the models and corresponding approaches on emotion recognition task.

Table 4. Accuracy on the DEAP dataset for Dominance Classification using features from Within and Cross Hemisphere

4 Results and Discussion

Table 2 shows the accuracy for valence classification from within and cross hemispheres. We observe that the approach involving features from the Gamma band, selected through the mRmR method, using the KNN classifier perform best at valence classification. Additionally, PLV features from cross-hemisphere seem to be performing better (accuracy of 79.4%) than those from within-hemisphere (accuracy of 78.1%). Table 3 shows the accuracy for arousal classification from within and cross hemispheres. We observe that the approach involving features from the Gamma band, selected through the mRmR/Chi-square method, using the KNN classifier perform best at arousal classification. Additionally, PLV features from cross-hemisphere seem to be performing slightly better (accuracy of 79.6%) than those from within-hemisphere (accuracy of 79.0%). Table 4 shows the accuracy for dominance classification from within and cross hemispheres. We observe that the approach involving features from the Gamma band, selected through the mRmR method, using the KNN classifier perform best at dominance classification. Additionally, PLV features from cross-hemisphere seem to be performing better (accuracy of 79.1%) than those from the within-hemisphere (accuracy of 77.1%). On comparing our results with the state of arts (Table 5), we find that our approach performs better with state-of-the-art accuracy.

Overall, the experimental results demonstrate that gamma EEG band is most relevant for emotion recognition and among machine learning classifiers, KNN achieves the best performance across all three ratings. Additionally, there is a minor increment in accuracy when PLV features are acquired from cross-hemisphere as compared to within-hemisphere.

Table 5. Comparison with the state of arts

5 Conclusion

In this paper, we have performed an emotion recognition task based on brain functional connectivity. Firstly, EEG signals are processed and denoised using wavelet based multiscale PCA. Then, PLV features are extracted from these processed signals and further mRmR feature section is done to examine the performance of brain connections demonstrated for within-hemisphere and cross-hemisphere. The obtained results manifest that gamma band is most effective and relevant for the evaluation of emotion recognition task. We achieved the best performance with KNN classifier across three rating dimensions of emotions (valence, arousal and dominance) for cross-hemisphere connections. Although there is a very slight difference between both the scenarios, we concluded that phase information obtained across cross-hemisphere connections is more reliable in comparison to same hemisphere one. Besides, as we know the information extracted via brain connections requires more and more numbers of EEG electrodes to enhance the performance of emotion recognition, simultaneously increases complexity for data acquisition. Thus, we are interested in multivariate phase synchronisation which improves the estimation of region-to-region source space connectivity with lesser number of EEG channels while eliminating useless electrodes. We leave this interesting topic as our future work.