1 Introduction

Emotions play a significant role in human life and impact our typical daily activities, such as cognition, decision-making, and intelligence. In addition to logical thinking, emotional ability is essential to human intellect. A current trend in studying human–computer interaction (HCI) is the development of emotional artificial intelligence. Despite emotional intelligence, emotions also have a direct bearing on several mental illnesses, such as depression, autism, and game addiction. In this context, emotional computing emerges as a multidisciplinary research area to create an artificial intelligence system capable of understanding, recognizing, and managing human emotions. The main goal of affective computing is to identify and simulate human emotions using machine learning and pattern analysis techniques. Emotion recognition EEG data has proven more credible than other frameworks that rely on outward appearances such as facial expressions, gestures, and speech signals, which may be pretended emotions [1]. Despite having a low spatial resolution, EEG has a high time resolution and can be used to assess changes in signal characteristics brought on by emotional inputs. Further, EEG is affordable, easy to set up, and non-invasive, making it a notable option for studying brain responses to emotional stimuli [2]. Since it enables the quantification of neurological activity from the brain using contact electrodes attached to the scalp, EEG has emerged as a vital technology for HCI systems [3]. Additionally, cross-subject emotion recognition (CSER) or subject-independent emotion recognition Most of the works above used time-domain feature extraction techniques from EEG, a few used feature selection, and the rest used complex deep learning-based classifiers. Time-domain feature extraction methods may not always capture the hidden inherent EEG features that are usually discriminative, resulting in limited classification performance. Also, using more hand-crafted features and employing feature selection techniques may fasten and improve the classification process but at the cost of an increase in the computational cost of the CSER systems. Due to less spatial resolution [26], several current works used EEG signals from multiple channels. It may be possible to explore more emotion-intense features by using frequency domain attributes and temporal information from various EEG channels [5]. This prompted us to explore a novel time–frequency analysis CEEMDAN to decompose EEG into intrinsic mode functions (IMFs) and to compute NIG-PDF features from mode functions obtained. Due to their efficacy in identifying potential EEG features, time–frequency (T-F) analysis techniques have notably drawn interest. There are other T-F analysis tools, such as short-time Fourier transform (STFT) [27], empirical mode decomposition (EMD) [9, 28], wavelet transform (WT), and variational mode decomposition [29], etc. These techniques have some drawbacks, such as STFT and WT [30] having specific scales of stationary assumptions to analyze non-stationary EEG. The WT uses a more efficient basis than the STFT, but based on the mother wavelet choice, its basis is also likewise defined by several pre-fixed filters. Few studies used EMD for EEG decomposition into IMFs to overcome this WT constraint. Also, various attributes were derived from IMFs for this task [9, 28]. However, EMD suffers from mode mixing, noisy mode functions, and a divergent intermediate frequency [21]. Improved variants, like ensemble EMD (EEMD) and complete ensemble EMD (CEEMD), were explored to work around some of these limitations [29]. However, EEMD has a few drawbacks in addressing mode mixing that make its use tedious for real-time applications. Contrarily, CEEMD offers more significant intrinsic mode function spectral separation, ensuring that the input signal is accurately reconstructed. The benefits mentioned above prompted using CEEMDAN instead of EMD variants. Consequently, CEEMDAN is an excellent technique for assessing highly uncertain EEG data. Thus, this study presented a CEEMDAN domain analysis of EEG for emotion classification. The CEEMDAN decomposed EEG signal helps capture the nonlinear EEG features relevant to the emotion recognition task. Further, we investigate the suitability of the NIG features to the IMFs resulting from CEEMDAN EEG decomposition. The potency of the NIG density function in modeling the statistics of signals that are nonlinear, like financial information, ultrasound images, and video signals, guided this work for modeling the CEEMDAN mode functions [30,31,32,33,34]. The final objective of this study is to explore if NIG density function parameters determined from various IMFs of a large set of EEG signals can be used to classify among diverse emotions, such as happy, relax, sad and neutral, etc. The presented approach has the following key contributions:

  • A novel framework for emotion recognition is introduced by combining CEEMDAN-based signal analysis with fine-tuned XGboost classifier.

  • To emphasize the EEG’s nonlinearity, NIG density parameters of CEEMDAN domain IMFs are computed, which makes it easier to derive the emotion-intense characteristics of EEG precisely.

  • A cross-validation strategy is adopted to fine-tune XGboost’s settings, such as the total count of trees and leaves, to develop an optimum model that enhances the model’s capacity for discriminating between different emotions.

  • A cross-subject validation is also used to handle the inter-patient variability efficiently, establishing the utility of the suggested method in different clinical settings.

2 Literature review

EEG has emerged as a vital technology for HCI systems and additionally, cross-subject emotion recognition (CSER) or subject-independent emotion recognition (SIER) is a crucial component for system generality and scope as opposed to subject-dependent emotion recognition (SDER), where HCI systems have applications in monitoring the health of people with disabilities [4]. Several models and frameworks have been developed to assist people to recognize and express their emotions and emotional states [5, 6]. The subject-independent evaluation is an essential aspect of EEG-based emotion identification applications, but limited study and minimal system performance have been achieved so far. Most recent subject-dependent studies have claimed significant recognition accuracy [7,8,9,10,11,12,13,14]. However, very few works have claimed subject-dependent and cross-subject evaluations, and CSER is typically seen to underperform in most works. The work [15] used the fusion approach with linear discriminant analysis,

adapting two modalities to distinguish emotions, yielding an inferior accuracy. In a comparable study focused on entropy measures from EEG, CSER obtained a mere 64.82% accuracy compared to 90.97% for SDR [16]. As EEG is very chaotic and dissimilar, they do not exhibit identical features for the selected features from subjects stimulated with the same stimuli. Consequently, it affects the efficacy of traditional recognition algorithms, which assume a similar EEG distribution. To investigate this problem, a few studies [3, 17, 18] conducted a thorough feature-level analysis to enhance the classifier’s CSER performance. Compared to the above-discussed works, current attempts on CSER [19,20,21,22] show better results. Recently, authors worked on pre-trained models [23, 24] and enhanced the maximum mean CSER accuracy. However, deep pre-trained models improve accuracy at the cost of computational complexity. One of the recent works on CSER reported a low accuracy of 79.6% with SEED using a deep neural network [25]. Recent advances in emotion identification studies involve cutting-edge machine learning and deep learning algorithms [9,10,11,12,13,14, 40,41,42,43,44,45,46,47,48,49,50,51] that use manual EEG features and automated or image-like attributes as shown in Table 1. Deep learning techniques reported recently with good EEG-based emotion recognition performance include CNN-LSTM [40, 45], a fusion of CNN and recurrent neural network (RNN), i.e., ACRNN [41], a fusion of CNN, stacked autoencoder (SAE) and dense neural network (DNN) [42], asymmetric CNN [43], deep CNN [44] with feature dimension reduction strategy. Despite achieving high accuracy, the work in [41] is computationally complex as it combines three deep models (i.e., CNN, SAE, and DNN) for feature extraction, feature reduction, and classification. Additionally, it used advanced techniques, including extracting Pearson correlation coefficient characteristics for emotion categorization and transforming EEG into 2D images. A few complex hybrid deep networks, like in [46], have recently been employed for emotion recognition with a manual feature extraction approach. However, only marginal improvements could be achieved using the DEAP dataset compared to the state-of-the-art methods [47, 48, 50, 51]. However, higher accuracy is obtained with SEED and SEED-IV datasets. Out of these [47], achieved high accuracy but involved many stages to perform the recognition task, such as the Relief algorithm for channel selection, max-relevance, and min-redundancy algorithm to obtain emotion-relevant channels further and combined EEGNet [52] with capsule network resulting in EEGNet for emotion- recognition. Moreover, [50] also used a hybrid model, i.e., attention-based convolutional transformer neural network (ACTNN) that cascades CNN and transformer fed with two features, i.e., a spatial feature made into a 2D matrix and spectral characteristics of EEG frequency rhythms. It is worth mentioning that hybrid-deep techniques are resource-intensive and data-hungry as discussed in Table 1. It discusses the limitations of the existing works which motivated us to propose this work. Thus, this study employs an eminent ensemble learning technique, the XGboost classifier, which has not been thoroughly investigated for emotion recognition using EEG.

Table 1 Performance of existing EEG-based emotion recognition works

The article contents are arranged as follows: the precise details of the datasets used in this work are explained in section II. The proposed CEEMDAN splitting of EEG into IMFs, NIG modeling of IMFs, the importance of the NIG parameters and tuning XGboost are demonstrated in section III. With the aid of accuracy, F1-score and comparative analysis, section IV describes the experimental results obtained. The conclusions from the suggested approach are presented in section V.

3 Materials

The SEED [3], SEED-IV [4], and DEAP [5] databases have been used to assess the effectiveness of the proposed work. The database’s specific details are listed in Table 2 and discussed below.

Table 2 Database information

3.1 SEED database

SEED has 15 participants, and every participant has 15 trials of EEG. Each EEG has 62 channels collected at a 1000 Hz sample rate and downsampled to 200 Hz. The EEG data were filtered with a bandpass filter of 0.3–50 Hz to remove the low-frequency noise and eliminate the artifacts. After pre-processing, EEG segments that matched the length of each video, i.e., around 4 min or 240 secs, were recovered, thus resulting in the data length of 48000 samples. Every participant watched 15 clips of 3 emotions: happy, neutral, and sad. The corresponding EEG was collected with the help of electrodes placed on the individual’s head [3].

3.2 SEED-IV database

It was also a multi-channel EEG dataset with 62 channels that followed the same pre-processing methods of SEED to pre-process its EEG. Fifteen participants watched 24 clips of 4 emotion types: happy, fear, sad, and neutral, and corresponding 24 trials were recorded with the electrodes placed on the individuals’ head [4].

3.3 DEAP database

It has 40 EEG channel data collected from 32 people. In the pre-processing step, the signals were 512 Hz to 128 Hz, and bandpass filtering was performed. Participants watched ‘40’ one-minute clips to elicit various emotions. The corresponding 40 EEG trials were recorded for the 40 clips. It has four emotional parameters with values varying from zero to nine. Low emotional variables were indicated by rating scores in the range of 0 to 5, while high emotional variables were indicated by rating levels in the range of 5 to 9. The induced emotions were classified into four emotion classes about the two variables, i.e., arousal and valence. Based on the 2D arousal and valence model, the EEG was classified into four emotions: happy, relax, angry, and sad [5].

4 Methods

The primary goal of this study is to introduce NIG parameters, which derive nonlinear attributes from the CEEMDAN IMFs of EEG signals and better capture adequate emotional information. Further, the fine-tuned XGboost aids in precisely categorizing the emotions into various classes. Figure 1 depicts the schematic of the presented method. Each block is discussed in detail in the following subsections.

Fig. 1
figure 1

Algorithm of the suggested technique

4.1 Complete ensemble empirical mode decomposition with adaptive noise

Before going into detail about CEEMDAN, let’s discuss its previous versions: EMD and EEMD. EMD is a data-driven, iterative method for studying non-stationary, complex signals like EEG. Despite this, EMD has a significant drawback known as mode mixing, making it less productive and efficient. Further EEMD was proposed to overcome these issues [40]. It applies the EMD to a signal ensemble with additive Gaussian white noise. The entire time–frequency plane is populated with white Gaussian noise to eliminate the mode mixing, allowing EMD to reap the benefits of its dyadic filter bank behavior [40,41,42]. On the other hand, EEMD overcomes the mode mixing problem but introduces new issues. Residual noise is present in the EEMD reconstructed signal. In addition, various realizations of the given.

  1. 1

    Obtain \(x(n) +\upsigma _{d} GW^{i} \left( n \right)\) by adding P groups of white Gaussian noise \(W^{i} \left( n \right)\) to input EEG signals x(n) where i = 1,2,3…P.

  2. 2

    Using EMD, compute the first modal component denoted by IMF1, P (IMF1,1 IMF1,2. IMF1,P). of each group of the P number of EEG channels. The average of these first mode functions could be expressed as below:

    $$\widetilde{IMF}_{1} \left( n \right) = \frac{1}{P}\sum\limits_{i = 1}^{P} {IMF_{1}^{i} } \left( n \right) = \overline{IMF}_{1} \left( n \right)$$
    (1)
  3. 3

    The residue can be computed using the equation below and denoted \(\upxi _{l} (n)\).

    $$\upxi _{l} (n) = x(n) - \widetilde{IMF}_{l} \left( n \right){\text{ where}}l = \, 1,2,3, \ldots ,L$$
    (2)

The first residue can be obtained by substituting l=1, and the equation (2) is transformed as

$$\upxi _{1} (n) = x(n) - \widetilde{IMF}_{1} \left( n \right)$$
(3)
  1. 4

    The realizations \(\upxi _{1} (n) +\upsigma _{1} c_{1} \left( {GW^{i} \left( n \right)} \right)\) are divided until their first EMD mode where i = 1,2,3,4…, P. \(\upsigma _{l}\) indicate white noise SD of the lth stage (where l = 1 for this iteration). The following equation can be used to determine \(\widetilde{IMF}_{2} \left( n \right)\):

    $$\widetilde{IMF}_{2} \left( n \right) = \frac{1}{P}\sum\limits_{i = 1}^{P} {c_{1} } (\upxi _{1} (n) +\upsigma _{1} c_{1} \left( {GW^{i} \left( n \right)} \right)$$
    (4)
  2. 5

    Calculate the lth residue for l = 2, 3,…, L.

    $$\upxi _{l} (n) =\upxi _{l - 1} (n) - \widetilde{IMF}_{l} \left( n \right)$$
    (5)
  3. 6

    The realizations \(\upxi _{1} (n) +\upsigma _{1} c_{1} \left( {GW^{i} \left( n \right)} \right)\) are split till their first mode of EMD, and the (l + 1)th mode is defined as follows.

    $$\widetilde{IMF}_{l + 1} \left( n \right) = \frac{1}{P}\sum\limits_{i = 1}^{P} {c_{1} } (\upxi _{l} n) +\upsigma _{l} c_{l} \left( {GW^{i} \left( n \right)} \right)$$
    (6)
  4. 7

    Go back to step 6 to complete the next step. l.

  5. 8

    As steps 5–7 are continued, the residue becomes monotonic, and no more modes may be recovered. It means that the convergence criterion (yes) is satisfied. If \(\upxi _{l} (n)\) is the end residue and l is the maximum number of modes, no more mode decomposition is achievable hereafter. The input signal x(n) can be recovered from all mode functions of CEEMDAN with the following equation:

    $$x(n) = \sum\limits_{l = 1}^{L} {\widetilde{IMF}_{l} \left( n \right)} +\upxi _{l} (n)$$
    (7)

Although CEEMDAN’s superior characteristics already demonstrate its potential for emotion recognition. It is further illustrated by the mode functions of EEG obtained through CEEMDAN for the happy, fear, sad, and neutral emotion classes in Fig. 2, which exhibit discriminating characteristics. Each EEG channel resulted in more than 10 IMFs, i.e., a maximum of 12, 17, and 17 IMFs with the respective DEAP, SEED, and SEED-IV data. However, all the IMFs are not relevant to their respective EEGs. Thus, we computed the correlation coefficient of IMFs and their respective EEG signal. Based on the threshold settings [39], we found that the first seven IMFs are primarily relevant for each EEG dataset. This selection of CEEMDAN IMFs reduces the number of extracted features, minimizing the proposed model’s computational complexity.

Fig. 2
figure 2

Various IMFs of CEEMDAN for four emotion categories of a single participant from the SEED-IV dataset showing variations in the amplitudes

4.2 NIG modeling of CEEMDAN IMFs

The CEEMDAN approach is used to decompose EEG into different IMFs. Further, the CEEMDAN IMFs obtained are modeled using symmetric NIG-PDF parameters [34, 35]. The NIG parameters of IMFs are computed using the Eq. (8) stated below:

$$P_{\alpha ,\delta } (x) = \frac{{\alpha \delta e^{\alpha \delta } }}{\pi }\frac{{K_{1} (\alpha \sqrt {\delta^{2} + x^{2} } )}}{{\sqrt {\delta^{2} + x^{2} } }}$$
(8)

where K1(.) is the 1st order modified Bessel function of 2nd kind. The letter ‘x’ denotes the IMF obtained through the CEEMDAN decomposition of EEG. The parameters δ indicate the scale factor NIG density function while α denotes the feature factor. The density function shape changes with changes in δ and α values, and these parameters can be computed with equations given below. The steepness of the NIG.

density is assessed by α. On the other hand, the spread of NIG density is controlled by the scale factor δ

$$\alpha = \sqrt {\frac{{3K_{x}^{2} }}{{K_{x}^{4} }}}$$
(9)
$$\delta = \alpha K_{x}^{2}$$
(10)

where Kx2 and Kx4 are NIG density function 2nd and 4th order cumulants. Here, these parameters are extracted from each IMF and used as features of the XGBoost classifier for emotion recognition, discussed in further sections. The significance of NIG density function in representing the fundamental statistical information of nonlinear signals with heavy-tailed distributions, including financial data, images, sales information, multimedia data, and so forth, prompted the application of these proposed attributes in this study [36]. Figure 3 shows histograms from one of the IMF fitted with NIG density function for the four emotions of the SEED-IV dataset. It reveals some interesting insight details. First, the CEEMDAN IMFs are modeled by NIG-PDF reliably and precisely. Next, the density shows heavy tailing, varied dispersion, and peaking across various EEG emotions. As shown in Fig. 3, the steepness and dispersion of the fitted density functions vary for each emotion class. The happy density is the steepest, while the sad density is the shallowest compared to others. In terms of dispersion, the happy emotion is less widely dispersed than the sad emotion, which is more widely spread. It is important to note that numerous visual evaluations have confirmed these shape variations in the PDFs. Since the values of α and δ vary among the emotions due to the variations in steep slopes and dispersion, both parameters serve as valuable criteria for emotion classification. These NIG features shape changes are captured from IMFs for the different emotion classes and fed to the XGboost classifier. The suggested NIG density features intend to record variations to employ them in the classification problem mathematically. The size of the density feature matrices obtained from the three datasets listed below:

$$\begin{gathered} {\text{DEAP }} = subjects \times trails \times channels \times IMFs \times NIG \, parameters = {128}0\,rows \times {448}\,columns. \hfill \\ {\text{SEED }} = subjects \times trails \times channels \times IMFs \times NIG \, parameters = {225}\,rows \times {868}\,columns. \hfill \\ {\text{SEED}} - {\text{IV }} = subjects \times trails \times channels \times IMFs \times {\text{NIG}}parameters = {36}0\,rows \times \,{868}columns. \hfill \\ \end{gathered}$$
Fig. 3
figure 3

Histograms of (ad) four emotions of IMF3 fitted with NIG pdfs depict the change in the shape of density functions among the different emotions of SEED-IV of a single participant

5 Experimental results and discussion

In this section, extensive performance evaluations of the proposed approach were done using publicly available EEG databases, SEED, DEAP, and SEED-IV. Here, we have conducted two validation experiments, i.e., CV and CSV, with the XGboost classifier using the proposed CEEMDAN domain NIG features. Further, it demonstrated the experimental results of the suggested detection technique with F1-score, along with accuracy, and discussed their significance.

5.1 Cross-validation

The efficacy of the proposed framework is examined using the CV strategy. The extracted NIG feature data is randomly split into 70% training data and 30% testing data, where the 70% training data is used for tenfold CV, and the rest 30% data is kept apart to test the optimized XGboost. The XGboost tries to find the best parameters namely tree and leaf count etc. for the NIG features throughout the training process during CV as this count is crucial. This study implemented CV with a wide range of decision trees to accomplish the optimum count of trees. The CV is conducted to achieve the following aspects: first, to attain the best parameters for XGboost and the count of trees necessary to obtain better detection efficiency can be found. Second, it eliminates the necessity for distinct test data, allowing us to evaluate XGboost predictive power [37]. Third, it helps to create a low-bias model and prevents overfitting. The overall counts of trees and leaves in each tree affect how well the XGboost performs. Here, it is also examined how the leaf size affects the algorithmic performance. After doing several experiments, wehave observed that the algorithm performs better with a smaller number of leaves, as demonstrated in Fig. 4. The mean square error (MSE) vs. the number of grown trees in Fig. 4 for different numbers of leaves shows that MSE is low for 20,10 and 5 leaves with 120, 100 and 50 trees for the three datasets features as depicted in Fig. 4a, b and c. Therefore, in our experiments, these best groups of parameters are used to categorize test data samples. The no free lunch theorem [38] states that no one optimal classifier and other classification models are to be investigated to compare the performance with the fine-tuned XGboost. The optimized XGboost obtained through a tenfold CV is tested with 30% unseen test data.

Fig. 4
figure 4

Mean square error varies depending on the trees’ count and the leaves size

Table 3 compares the experiment results obtained when NIG features were classified with different classifiers using 30% unseen test data. It shows that all databases perform well for ensemble learning techniques like bagging, random forest (RF), XGboost, etc. Nevertheless, XGboost surpasses other classification models’ accuracy and F1 score, demonstrating its applicability in the suggested emotion detection system. The F1-score achieved with the DEAP, SEED, and SEED-IV are 95.2%, 96.3%, and 96.2% respectively. The rest of the experiments are conducted using this optimized XGboost classifier in this work.

Table 3 %Accuracy obtained with the proposed method using various classifiers

5.2 Cross-subject validation (CSV)

CSV is conducted by taking a single subject’s features for testing, and the remaining subjects’ features are fed as input to the XGboost’s training process. This experiment is conducted with each subject’s features as test data, and the accuracy achieved for each CSV experiment is averaged to calculate average classification accuracy (ACA). Experiments are conducted on DEAP, SEED, and SEED-IV databases, and results regarding %ACA are discussed in Table 4. The CSV-1 experiment is done with DEAP to classify four emotion categories: happy, relax, angry, and sad. For CSV-2, NIG density features are labeled into three emotions of SEED, i.e., negative, neutral, and positive. Next, the CSV-3 is implemented with density attributes of SEED-IV. As depicted in Table 4, it is observed that for three experiments, the XGboost model obtained superior performance with the computed NIG density characteristics in terms of more significant %ACA at both individual class and average accuracy levels. For the CSV-1 with the four-class experiment, the happy class XGboost reported a maximum % ACA (94.21%) and a minimum of angry (89.2%) with an overall F1 score of 90.43%. Further, with CSV-2, the XGboost achieved 97.74% maximum %ACA for the positive emotion class and minimum for the negative class (96.01%) with an F1-score of 95.55%. Similarly, for the CSV-3 experiment, the maximum %ACA (97.10%) is obtained for the happy emotion class and the minimum for fear class (94.13%). The %F1-scores are at their maximum with SEED-IV (94.79) and SEED (95.55), while they are at their minimum with DEAP due to their imbalanced nature. Despite its imbalanced nature, it also reported a marginal F1 score of 90.43%. We can observe from these CSV experiments that XGboost achieved higher accuracy for positive emotions than for negative classes. These outcomes illustrate the efficacy of the presented method for SIER by achieving maximum efficiency even for CSV.

Table 4 Cross-Subject Validation (CSV) Experiments (%ACA) achieved with XGboost

5.3 XGboost quantitative analysis

This section discusses the classification results of CV and CSV experiments performed with XGboost with the help of confusion matrices. The confusion matrix (CM) table format allows a quantitative visual representation of the classifier potency. The % classification accuracy claimed by the XGboost in Table 4 can be verified with the CMs depicted in Fig. 5 for the respective three datasets. Moreover, CM gives insights into individual emotion groups’ misclassification rate and accuracy. In Figs. 5a–f CMs are depicted for one of the CV and CSV experiments implemented with the three datasets. The classification the accuracy achieved with 30% test data on conducting CV experiments with XGboost is shown in the CMs. While plotting CMs, the classifier chooses an unequal count of samples with each emotion group. As a result, compared to other groups, the accuracy of each group varies greatly. It can be noticed from Fig. 5a that the happy emotion class with DEAP achieved the highest accuracy, 98.75%, compared to other emotion classes. Similarly, in Figs. 5b, c, we can observe the individual class-level accuracies of SEED and SEED-IV, respectively. The next three CMs in Fig. 5d, e, and f are plotted for CSV experiments with the three datasets, respectively. The CM obtained for the CSV-1 experiment is shown in Fig. 5d, where the happy class achieved maximum accuracy compared to the other three classes. The CMs for the CSV-2 and CSV-3 obtained maximum accuracy with positive emotion class and happy class compared to other classes shown in Fig. 5e and f. With these outcomes, we can claim that the suggested strategy CEEMDAN domain NIG features achieved maximum performance not only for CV experiments but also for CSV experiments which is very helpful for real-world applications. Also, the proposed approach achieved maximum accuracy for positive emotions rather than negative emotions. Therefore, it is verified that the integration of CEEMDAN-based NIG parameters computed for EEG improved the XGboost discriminating power of emotions.

Fig. 5
figure 5

CMs of (ac) cross-validation and (df) cross-subject validation experiments of the three datasets

5.4 Comparative analysis

Table 5 compares the suggested automatic emotion detection framework with the latest published machine learning based research. As our proposed work is based on machine learning based XGboost classifier. The comparative study is carried out in the context of the accuracy claims made by the previous studies. The comparison table shows that our scheme performs better than most recently published works. Moreover, the suggested work outperforms the most recent EMD-based studies [9, 28], achieving 93.8% highest accuracy with DEAP four emotions, while we attained 96.7% for the same variety of classes. This could result from EMD’s drawbacks, such as its issues with mode mixing, noisy mode functions, and deviating intermediate frequency. The work [8] used various statistical features which are fed to modified random forest along with SVM obtained marginal accuracy of 93.1% with SEED dataset. Few works explored multidomain features classified for emotions using variants of SVM reporting marginal performance with DEAP and SEED. Also variant of wavelet features are derived for discriminating emotions using Random forest with minimal performance with DEAP [20]. The improved XGboost attained higher discriminating accuracy than the aforementioned models. This might be due to the ensemble nature, more variety of parameters for tuning, built-in capacity to handle missing values, and many other features of XGboost to predict outcomes that are more precise than those made by single classifiers like SVM [18]. It can be deduced from the comparison study that the suggested method outperforms related existing studies while requiring lower computational effort. The CEEMDAN domain NIG density attributes effectively extract emotion-specific characteristics from the EEG. Further, the optimized XGboost improves the efficiency of the classification by adjusting the hyper-parameters. Hence, the presented method outperformed previous similar studies regarding overall performance with the three databases.

Table 5 Comparative Analysis with Existing Works with the reported accuracy

6 Conclusion

EEG-based emotion recognition is a significant issue in the field of HCI applications. Inter-subject variability poses a key constraint in such applications to attain system generality and a broader range of applications. This study decomposed EEG signals into IMFs through CEEMDAN to solve this concern. These CEEMDAN-IMFs are represented by NIG density function parameters and fed these features to XGboost to distinguish various emotions. An optimized XGboost classifier has demonstrated the potency of NIG density features as emotion-discriminating traits. Moreover, the efficacy of NIG parameters has been examined with different classifiers. Our approach outperformed compared to similar techniques in the literature. The suggested CEEMDAN-based emotion detection system was assessed as superior to the EMD and DL-based studies. Thorough evaluations of the proposed method are conducted on three publicly accessible databases. The proposed method is verified using two validation protocols, CV and CSV, which performed better than previous studies. This demonstrated the suggested model’s potential to cope with cross-subject variability, which is crucial in practical settings. Moreover, this aspect of the proposed approach makes it well suited for the modern world; stress and depression are highly prevalent among individuals of all ages and have a significant negative influence on one’s health. Such emotional disorders can be categorized as negative emotions, and our suggested SIER system can identify them. Also, the premise that the presented EEG-based emotion recognition technique does not need filtering, denoising, or artifact removal from EEG signals increases its suitability and promise for real scenarios. The current method can satisfy the demands of HCI applications due to its minimal computational demand and significant classification performance.