1 Introduction

Motor imagery, based on brain–computer interface (BCI) systems that establish direct communication between the human brain and communication devices, provides users with the ability to control computer cursors, interactive robotic wheelchairs, and explore virtual environments (Doud et al. 2011). Among various brain imaging measurements used in BCI, electroencephalogram (EEG) is widely used for the classification of motor imagery tasks due to its low cost and noninvasive nature (Lebedev and Nicolelis 2006). However, inferring the category of actions that the subject is imagining based on raw EEG signals is not easy, as they contain cross-channel interdependence of multichannel data, strong nonstationary characteristics, low signal-to-noise ratio (SNR), and other hard-to-analyze features (Graimann 2009; Park et al. 2013; Wang et al. 2018). Therefore, a general approach for motor imagery classification with EEG data involves four steps: (1) pre-processing; (2) feature extraction; (3) feature selection; (4) learning a classifier. Although deep learning methods that are recently used for the classification of motor imagery tasks (Tabar and Halici 2016; Dai et al. 2019; Zhang et al. 2019) can achieve the above procedures simultaneously, they still require preprocessing approaches to enhance the related features of EEG signals (Wang et al. 2018; Hernández and Antelis 2018; Craik et al. 2019).

Over the past two decades, several techniques have been proposed for the preprocessing and feature extraction of motor imagery signals. These methods are based on the neurophysiological changes of EEG signals in specific frequency bands, such as the mu (8–12 Hz) and beta (18–25 Hz) rhythms, when subjects plan and execute hand or finger movements. Specifically, during motor imagery, the mu or beta rhythm exhibits the event-related decrease (ERD) over the contralateral scalp and the event-related increase (ERS) over the ipsilateral area, as observed in previous studies (Yuan and He 2014). These changes in the mu and beta rhythms are then extracted and quantified by the preprocessing and feature extraction methods.

In the preprocessing stage, most methods are based on existing signal processing technologies (Pfurtscheller et al. 1997; Kevric and Subasi 2017; Graimann 2009; Nicolas-Alonso and Gomez-Gil 2012). For example, Pfurtscheller et al. (1997, 2006) utilize the Fourier transform to analyze and filter EEG data for each channel. However, due to the nonlinear and non-stationary of EEG data, several researchers have explored the use of the wavelet transform to separate the original data into diverse frequency sub-bands, thereby enhancing the features of specific rhythms (Mousavi et al. 2011; Robinson et al. 2013). Despite their usefulness, classical signal processing methods, including Fourier and wavelet transforms, are limited by a predefined set of basis functions and cannot provide highly centralized time–frequency representations of EEG data (Park et al. 2013). Therefore, empirical mode decomposition (EMD), a fully data-driven time–frequency analysis technique, has also been explored by researchers for the preprocessing of EEG data (Wang et al. 2008).

After the preprocessing stage, feature extraction methods are used to quantify the filtered signals. These methods include Common Spatial Patterns (CSP) (Ramoser et al. 2000), energy entropy (Hu et al. 2009), adaptive autoregressive models (Anderson et al. 1998), and wavelet transform coefficients (Bostanov and Kotchoubey 2004). Note that several previously published studies combine preprocessing and feature extraction methods into an integrated framework, which are also referred to as hybrid technologies for feature extraction. For instance, Common Spatio-Spectral Pattern (CSSP) optimizes a simple filter that employs a one-time delayed sample with CSP algorithm (Lemm et al. 2005). Additionally, Ang et al. propose the Filter Bank Common Spatial Pattern (FBCSP), which combines the bandpass filter bank with the CSP method to achieve feature extraction (Ang et al. 2008).

However, the above-mentioned methods analyze or filter the signal from each EEG channel separately, without considering cross-channel interdependence, resulting in a problem of uniqueness: the decomposed components for each channel do not correspond in number and frequency (Park et al. 2014; Mandic et al. 2013). To address this issue, one widely used method is multivariate mode decomposition (MEMD) and its noise-assisted version (NA-MEMD) (Rehman and Mandic 2010; Ur Rehman and Mandic 2011). Park et al. (2013) have proposed a MEMD-based CSP approach for motor imagery classification, which fully benefits from its enhanced localization properties, the use of cross-channel information, and improved robustness to noise and artificial interferences. Inspired by this work, other studies have focused on improving classification accuracy and extending motor imagery tasks to multiple classes using similar MEMD-based frameworks (Bashar and Bhuiyan 2016; Gaur et al. 2018). However, the core algorithm, MEMD, still exposes several unsolved problems: (1) MEMD requires excessively high computational resources, especially for multivariate data (such as multichannel EEG) (Rehman et al. 2015; Lang et al. 2018); (2) The filter bank structure contained in MEMD is not stable enough and is vulnerable to measurement noise and interferences, leading to possible inaccurate decomposition behaviors like mode mixing (Lang et al. 2018). Therefore, it is challenging for MEMD-based methods to be compatible with brain–computer interface (BCI) devices and practical rehabilitation medical environments.

To address these aforementioned problems, this paper explores the use of fast multivariate empirical mode decomposition (FMEMD) to analyze motor imagery responses. FMEMD is a computationally less-expensive alternative to MEMD that operates by applying univariate EMD on projected signals to obtain a set of intrinsic mode functions (IMFs). These IMFs are combined with their corresponding direction vectors and solved by a least square algorithm to yield Multivariate IMFs (MIMFs) (Lang et al. 2018). FMEMD offers enhanced computational efficiency and a fairly stable filter bank property, making it highly robust to noise when processing low-SNR EEG data. Therefore, this paper proposes the FMEMD-based architecture for motor imagery tasks fully utilizing the benefits of FMEMD in terms of computational complexity and noise robustness.

Our proposed approach automatically eliminates redundant frequency bands and selects valuable ones by calculating the center frequencies of each decomposed component, thereby improving the characterization of brain activity. A comparative analysis between FMEMD and MEMD using simulation signals, similar to rhythms, confirms the superiority of FMEMD.

In addition, classification experiments conducted on two representative small datasets indicate that the FMEMD scheme is proficient in augmenting data features on small multichannel datasets and can mitigate dimensionality constraints caused by computational complexity. It can also function as a general preprocessing filtering algorithm for adaptive EEG rhythm separation and can be integrated with other complex classification models including deep learning method to enhance the accuracy and robustness of the underlying models.

This paper is organized as follows: Sect. 2 introduces MEMD, FMEMD, and their noise-assisted versions. Section 3 illustrates the advantages of FMEMD compared to MEMD using two simulation cases. In Sect. 4, a brief introduction of a representative dataset is provided, and the time–frequency analysis of bi-channel EEG data using FMEMD and other existing methods are presented. Finally, Sect. 5 presents a complete FMEMD-based classification scheme, combining some basic feature extraction and classifiers. In Section 6, an overall conclusion is presented, summarizing the contributions and significance of the proposed FMEMD-based classification scheme for motor imagery tasks

2 Preliminaries

2.1 Multivariate empirical mode decomposition

Empirical mode decomposition (EMD) is a fully data-driven method used for analyzing nonlinear and non-stationary signals (Huang et al. 1971). It decomposes a signal into a finite set of intrinsic mode functions (IMFs) that represent AM/FM components in nature. An IMF must satisfy two conditions: (1) the difference between the number of extrema and the number of zero crossings should be no more than one, and (2) the mean of the upper and lower envelopes defined by the local extrema should be close to zero. However, EMD is only applicable for univariate data.

To overcome this limitation, multivariate empirical mode decomposition (MEMD) was developed to analyze multivariate data (Mandic et al. 2013; Rehman and Mandic 2010; Rilling et al. 2007). In MEMD, the estimation of the multivariate local mean is a critical step, as the concept of local extrema is not well-defined for multivariate signals. Following the similar idea in bivariate empirical mode decomposition (BEMD) (Rilling et al. 2007), the MEMD algorithm uses real-valued projections along a set of directions on hyperspheres to obtain the extrema of multivariate signals. These extrema are then interpolated channel-wise to yield the desired envelopes. The multichannel local mean is finally estimated by averaging these envelopes. To improve the approximation accuracy of the local mean, MEMD utilizes quasi-Monte Carlo-based low-discrepancy sequences to generate a suitable set of direction vectors.

The MEMD method is summarized in Algorithm 1.

figure a

The sifting process of a multivariate IMF can be stopped when all K projections of the detail signal \(\mathbf{{s}}\left( t \right) \) satisfy the aforesaid stoppage criterion of the standard EMD. As a result, MEMD decomposes a p-variate signal \(\mathbf{{x}}\left( t \right) \) as

$$\begin{aligned} \mathbf{{x}}\left( t \right) = \sum \limits _{i = 1}^M {{\mathbf{{d}}_i}\left( t \right) } + \mathbf{{r}}\left( t \right) , \end{aligned}$$
(2)

where the p-variate MIMFs, \(\left\{ {{\mathbf{{d}}_i}\left( t \right) } \right\} _{i = 1}^M\), contain scale-aligned intrinsic joint rotational modes (Rehman and Mandic 2010).

Researchers have demonstrated that MEMD enables cross-channel time–frequency analysis and provides high localization of specific frequency components (Park et al. 2013; Mandic et al. 2013; Gaur et al. 2018). In a BCI study based on motor imagery EEG responses, the MEMD algorithm enhanced multicomponent extraction of the mu and beta rhythms of interest. In particular, the noise-assisted version of MEMD (NA-MEMD) allows for a more stable estimation of time–varying frequency information from multichannel EEG signals. Despite the powerful capability of MEMD for analyzing EEG data, there are several obstacles that limit its usefulness for clinical applications, most notably its computational inefficiency when processing multichannel data.

Since MEMD performs cubic spline interpolations in each data channel for a single sifting process, its computational complexity increases as the total number of data channels increases (Lang et al. 2018). Therefore, it is difficult to achieve real-time analysis and discrimination of motor imagery behaviors using multichannel EEG data. Additionally, the filter bank property of MEMD is not stable enough for noise disturbances. Therefore, even though using NA-MEMD, the mode mixing problemFootnote 1 (poor signal separation phenomenon) still arises in low-SNR complex signals like EEG. In the following section, we introduce the FMEMD algorithm and its noise-assisted version, which will be proven to overcome these problems and used to analyze the time–frequency variation of EEG data.

2.2 Fast multivariate empirical mode decomposition

FMEMD is a recently introduced method that has been shown to outperform MEMD in processing multivariate data with less computational cost. It operates by applying univariate EMD on projected signals to obtain a set of IMFs, which are combined with their corresponding direction vectors and then solved by a least squares algorithm to yield multivariate IMFs (MIMFs) (Lang et al. 2018). The details of FMEMD are listed in Algorithm 2.

figure b

The stopping criterion used for univariate IMF extraction in step 3 is borrowed from (Huang et al. 2003), where the sifting is stopped when the number of zero crossings and extrema is the same number for S successive sifting steps. Typically, a value of \(S = 5\) has proved successful as the default stopping criterion.

Similar to the noise-assisted MEMD (NA-MEMD), we here add the extra noise channels into the original signal, and utilize FMEMD to decompose the synthesized signal, thereby eliminating the mode mixing. The noise-assisted FMEMD (NA-FMEMD) forces the alignment of multivariate IMFs (MIMFs) based on the dyadic filter structure of FMEMD, where each MIMF carries only one frequency sub-band. As mentioned before, compared with MEMD, FMEMD contains a much more stable filter bank property, thereby exhibiting stronger robustness to noise. Hence, it is more effective for NA-FMEMD to more effectively solve the mode mixing problem than NA-MEMD, which will be illustrated in the following experiments. The details of the noise-assisted version are outlined in Algorithm 3.

figure c

3 Comparative analysis on simulation signals

According to the previously published works (Rehman and Mandic 2010; Park et al. 2013, 2011), researchers have designed two representative simulation experiments on basis of multichannel low-SNR data like EEG for illustrating the ability of MEMD to extract the cross-channel information and the robustness to noise disturbance. These two experiments are called as the extraction of common oscillatory modes and component estimation, respectively. In fact, MEMD performs better in these experiments compared with the univariate method EMD (Park et al. 2013). As mentioned before, FMEMD scheme exhibits a fairly stable filter bank structure and low computational complexity. In this section, we replicate the aforementioned experiments to illustrate that FMEMD framework exhibits a more precise, resilient, and efficient analysis on simulation oscillatory data, which is similar to EEG rhythms, as compared to MEMD. Moreover, it can serve as a general preprocessing filtering algorithm for adaptive EEG rhythm separation.

3.1 Common oscillatory modes of multivariate IMFs

We apply NA-MEMD and NA-FMEMD to a 3-channel synthetic signal to visually investigate the capability of FMEMD on the extraction of common oscillatory modes compared with that of MEMD. We choose the noise-assisted versions of MEMD and FMEMD since these two algorithms can effectively alleviate the mode mixing problem.

Note that the number of noise-assisted channels is set to 4. The Halton and Hammersley sequence is used for generating a set of \(K = 64\) directions. The stoppage criterion is \(S = 5\), and end effects are eliminated in advance. The simulation signal \([x_1(t),x_2(t),x_3(t)]\) is constructed by,

$$\begin{aligned} \begin{array}{l} x_1(t) = \left\{ {\begin{array}{*{20}{c}} {2\sin (2\pi {f_1}t),t = 1,\ldots ,400}\\ {2\sin (2\pi {f_1}t) + 1.5\sin (2\pi {f_3}t),t = 401,\ldots ,1000} \end{array}} \right. \\ x_2(t) = 2\sin (2\pi {f_1}t) + 2\sin (2\pi {f_2}t),t = 1,\ldots ,1000\\ x_3(t) = \left\{ {\begin{array}{*{20}{c}} {2\cos (2\pi {f_1}t),t = 1,\ldots ,600}\\ {2\sin (2\pi {f_2}t),t = 601,\ldots ,1000} \end{array}} \right. . \end{array} \end{aligned}$$
(3)

where the sample frequency is \(f_s = 1000\), and the frequencies of the signal are as follows: \({f_1} = \frac{6}{{{f_s}}}\), \({f_2} = \frac{15}{{{f_s}}}\), \({f_3} = \frac{40}{{{f_s}}}\).

The time–domain decomposition results are shown in Figs. 1 and 2. As we can see that the common oscillatory modes are aligned at the same IMF level, where the 6Hz frequency mode, common to the three channels, is aligned by NA-MEMD at the \(d_9\) in Fig. 1. However, NA-MEMD still generates the mode mixing problem, which is observed in \(d_7\) and \(d_8\) of Fig. 1. MIMFs extracted by NA-FMEMD in Fig. 2, by contrast, fully eliminate the mode mixing and are accurately located within different frequency scales. This illustrates that the performance of NA-FMEMD achieves a further improvement over NA-MEMD in terms of alleviating mode mixing problems, indicating that the FMEMD-based method enables a more accurate and unified frequency-band separation across the data channels, which is crucial for the analysis of multichannel EEG data.

Fig. 1
figure 1

The results decomposed by NA-MEMD. The mode mixing problem is present in \(d_7\) and \(d_8\)

Fig. 2
figure 2

The results decomposed by NA-FMEMD. Different frequency modes are separated to the corresponding MIMFs without mode mixing

We need to highlight that a robust filter bank property is the internal cause of stably reducing mode mixing phenomenons. More specifically, the intermittency and the randomness of multivariate multi-component signals easily lead to the aliasing among frequency sub-bands of the filter bank, which means that NA-MEMD cannot well recover high or low-frequency oscillations of the original signal. In comparison, FMEMD utilizes its fairly stable filter bank property to eliminate mode mixing with a noise-assisted framework. This enhanced capability of FMEMD that comes from the stability of the filter bank structure will be more prominently verified in the following experiment, component estimation.

3.2 Component estimation

In this section, we conduct the component estimation experiment to illustrate the improved noise robustness and higher computational efficiency of FMEMD compared with MEMD. The simulation data is a multichannel signal where all channels contain the same one-component signal \({s_f}\), added by different realizations of white Gaussian noise (WGN), as shown below

$$\begin{aligned}{} & {} {x_1} = {s_f} + {n_1},{x_2} = {s_f} + {n_2},\nonumber \\ {}{} & {} \quad {x_3} = {s_f} + {n_3},\ldots ,{x_P} = {s_f} + {n_P}. \end{aligned}$$
(4)

where \(n_P\) represents a realization of WGN in the Pth channel. In this case, \(s_f\) represents one fixed sinusoidal signal (with the frequency 6 Hz or 15 Hz) common to all channels. We obtain a set of simulation signals by increasing the number of channels P from 4 to 8 and selecting diverse signal-to-noise ratios (SNRs) of 5 dB, 0 dB and \(-5\) dB with respect to \(s_f\). Each signal is decomposed by MEMD and FMEMD for 50 trials (corresponding to 50 noise realizations of each signal), and we select all specific MIMFs where contain the same frequency with \(s_f\) as the reconstructed sinusoidal signals. To conduct the performance assessment, we compute the mean and variance of new SNRs of these reconstructed signals, and provide the visual results by error bars. In addition, the sampling frequency is 1000 Hz and each trial lasts for 1 s. The parameters used in MEMD and FMEMD keep consistent with the previous experiment.

The estimation results are shown in Figs. 3 and 4, corresponding to the 6 Hz and 15 Hz frequency component, respectively. As can be seen that FMEMD outperforms MEMD for almost all input SNRs under different total numbers of channels.

Fig. 3
figure 3

The estimation results to the 6Hz sinusoidal signal by MEMD (blue lines) and FMEMD (red lines)

Fig. 4
figure 4

The estimation results to the 15Hz sinusoidal signal by MEMD (blue lines) and FMEMD (red lines)

Fig. 5
figure 5

The average computational times of MEMD and FMEMD corresponding to different total numbers of data channels when the 6 Hz or 15 Hz sinusoidal signals are extracted

Particularly, the estimation accuracies (the output SNRs) of MEMD significantly decrease when the input noises increase (SNRs from 5 to \(-5\) dB), while FMEMD still maintains a consistent performance. In addition, for 50 repeated decomposition trials of each simulation signal, the output SNRs obtained by FMEMD show the fluctuations (variances) in a much smaller range than MEMD. This again verifies that FMEMD presents a fairly stable filter band structure, exhibiting the stronger robustness to different noise disturbances, resulting in a more accurate component estimation.

We also calculate the average running times of MEMD and FMEMD with respect to diverse numbers of data channels when extracting the common sinusoidal signals \(s_f\) by these two methods. The results are shown in Fig. 5. Observe that the computational efficiency of FMEMD to extract the cross-channel information is greatly improved compared with MEMD. This phenomenon will become much more striking if the total number of data channels increases, for instance, the computational time (59.3 s) of MEMD is nearly 30 times than that (2.5 s) of FMEMD when the 8-channel simulation data is decomposed for obtaining the 6 Hz sinusoidal signals.

4 Time–frequency analysis of EEG data

The section verifies the time–frequency analysis ability of FMEMD on practical EEG data. More specifically, we employ NA-FMEMD to detect ERD (event-related decrease) and ERS (event-related increase) phenomenons in the contralateral and the ipsilateral somatosensory cortex. The previously published work has illustrated that MEMD produces more accurate spectrogram estimations of EEG data over the classic univariate methods, such as the short-time Fourier transform (STFT) and continuous wavelet transform (CWT) (Park et al. 2013, 2014, 2011). In this work, we conduct the comparison between NA-FMEMD and NA-MEMD in terms of the extraction of ERD and ERS, thereby illustrating that FMEMD is a more suitable method for the time–frequency analysis of EEG data.

4.1 Materials

We here make use of publically available BCI Competition IV Dataset I to select the EEG data to be analyzed (Blankertz et al. 2007). The dataset was provided by the Berlin BCI group. EEG signals were recorded using 59 electrodes from four healthy subjects (a, b, f, and g). For each subject two classes of motor imagery were selected from three tasks. More precisely, subjects a and f performed left hand and foot motor imagery while subjects b and g carried out left hand and right hand motor imagery. A total of 200 trials were available for each subject, including 100 trials for each class.

In each trail, the visual cues are displayed in the computer screen for a period of 4 s during which the subject is instructed to perform one of the possible tasks. A 2 s blank screen and a 2 s fixed cross in the center of the screen are followed after the 4 s motor imagery task. The EEG signals are sampled at the sampling frequency of 100 Hz. For more details about BCI Competition IV Dataset I refer to (Blankertz et al. 2007). According to the simulation results in Sect. 3.2, the computational efficiency of FMEMD almost gets rid of the influence of the data channel. Hence, we directly apply FMEMD to each EEG data with total 59 channels. For verifying how the number of processed channels affects the FMEMD performance using real-world EEG data, we have also considered 11 channels from the 59 EEG channels, “FC3”, “FC4”, “Cz”, “C3”, “C4”, “C5”, “C6”, “T7”, “T8”, “CCP3”, and “CCP4”, and 4 channels from these 11 channels, “C3”, “C4”, “CCP3” and “CCP4”, which are followed by Park et al. (2013). Next, the ERD and ERS phenomenons on the left hand motor imagery datasets of subject g from electrode “C3” and “C4” are estimated using FMEMD and MEMD.

4.2 Time–frequency analysis using FMEMD and MEMD

As mentioned before, we only choose MEMD for comparison as the researchers have illustrated its superiority on time–frequency analysis over other existing univariate methods. Both MEMD and FMEMD decompose 11 data channels with the aid of two additional noise channels, while only the analysis results of the ipsilateral hemisphere (C3 data channel) and the contralateral hemisphere (C4 data channel) with respect to the left hand motor imagery task are displayed.

The extracted mu and beta separated rhythm time series from \(-2\) to 6 s using NA-MEMD and NA-FMEMD are shown in Fig. 6. Note that the first 2 s and last 2 s correspond to the fixed-cross and the blank screen, respectively, while the middle 4s is the duration of motor imagery task. In order to highlight the ERD and ERS, all time series display the amplitude changes relative to the mean amplitude of the first 2 s baseline interval, which are normalized in advance by the standard deviation of the baseline signal. As mentioned in Yuan and He (2014); Yuan et al. (2008), once the subject starts to perform motor imagery task, the amplitudes decrease (ERD) for approximately 2 s over the contralateral scalp and the amplitudes increase (ERS) after 2 s over the ipsilateral scalp in the specific rhythms, especially mu rhythm.

Fig. 6
figure 6

Amplitude changes in beta and mu rhythms normalized by the mean and standard deviation of baseline waveform, are estimated by NA-MEMD and NA-FMEMD. From 1.5 to 4 s, ERS (amplitude increase) in C3 and ERD (amplitude decrease) in C4 can be accurately observed by NA-FMEMD

According to the extracted mu and beta separated rhythms by NA-MEMD and NA-FMEMD, the ERD appears between 1.5 and 4 s over the contralateral scalp (C4 electrode), while the ERS happens around 3s over the ipsilateral scalp (C3 electrode). In particular, the mu separated rhythm contains more prominent ERD and ERS phenomenons than the beta separated rhythm, which is consistent with the above prior knowledge. Besides, compared with NA-MEMD, NA-FMEMD observes clearer ERS (around 3 s) within the mu and beta separated rhythms. Similar results are also shown by the power changes of the time series in Fig. 7. The instantaneous powers are computed by the envelopes of the time series.

Fig. 7
figure 7

Power changes in beta and mu rhythms are estimated by NA-MEMD and NA-FMEMD. From 1.5 to 4 s, ERS (amplitude increase) in C3 and ERD (amplitude decrease) in C4 can be accurately noted by NA-FMEMD

Fig. 8
figure 8

NA-MEMD and NA-FMEMD spectra within the ipsilateral and contralateral scalps (C3 and C4) for left hand motor imagery tasks

Figure 8 shows the time–frequency spectra obtained by NA-MEMD and NA-FMEMD, where the motor imagery start from 0 to 4 s. The results are derived by decomposing the 11-channel EEG data of left hand motor imagery and computing the Hilbert-Huang spectra (HHS) from IMFs corresponding to C3 and C4.

As with the previous analysis results, the time–frequency spectra of NA-MEMD and NA-FMEMD show the ERS around 3 s and ERD between 1.5 and 4 s. Compared with the classic univariate methods, MEMD has been proved to provide the more localized time–frequency representations of EEG data (Park et al. 2013). In this case, FMEMD also exhibits the similar ability according to Fig. 8. The difference is that FMEMD preserves less high-frequency (over 30 Hz) noise in the obtained spectra than MEMD, which generates the clearer analysis results with respect to the brain activities of left hand motor imagery.

4.3 Average preprocessing times of FMEMD and MEMD

Another comparative experiment is conducted by computing the average running time of NA-MEMD and NA-FMEMD for all trials of each subject in BCI Competition IV Dataset I. The EEG data with diverse data channels of 4, 11 and 59 (as mentioned in Sect. 4.1) are processed by NA-MEMD and NA-FMEMD. The related parameters of the applied methods are consistent with the previous section. Table 1 shows the final results. Observe that the increase in the total numbers of data channels causes less impact on the running efficiency of FMEMD than MEMD. Moreover, FMEMD improves the preprocessing time by more than 15 times compared with MEMD. Hence, FMEMD becomes more compatible with the real-world BCI systems. On the other hand, the scheme to consider the cross-channel information of more data channels simultaneously is more practicable on FMEMD due to its high calculation rate. Combining the more superior noise robustness given by the stable filter bank property, the FMEMD-based method can provide more accurate estimation of brain responses within specific frequency bands over existing methods, thereby improving the classification accuracy of motor imagery tasks.

5 Classification of motor imagery task using FMEMD

In this section, we propose a FMEMD-based classification method and evaluate it using two motor imagery datasets.

5.1 Materials

As mentioned before, we have chosen the BCI Competition IV Dataset I to evaluate the time–frequency analysis ability of FMEMD. In this section, this dataset is also used to verify the classification performance of the FMEMD-based approach over other existing methods that have employed the same dataset. Another representative dataset from the Physiobank Motor/Mental Imagery (MMI) database is taken into consideration (Schalk et al. 2004). The dataset consists of a total of 109 subjects who performed the left and right hand motor imagery tasks. Each subject perform 45 trials and imagined one of two tasks for a duration of 4 s. The 64-channel EEG data are recorded at 160 Hz. Note that we exclude the data of 4 subjects including S088, S092, S100, and S104, since they had damaged recordings and too little samples (Kim et al. 2016). Therefore, totally 105 subjects are considered into this classification experiment. Out of 64 EEG channels, 11 are chosen for analysis, including “FC3”, “FC4”, “Cz”, “C3”, “C4”, “C5”, “C6”, “T7”, “T8”, “CP3”, and “CP4”. The BCI Competition IV Dataset 2a is also considered to comprehensively evaluate the performance of the proposed method. This dataset includes the EEG signals of four-category MI recognition tasks (left hand, right hand, feet, tongue) from 9 subjects. For each subject, two sessions of EEG signals were collected on different days, and there were total 288 trials (72 trials per class) per session. In accordance with the prompt displayed on the screen, four distinct MI tasks were performed by the subjects. The sampling frequency of the EEG signals from 22 electrodes was 250 Hz.

5.2 FMEMD-based classification method

5.2.1 Preprocessing

The motor imagery data from the given datasets are filtered into the different frequency modes using NA-FMEMD. As before, NA-FMEMD is applied to decompose the multichannel EEG data simultaneously with two additional noise channels. Note that, in order to examine the relationship between the number of EEG channels and the FMEMD performance, we select the EEG data with 4, 11 and 59 channels to conduct the classification experiments. The parameters of FMEMD itself remain the same as the previous content.

Once a set of MIMFs are obtained, we need to identify the specific components and their frequency ranges which contribute to the beta and mu rhythms. One of contributions of our work is to develop an automatic screening method of frequency modes by computing the joint mean frequency of each MIMF.

For a multivariate modulated oscillation (MIMF), the joint instantaneous frequency \(\omega (t)\) is defined as

$$\begin{aligned} \omega (t) = \frac{{\sum \nolimits _{n = 0}^N {a_n^2} (t){\omega _n}(t)}}{{\sum \nolimits _{n = 1}^N {a_n^2} (t)}}, \end{aligned}$$
(5)

which is the power-weighted average of the frequencies \({\omega _n}(t)\) of all N channels. Therein, \({a_n} (t)\) denotes the instantaneous amplitudes of each channel, while the instantaneous powers are represented as \({a_n^2} (t)\). This joint instantaneous frequency \(\omega (t)\) is the generalization of the concept of univariate instantaneous frequency, which has been interpreted specifically in Lilly and Olhede (2009, 2011) and Boashash (1992). The joint mean frequency \(\bar{\omega }\) for each MIMF is then computed by,

$$\begin{aligned} \bar{\omega }= \frac{1}{L}\sum \limits _{t = 1}^L {\omega (t)}, \end{aligned}$$
(6)

where L is the data length. In fact, these mean frequencies of MIMFs can be also regarded as the center frequencies of the power spectrum in the frequency domain. Using these center frequencies, we can automatically select the frequency modes corresponding to beta (18–25 Hz) and mu (8–12 Hz) rhythms, thereby enhancing the characteristics of the motor imagery responses.

Table 1 Average computational time (s) against diverse data channels

5.2.2 Common spatial patterns

The common spatial patterns (CSP) approach is widely used as an effective tool to extract numerical features relevant to motor imagery responses in BCI applications (Schalk et al. 2004; Kevric and Subasi 2017). It aims at finding linear spatial filters that maximize the variance of EEG signals from one class while minimizing their variance from others (Lotte and Guan 2010). In particular, ERD(S) caused by changing mental/brain states can be detected by CSP filters as the relative operations are sensitive to power changes of time series. A ith trial EEG data for one task C is denoted as an \(N \times K\) matrix \({\mathrm{{X}}_{i,C}}\), where N is the number of channels and K represents the total number of selected MIMFs. Then details of the CSP method are presented in Algorithm 4.

figure d

5.2.3 Classification

In this work, the feature vectors, which are yielded by Eq. 6 in Algorithm 4 for \(m=1,2\), are classified using different classifiers. Both the linear discriminant analysis (LDA) and support vector machine (SVM) algorithmsFootnote 2 were, and still are, the most popular types of classifiers for EEG based-BCIs, particularly for online and real-time BCIs (Lotte et al. 2018). Therefore, we select these two representative classifiers to perform the classification tasks, while exploring the impact of different classifiers on the FMEMD-based classification performance. We divide 200 trial data for each subject from BCI Competition IV Dataset I into 140 training and 60 testing sets, and the 45 trial data in Physiobank MMI database into 32 subjects for training and 13 for test sets.Footnote 3

Fig. 9
figure 9

Architecture of the proposed FMEMD-based classification approach for left and right hand motor imagery tasks

The classification performance of all classifiers are calculated using a five-cross validation, while the classification tasks of each subject are repeated for 100 times by mixing the sample order. It is worth noting that the upper limit of confidence intervals between two classes corresponding to the number of trials was \(56.9\%\) for 200 trails and \(64.0\%\) for 45 trails (Kim et al. 2016; Loboda et al. 2014). Only the subjects with classification rates over \(56.9\%\) or \(64.0\%\) are categorized as significant subjects to display.

The architecture of FMEMD-based classification method is given by Fig. 9.

Note that, instead of reconstructing the EEG data by calculating the sum of the selected MIMFs (Park et al. 2013; Gaur et al. 2018), we conduct the feature extractions with respect to different frequency sub-bands using common spatial pattern filters (CSP), respectively, and integrate these features into a feature vector. This further highlights the advantages of FMEMD in terms of frequency band separation and mode alignment, while enabling the feature vectors to contain more information on brain activities.

Table 2 Classification results (%) for four subjects of BCI competition IV dataset I using all applied algorithms

5.3 Results

Table 2 shows the classification performances for the four subjects of BCI Competition IV Dataset I using NA-FMEMD, NA-MEMD along with LDA and SVM classifiers.

The proposed method-1 applies NA-FMEMD to 11-channel EEG data for all subjects(NA-FMEMD+CSP+LDA(11)), and extract the feature vectors using CSP, then achieving the classification with LDA classifier. In order to examine the impact of data channels on the classification performance of FMEMD-based methods, 59-channel EEG data are used in the proposed method-2(NA-FMEMD+CSP+LDA(59)). Besides this, the Method-3 classifies the CSP features from the 11-channel EEG data using SVM(NA-FMEMD+CSP+SVM(11)), thereby realizing the comparison between different classifiers. The above three methods all make use of the proposed FMEMD-based architecture.

As we can see that since Method-2 considers the information from more data channels, it achieves the improvement of \(2.5\%\), \(1.2\%\), and \(1.2\%\) for three subjects (a, b, and f) over Method-1. However, the accuracy of Method-2 for subject g is much lower than Method-1. This may be because the features of subject g are more concentrated on the selected 11 channels, while the remaining channels cause the adverse effect on the feature extraction. Compared with Method-3, Method-1 shows a more superior performance for all subjects, which reveals that the LDA classifier is more suitable for FMEMD-based method. On average, the best and robustest performance is obtained by Method-1, where it achieves a \(1.0\%\) improvement over Method-2, a \(0.5\%\) improvement over Method-3.

Table 2 also shows the comparison with other state-of-art methods, including Method-4 (MEMD-based CSP with SVM) (Park et al. 2013), Method-5 (FBCSP with SVM) (Kumar et al. 2017), Method-6 (CSP-TSM with SVM) (Kumar et al. 2017) and Method-7(DRL1-CSP) (Jin et al. 2020). The Method-4 exploited NA-MEMD to filter the 11-channel EEG data. Then it computed the CSP features from manually selected decomposition components, further classified by SVM. Method-5 and Method-6 applied a butterworth bandpass filter to raw EEG data, and performed the feature extraction stage using FBCSP and CSP-TSM (CSP and tangent space mapping (TSM)), respectively. These features are then classified by the same SVM classifier. Method-7 applied a internal feature selection for CSP based on Difference and Ratio of Average L1-Norm.Observe that our proposed method can maintain superior classification performances for all subjects over other methods. On average, the NA-FMEMD-based method gives the high accuracies of \(83.6\%\), a \(6.2\%\) improvement over Method-5 (FBCSP with SVM), a \(2.3\%\) improvement over Method-6 (CSP-TSM with SVM), and a \(15.6\%\) improvement over Method-7(DRL1-CSP). Although Method-4 (NA-MEMD-based method) marginally outperforms Method-1 by \(0.2\%\), the running rate of the proposed method is 20 times more than Method-4 (see also Sect. 4.3 and Table 1), which benefits real-time BCIs.

The classification rates for the second dataset, Physiobank MMI database, are obtained by NA-FMEMD and NA-MEMD with CSP, which are shown in Table 4. Among FMEMD-based approaches, we chose Proposed Method-1 since it showed the better-synthesized performance in the experiment of BCI Competition IV Dataset I over proposed Method-2 and Method-3. The NA-MEMD-based method is still denoted as Method-4. Followed by Park et al. (2013), we also displayed ten subjects for a detailed comparison. Except for two subjects 25 and 12, Method-1 showed the best classification accuracies for the remaining eight subjects. On average, the proposed method gave a \(0.9\%\) improvement over NA-MEMD (Method-4).

Table 3 Overview over other works performing motor imagery classification tasks on the Physiobank MMI dataset for all subjects, and comparison to this work’s results
Table 4 Classification Results (\(\%\)) for ten subjects of Physiobank MMI database using Proposed Method-1 and Method-4

Table 3 shows the average classification rates for all significant subjects obtained by the proposed method and other existing approaches. Observe that our proposed method presents a comparable performance over other methods. Although the FMEMD-based method in this work achieves a slightly lower accuracy compared to the works in Kim et al. (2016) and Dose et al. (2018), we should notice that: (1) the former still uses MEMD to preprocessing EEG data, which consumes plenty of running time, especially when the number of data channels increases. Researchers have illustrated that the SUT-CCSP (strong-uncorrelating transform based complex CSP) used in this method shows more superior performance in terms of feature extraction than the conventional CSP used in our proposed method (Park et al. 2013). Therefore, the new method combining NA-FMEMD with SUT-CCSP may achieve better performance than MEMD-based one; (2) the latter method based on convolution neural network (CNN) is an end-to-end learning approach for classification of motor imagery tasks. It requires a lot of high-quality training data while showing poor model interpretability, which leads to the low compatibility with practical BCI systems. Besides, this method also lacks verification of other data sets.

Table 5 Overview over other works performing motor imagery classification tasks on the BCI Competition IV Dataset 2a for all subjects, and comparison to this work’s results

Table 5 shows the average classification rates on the BCI Competition IV Dataset 2a obtained by the proposed method and popular deep learning methods. Observe that our proposed method presents a comparable performance over other methods. Although the FMEMD-based method in this work achieves a slightly lower accuracy compared to the EEGNet and ATCNet, it still surpasses most other deep learning methods. These end-to-end learning approaches for classification of motor imagery tasks requires a lot of high-quality training data while showing poor model interpretability, which leads to the low compatibility with practical BCI systems.

Therefore, among the reviewed methods for the classification of motor imagery tasks, our proposed one is a real-time BCI oriented approach with the best-synthetic performance. While it is true that utilizing deep learning methods can enhance accuracy, FMEMD as a preprocessing method can also be incorporated into neural network models to replace segmentation filtering like FBCSP. We will conduct relevant research in our future work.

5.4 Discussion

In this paper, we applied our recently proposed FMEMD method to analyze motor imagery EEG data. FMEMD inherited the ability of MEMD to multivariate analysis, thereby providing a scale-alignment decomposition and physically meaningful component estimations for motor imagery response. It is cleared that FMEMD showed higher noise robustness due to its fairly stable filter bank property compared with MEMD. Hence, there is no obvious mode mixing and mode misalignment problems are introduced by FMEMD. Besides, it achieved more precise and stable multivariate component estimations under the disturbances of White Gaussian noises with different SNR. The above statements were verify in Sect. 3 (simulation data) and Sect. 4 (real-world data). More specifically, Figs. 123, and 4 clearly revealed a more superior decomposition performance exhibited by FMEMD in terms of mode mixing and component estimation. Particularly, the fewer fluctuations of estimation results using FMEMD (as shown in Figs. 3 and 4) among all repeated trials for each frequency mode in the component estimation experiment again illustrated its stable filter bank property. For the real-world two-channel EEG data from BCI Competition IV Dataset I, FMEMD also observed more prominent ERD and ERS phenomenons from both views of instantaneous amplitudes (Fig. 6) and instantaneous powers (Fig. 7). Therefore, the power spectra yielded by FMEMD not only showed similar enhanced-localization characteristics, but also exhibited the less high-frequency noisy points than MEMD in the time–frequency plane (see also Fig. 8).

It was shown that FMEMD enhanced noise robustness in the improvement of classification rates on two typical datasets, especially Physiobank MMI dataset (see Table 4). For the displayed ten subjects, the FMEMD-based method basically maintained the better classification results than MEMD, which demonstrated that FMEMD retained more accurate cross-channel information for each subject. For the first dataset BCI Competition IV Dataset I, the FMEMD-based method outperformed most applied methods, except for showing a slightly lower average classification rate then MEMD (see Table 2). However, the higher computational efficiency of FMEMD compensates for this drawback in the sense of real-time BCIs.

Figure 5 (simulation data which is similar to rhythms) and Table 1 (EEG data with different channels) revealed that the existing time of FMEMD for processing multichannel data was much shorter than MEMD, and, in particular, this superiority became more obvious when the total number of data channels increased. The greatly reduced computational load of FMEMD led to the enhanced compatibility with practical and real-time BCI systems. In addition, FMEMD allowed the simultaneous analysis of more channels without the apprehension of increased running time from more data channels. It showed that more EEG information corresponding to other channels was taken into consideration, which may resulted in the improved classification performance. In fact, the comparison of the classification accuracies of proposed Method-1 (with 11-channel EEG) and Method-2 (with 59-channel EEG) on subjects a, b, and f in Table 2 verified this statement.

In this work, we also introduced a more physically meaningful screening index of frequency modes than frequency measures used in Gaur et al. (2018). The index for each MIMF is computed by averaging the joint instantaneous frequencies across data channels. The related information about the joint instantaneous frequencies of a multivariate modulated oscillation can be found in Lilly and Olhede (2009, 2011). Combining the newly proposed selecting strategy, we developed the FMEMD-based classification scheme, as shown in Fig. 9. However, although our proposed scheme was comparable to other hybrid methods, it failed to show the highest classification rates (see Table 3) since the feature extraction method and classifier used in this work were not further analyzed and optimized. We can find several effective directions for improving the overall performance from the experimental results. For examples, LDA classifier was more suitable for the FMEMD-based method than SVM, as verified by the classification results of proposed Method-2 and Method-3 in Table 2. Researchers have illustrated that the nonlinear classifier, such as random forest (RF) can produce more superior performance than conventional classifiers (like LDA) in the application of motor imagery classification (Kim et al. 2016). This finding may also significantly improve the FMEMD-based classification method. On the other hand, a suitable selection strategy of data channel is essential for our proposed method. Overly more data channels sometimes will damage the extracted EEG features, thereby resulting in less accuracies for certain subjects. The lower classification rate of proposed Method-2 on subject g illustrated this point (see also Table 2). Fortunately, the EEG channel selection scheme is already a relatively mature technology, moreover, FMEMD does not need to consider the impact of the number of channels on the operating efficiency, thus being expected to further obtain more superior performance of the FMEMD-based method. It is indicated that FMEMD architecture is not limited to motor imagery cases but can also be employed in the analysis of several neural data. A new area of research has arisen pertaining to the separation and parametrization of neural oscillations (Donoghue et al. 2020), and this method can make contribution to the development of this field.

6 Conclusion

In this study, our recently proposed FMEMD method has been explored to EEG rhythm separation and time–frequency analysis compared with other state-of-art methods. It was found that FMEMD and its noise-assisted version can significantly improve classification performance on BCI Competition IV dataset and attained a comparable performance to complex methods on the Physiobank MMI dataset and BCI Competition IV 2a dataset within less executing time. The stable filter bank property and low computational complexity of FMEMD enable its accurate component estimation and high noise robustness, thus providing more accurate brain activities corresponding to the specific frequency bands, especially mu and beta rhythms. Future works will concentrate on the improvement of the FMEMD-based method and its implementation. Our intention is to utilize FMEMD architecture as a novel approach for EEG rhythm separation to investigate the mechanisms of neurological disorders, aid in their diagnosis, and conduct classification research on various cognitive tasks.