Keywords

1 Introduction

Brain-computer interface (BCI), also known as brain-computer fusion sense, is a means of communication between a human or animal brain and an external auxiliary device. Using this communication technology, the control and interaction of external auxiliary devices can be realized without the help of the brain nervous system and muscle tissue [1], Nicolas [2] define this as a hardware and software communication strategy.

There are many classification methods for BCI. Figure 1 illustrates four classification schemes, which are classified according to the direction of control, dependability, recording method, and operation method respectively. According to the direction of control, it can be divided into unidirectional BCI and bidirectional BCI. In a unidirectional BCI, only one end can send instructions to the other end at the same time. For example, the brain sends instructions to an external auxiliary device, or an external device sends instructions to the brain. Bidirectional BCI allows two-way information exchange between the brain and external devices. At present, the research on BCI is mainly on unidirectional BCI, and can only realize the brain sends instructions to external auxiliary devices.

According to dependability, it can be classified into dependent BCI and independent BCI. Dependent BCI requires subjects to carry out some form of motor control, such as visual evoked control and motor imagery control, which has been widely used. Independent BCI, which does not require subject control, is ideal for patients with eye movement disorders or severe physical paralysis. Tello [3] proposed a novel independent BCI based on conventional steady-state visual evoked potentials, they use figure-ground perception to identify two different targets, send commands in limited visual space without shifting eyes, and proved to be effective.

According to the recording method, it can be divided into non-invasive BCI and invasive BCI. Non-invasive BCI involves placing physical electrodes to collect electrical signals on the scalp, and invasive BCI involves placing physical electrodes into the skull. Invasive BCI requires physical electrodes to be surgically implanted in different parts of the brain and has the characteristics of strong signal acquisition, stable signal, and long duration. But with time, scar tissue is easy to produce, resulting in signal interference and loss. Although the signal of non-invasive BCI is not as strong and stable as that of invasive BCI, it does not harm the human body, there is no need to worry about immune effects on the human body. Common non-invasive methods include EEG, MEG, PET, functional magnetic resonance imaging, and functional near-infrared spectroscopy etc. Due to the advantages of non-invasive, easy to use, safe, easy to collect, and cost-effective, EEG is widely used and can induce SCP, SSVEP, MI, ERRP, P300, and other control signals [4].

Finally, according to whether the user depends on the time when operating the system, the BCI can be divided into synchronous BCI and asynchronous BCI. If the interaction is based on a prompt imposed by the system at some point in time, it is called asynchronous BCI. At this point, the brain activity is generated by the user. Based on this cue, it can be distinguished whether the neural activity generated by the user is intentional or unintentional [5]. Asynchronous BCI means that the user can generate a mental task to interact with the application at any time, regardless of time and system prompt. But asynchronous BCI needs to actively distinguish between intentional and unintentional neural activity generated by the user. The synchronous BCI system is simple in design but has many limitations. By comparison, synchronous BCI is not user-friendly.

Motor imagery (MI), one of the four main paradigms of BCI, focuses on controlling the movement of objects (such as the movement of hands, arms, or feet) through visual-motor imagery visualization. Unlike other paradigms, it primarily characterizes an intention to move, controls limb movement through neural activity and has no actual movement output, and does not require external stimulation [6]. When subjects imagine different limb movements, they generate EEG signals in the sensorimotor cortex of the brain that is similar to the actual signals, allowing researchers to determine the user's intention to achieve control of the limb by identifying the activation effects in different brain areas. 28-year-old paraplegic Giuliano Pinto successfully kicked off the World Cup in Brazil through MI-BCI.

The MI-BCI is important for the therapeutic recovery of stroke patients, people with motor disorders, severe muscle disorders, and paralysis etc., and this active motor rehabilitation training approach has been studied to effectively restore the function of impaired brain motor perceptual areas [7]. It also allows people with motor disabilities, cerebral palsy, and other mobility impairments to control some external assistive devices (such as wheelchairs, nursing beds, and robotic arms) through the MI paradigm to restore their ability to communicate and move to some extent.

The structure of this paper is as follows: Sect. 2 introduces the basic information about EEG signal and EEG signal acquisition; Sect. 3 introduces the pre-processing method of MI-EEG signal; Sect. 4 introduces the feature extraction method of MI-EEG signal; Sect. 5 introduces the feature classification method of MI-EEG signal; Sect. 6 makes a conclusion and outlook on the MI-BCI.

Fig. 1.
figure 1

Classification of BCI systems in terms of control direction, dependability, recording method, and mode of operation.

2 EEG Signals and Signals Acquisition

The EEG signals are the sum of the changes in extracellular field potentials caused by the electrophysiological activity of a large number of nerve cells in the brain in the cerebral cortex or on the surface of the scalp, and data on brain activity can be recorded using EEG acquisition equipment. EEG signal is generally classified as spontaneous EEG and evoked EEG [8], with spontaneous EEG being the spontaneous changes in extracellular field potentials induced by the brain's nervous system without any external stimuli applied, such as slow cortical potentials and sensorimotor rhythms. Evoked EEG is an external stimulus (such as sound, light, picture, video etc.) applied to a person's sensory organs that cause fluctuations in the nervous system of the brain which in turn causes potential changes in the corresponding parts of the brain, such as steady-state visual evoked potentials, visual evoked potentials, and P300.

The human brain is generally divided into the cerebral cortex and the subcortex, of which the cerebral cortex is generally the focus of scientists’ research. It is the most central and complex region of the brain, controlling human emotions, memory, thinking, behavior, language, and other functions. The cerebral cortex is divided into two hemispheres, as shown in Fig. 2, each of which contains five parts: frontal, parietal, occipital, temporal lobes, and cerebellum [9]. With the advancement of science and technology, scientists have found that subjects in different mental states show different EEG signal characteristics, and EEG activity is closely related to the subject's emotion and thinking. Since the frequency domain signal of the EEG signal fluctuates more obviously, the fluctuation range is 0.5–0 Hz, so the EEG signal is divided into 5 bands δ wave, θ wave, α wave, β wave, and γ wave according to the frequency, and each band can reflect the different activity states of the brain, as shown in Table 1.

Fig. 2.
figure 2

Physiological Structure of the Cerebral Cortex.

Table 1. EEG characteristics of different bands.

Acquiring EEG signals and accurately processing EEG information becomes the key to BCI. A complete EEG signal acquisition system consists of a signal acquisition cap, amplifier, and data storage device [10]. The electrodes of the signal acquisition cap can be divided into dry electrodes and wet electrodes. Dry electrodes are generally made of stainless steel as conductors, while the conductors of wet electrodes are usually made of silver and silver chloride materials. A comparison of dry electrode collection devices and wet electrode collection devices is shown in Table 2. Since both acquisition devices have their advantages and disadvantages, the appropriate device can be selected for EEG signal acquisition during the study according to the length of the experiment, laboratory environment, and other factors.

Table 2. Comparison table of dry electrode and wet electrode.
Table 3. Typical time domain feature extraction methods.

The current electrodes of EEG acquisition devices follow the international standard for placement of electrodes for the 10–20 system developed in 1958 [11], as shown in Fig. 3. Where 10 represents the distance from the midpoint of the frontal pole to the root of the nose and the distance from the occipital point to the external occipital ridge each representing 10% of the total connecting distance, and 20 represents the distance between the remaining collection points representing 20% of the total distance. Since the EEG information collected by the EEG acquisition device is extremely weak, the collected signal needs to be amplified by an amplifier, which also reduces the effect of environmental noise and the weakening of the signal caused by cable movement. Finally, the collected EEG information is stored through storage devices such as mobile hard disks or Raspberry Pi.

Fig. 3.
figure 3

International standard 10–20 EEG recording system electrode placement.

3 Preprocessing of MI-EEG Signals

To obtain effective EEG signals, signal processing usually consists of three parts: preprocessing, feature selection and extraction, and feature classification, which we will introduce in this section and the rest in the next two sections.

EEG signals collected with EEG acquisition equipment are usually mixed with many artifacts and noise. Artifacts are usually generated by the human body, such as eye artifacts, heart artifacts, muscle disorders etc. Noise is usually generated by equipment outside the human body, such as EEG acquisition equipment failure, poor electrode contact, electrode impedance, electromagnetic noise, power line interference etc. Noise and artifacts cause great obstacles to the analysis of EEG data, and the preprocessing of EEG signals based on MI-BCI system is to filter the original EEG signals mainly by using temporal filters and spatial filters to eliminate noise and artifacts to get signals with specific patterns [12, 13].

Temporal filters, which mainly include low-pass and band-pass filters, are the most commonly used in the preprocessing stage. Temporal filters mainly restrict the EEG signal to the frequency band where the neurophysiological information related to the cognitive task is located. For example, the signal will block the high frequency signal in the signal after passing through the low-pass filter (myoelectric or other noisy signals). Signals in both the α band and the β band are usually closely related to the motor imagery task, so the band-pass filter is usually set in the MI task at a frequency of 8 to 30 Hz [14, 46].

The main function of the spatial filter is to extract the necessary spatial information related to the motion imagery task [15]. The common average reference (CAR) and the Laplacian spatial filter are two common spatial filters and are computationally inexpensive. The CAR mainly removes the common components from all channels, leaving only the channel-specific signals. The Laplace spatial filter is designed to remove the common components of adjacent signals, increasing the difference between channels.

4 Features Extraction and Analyses

Due to the multi-electrode and high sampling rate of EEG acquisition devices, a large amount of EEG data is generated every second species, but the vast majority of these data are non-valid. It is important to correctly distinguish the intentional neural activity (such as motor imagery task of a specific limb) and non-intentional neural activity (such as EEG, EMG) of the subjects to extract useful EEG information. Feature extraction is mainly the process of abstracting feature vectors that can strictly distinguish different thinking states from the pre-processed EEG signals, and removing non-valid data from the feature vectors to retain valid data. Feature extraction methods based on motion imagery can be broadly classified into: time domain methods, frequency domain methods, time-frequency domain methods, spatial domain methods, time-space domain methods, spatial spectrum methods etc. [16, 47].

4.1 Time Domain Methods

The EEG signal is an extremely weak and unstable signal, and its amplitude, frequency, period, and phase all change with the changes in the sensory motor rhythm, the EEG signal shows different characteristics at each moment. The time domain analysis method mainly extracts the EEG signal features at each time node from time, which is the earliest and most intuitive feature extraction method used, easy for people to understand, and can obtain both time domain and frequency-domain features. However, the algorithm is complex and computationally intensive, which is difficult to meet the real-time requirements of the BCI [17]. And the method is highly subjective, largely influenced by the analyst's thoughts, and it is often difficult to objectively evaluate EEG signals.

The time domain method first extracts and analyzes the EEG signals for every single channel, then fuses the features of all acquired channels into a large feature set and applies this feature set to a single motion imagery paradigm, Table 3 summarizes several commonly used methods employed for time domain feature extraction. To extract effective time domain features, the EEG signal needs to be digitally filtered to extract the values of the motor rhythm components in the frequency band of interest to the researcher, and then the energy values of the filtered frequency band power features are calculated. Mathematical statistical methods such as mean, root mean square, standard deviation, and variance are all widely used in MI task classification [18].

4.2 Frequency Domain Methods

Spectral Domain Methods (SDM) are used to extract frequency domain information from EEG signals. Some statistical methods in the time domain (such as mean, standard deviation, variance etc.) are also applicable in the frequency domain. Samuel [23] used 12 spectral domain descriptors (SDD) and 20-time domain descriptors (TDD) for a total of 32 EEG Feature extraction methods were used to decode the MI task for different limbs, and the results showed an average accuracy of 99.55% for a set of optimal SDD and 90.68% for a set of optimal TDD by a linear feature combination technique. The power of specific frequency bands, such as δ, θ, α, β, and γ bands, can be analyzed using the fast Fourier transform (FFT) [24].

The power spectral density method (PSD) is a frequency domain based method, PSD is a measure of how the power of a signal is distributed over frequency, it is performed by parametric or non-parametric methods, commonly used are Welch's averaged modified periodogram [25], Yule-Walker equation [26], Lomb-Scargle periodogram [27], Spectral entropy [28].

4.3 Time-Frequency Domain Methods

For EEG signals with more prominent time-frequency characteristics, this is generally analyzed by time-frequency methods, which means that the EEG signal can be extracted in both time and frequency domains simultaneously. Short Term Fourier Transform (STFT) [29] and Wavelet Transform [30] are the more commonly used analysis methods in the time-frequency domain. STFT first splits the EEG signal into overlapping time frames and then performs Fast Fourier Transformation (FFT) on the time frames by a fixed window function. FFT has the advantages of simple calculation and short computation time, so it has been widely used. Wavelet transform is a decomposition of the signal into wavelets, which is a finite harmonic function (sin/cos). The wavelet transform has a flexible time-frequency resolution, the signal is progressively refined using a variable time-frequency window, and the energy intensity or density of the signal can be represented in both the time and frequency domains [31].

Main formulas of STFT:

$$S\left(m,k\right)={\sum }_{n=0}^{N-1}s\left(n+mN\right)\varpi \left(n\right){e}^{-j\frac{2\Pi }{N}nk}$$
(1)

Main formulas of Wavelet Transform:

$${\psi }_{s,\tau }\left(t\right)=\frac{1}{\sqrt{s}}\psi (\frac{t-\tau }{s})$$
(2)

Empirical modal decomposition (EMD) is an analysis method similar to Wavelet Transform, but instead of decomposing the EEG signal into wavelet functions, it decomposes the EEG signal into intrinsic mode functions (IMF), which are simple oscillatory functions in mathematics, and the IMF capture the frequency signals in order from high to low.

Main formulas of EMD:

$$x(t)={\sum }_{i=1}^{n}{c}_{i}\left(t\right)+{r}_{n}(t)$$
(3)

4.4 Spatial Domain Methods

Although the time domain method has been used earlier, only a single channel can be selected for EEG signal extraction and analysis at a time, and the algorithm is more cumbersome. The spatial domain method extracts features by combining multiple channels with certain feature relationships, and it can process multiple channels at a time, among which blind source separation (BSS) [32] is a widely used unsupervised feature extraction method. Cortical current density (CCD) and independent component analysis (ICA) are both good applications of the blind source separation method. The blind source separation method is an unsupervised feature extraction method in which there is no correspondence between classes and features.

Main formulas of BSS:

$$x(t)=As(t)$$
(4)
$${s}^{^{\prime}}\left(t\right)=Bx(t)$$
(5)

where x (t) is the vector of the mixed signals, s (t) is the vector of sources, and A is the unknown non-singular mixing matrix. They aim to find a matrix B that reverses the channels back into their sources.

Common Spatial Pattern (CSP) is a supervised feature extraction method based on classes and features, which can effectively detect event related desynchronization (RED), and the method has a high recognition rate and low computational complexity and is more widely used in BCI. The preprocessed EEG data are first subjected to wavelet transform, and then the wavelet transformed finite harmonic function is used as input for the common spatial mode transformation. This enables the transformation of EEG information into another new space that minimizes the variance of the class signal [33]. This spatial filtering algorithm can be considered as a data driven dimensionality reduction method to improve the variance difference between the two conditions. The common spatial frequency subspace decomposition (CSFSD) used by Ramoser [34] and Choi [35] method is an improvement of the CSP method.

Main formulas of CSP:

$$J\left(\omega \right)=\frac{{\omega }^{T}{C}_{1}\omega }{{\omega }^{T}{C}_{2}\omega }$$
(6)

where C1 and C2 represent the estimated covariance matrix of each MI class. The above equation can be solved while using the Lagrange multiplier method.

4.5 Spatio-Temporal Domain Methods

The combination of time domain feature extraction methods and space-domain feature extraction methods results in spatio-temporal domain feature extraction methods, and the more common spatio-temporal methods in the past were the Riemannian geometry based methods. Riemannian flow shape is formed by using EEG data with flow characteristics and sample covariance matrix (SCM) acting in symmetric positive definite (SPD) matrix space [36]. The distances of Riemannian manifolds are curves not straight lines, which can be calculated using the affine invariant Riemannian metric (AIRM) [37].

Most of the remaining spatio-temporal domain methods are based on deep learning. For example, the new method proposed by Echeverri [38] in 2019 uses a blind source separation (BSS) algorithm to separate the single channel signal into independent components of the estimated source signal, and then uses the continuous wavelet transform (CWT) for 2D representation of the separated independent components, and finally uses a convolutional neural network (CNN) method for classification. Yang [39] proposed a method using a long short term memory network (LSTM) and convolutional neural network to extract temporal and spatial features from the raw EEG signal, followed by extracting the spectral information of the EEG signal by discrete wavelet transform. Li [40] proposed an end-to-end EEG decoding framework by first extracting spatial and temporal features from the raw EEG signal, and then by using wave amplitude-scramble data enhancement assisted by channel-projection mixed-scale convolutional neural network (CP-MixedNet) technique to improve the decoding accuracy.

4.6 Spatio-Spectral Domain Methods

The combination of spatial domain feature extraction methods and spectral domain methods results in a spatio-spectral domain feature extraction method, and if temporal and spatial filters can be learned simultaneously, a unified framework can extract information from both spatial and spectral domains. For example, Wu [41, 48] proposed an iterative spatio-spectral patterns learning (ISSPL) algorithm that learns both spatio-temporal filters and spectral filters simultaneously. Suk [42, 49] used the interplay between particle filtering algorithms, feature vectors, and class labels information proposed a probabilistic method for optimizing spatio-temporal spectral filtering of BCI based on EEG. Zhang [43] proposed a structure based on deep recurrent and 3D convolutional neural networks (R3DCNNs) that enables simultaneous learning of EEG signal features from spatial, spectral, and temporal dimensions. Bang [44] proposed to superimpose the filtered spectral filters and construct a 3-D-CNN feature map, and by using this feature map, a layer-by-layer decomposition model of the framework was implemented and experimental accuracy was ensured.

5 Classification of MI EEG Signals

A feature classification algorithm is to classify the extracted feature vectors according to the target discriminant criterion to obtain the best classification result, which is the mapping from the feature space to the target space, and usually consists of three parts: the mapping function, the objective function, and the minimization/maximization algorithm. Among them, the mapping function determines the feature space and the approximation ability of the classifier, the objective function describes the problem to be solved by the classifier, and the minimization/maximization algorithm is to find the best mapping function to ensure the mapping of the data to the target space.

Algorithms such as Linear Discriminant Analysis (LDA), Support Vector Machine (SVM), Multilayer Perceptron (MLP), and Bayesian classifier are feature classification stage commonly used algorithms. In recent years, some deep learning-based feature classification algorithms have been proposed, but the feature classification stage of deep learning based on motion imagery is still difficult to be widely used due to noise, the correlation between channels, and small dataset of subjects [45].

6 Conclusions

This paper introduces the research of brain-computer interface based on motion imagery, which mainly involves the classification of brain-computer interface, an overview of EEG signal and signal acquisition, pre-processing of MI, feature extraction, feature classification methods etc.

With scientists’ research on MI-BCI, various signal processing methods have made some progress and the performance of algorithms has improved substantially. However, the research on MI-BCI is far from over, and there are still some key issues waiting to be solved. For example, due to the extreme nonlinearity and non-smoothness of EEG itself, the target user often needs to conduct a large number of training experiments, which leads to a longer calibration period of a MI model. Current research on motor imagery is mainly focused on offline models, and research on MI in online models needs to be enhanced. Researchers should set a unified BCI criterion for algorithm evaluation, which in turn can better measure performance improvement.