Keywords

1 Introduction

Brain computer interface creates a new communication system between the brain and an output device by bypassing conventional motor output pathways of nerves and muscles [13]. The BCI operation is based on two adaptive controllers, the user’s brain, which produces the activity that encodes the user thoughts, and the system, which decodes this activity into device commands [4, 5]. When advanced Computational Intelligence (CI) and machine learning techniques are used, a Brain computer interface can learn to recognize signals generated by a user after short time of training period [614]. The proposed system depends on the EEG activity that derives the user’s wishes. Subjects are trained to imagine right and left hand movements during EEG experiment. The imagination of right or left hand movement results in rhythmic oscillations in the EEG signal. The oscillatory activity is comprised of event-related changes in specific frequency bands. This activity can be categorized into event-related desynchronization (ERD), which defines an amplitude (power) decrease of μ rhythm (8–12 Hz) or β rhythm (18–28 Hz), and event-related synchronization (ERS), which characterizes amplitude (power) increase in these EEG rhythms. The system is used to output commands to a remote control to control the movement of a wheelchair via radio frequency (RF) waves (Fig. 1).

Fig. 1
figure 1

Block diagram of the system

2 Methodology

This work is applied on a dataset that uses EEG activity recorded from 59 scalp electrodes placed according to the international 10/20 system of channel locations. The signals were sampled at 100 Hz. Two different tasks, an imagined right-hand movement and an imagined left-hand movement, are performed in the experiment. The brain signals recorded from the scalp encode information about the user’s thoughts. A BCI system has been established to decode this information and translate it into device commands. Figure 2 shows the processing stages of BCI. First the artifacts contaminated in the EEG signal are removed in order to increase the signal to noise ratio (SNR) of the acquired EEG signal. Second, certain features are extracted and translated into device control commands. Each processing phase will be discussed in the following sections.

Fig. 2
figure 2

The processing stages of BCI

  1. A.

    Artifact Removal

Artifacts are non–brain based EEG activity that corrupt and disturb the signal making it unusable and difficult to interpret. To increase the effectiveness of BCI system, it is necessary to find methods for removing the artifacts. The artifact sources can be internal or external. Internal artifacts are those which are generated by the subject itself and uncorrelated to the movement in which we are interested. This type of artifacts includes eye movement, eye blink, heart beat and other muscle activity. On the other hand the external artifacts are coming from the external world such as line noise and electrode displacement. Several approaches for removing these artifacts have been proposed. Early approaches to the task of subtracting artifacts using regression methods were met with limited success [1517]. Many of the newer approaches involve techniques based on blind source separation. In this chapter, a generally applicable method is applied for removing a wide variety of artifacts based on blind source separation by independent component analysis. The ICA work was performed on Matlab (http://www.mathworks.com) using EEGLAB software toolbox [18]. ICA is a statistical and computational technique that finds a suitable representation of data by finding a suitable transformation. It performs the rotation by minimizing the gaussianity of the data projected on the new axes. By this way it can separate a multivariate signal into additive subcomponents supposing the mutual statistical independence of the non-Gaussian source signals.

Bell and Sejnowski [19] proposed a simple neural network algorithm that blindly separates mixtures, X, of independent sources, S, using information maximization (infomax). They showed that maximizing the joint entropy of the output of a neural processor minimizes the mutual information among the output components. Makeig et al. [16] proposed an approach to the analysis of EEG data based on infomax ICA algorithm. They showed that the ICA can be used to separate the neural activity of muscle and blink artifacts and find the independent components of EEG [20]. Once the independent components are extracted, “corrected EEG” can be derived by identifying the artifactual components and eliminating their contribution to EEG.

ICA methods are based on the assumptions that the signals recorded from the scalp are mixtures of temporally independent cerebral and artifactual sources, that potentials from different parts of the brain, scalp, and the body are summed linearly at the electrodes, and that propagation delays are negligible. ICA will solve the blind source separation problem to recover the independent source signals after they are linearly mixed by an unknown matrix A. Nothing is known about the sources except that there are N recorded mixtures, X. ICA model will be:

$$ \boxed{x = AS} $$
(1)

The task of ICA is to recover a version U of the original sources, S, by finding a square matrix W, the inverse of matrix A, that invert the mixing process linearly as:

$$ \boxed{U = WX} $$
(2)

For EEG analysis, the rows of X correspond to the EEG signals recorded at the electrodes, the rows of U correspond to the independent activity of each component (Fig. 3), and the columns of A correspond to the projection strengths of the respective components onto the scalp sensors. The independent sources were visually inspected and artifictual components were rejected to get a “Corrected EEG” matrix, X′, by back projection of the the matrix of activation waveforms, U, with artifactual components set to zero, U′, as:

$$ \boxed{X^{\prime } = (W)^{ - 1} U^{\prime } } $$
(3)
Fig. 3
figure 3

EEG independent components (ICs)

Before applying ICA algorithm on the data, it is very useful to do some preprocessing. One popular method is to transform the observed data matrix to obtain a new matrix in which its components are uncorrelated as a condition to be independent. This can be achieved by applying principal component analysis (PCA). PCA finds a transformation for the data to a new orthogonal coordinate system with the axes ordered in terms of the amount of variance. At the same time, PCA can be used to reduce the dimension of the data by keeping the principal components that contribute to the most important variance of the data and ignoring the other ones. This often has the effect of reducing the noise and preventing overlearning of ICA.

  1. B.

    Feature Extraction

To get a reduced and more meaningful representation of the preprocessed signal for further classification, certain features are measured to capture the most important relevant information. The patterns of right and left hand movements are focused in the channels recorded from the sensorimotor area of the brain in the central lobe and some of the other channels might be unusable for discrimination between the two motor tasks. Therefore, a minimum number of EEG channels were selected from the primary sensorimotor cortex area for further processing (Fig. 4). The right cerebral hemisphere of the brain controls the left side of the body and the left hemisphere controls the right side. It was found that left hand movement appears strongly on the C4 channel and right hand movement appears strongly on the C3 channel. The two selected channels were found to be sufficient to ensure a high level of classification as they contain the most relevant information for discrimination [21]. Commonly used techniques for feature extraction such as Fourier analysis have the serious drawback that transitory information is lost in the frequency domain. The investigation of features in the EEG signals requires a detailed time frequency analysis. Wavelet analysis comes into play here since wavelet allows decomposition into frequency components while keeping as much time information as possible [2226]. Wavelets are able to determine if a quick transitory signal exists, and if so, it can localize it. This feature makes wavelets very useful to the study of EEG.

Fig. 4
figure 4

EEG channel locations showing the selected channels

The wavelet transform is achieved by breaking up of a signal into shifted and scaled versions of the original (or mother) wavelet. The mother wavelet ø is scaled by parameter s and translated by τ. WT of a time domain signal x (t) is defined as:

$$ \boxed{w(s,\tau ) = \int {x(t)\Uppsi }_{s,\tau }^{ * } (t)\;dt} $$
(4)

This wavelet transform is called the continuous wavelet transform (CWT). In our case both the input signal and the parameters are discrete so the transform here is the discrete version of wavelet transform. To create the feature vector of each trial, discrete wavelet transform (DWT) was applied. An efficient way to implement DWT is by using digital filter bank using Mallat’s algorithm [27].

In this algorithm the original signal passes through two complementary filters, low pass and high pass filters, and wavelet coefficients are quickly produced. This process is iterated to generate at each level of decomposition an approximation cA which is the low frequency component and a detail cD which is the high frequency component (Fig. 5). The ability of the mother wavelet to extract features from the signal is dependent on the appropriate choice of the mother wavelet function. The different orders (wavelets) of the mother wavelet “Coiflet” were tried out to implement the wavelet decomposition (Fig. 6). Each decomposition level corresponds to a breakdown of the main signal to a bandwidth. The low frequency component is the most important part. It carries the information needed about the motor movement found in the ì rhythm (8–12 Hz). Therefore, the coefficients of the second level decomposition cA2 were selected to form the feature vector of each trial. As wavelet coefficients have some redundancy, dimensionality reduction of feature vectors is suggested as a preprocessing step before classification. This could lead to better classification results as it will keep the minimum number of coefficients that are significant and discriminatory. This was achieved by applying PCA by projecting the coefficients onto the first n principal components (PCs), where n is much smaller than the dimensionality of the features. The number of PCs to project the data can be determined by examining the energy of the data. Therefore, PCA was formed by projecting 150 dimensional patterns onto the first 44 PCs which accounted for 99.98 % of the variability of the data.

Fig. 5
figure 5

Multi-level wavelet decomposition of EEG signal

Fig. 6
figure 6

The mother wavelet (Coiflet) with the different orders used in the decomposition

  1. C.

    Classification

Classification was performed by training nonlinear feedforward neural networks using the standard backpropagation algorithm to minimize the squared error between the actual and the desired output of the network [28]. The use of nonlinear methods is useful when the data is not linearly separable. The network is used to develop a nonlinear classification boundary between the two classes in feature space in which each decision region corresponds to a specific class.

The network was implemented with 3 hidden neurons in the hidden layer and a single neuron in the output layer that will result in a single value 0 or 1 (Fig. 7). The target output during the training was set to 0 and +1 to represent the different classes. When simulating new input data, an output value greater than or equal to 0.5 represents the first class and a value less than 0.5 represents the other one. The data from one subject was divided into training set and test set using “leave-k-out” cross validation method. By this way the data was divided into k subsets of equal size. The network was trained k times. Each time leaving out one of the subsets from training and using only the remaining subset for validation. To get the true classification rate, the accuracy was averaged over all subsets.

Fig. 7
figure 7

The architecture of the feed-forward neural network used for classification

3 Results and Discussion

To evaluate the performance of our system, EEG data acquired from a motor experiment is processed. The subject made an imagined left or right hand keyboard pressing synchronized with command received. A total of 90 trials were carried out in the experiment for each subject. Table 1 shows the results of all classification experiments as the average percent of test patterns classified correctly using the “leave-k-out” cross validation method.

Table 1 Classification accuracy of imagined right and left hand movements for 6 subjects obtained using the different orders of the mother wavelet “Coiflet”

The results presented above demonstrate that the most effective results were found by applying the mother wavelet “Coiflet” order 5. As there is no well-defined rule for selecting a wavelet basis function in a particular application or analysis, different wavelets were tried out. However, for a more precise choice of a wavelet function, the properties of the wavelet function and the characteristics of the signal to be analyzed should be matched which was the case in the wavelet “Coiflet” of order 5. Also there are other wavelet families can be applied like Harr, Daubechies, and Symmlet. Since the classification accuracy is sensitive to the contaminated EEG artifacts, our processing was performed on artifact-free EEG signal. In order to improve the performance of the system in real time applications, it is ideal if the removal of the artifacts is done using automatic methods [29]. The results also indicated that the two channels C3 and C4 of the sensorimotor cortex area are sufficient to ensure high classification rates. Müller-Gerking et al. [30] and Ramoser et al. [31] studies give a strong evidence that further increase in the number of used channels can increase the classification accuracy. Other studies were performed on the same dataset using a hierarchical multi-method approach based on spatio-temporal pattern analysis achieved classification accuracy up to 81.48 % [32].

4 Conclusions

With the advances in DSS, expert systems (ES) and machine learning, the effects of these tools are used in many application domains and medical field is one of them. Classification systems that are used in medical decision making provide medical data to be examined in shorter time and more detailed. Recently, there has been a great progress in the development of novel computational intelligence techniques for recording and monitoring EEG signal. Brain Computer Interface technology involves monitoring of brain electrical activity using electroencephalogram (EEG) signals and detecting characteristics of EEG patterns by using digital signal processing (DSP) techniques that the user applies to communicate. The proposed wavelet-based processing technique leads to satisfactory classification rates that improve the task of classifying imagined hand movements. The results are promising and show the suitability of the technique used for this application. It is hoped to realize the system in real world environment and to overcome the BCI challenges of accuracy and speed.