Keywords

1 Introduction

Brain-computer interface (BCI) provide a communication system in which an individual can send messages to an external device (e.g. a computer) without using the brain’s muscular. The person’s intention to control or communicate initiates brain activities and the patterns from those brain activities can be detected from electrophysiological signals [1, 2]. Patients suffering from high spinal cord injuries (HSCL), amyotrophic lateral sclerosis (ALS), brainstem stroke, cerebral palsy or other neural disorders find difficulties in communication and neural prosthetics. BCI aims at resolving the difficulties of such patients and raise their standard of living [3,4,5].

There are different invasive and non-invasive electrophysiological signal recording methods that are being used to detect brain activities. Electroencephalography (EEG) signals are the most studied type of signals to detect brain activities because of its non-invasive and portable nature. Non-invasive methods include magneto encephalography (MEG), positron emission tomography (PET), functional magnetic resonance imaging (fMRI), and optimal imaging. These methods have expensive setup and equipment costs, they are technically more demanding, have longer time constants and less suitable for rapid communication [6, 7].

When recording EEG signals, most of the electrodes are not tightly contained within the scalp therefore many environmental, electromagnetic (EM) and other surrounding sources contribute to the high noise-to-signal ratio of EEG signals. The EOG artifact is most disturbing artifact that corrupt the EEG signal because of the frequency range of EOG activity [1, 8,9,10]. The important aspect of a BCI system is, therefore, to minimize the noise-to-signal ratio and remove the non-CNS related artifacts from the signals. High dimensional feature set reduces the accuracy of the BCI classifier by contributing more noise.

For good classification of mental tasks by a BCI system it needs to give more importance to the preprocessing and feature selection units. But constructing an efficient one and less complicated system is still a goal to achieve. Different machine learning and signal processing techniques are being explored for these preprocessing stage [11,12,13]. But these are quite time consuming because of heavy computations and our main target while developing a BCI system should be fine accuracy with more instantaneous attitude. So we presents a simple yet efficient model for EEG based BCI to detect motor imagery.

The main objective of our research is to deal with feature selection stage for better classification of motor imagery by EEG based BCI. The proposed system take three different dataset and process it through various experiments. It record the EEG signals from and optimized it through NAN values and separate the data into training and testing data to lower the frequency with RFECV feature selection. The system will classified the results and compare these experiments to achieve an appropriate model.

In the next section we discussed our proposed methodology and its implementation on our main observed dataset. Section 3 presents the experiment on remaining two datasets to test our model. Section 3.2 gives discussion on results of implementation and finally Sect.4 concludes this paper.

2 Proposed System Model

This section presents BCI design, which is tested through different experimentation performed on BCI motor-imagery datasets. The main component of proposed research is Butterworth low-pass filter banks method for feature construction with “Recursive Feature Elimination with Cross Validation (RFECV)”. Working on Graz 2A, Graz 2B datasets [14] and Random forest classifier keeps it simple and robust. To evaluate the system, Receiver Operate characteristics (ROC) and Area Under Curve (AUC) as an evaluation matrix for ‘Grasp-And-Lift’ dataset [15] and Cohen’s kappa score for Graz 2A and Graz 2B datasets [16]. The proposed model as shown in Fig. 1, the demand of Graz 2A and Graz 2B datasets.

Fig. 1.
figure 1

BCI generalized proposed model

2.1 Design and Implementation

Before going into the design and implementation there is an important point showed in the task video of the dataset [15]. The subjects viewed a light bulb constantly, when the light bulb glows subject performs hand movements accordingly. When the bulb glows, visual evoked potential (VEP) is generated in the EEG dataset. They occur just before the hand movement [15].

We also performed a basic experiment to analyze the BCI dataset of “Grasp-and-Lift”. By using the simple approach of training and testing, it achieved accuracy of 0.73 with ROC and AUC as shown in Fig. 2. This experiment showed that by only dividing the six-class problem into the two-class problem accuracy of 0.73 is achieved.

2.2 Classification Method Used

Below are the steps for training of classifier with specific approach

  • For each subject S, there are 6 different classes required

  • Train the classifier on class m, m = 1.6 using X as training data and Y as target labels. For U predict the class label P

  • Combine all the series data and event data for subject S in X and Y

  • Combine all the test series data for a subject S in U

  • Compute the ROC AUC for all 6 classes and micro-average ROC AUC for all.

Fig. 2.
figure 2

EEG segment for channel Cz

2.3 Preprocessing and Feature Construction

To deal with these high frequency components, a low-pass filter should be used. Therefore, we used the Butterworth digital low-pass filter with some low cutoff frequency to attenuate the high frequency components. The EOG artifact due to eye movement and eye blinking also lies in the low frequency range from 0–4 Hz. Figures  3, 4 and 5 shows the effect of choosing different cutoff frequencies. Figure 6 shows the boost in classification accuracy described in previous section, which make use of all feature columns. Later, non-important filtered features can be excluded from the feature set by using suitable feature selection method.

2.4 Feature Selection

We filter the potential features with low-pass filter banks with five different frequency banks between 0–5 Hz. To overcome this problem, we implemented an automated features selection RFECV method. As the random forest classifier is used for feature classification so we also used it with RFECV. The accuracy score for cross-validation scoring and K = 3 for cross-validation. There are numerous possibilities in which RFECV method can be used to find optimal features.

We take the option 2 and 5 to run our model with RFECV method because option 1 might not give the optimal features for classification. Each channel has their own feature channel’s importance for a particular task so option 3 is also out of consideration and Option 4 will not give the true generalization of trials. Option 5 gives the best results of all but the RFECV method take some considerable time to run.

Fig. 3.
figure 3

Butterworth with filtering f = 0.4

Fig. 4.
figure 4

Butterworth with filtering f = 1

Fig. 5.
figure 5

Butterworth with filtering f = 2

Fig. 6.
figure 6

ROC AUC for Butterworth low pass filter used on all columns, ROC AUC, is increased to 0.83 from 0.73

So now we filtered all 32 EEG channels with digital Butterworth low-pass filter banks with 5 cut-off frequencies f = [0.5,1.5,2,3,20]. There are total of 32 \(\times \) 5 = 160 features in the feature set. Then RFECV method is applied with option 2, 14 optimal features are reached from 160 total features with 90% accuracy, Fig. 7 represents the ROC curve using the RFECV method with option 2 for ‘Grasp-and-Lift’ dataset. Then we applied RFECV selection method using option 5 just like option 2. Again the 32 EEG channels are filtered with 5 low-pass filter banks of order 5 at f = [0.5,1.5,2,3,20], so there is a total of 160 features before RFECV feature selection. Figure 8 shows the ROC curve with option 5.

Fig. 7.
figure 7

ROC & AOC using Option 2

Fig. 8.
figure 8

ROC & AOC using Option 5

Option 5 gives the best accuracy of 91% better than about 90% accuracy of option 2 with RFECV automated feature selection for Grasp-And-Lift dataset, but as compared to option 2, option 5 takes six times more time to run. While for option 5 RFECV method is executed for all classes separately that means RFECV method runs for six times and optimal features are found for each class separately, which are presented in Table 1.

Table 1. Optimal features selection using RFECV

3 Experimental Study

3.1 Implementation on Graz 2A BBCI Dataset

To confirm the performance of developed system, we have to test it with more datasets, we take the Graz dataset 2A from BBCI competition IV [16]. It is a four-class problem for the detection of motor imagery movements of left hand (class1), right hand (class2), both feet (class3) and tongue (class4). So, for each subject out of 72 trials for each class, 50 trials are used for training, and 22 trials are used for evaluation or classification. For this dataset, we performed two experiments using our proposed model.

Experiment 1

Digital Butterworth low-pass filter banks of order 5 with 4 cut-off frequencies f = [0.5,1.5,2,3] are used. 22 \(\times \) 4 = 88 features are given to RFECV for feature selection. Figure 9 shows the optimal number of features for this dataset against the cross-validation score. Table 2 concludes the results of experiment 1 with our approach for Graz 2A dataset and Fig. 10 shows the average ROC AUC for experiment 1.

Fig. 9.
figure 9

Exp-1 features for Graz 2A

Fig. 10.
figure 10

Exp-1 ROC & AUC on Graz 2A

We also computed kappa score for all subjects and mean kappa score for overall experiment 1, which is 0.2155 is quite promising and competitive with the top five participants by using 50% less data for each subject.

Table 2. Result summary of Experiment 1 & 2 on GRAZ2A

Experiment 2

For artifact processing we implemented linear regression based artifact removal method for experiment 2 on Graz 2A BBCI dataset. The result of applying linear regression based EOG artifact removal method on subject A01 where fluctuations in EOG channels shows eye blinking which is the corrected signal as shown after EOG artifact processing. Table 2 presents the summary of experiment 2 on Graz 2A dataset using the artifact-processing unit.

3.2 Implementation on Graz 2B BBCI Dataset

It is a two-class problem for the detection of right and left hand motor imagery movements. For all nine subjects (B01 to B09), there are 5 sessions recorded signal, out of which 3 sessions are meant to use for training and 2 are meant to use for testing/evaluation. We performed two experiments on this Graz 2B BBCI dataset using our proposed model.

Experiment 1

For first experiment on Graz 2B dataset we used the same model just like experiment 1 of Graz 2A dataset, the only difference is that as there are only 3 EEG channels instead of 22 EEG channels and all 3 EEG channels (C3, Cz and C4) are contributing positively for the classification of brain activity. Just like Graz 2A dataset this dataset also has missing (NaN) values and idle/other events recorded data. Then we applied the Digital Butterworth low-pass filter banks of order 5 at six cut-off frequencies i.e. f = [1,2,3,4,7,9,20].

Experiment 2

For experiment 2 the missing values are resolved by averaging method. Then the signal is corrected by linear regression based artifact removal method. The signal data is filtered with Butterworth low-pass filter banks of order 5 at cut-off frequencies f = [1,2,3,4,7,9,20] just like experiment 1. For experiment 2 on Graz 2B dataset, we achieved a promising mean Cohen’s kappa score of 0.61 and ROC AUC accuracy of about 93% as shown in Figs. 11 and 12. The accuracy achieved through this experiment on kappa score is quite promising.

CSP is widely used for EEG based BCI systems [17], and it shows good results. For benchmarking we applied the CSP on same datasets and kept the random forest as a classifier. The detail discussion for all three datasets experiments is presented below. Table 3 summarizes the results obtained using our proposed approach as compared to the CSP method.

Fig. 11.
figure 11

Exp-1 features for Graz 2B

Fig. 12.
figure 12

Exp-1 ROC & AUC on Graz 2B

Table 3. Summary of results obtained using proposed approach

4 Conclusion and Future Work

To compensate the noise and artifacts of EEG signals this paper presents an improved model for feature construction and feature selection and hence provide a more efficient BCI system to classify motor imagery. For ‘Grasp-And-Lift’ challenge we increased the accuracy to 91% from 73% using our proposed model with 25% less data for training. For Graz 2A and 2B datasets, we achieved kappa scores of 0.42 and 0.61 respectively, by using 50% and 40% reduced data. For all our experiments, we used relatively simple random classifier for feature classification. The promising results achieved by using our proposed feature construction and feature selection model shows the potential of its use for online experimentation. Further work will focus on online experiments to minimize the noise and improve the efficiency for its effectiveness.