Abstract
We propose a novel method for the extraction of discriminative features in electroencephalography (EEG) evoked potential latency. Based on our offline results, we present evidence indicating that a full surround sound auditory brain–computer interface (BCI) paradigm has potential for an online application. The auditory spatial BCI concept is based on an eight-directional audio stimuli delivery technique, developed by our group, which employs a loudspeaker array in an octagonal horizontal plane. The stimuli presented to the subjects vary in frequency and timbre. To capture brain responses, we utilize an eight-channel EEG system. We propose a methodology for finding and optimizing evoked response latencies in the P300 range in order later to classify them correctly and to elucidate the subject’s chosen targets or ignored non-targets. To accomplish the above, we propose an approach based on an analysis of variance for feature selection. Finally, we identify the subjects’ intended commands with a Naive Bayesian classifier for sorting the final responses. The results obtained with ten subjects in offline BCI experiments support our research hypothesis by providing higher classification results and an improved information transfer rate compared with state-of-the-art solutions.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
Introduction
A brain–computer interface (BCI) is designed to establish a communication link between the human brain and a computer [1]. A BCI does not depend on muscle or peripheral nervous system activity. In particular, a BCI could help patients suffering from amyotrophic lateral sclerosis (ALS) to communicate or to complete various daily tasks, including controlling a computer or typing messages on a virtual keyboard. In recent years, auditory BCI (aBCI) has become a topic of great interest in computational auditory neuroscience. The aBCI utilizes human auditory pathway responses and allows users to operate external devices more quickly and simply, based on auditory evoked responses to sound stimuli. We utilize a spatial auditory paradigm to create a new aBCI system with which users consciously direct their attention to different locations in a horizontal surround sound environment with various noise stimuli, as depicted in Fig. 1. The spatial aBCI concept is founded on a basic feature of the human auditory pathway, which is very sensitive to the location of changing spatial auditory sources [2]. The auditory pathway also has a very good temporal resolution, which is an additional feature we would like to utilize in the spatial aBCI design. This will make it possible to reduce inter-stimuli intervals (ISI) of the presented sounds in comparison with vision-based applications [3]. Contemporary applications have thus far failed to use rear-to-the-head loudspeakers, as postulated as an optimal setting yet still not fully realized in Schreuder et al. [4]. Our proposal includes sound presentation from rear loudspeakers, thus realizing an eight-command BCI (octagonal horizontal plane full surround sound set-up). In previous publications, we discussed EEG electrodes selection, event-related potential (ERP) features optimization and linear discriminative analysis classification [5]. These earlier results were unsatisfactory as regards a successful online aBCI system application.
In this paper, we discuss a novel aBCI paradigm based on the full surround sound horizontal stimuli as an informative cue with the support of the P300 component (known as the “aha response”) at a latency around and after the 300 ms, evoked by the expected/instructed targets [6]. Our hypothesis is that a significant ERP response will be found when subjects attend to the target direction and ignore the non-targets. To find the significant differences, we propose analyzing the response statistically to identify only those ERP latencies that contribute to the classification enhancement, in contrast to state-of-the-art approaches [1], in which the whole response is taken as a feature for subsequent classification.
The hypothesis of our research is that the horizontal full surround sound aBCI paradigm could be improved with the careful selection of ERP discriminative features that allows the use of the rear-to-the-head sound directions. For this purpose, we introduce a statistical response analysis, which ultimately leads to the final improvement in the information transfer rate (ITR).
The paper is organized as follows. In the next section, the experimental paradigm is explained together with the EEG preprocessing steps. Then, we discuss EEG feature selection using the method of statistical analysis of ERP responses. Finally, we present classification results obtained with a Gaussian Naive Bayesian Classifier (GNBC), which leads to an improvement in the ITR scores. The final section summarizes the paper.
Methods
Within the framework of the proposed novel aBCI paradigm, the subjects were asked to attend to and count targets while ignoring non-targets, as in the classical oddball paradigm [1, 6]. A target direction instruction regarding which direction should be attended to in each trial was displayed visually on a computer display located in front of the subject. First, we conducted psychophysical experiments to check possible preferred directions of the subjects by comparing response time delays. Next we conducted EEG recording experiments in an offline BCI setting. The EEG signals were recorded with a g.MOBILab+ EEG amplifier by g.tec . We used novel dry EEG electrodes g.SAHARA to further improve the subjects’ comfort, since these do not require conductive gel. The reference and ground electrodes were attached behind the left and right ears, respectively. To reduce unnecessary noise and to prevent degradation of the EEG signal quality as a result of electromyography (EMG) noise related to muscular movement in the ERP responses, the subjects were asked to minimize the blinking of their eyes, and facial and body movements during the experiments.
EEG experiments designed to validate the proposed spatial aBCI paradigm utilizing the P300 latency were conducted in the Multimedia Lab at the Life Science Center of TARA, University of Tsukuba, Japan. All the experimental procedures and study targets were explained to the subjects, who agreed to participate voluntarily. The experiments were conducted in agreement with the WMA Declaration of Helsinki—Ethical Principles for Medical Research Involving Human Subjects. All the experiments were conducted in a silent and low reverberation room in order to limit any interference from environmental acoustic noise.
The auditory stimuli were presented through eight loudspeakers in an octagonal setting, as depicted in the upper part of the Fig. 1. The eight sound stimuli directions proved to be optimal from the points of view of aBCI and human subject spatial auditory performance [4].
Two short white and pink noise stimuli bursts were used as depicted in the lower part of the Fig. 1 and described in the following section.
Psychophysical Experiment
In the psychophysical experiment, only the behavioral responses (button presses after the instructed and perceived target stimuli) were recorded. Different response time delays would suggest changing cognitive loads and task difficulties in function of the various spatial directions. The subjects were requested to press a button immediately after an instructed target direction was presented. The response delays in respect of auditory stimuli onsets were recorded and further analyzed in order to compare them with various spatial directions.
The results of the psychophysical experiment are presented in Tables 1 and 2. As a result of the tests conducted, we conclude that all the eight spatial sound stimuli locations had the same (differences among means statistically non-significant when compared with pairwise t tests) values for all the tested octagonal stimulus spatial directions for white and pink noise, which also confirms psychophysical experiments reported in Schreuder et al. [4].
The Offline aBCI Experiment Protocol
The experimental hypothesis was that we would be able to distinguish from the ERP shape (mainly based on the P300 response latencies) which direction the subject attended to in the spatial auditory paradigm experiment.
To test the hypothesis, we conducted a series of EEG recording experiments in the offline BCI mode (with no instant feedback or classification results given to the subjects [1]). EEG recording experiments were conducted with the ten healthy subjects (eight males; two females; age range from 23 to 42 years, mean 25.8, SD 6.34). The subjects were requested to sit in a comfortable chair in the center of eight octagonally positioned loudspeakers, and the dry EEG electrodes were positioned on the scalp. The elevation of the loudspeakers was fixed at the subject’s ear level in order to create a horizontal spatial plane defined by the eight loudspeakers (see Fig. 1). The volume of the sound was set to 72 dB.
The sound stimuli were presented in random order and one at a time from a single loudspeaker (a single trial consisted of a delivery of a single target and seven non-targets). We employed two broadband noise stimuli types that allowed us to utilize the two spatial localization mechanisms of the human auditory pathway, the interaural time delay (ITD) and the interaural level difference (ILD) [2]. The white and pink noise stimuli both had 30 ms lengths with 5-ms linear attack and sustain intervals. For each subject and each stimulus, we performed eight sessions (altogether 64 targets and 448 non-targets were presented). Each subject was requested to focus on the instructed target direction which was presented on a computer display. The subject ignored the other non-target directions. Each subject was also requested to control her/his eye movements to decrease the unnecessary EMG noise during the experiments. Before each experiment, the subject was allowed a short practice session to get familiar with the spatial auditory stimulus conditions.
EEG Acquisition
The EEG signals were recorded by the g.MOBILab+ bio-amplifier with eight dry g.SAHARA electrodes. The EEG recording system captured the neurophysiological signals in a frequency range of 0.1–40.0 Hz. The following eight EEG electrode positions were chosen P3, P4, P5, P6, Cz, CPz, Pz, and POz, as in the 10/10 system [7]. The eight EEG channels were sampled with 256 Hz frequency and stored using a custom application programmed in MATLAB and Simulink environments.
EEG Response Analysis
The analysis of EEG ERP responses, leading to the final eight-direction spatial auditory classification for target and non-target locations, was composed of the following three steps:
-
1.
EEG signals preprocessing: band-pass filtering, epoch segmentation, and artifact rejection;
-
2.
Discriminative feature extraction using the analysis of variance (ANOVA) method;
-
3.
The final classification of evoked response using the GNBC.
We describe the above steps in detail in the following sections.
EEG Preprocessing
First, we filtered digitally the signals with the two fifth-order Butterworth high- and low-pass filters, which were applied with cut-off frequencies at 0.5 and 25 Hz. The low-pass filtering removed possible muscle-activity-related artifacts. The high-pass filtering removed the direct current-related drifts of the EEG signals, as well as slow eye movement artifacts.
Next, the EEG signals were segmented creating the ERP-related epochs. Each epoch started 100 ms before stimulus onset and it ended after 700 ms. We used the 100 ms prestimuli onset interval as the baseline (see Fig. 2).
In the next step, the rejection of eye movement artifacts was carried out. Auditory spatial stimuli are known to cause uncontrolled eye movements in subjects [8], which in the current approach were removed with a threshold value set at the 80 μV (signal amplitude level above the usual EEG activity). The rejected epochs were not further processed, since in the current approach, the emphasis was on the spatial paradigm validation. In the following sections, feature extraction and ERP classification results are introduced.
ERP Feature Extraction Using ANOVA of the ERP Latencies
The aim here was to optimize the EEG response domain (mainly P300 response), which would provide a better separability for further classification. In order to do this, we conducted ANOVA of the two-class single-trial ERP distributions (target vs. non-target responses) in the spatial auditory experimental setting. The ERP response distributions passed “normality tests” and were comparable to more flexible methods such as the area under the curve analysis, yet the proposed ANOVA yielded the best results in our case. The majority of spatial aBCI applications aim at the P300 response latency [4, 5, 9, 10]. The example in Fig. 2 shows the averaged ERP responses to targets and non-targets (note the latencies range 300–600 ms). Next, the ANOVA method was applied to compare the differences of response distributions in single trials for each sample point of the collected ERPs. As a result, we were able to extract discriminative information leading to later classification optimization. The results of the above analysis are depicted in Figs. 3 and 4. The bottom panels in the above figures visualize the ANOVA’s p values for eight electrodes separately in each row using a color scheme, as explained in the color bars next to the panels. The p values are the probabilities of the null hypotheses that the distributions are significantly different (in life sciences, usually p < 0.05 is considered to be a significant value). The results in Figs. 3 and 4 clearly show that the postulated P300 latency area in the range 300–600 ms is the best to discriminate attended targets from ignored non-targets. This finding confirms our hypothesis that the P300 latencies are also related to spatial cognition in the human brain.
Next in this paper, the binary classification problem is discussed. We evaluate our hypothesis that the “hand-picked” P300 latency ERP periods are significant features to improve the binary target vs. non-target classification accuracy. In order to find the most discriminable features from ERP responses, we used the results from the ANOVA method described above applied to the all ERP latencies. We “hand-picked” only those samples within each subject’s ERPs for which the p values were smaller than 0.05 (as depicted in blue in Figs. 3 and 4) in the range 300–600 ms.
The Offline ERP Classification in the aBCI Paradigm
We performed the classification steps for each subject separately in aBCI offline mode, which means that all the procedures were conducted after the collection of data from each experiment, without any online feedback to the subjects. The classification procedure in our case is a so-called binary task paradigm (target vs. non-target). In the classifier training and testing step, we selected 64 targets and a random subset of 64 non-targets (from the 448 available) to have a balanced number of the members in each class set. The resulting theoretical chance level was thus 50 %. Based on our previous classification trails reported in Cai et al. [5, 11], we proposed to use a Bayesian classifier, which yielded similar or even better results on our experimental data than linear discrimination analysis methods. The GNBC is particularly suited to highly dimensional features. The GNBC method produced results comparable to more sophisticated classification methods [12] for particular cases as reported in this paper. In our approach, we utilized a NaN-Toolbox which is a part of a BioSig environment [13]. The classifier input features were the real micro-volt EEG ERP latency values “hand-picked” as discussed in the previous section.
The results of the successful application of the GNBC technique are presented in the next section.
Results
The proposed approach resulted in the improvement in the aBCI paradigm for setting of both the combined and averaged electrodes for each subject separately. Detailed results are presented in the following sections. First, we introduce the ITR, which is a commonly used measure to compare various paradigms in the BCI research community [4]. We also present classification accuracy results.
Analysis of aBCI Results with ITR and Classification Accuracies
The amount of information carried by each selection in the BCI application is usually quantified by the ITR, which is calculated based on bit-per-selection R, defined as [4]:
where C is the classification accuracy and N is the number of classes (N = 8 in this paper). The final ITR is obtained after a multiplication by a classification speed V, resulting in a bit-per-minute rate (bit/min) as:
The ITR results are summarized in Tables 3 and 5 and discussed in detail in the following sections. The GNBC classification accuracies are also summarized in Tables 4 and 6.
The ITR and Classification Accuracy Results from the P300 ERP Range Latencies in the Single Channel Setting of Target Versus Non-target
A summary of the ITR results is presented in Table 3 (the corresponding classification accuracy is summarized in Table 4). We compare the ITR values obtained for the features drawn from the conventional whole ERP and the proposed “hand-picked” features resulting from the ANOVA. The proposed method allows us to boost the classification results up to +10.43 bit/min (+44 % classification accuracy). Only a single case shows a decrease using the leave-one-out cross-validation for the GNBC technique.
The ITR Results from the P300 ERP Range Latencies from the Averaged Eight Trials in the Setting of Target Versus Non-target
The single-trial classification results discussed in the previous sections have resulted mostly in lower than 14 bit/min (below 70 % accuracy) in the aBCI offline mode. In order to improve the results, for each subject and each stimuli, we averaged the eight target trails (convert 64 targets to 8 targets) and 8 non-target trials (convert 448 non-targets to 56 non-targets). In the classifier training and testing steps, we selected 8 targets and a random subset of 8 non-targets (from the 56 available), the same as the single-trial classification training and testing set. For each of the subjects in the case of the pink noise stimulus, the classification resulted in scores higher than 8 bit/min (80 % accuracy). In particular, for the subject numbers 1 and 2, the resulting ITR reached 14.06 bit/min (100 % accuracy). The summary of the results is presented in Tables 5 and 6 for ITR and classification accuracies, respectively. The comparison presented of the ITR and accuracy values obtained for the features drawn from the conventional whole ERP and the proposed “hand-picked” features, resulting from the ANOVA, supports the improvement of the proposed method. The method proposed allows us to increase the classification results (only a single case of a decrease was reported) using the leave-one-out cross-validation for the GNBC technique. An online aBCI application is planned as a next stage by the authors.
Discussion and Conclusions
In this paper, we have presented an approach leading to the improvement of classification accuracies and ITRs in a novel offline aBCI paradigm. This has been achieved by introducing ERP feature extraction in P300 range latencies to replace the classical whole evoked response range approaches.
The proposed improvement method allows the extraction of the most separable ERP features, enabling an increase in the classification accuracy and an improvement ITR of a maximum of +35.30 bit/min (22 % accuracy) in the case of features drawn for single-electrode ERP distributions. In the case of the features obtained from the eight trials averaged ERP responses, the majority of subjects also improved their results with a maximum increase of 10.43 bit/min (44 % in accuracy).
These are the very encouraging results, providing the possibility further to improve the auditory-paradigm-based BCI.
The main achievement reported in the paper allows us to improve the spatial aBCI paradigm in the offline mode, which is a step forward in non-vision-based interfacing strategies. We have also shown that in comparison with contemporary applications of spatial auditory BCI paradigms that fail to utilize rear-to-the-head loudspeakers, it is possible to utilize all spatial horizontal sound directions thanks to the proposed classification improvement approach based on the “hand-picked” ERP latencies.
References
Wolpaw J, Wolpaw EW, editors. Brain–computer interfaces: principles and practice. Oxford: Oxford University Press; 2012.
Schnupp J, Nelken I, King A. Auditory neuroscience: making sense of sound. Cambridge: MIT Press; 2010.
Rutkowski TM, Cichocki A, Mandic DP. Spatial auditory paradigms for brain computer/machine interfacing. In: International workshop on the principles and applications of spatial hearing 2009 (IWPASH 2009). Proceedings of the international workshop. Miyagi-Zao Royal Hotel, Sendai, Japan; 2009. p. P5.
Schreuder M, Blankertz B, Tangermann M. A new auditory multi-class brain-computer interface paradigm: spatial hearing as an informative cue. PLoS ONE. 2010;5: e9813.
Cai Z, Terasawa H, Makino S, Yamada T, Rutkowski TM. Sound timbre and spatial location as informative cues in auditory BCI—brain evoked potential enhancement and involuntary eye movements artifacts suppression approach. In: Proceedings of the third APSIPA annual summit and conference (APSIPA ASC 2011). Xi’an, China: APSIPA; 2011. p. paper #241.
Schomer DL, Lopes da Silva FH, editors. Niedermeyer’s electroencephalography: basic principles, clinical applications, and related fields. Wolters & Kluwer, Lippincott Williams & Wilkins, 6th ed; 2011.
Jurcak V, Tsuzuki D, Dan I. 10/20, 10/10, and 10/5 systems revisited: their validity as relative head-surface-based positioning systems. NeuroImage. 2007;34:1600–1611.
Rutkowski TM. Auditory brain-computer/machine interface paradigms design. In: Cooper E, Kryssanov V, Ogawa H, Brewster S, editors, Haptic and audio interaction design. Lecture notes in computer science, vol 6851. Berlin/Heidelberg: Springer; 2011. pp. 110–119.
Schreuder M, Rost T, Tangermann M. Listen, you are writing! Speeding up online spelling with a dynamic auditory BCI. Front Neurosci. 2011;5: Article 112.
Hoehne J, Schreuder M, Blankertz B, Tangermann M. A novel 9-class auditory ERP paradigm driving a predictive text entry system. Front Neurosci. 2011;5: Article 99.
Cai Z, Makino S, Rutkowski TM. Multichannel analysis of ERP at early and P300 latencies in application for spatial auditory BCI improvement. In: Proceedings of the fourth APSIPA annual summit and conference (APSIPA ASC 2012). Hollywood, CA, USA: APSIPA; 2012. p. paper #355.
Mitchell TM. Machine Learning. New York: McGraw-Hill; 1997.
Schloegl A, Kronegg J, Huggins J, Mason SG. Evaluation criteria in BCI research. In: Dornhege G, Millan J, Hinterberger T, DJ McFarland KR, editors. Towards brain-computer interfacing. Cambridge: MIT Press; 2007. pp. 327–342.
Acknowledgments
This research was supported in part by the Strategic Information and Communications R&D Promotion Program No. 121803027 of The Ministry of Internal Affairs and Communication in Japan and by KAKENHI, the Japan Society for the Promotion of Science, Grant Nos. 12010738 and 24700154. We also acknowledge the technical support of YAMAHA Sound & IT Development Division in Hamamatsu, Japan.
Author Contributions
Zhenyu Cai, Tomasz M. Rutkowski: Performed the EEG experiments and analyzed the data. Tomasz M. Rutkowski: Conceived the concept of the spatial auditory BCI and designed the EEG experiments. Shoji Makino: Coordinated and supported the project. Zhenyu Cai, Tomasz M. Rutkowski: Wrote the paper.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Cai, Z., Makino, S. & Rutkowski, T.M. Brain Evoked Potential Latencies Optimization for Spatial Auditory Brain–Computer Interface. Cogn Comput 7, 34–43 (2015). https://doi.org/10.1007/s12559-013-9228-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12559-013-9228-x