Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Why BCIs (Sometimes) Don’t Work

Brain-computer interface (BCI) research has made great progress recently. Initial BCI research efforts focused primarily on validating proof of concept, usually by testing BCIs with healthy subjects in laboratories instead of target users in home or hospital settings (Pfurtscheller et al. 2000; Kübler et al. 2001; Wolpaw et al. 2002). BCIs have since provided practical communication for severely disabled users with no other way to communicate, and many new applications, signal processing approaches, and displays have been explored. Patients and healthy people have successfully used BCIs based on all three major noninvasive BCI approaches—P300 BCIs based on intermittent flashes, Steady State Visual Evoked Potential (SSVEP) BCIs based on oscillating lights, and Event Related Desynchronization (ERD) BCIs based on imagined movement. This progress and enthusiasm is reflected in the dramatic increase in peer reviewed publications, conference presentations and symposia, and media attention (Pfurtscheller et al. 2006, 2008; Allison et al. 2007; Nijholt et al. 2008). Amidst these positive developments, one major problem is becoming apparent: BCIs do not work for all users.

Ideally, any interface should work for any user. However, across the three major noninvasive BCI approaches, numerous labs report that very roughly 20% of subjects cannot attain control. This problem has been called “BCI illiteracy” e.g., Kübler and Müller (2007), Blankertz et al. (2008), Nijholt et al. (2008). Extensive efforts have been made to overcome this problem through various mechanisms, such as extensively training the subject and/or classifier, alternate displays or instructions, improved signal processing efforts, and error correction. They have only been partly successful. While these options can make BCIs work for some previously “illiterate” users, some people remain unable to use any particular BCI system (Allison et al. 2010b). There is no “universal BCI”.

This problem may result from a possible reason why some users cannot generate the brain activity necessary to control a particular BCI. A small minority of subjects will probably never attain control with a given approach due to the structure of their brains. While all people’s brains have the same cortical processing systems, in roughly the same locations, with similar functional subdivisions, there are individual variations in brain structure. In some users, neuronal systems needed for control might not produce electrical activity detectable on the scalp. This is not because of any problem with the user. The necessary neural populations are presumably healthy and active, but the activity they produce is not detectable to a particular neuroimaging methodology, such as EEG. The key neural populations may be located in a sulcus, or too deep for EEG electrodes, or too close to another, louder group of neurons. For example, about 10% of seemingly normal subjects do not produce a robust P300 (Polich 1986; Conroy and Polich 2007). These users would probably not benefit from training, alternate P300 tasks, or improved signal processing; their best hope is to switch to a BCI that relies on another signal, such as ERD or SSVEP.

There are other reasons why some users cannot use some BCIs. Some subjects produce excessive muscle artifact, or misunderstand or ignore the instructions on how to use a BCI. BCIs might fail because the people responsible for getting the BCI to work made mistakes resulting from inexperience, such as misusing the software or mounting the electrodes incorrectly. Some environments may produce excessive electrical noise that can impair signal quality.

These problems are generally surmountable, whereas individual variations in brain structure are quite difficult to change. This chapter does not address problems resulting from fundamental mistakes by subjects or BCI practitioners. That is, we assume subjects are following instructions, with properly prepared hardware and software, in a reasonable setting.

2 Illiteracy in Different BCI Approaches

What does it mean to say that some users “cannot use” some BCIs? As noted below, comparing illiteracy across different BCI articles is difficult because no standards exist, and various factors must be considered. Recent work that assessed the relationship between illiteracy and the severity of motor impairment used a threshold of 70% or other values (Kübler and Birbaumer 2008). This was an excellent article, and this threshold was adequate for establishing that the severity of impairment was not correlated with illiteracy, except in completely locked-in patients. However, a thorough and parametric assessment of illiteracy across the three major BCI approaches may be premature before some standards to assess illiteracy are developed.

BCI illiteracy is clearly not limited to any one research group or BCI approach. Anecdotal evidence suggests that ERD BCIs may entail greater illiteracy than BCIs based on evoked potentials (P300 and SSVEP). However, Kübler and Birbaumer (2008) (which did not assess SSVEP BCIs) did not find that ERD BCIs entailed higher illiteracy than P300 BCIs.

2.1 Illiteracy in ERD BCIs

ERD BCIs rely on EEG activity associated with different imagined movements. Some approaches rely on specific imagined movements, such as moving the left hand, right hand, or both feet (Pfurtscheller et al., 2006; Leeb et al., 2007; Blankertz et al., 2008; Scherer et al., 2004, 2008). Other approaches train users to explore different, often less specific, imagined movements until they find imagery that yields good results (Friedrich et al. 2009).

Hence, ERD BCIs can only function well if subjects can produce brain activity patterns that differ across different types of imagined movements. ERD BCIs rely on time frequency analysis; the raw EEG is transformed into an estimate of power at different frequencies by a mechanism such as a Fourier transform or autoregressive analysis. If the different movement classes (such as left hand vs. right hand) do not produce reliable and reasonably robust differences in power, at least at one or more frequencies and/or electrode sites, then effective communication will not be possible.

Figure 3.1 presents data from subject A, who could use an ERD BCI. The top two panels show activity over electrode sites C3 (top left panel) and C4 (top right panel) while the subject imagined left hand movement. In the top right panel, ERD is apparent at about 10 Hz, while there is no strong ERD in the top left panel. These top two panels show that left hand movement imagery reduced power at about 10 Hz over the right sensorimotor area, which occurs in most people (Pfurtscheller et al. 2006; Pfurtscheller and Neuper in press).

Fig. 3.1
figure 1

These four panels present data from subject A, who attained very good control with an ERD BCI. In all panels, the x-axis represents the time from the beginning of the trial. A cue, which appeared 2 seconds after the beginning of the trial, instructed the subject to imagine either left or right hand movement. The y-axis shows the frequency. Blue reflects an increase in power, and red reflects a decrease in power, also called ERD. The two left images show activity over site C3, located over the left sensorimotor area, and the two right images show activity over site C4, located over the right sensorimotor area. The top two images reflect trials with imagined left hand movement, and the bottom two images present trials with imagined right hand movement. Images courtesy of Dr. Clemens Brunner

The bottom two panels of Fig. 3.1 show activity over sites C3 and C4 while the subject imagined right hand movement. These two panels instead show ERD over the left sensorimotor area. Therefore, an ERD BCI could determine whether the subject was imagining left or right hand movement by identifying characteristic activity in sites C3 and C4.

Figure 3.2 presents data from subject B, who could not use an ERD BCI. The top two panels do not differ very much from the bottom two panels. Hence, the classifier did not have any way to determine which hand the subject was thinking about moving.

Fig. 3.2
figure 2

These four panels present data from subject B, who was illiterate with an ERD BCI. The axes and shading are the as in Fig. 3.1. The two left images show activity over site C3, over the left sensorimotor area, and the two right images show over site C4, located over the right sensorimotor area. The top two images reflect trials with imagined left hand movement, and the bottom two images present trials with imagined right hand movement. Courtesy of Dr. Clemens Brunner

2.2 Illiteracy in SSVEP BCIs

SSVEP BCIs require subjects to focus their attention on one of (usually) two or more stimuli that each oscillate at different frequencies. This produces oscillations over visual areas at the same frequency as the oscillating stimulus, and often at one or more harmonics of that frequency as well (Pfurtscheller et al. 2006; Allison et al. 2008; Faller et al. 2010).

In an SSVEP BCI, the raw EEG is translated into an estimate of power at different frequencies, much like the procedure in an ERD BCI. The resulting spikes at specific frequencies can be used to determine which stimulus occupied the subject’s attention. Therefore, SSVEP BCIs also depend on clear spikes in the power spectrum at specific frequencies. If these spikes are not apparent, or are too weak to distinguish from background noise, then the SSVEP BCI will not function accurately.

Figure 3.3 presents one literate subject (top 2 panels) and one illiterate subject (bottom two panels) who tried to use an SSVEP BCI. The left and right panels reflect the subject’s desired to communicate left and right movement, respectively. There is a clear difference between the top two panels, and hence this subject attained almost perfect accuracy with an SSVEP BCI. However, there is no clear difference between the bottom two panels, and hence this subject could not attain performance above chance with this SSVEP BCI.

Fig. 3.3
figure 3

These four panels present data from two subjects who tried to use an SSVEP BCI. All panels show data from electrode site O1, over the primary visual cortex. In all panels, the x-axis represents the time from the beginning of the trial. The cue that instructed the subject to focus on the 8 or 13 Hz LED appeared after 2 seconds. The y-axis shows the frequency. The horizontal blue lines reflect an increase in power. The top two images are from subject B, and the bottom two images are from subject A. The two left images were recorded when the subject focused on an 8 Hz LED (which could be used to move left), and the two right images were recorded when the subject focused on a 13 Hz LED (which could be used to move right). In the top left panel, there are clear power increases at 8 Hz and its harmonics of 16, 24, and 32 Hz. In the top right panel, there are clear power increases at 13 Hz and its harmonics of 26 and 39 Hz. Since there are very clear differences between the top 2 panels, subject B showed excellent control with this SSVEP BCI. However, neither of the bottom two panels shows these changes, and hence subject A was illiterate with this SSVEP BCI. Images courtesy of Dr. Clemens Brunner

Noteworthily, the two subjects shown in Fig. 3.3 are the same two subjects shown in Figs. 3.1 and 3.2. Subject A was literate with an ERD BCI, but illiterate with an SSVEP BCI. Subject B was literate with an SSVEP BCI, but illiterate with an ERD BCI.

2.3 Illiteracy in P300 BCIs

Like SSVEP BCIs, P300 BCIs rely on selective attention to visual stimuli (Allison and Pineda 2006; Sellers and Donchin 2006; Lenhardt et al. 2008; Kübler et al. 2009; Jing et al. 2010). However, in a P300 BCI, the stimuli flash instead of oscillate. Whenever a user focuses attention on a specific stimulus, a brainwave called the P300 may occur, whereas the P300 to ignored stimuli is much weaker.

P300 BCIs do not rely on time frequency analyses like ERD and SSVEP BCIs do. Instead, the raw EEG is time-locked to the onset of each flash, producing an event related potential or ERP. ERPs from several trials are usually averaged together to improve accuracy. The classifier tries to identify which flash elicited a robust P300, sometimes incorporating other ERPs as well. Ideally, only the target stimulus—that is, the stimulus that the user is attending—elicits a robust P300. If none of the flashes elicit an ERP that is reliably different from other ERPs, then effective communication is not possible with that P300 BCI system.

Figure 3.4 contains ERPs for three subjects who tried to use a P300 BCI. In all three panels, the solid line shows the ERP to a target flash, and the dashed line shows the ERP to a nontarget flash. The top left panel shows a subject who had a weak P300, although the target and nontarget flashes did vary earlier in the time window. The right panel shows data from a literate subject. This subject’s P300 is clearly visible after only target flashes. The bottom panel shows an illiterate subject, whose ERPs to target and nontarget flashes look similar.

Fig. 3.4
figure 4

ERP activity from three subjects who tried to use a P300 BCI. In all three panels, the x-axis reflects the time after the flash began, and the y-axis reflects the amplitude of the ERP. Each panel presents ERPs that were averaged over many trials; the solid and dashed lines are much harder to distinguish on a single trial basis. The top left panel shows a subject who did not have a strong P300. The solid and dashed lines look similar in the time window when the P300 is typically prominent, which is about 300–500 ms after the flash. However, these two lines did differ during an earlier time window. The top right panel shows a subject who did have a strong P300. The bottom panel shows a subject whose ERPs look similar for target and nontarget flashes throughout the time window. This subject was illiterate with a P300 BCI. Images courtesy of Dr. Jin Jing

3 Improving BCI Functionality

What can you do if someone cannot use a BCI? As noted, BCI illiteracy is essentially a problem of accuracy. The methods for improving accuracy presented here could make the difference between an ineffective system and a functional communication tool. Of course, improving accuracy could benefit literate users as well; since BCIs very rarely allow sustained communication at 100% accuracy, the approaches below could be useful to almost any BCI system. Again, we do not consider basic problems that may result from simple mistakes in BCI setup or a noisy environment. Four possible solutions to other problems are discussed:

  1. 1.

    Improve selection and/or classification of existing brain signals through improved algorithms

  2. 2.

    Use sensor systems that provide richer information

    1. a.

      Different neuroimaging technologies

    2. b.

      More or better sensors

  3. 3.

    Incorporate error correction or reduction

    1. a.

      Improved interfaces that make errors less likely and/or allow error correction

    2. b.

      Additional signals, from the EEG or elsewhere, that convey error

  4. 4.

    Generate brain signals that are easier to categorize

    1. a.

      Within existing BCI approaches

    2. b.

      Using novel BCI approaches

    3. c.

      By switching to a different approach

    4. d.

      By combining different approaches

3.1 Improve Selection and/or Classification Algorithms

Option 1 (improved algorithms) is by far the most heavily pursued. There have been four major data analysis competitions (e.g. Blankertz et al. 2004), but no competitions to (for example) produce the strongest ERD or develop the most discerning sensor system. Signal processing is the easiest component of a BCI to improve, since it requires no special equipment, data collection, device development, etc. Improved signal processing merits further study, and will probably continue to reduce but not eliminate “BCI illiteracy” (Blankertz et al. 2008; Brunner et al. 2010). Improved signal processing cannot help if the subject is not producing any detectable activity that could distinguish different mental states.

Since different people have different brain activity, customizing the classification algorithms for each user can dramatically improve accuracy with some subjects. This customization is now common; relatively few BCIs use the same parameters for all subjects. Hence, an emerging challenge is finding ways to automate this customization process, since a BCI could then customize itself without human intervention. As BCIs move outside the laboratory, and hence further away from experts who can customize BCIs for each user, software that can automatically configure classification algorithms and other parameters becomes increasingly important.

The top left panel of Fig. 3.3 presents a simple example of how a customized signal processing algorithm can improve performance, perhaps enough to make this subject literate. Some P300 BCIs use a linear classification technique that focuses on specific time periods after the flash, such as Stepwise Discriminant Analysis (SWDA). An SWDA classifier that used generic settings for all users would probably only evaluate time periods when the P300 is typically apparent, such as 300–500 ms after the flash. However, software might examine each subject’s ERP, determine which time periods exhibit a strong difference between target and nontarget flashes, and adjust the classifier settings accordingly. In this example, the classifier could be automatically reprogrammed to focus more heavily on the time period about 200 ms after each flash.

The subject shown in the top left panel is not especially unusual. She shows a strong P200, which is a well-known ERP component that often precedes the P300 and can differ with selective attention (Allison and Pineda 2006). Indeed, the subject in the top right panel also has a strong P200, in addition to a strong P300. However, the subject in the bottom panel has a weak P200 and a weak P300. We could not identify any classifier settings that would make this subject proficient with a “P300” BCI.

3.2 Explore Different Neuroimaging Technologies

Option 2a (different neuroimaging technologies) needs more attention; no articles have thoroughly explored whether someone who cannot attain literacy with a BCI based on one neuroimaging approach might perform better with a different approach. This article focuses primarily on EEG-based BCIs, since over 80% of BCIs rely on the EEG (Mason et al. 2007). Other noninvasive methods might be effective when EEG based methods are not, but have other drawbacks such as cost or portability (Wolpaw et al. 2006; Allison et al. 2007).

Invasive BCIs can also be effective communication tools (Hochberg et al. 2006; Schalk et al. 2008; Blakely et al. 2009) and might also work when other methods do not. The brain’s electrical activity is filtered, smeared, and diminished as it travels from the brain to the outer surface of the scalp. Signals recorded from sensors fixed on or in the brain might be easier to categorize, but entail neurosurgery, scarring, risk of infection, and ethical concerns that vary considerably across different users and their needs. Since some invasive BCIs may be able to detect activity from neurons within a sulcus, people who cannot use a noninvasive BCI because of their brain structure might attain better results with an invasive approach. This prospect merits further study, along with the possible benefits of combining noninvasive and invasive approaches (Wolpaw et al. 2002).

Option 2b (more or better sensors) has been heavily pursued, with little success. The conventional Ag/AgCl electrode, with electrode gel and skin abrasion, has not changed much in decades despite many efforts from academic and commercial groups. Dry electrodes might make caps more convenient (Popescu et al. 2007; Sullivan et al. 2008), but even the most enthusiastic developers agree that the signals are at the very best comparable to gel based electrodes. The prospects of using additional electrodes and optimizing electrode locations are not new (Pfurtscheller et al., 1996, 2006). Furthermore, there are many drawbacks to adding more sensors, such as increased cost and preparation time.

3.3 Apply Error Correction or Reduction

Option 3 (error correction or reduction) could help improve BCIs in many ways. Since BCIs have generally failed to capitalize on fundamental principles from HCI research, there are many unexplored opportunities for improvement (Allison in press). However, like the two options already discussed, error reduction and correction cannot make all subjects proficient. Error related activity can be detected in the EEG, as well as other signals based on eye, heart, or other physiological signals (Schalk et al. 2000; Buttfield et al. 2006; Ferrez and Millán 2008). It can improve performance when a signal is poor but sometimes usable, but is useless if the subject cannot effect control at all. Similarly, software that prevents people from spelling impossible words or sending meaningless commands cannot help a subject who cannot convey anything in the first place (Allison in press).

3.4 Generate Brain Signals that are Easier to Categorize

Option 4a (clearer signals within a BCI approach) has been most heavily pursued within ERD BCIs, with considerable success. Many ERD BCI improvements from the Wolpaw lab stem from training subjects to produce more actionable information via ERD BCIs. Neuper et al. (2005) showed that instructing subjects to focus on first-person motor imagery (that is, imagining their own hand moving) could improve performance relative to third-person motor imagery (that is, imagining watching a hand move). Nikulin et al. (2008) claimed that a novel type of motor imagery based on “quasi-movements” could yield better performance than conventional ERD tasks.

Unlike ERD BCIs, there has been little success in generating clearer EEG signals with P300 or SSVEP BCIs. The original paradigm used in Farwell and Donchin (1988) already produced P300s that are about as big as some of the larger P300s in the literature. It is unlikely that a new paradigm to produce huge P3s will be developed, although novel displays, tasks, or other parameters might enhance other features such as the CNV (Farwell and Donchin 1988; Allison and Pineda 2006).

Paradoxically, some approaches to improve information transfer rate (ITR, also called bit rate or information throughput) in P300 BCIs might increase illiteracy. For example, changing the number or distribution of characters illuminated with each flash can improve P300 BCI ITR in some subjects—not by eliciting larger P300s, but by reducing the number of flashes required to identify each target character (Guger et al. 2009; Jing et al. 2010). However, methods that reduce the number of flashes also entail a shorter target to target interval (TTI), which can reduce P300 amplitude and potentially increase illiteracy (Gonsalvez and Polich 2002).

Conventional SSVEP BCIs already yield SSVEPs that differ considerably between target and nontarget events in most subjects. There seems to be no easy way to create SSVEP differences that are easier to recognize, though the number of events required to identify each target could be reduced. Other work showed that better displays or other parameters could create more recognizable SSVEPs and similar VEPs (Cheng et al. 2002; Wang et al. 2006; Allison et al. 2008; Bin et al. 2009).

Option 4a could also entail configuring a BCI to rely more heavily on the signals within a BCI approach that are easiest to categorize. This option has been explored with some BCIs that rely on imagination of different conventional mental tasks. For example, Millán and Mouriño (2003) first explored which of six mental tasks yielded the most discriminable EEG signals for each subject, and then configured a BCI system to control a robot using the three tasks that yielded the clearest signals.

A similar solution might work with ERD BCIs. Consider a BCI that detects foot imagery very poorly, but reliably detects hand imagery. This BCI might be only 10% accurate if the user usually tries to communicate via foot imagery, but 100% accurate if the user only uses hand imagery. Such a BCI should thus be configured to rely more heavily on hand imagery. This solution would reduce errors, but not eliminate them, unless the BCI is configured to operate without foot movement imagery, which limits its alphabet. There might be other reasons why the BCI was designed to include foot imagery. For example, foot movement might seem more natural if the goal is to walk forward (Leeb et al. 2007; Scherer et al. 2008) or control vertically scrolling letters (Scherer et al. 2004). On the other hand, keyboards are highly unnatural interfaces, since moving fingers across a keyboard has little intuitive connection to the message being sent, or indeed any natural activity. Further research should explore the importance of a congruent, literal mapping between mental task and desired outcome.

Improved feedback could make subjects more motivated and involved (Neuper and Pfurtscheller in press). Subjects might find immersive virtual feedback more absorbing than conventional feedback (Leeb et al. 2007; Scherer et al. 2008; Faller et al. 2010). Subjects who are more motivated or engaged could produce clearer brain signals (Nijboer et al. 2008; Nijboer and Broermann in press).

Presenting feedback through different modalities could also result in clearer brain signals. While most BCIs rely on visual stimuli, BCIs have also been developed based on auditory (Kübler et al. 2009) and tactile (Müller-Putz et al. 2006) stimuli. In healthy subjects, visual stimuli usually produce clearer brain signals. However, subjects who have trouble seeing might attain better results with a BCI based on auditory or tactile modalities. It may seem easy to determine whether a user can see, but this is not always true. Subjects who are locked in and cannot communicate have no way to report that they have trouble seeing the visual stimuli used in a BCI. Hence, if a subject who cannot communicate seems illiterate with a BCI based on visual stimuli, experimenters should consider an auditory or tactile BCI.

Option 4b (clearer signals with a novel BCI approach) is receiving more attention. Possibilities such as auditory streaming, imagined music, phoneme imagination, or conventional mental tasks like math or singing have not been tested across many subjects. A previously unknown or underappreciated task probably won’t lead to a BCI that works for all users. Hence, all of the options presented so far should reduce but not eliminate illiteracy.

Option 4c is very rarely considered: give up on the current BCI approach and try another one. Many labs and researchers focus on only one approach, and thus lack both the tools and cognitive flexibility to explore other options. This option hinges on our belief that there will always be a small minority of users who can never use a specific approach, even after any or all of the above options have been implemented. This prospect was recently explored in the first controlled study devoted to comparing different BCI approaches within subjects. Our team at TU Graz recently compared data from offline simulations of SSVEP vs. ERD BCIs, which suggested that some subjects who could not effectively use an SSVEP BCI could use an ERD BCI, and vice versa (Allison et al. 2010b).

We have also confirmed this result with online BCIs. Figures 3.13.3 present clear examples of two subjects who were literate with only one of these two BCI approaches. Subject A was literate with an ERD BCI, but not an SSVEP BCI. Subject B was literate with an SSVEP BCI, but not an ERD BCI.

Allison et al. (2010b) also introduced a potential hybrid BCI that combines two BCI approaches (SSVEP and ERD), which addresses option 4d. A hybrid BCI would ideally have an adaptive classifier to learn how to appropriately weigh contributions from different signals. That is, with training, a hybrid BCI using signals X and Y would become the same as a BCI using signal X only if signal Y was uninformative. If subjects could use both signals, then X and Y could be combined to increase the dimensionality of control or improve the accuracy/speed tradeoff. Subjects A and B were both literate with our hybrid ERD/SSVEP BCI.

Options 4c and 4d, which both involve a different BCI approach, are underappreciated opportunities to provide communication for subjects who are not successful with the first approach they try. A new approach does not mean the subject cannot attain the same goals, such as spelling, moving a cursor, or controlling a robotic device. Major changes to display and feedback parameters may not be needed either. The subject must simply perform different mental tasks, such as paying attention to letters that flash instead of oscillate.

3.5 Predicting Illiteracy

There is currently no way to predict whether someone will be illiterate with a certain BCI approach. Illiteracy is only apparent after a subject tries to use a BCI. Researchers, carers, or others may also try to get the BCI working through some of the methods described above, and/or trial and error with additional BCI sessions. Therefore, considerable time and effort is necessary to diagnose illiteracy.

Additional research into BCI demographics might help identify factors that could predict whether someone will be proficient with a certain BCI approach, and could also help predict the best parameters for each user. Age, gender, personality traits, lifestyle and background, and other factors could help developers and other people find the best BCI for each user. People with a strong history of sports, dance, martial arts, or other movement oriented hobbies might perform better with BCIs based on imagined movement. People who play some types of computer games, or perform well on simple tests of visual attention, might perform better with BCIs that rely on visual attention.

Temporary factors like time of day, fatigue, or recent consumption of food, alcohol, caffeine, or drugs may be relevant. For example, Guger et al. (2009) found that people who reported less sleep the previous night performed better with P300 BCIs—a surprising finding that suggests a rather easy way to temporarily improve P300 BCI performance. Another study found that older subjects performed worse with SSVEP BCIs, but otherwise found no correlation between performance and many other factors (Allison et al. 2010a).

4 Towards Standardized Terms, Definitions, and Measurement Metrics

The term “BCI illiteracy” implies a connection between BCIs and language. BCI illiteracy is not limited to any alphabet of mental signals. That is, just as someone illiterate in German might be fluent in English, a person who cannot use an ERD BCI might communicate effectively through a SSVEP BCI. The graphemes or phonemes in written or spoken Japanese are incomprehensible to someone who only knows Arabic, and the mental tasks (also called “cognemes”) in ERD BCIs are useless in SSVEP BCIs (Allison and Pineda 2006).

Like conventional illiteracy, BCI illiteracy is essentially a problem of accuracy. An illiterate reader or listener is someone who cannot interpret text or speech accurately. Also like conventional illiteracy, BCI illiteracy is a problem of scale that depends on the likelihood of correct communication by chance. A conventional illiterate is someone who can accurately communicate with about 0% accuracy, since the likelihood of correctly guessing the right word is very low because natural language vocabularies typically have tens of thousands of options. A person who can understand half the common words in a natural language might be considered reasonably competent. However, a BCI that correctly interprets the user’s intended message only half the time is probably inadequate, since BCIs have smaller alphabets, perhaps as few as two elements.

This point underscores the first of many concerns with the term “BCI illiteracy”: there is no accepted literacy threshold. That is, there are no guidelines that specify which accuracy threshold must be crossed before a subject is considered literate. For example, among BCIs that allow two choices, different articles use different thresholds (e.g. Perelmouter and Birbaumer 2000; Guger et al. 2003; Allison et al. 2008; Kübler and Birbaumer 2008). We used a threshold of 70% in two recent articles involving tasks that simulated a two choice BCI (Allison et al. 2010b; Brunner et al. 2010). Guger et al. (2003) was written before the term “BCI illiteracy” was coined, but refers to the 6.7% of subjects who attained less than 60% accuracy in a two-choice ERD task as “marginal.” The article assumes that the 93.3% of subjects who attained better performance would be effectively literate. Had a threshold of 70% been used instead, the number of “marginal” (aka illiterate) subjects would have increased to 48.7%. Therefore, fairly small changes in the threshold can dramatically affect the percentage of subjects who are deemed literate.

This threshold depends on the number of choices in the BCI’s alphabet, which is called N (Wolpaw et al. 2002). 65% accuracy is probably unacceptable in a BCI with two choices, but might be tolerable in a BCI with many more choices. However, regardless of N, there is no agreement on the best proficiency threshold. Sellers and Donchin (2006) criticized an earlier article for implying that a 36-choice BCI with almost 50% accuracy was a reasonable communication system. Only two of ten subjects in Friedrich et al. (2009) were considered illiterate by that paper’s first author (Friedrich, personal communication, April 2009), although six subjects attained accuracy below 50% in a four-choice task.

Furthermore, the true “chance level” also depends on the length of the message or sequence of commands (Müller-Putz et al. 2008). While it may seem that chance performance with a two-choice BCI is 50%, this is effectively true only with infinity trials. The proficiency threshold should be higher if the user can only send one very short message.

Similarly, “BCI illiteracy” does not account for the possibility of improving accuracy by allowing more time for selections. In some cases, increasing the number of trials or the duration of each trial can improve accuracy, perhaps above the proficiency threshold. For example, P300 BCI articles often note that performance with single trials is typically below any reasonable proficiency threshold, but performance improves if data from many trials are averaged together (Farwell and Donchin 1988; Jing et al. 2010).

In summary, proficiency thresholds might not best be represented by a single number, but rather a formula that includes the number of choices, the number of trials, and the time allowed for each selection (Allison in press). Unfortunately, even after considering these factors, other challenges remain.

Some challenges with developing a standardized proficiency threshold are harder to address. A single formula cannot easily account for different types of errors, such as false positives or misses. Errors of omission or commission may be more or less confusing or frustrating for the user, designer, and/or listener. A proficiency threshold formula might be further complicated because some errors are more likely with certain signals, which was discussed in problem 4a above. Certain choices may be selected more often than others, which can complicate the standard formula for ITR (Wolpaw et al. 2002) and a standardized threshold approach.

A proficiency threshold is harder to determine with asynchronous BCI systems. In asynchronous BCI, the BCI system determines when messages or commands must be sent. This characteristic makes it relatively easy to determine whether a user correctly sent a message within the allotted time. However, in an asynchronous BCIs, users may communicate (or not) at their leisure (Millán and Mouriño 2003; Pfurtscheller et al. 2006). There might also be many different ways to accomplish a goal. For example, one user may navigate through a slalom course by turning after every step, while another might only turn once for each obstacle. Either solution would be correct. Any proficiency test for an asynchronous BCI should also ensure that a subject can avoid sending signals at certain times, which reflects effective communication of the “no control” state (Leeb et al. 2007; Scherer et al. 2008; Faller et al. 2010).

Further complicating the discussion, an “effective proficiency threshold” also depends on subjective factors. A subject who attains 69% accuracy with a two choice system might be classified as illiterate, but could still communicate if persistent and patient. A different subject might consider a two choice BCI useless if it does not provide at least 90% accuracy. That would be effectively illiterate, just like a decent French speaker who is so embarrassed by his accent, and/or by his periodic errors in French grammar, that he never speaks French. Other authors have noted that users may prefer a more accurate system over one that maximizes ITR (Kübler et al. 2001; Wolpaw et al. 2002; Allison et al. 2007).

Finally, illiteracy may vary within subjects with factors like time of day, mood, motivation, lighting, distraction, and testing environment. How should this be addressed? Can someone be literate in one setting, and illiterate in another?

4.1 The Relative Severity of Illiteracy

The discussion so far might suggest that “BCI illiteracy” is a fatal problem in BCI research. The severity of “BCI illiteracy” should also be considered in relation to other interfaces. Conventional interfaces are not universal either. Many millions of people cannot use keyboards, mice, cell phone keypads, and other conventional interfaces due to physical or other disability. This serious drawback has not prevented these interfaces from becoming mainstream communication tools. BCIs may also attain wider acceptance among disabled and healthy users even if they do not provide control for some people (Nijholt et al. 2008).

Similarly, ITR is a problematic way to compare BCIs, with many of the same problems as BCI illiteracy. For example, the formula for ITR does not account for types of errors, frequency of certain selections, subjective factors, preferences for higher accuracy over ITR, “extra time” such as the time between selections and breaks, and other issues. These concerns have been widely noted (e.g. Kübler et al., 2001; Wolpaw et al., 2002; Sellers and Donchin, 2006; Allison et al., 2007), yet ITR is still widely used in BCI articles.

4.2 (Re) Defining “BCI Illiteracy”

In addition to the problems with measuring illiteracy, there is no widespread agreement on the term itself. The term “BCI illiteracy” is still quite new. Its first publication outside of a conference presentation was in Kübler and Müller (2007). The Berlin group used the term “BCI aphasia” in prior conference presentations. Other terms that might be used include proficiency, reliability, or universality. Authors have described subjects’ unacceptable performance as “bad” (Cheng et al. 2002), “marginal” (Guger et al. 2003), “low” (Allison et al. 2008) or “poor” (Leeb et al. 2007).

Extending the word “illiteracy” from natural languages to BCIs leads to intriguing comparisons, but can also be confusing. Since the word “illiteracy” refers to trouble reading or writing, it is unclear whether illiteracy results from the subject, classifier, or other factors. This distinction may be meaningful. As discussed above, different problems suggest different possible solutions.

“BCI illiteracy” implies that failure to use a BCI results from inadequate effort by the user, which is generally not true. Conventional illiteracy can typically be overcome by (for example) taking German classes. Hence, if someone cannot speak German, one might assume he is lazy, uninterested, or overly focused on other priorities (such as writing articles about BCIs). On the other hand, some subjects could never learn to use a particular BCI.

“Illiteracy” really reflects a problem connecting the different letters in an alphabet into meaningful communication. English, French, Spanish, Dutch, Flemish, Italian, and other languages have alphabets similar to the German alphabet, and native German speakers can recognize most letters in other Romance languages. Similarly, a native German speaker can produce the sounds used in most Romance languages. However, proficiency with the alphabet is only a precursor to literacy with a natural language. With BCIs, the real challenge is mastering the alphabet—the basic signals that convey information. Combining these signals into a vocabulary of messages or commands is then straightforward. There may be some cases when an individual signal can convey meaning (Allison et al. 2007), just as “I” or “a” are letters that are also English words, but such cases are rare.

5 Summary

The rapid increase in BCI research has exposed a problem that remains underappreciated: BCI illiteracy. This problem exists across the three prominent BCI approaches (P300, SSVEP, and ERD) and across different implementations of these approaches in different labs. Many options to reduce illiteracy have been explored. While these have been somewhat successful, some subjects will be unable to use a particular BCI approach, and these subjects might only attain proficiency by switching to another approach. Although we focused on EEG BCIs, BCI illiterates might benefit from switching to another imaging approach, and many of the problems, solutions, and terminological issues discussed here could be extended to non-EEG BCIs.

Hence, the answer to the question “Can anyone use a BCI?” depends on the interpretation of the question. For a specific BCI system, the answer is probably no. A “universal BCI” is unlikely in the near future; at least a minority of subjects will not be proficient any particular system. Fortunately, the answer becomes “probably” if the question is interpreted as: “Can anyone use at least one BCI?” It is unlikely that anyone would be unable to use all BCI approaches, so long as s/he is mentally capable of goal-directed action, receiving and understanding instructions and feedback, and forming messages or commands (Kübler and Birbaumer 2008). Therefore, while all the options presented above should be explored, more attention should be devoted to exploring different BCI approaches, especially hybrid BCIs, within subjects in real-world settings.

There are also many concerns with defining “BCI illiteracy”. Some of these problems are unique to the term itself, while other problems create challenges in establishing any standards to assess this phenomenon. Ultimately, standards need to be established through discussion among established BCI research groups. Widely agreed terms, definitions, and measurement metrics will help future developers, authors, carers, users, and others unequivocally identify how to distinguish effective communication from illiteracy.