Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Brain–computer interfaces are a new technology that could help to restore useful function to people severely disabled by a wide variety of devastating neuromuscular disorders and to enhance functions in healthy individuals. The first demonstrations of brain–computer interface (BCI) technology occurred in the 1960s when Grey Walter used the scalp-recorded electroencephalogram (EEG) to control a slide projector in 1964 [1] and when Eberhard Fetz taught monkeys to control a meter needle (and thereby earn food rewards) by changing the firing rate of a single cortical neuron [2, 3]. In the 1970s, Jacques Vidal developed a system that used the scalp-recorded visual evoked potential (VEP) over the visual cortex to determine the eye-gaze direction (i.e., the visual fixation point) in humans, and thus to determine the direction in which a person wanted to move a computer cursor [4, 5]. At that time, Vidal coined the term “brain–computer interface.” Since then and into the early 1990s, BCI research studies continued to appear only every few years. In 1980, Elbert et al. showed that people could learn to control slow cortical potentials (SCPs) in scalp-recorded EEG activity and could use that control to adjust the vertical position of a rocket image moving across a TV screen [6]. In 1988, Farwell and Donchin [7] reported that people could use scalp-recorded P300 event-related potentials (ERPs) to spell words on a computer screen. Wolpaw and his colleagues trained people to control the amplitude of mu and beta rhythms (i.e., sensorimotor rhythms) in the EEG recorded over the sensorimotor cortex and showed that the subjects could use this control to move a computer cursor rapidly and accurately in one or two dimensions [8, 9].

The pace and breadth of BCI research began to increase rapidly in the mid-1990s and this growth has continued almost exponentially into the present. The work over the past 20 years has included a broad range of studies in all the areas relevant to BCI research and development, including basic and applied neuroscience, biomedical engineering, materials engineering, electrical engineering, signal processing, computer science, assistive technology, clinical rehabilitation, and human factors engineering [10, 11]. A comprehensive book on BCI has also appeared recently and interested readers can find a detailed treatment in [12].

The central goal of BCI research and development is the realization of powerful new assistive communication and control technology for people severely disabled by neuromuscular disorders such as amyotrophic lateral sclerosis (ALS), stroke, spinal cord injury, cerebral palsy, multiple sclerosis, and muscular dystrophies. This emphasis has been encouraged and strengthened by increased societal appreciation of the needs of people with severe disabilities, as well as by greater realization of their ability to live enjoyable and productive lives if they can be provided with effective assistive technology. In addition, in recent years a number of investigators have begun to explore possibilities for developing BCIs for the general population. These include systems for enhancing or supplementing human performance in demanding tasks such as image analysis or continuous attention, as well as systems for expanding or enhancing media access, computer gaming, or artistic expression. Furthermore, BCI technology has recently begun to be explored as a means to assist in rehabilitation of people disabled by stroke and other acute events. This chapter provides an introduction to the underlying concepts and principles as well as the applications of BCIs.

2 BCI Definition and Structure

2.1 What is a BCI?

According to present understanding, the role of the central nervous system (CNS) is to respond to occurrences in the environment or in the body by producing appropriate outputs. The natural outputs of the CNS are either neuromuscular or hormonal. A brain–computer interface (BCI) gives the CNS new output that is not neuromuscular or hormonal. A BCI is a system that measures CNS activity and converts it into artificial output that replaces, restores, enhances, supplements, or improves natural CNS output and thereby changes the ongoing interactions between the CNS and its external or internal environment [13].

To understand this definition, one needs to understand each of its key terms, starting with CNS. The CNS is composed of the brain and the spinal cord and is differentiated from the peripheral nervous system (PNS), which is composed of the peripheral nerves and ganglia and the sensory receptors. The unique features of CNS structures are their location within the meningeal coverings (i.e., meninges), their distinctive cell types and histology, and their role in integrating the numerous different sensory inputs to produce effective motor outputs. In contrast, the PNS is not inside the meninges, does not have the unique CNS histology, and serves primarily to bring sensory inputs to the CNS and to carry motor outputs from it.

CNS activity comprises the electrophysiological, neurochemical, and metabolic phenomena (such as neuronal action potentials, synaptic potentials, neurotransmitter releases, and oxygen consumption) that occur continually in the CNS. These phenomena can be monitored by measuring electric or magnetic fields, hemoglobin oxygenation, or other parameters employing sensors on the scalp, on the surface of the brain, or within the brain. A BCI records brain signals, extracts particular measures (or features) from them, and converts (or translates) the features into new artificial outputs that act on the environment or on the body itself. Figure 2.1 shows the five kinds of applications that a BCI output might control. It illustrates each of these five kinds of application by showing one of many possible examples.

Fig. 2.1
figure 00021

Design and operation of a brain–computer interface (BCI) system. Signals produced by brain activity are recorded from the scalp, from the cortical surface, or from within the brain. These signals are analyzed to measure signal features (e.g., amplitudes of EEG rhythms or firing rates of individual neurons) that correlate with the user’s intent. These features are then translated into commands that control application devices that replace, restore, enhance, supplement, or improve natural CNS outputs (from [13] with permission)

A BCI output could replace natural output that has been lost to injury or disease. Thus, someone who cannot speak could use a BCI to spell words that are then spoken by a speech synthesizer. Or one who has lost limb control could use a BCI to operate a powered wheelchair.

A BCI output could restore lost natural output. Thus, someone with a spinal cord injury whose arms and hands are paralyzed could use a BCI to control stimulation of the paralyzed muscles with implanted electrodes so that the muscles move the limbs. Or one who has lost bladder function from multiple sclerosis could use a BCI to stimulate the peripheral nerves controlling the bladder so as to produce urination.

A BCI output could enhance natural CNS output. Thus, someone engaged in a task that needs continuous attention over a long time (e.g., driving a car or performing sentry duty) could employ a BCI to detect the brain activity preceding breaks in attention and then produce an output (such as a sound) that alerts the person and restores attention. By preventing the periodic attentional breaks that normally compromise natural CNS output, the BCI enhances the natural output.

A BCI output could supplement natural CNS output. Thus, someone controlling cursor position with a standard joystick might employ a BCI to choose items that the cursor reaches. Or a person could use a BCI to control a third (i.e., robotic) arm and hand. In these examples, the BCI supplements natural neuromuscular output with another, artificial output.

Lastly, a BCI output might possibly improve natural CNS output. For example, a person whose arm movements have been compromised by a stroke damaging sensorimotor cortex might employ a BCI that measures signals from the damaged areas and then excites muscles or controls an orthosis that improves arm movement. Because this BCI application enables the production of more normal movements, its continued use might induce activity-dependent CNS plasticity that improves the natural CNS output and thus helps to restore more normal arm control.

The first two kinds of BCI application, replacement or restoration of lost natural outputs, are the focus of most present-day BCI research and development. At the same time, the other three types of applications are drawing increasing attention.

The final part of the definition says that a BCI changes the ongoing interactions between the CNS and its external or internal environment. The CNS interacts constantly with the environment world and the body. These interactions comprise its outgoing motor outputs along with its incoming sensory inputs. By monitoring CNS activity and translating it into artificial outputs that act on the environment or the body, BCIs modify both CNS motor outputs and sensory inputs (i.e., feedback) as well. Devices that only monitor brain activity and do not employ it to modify the continuing interactions of the CNS with its environment are not BCIs.

2.2 Alternative or Related Terms

BCIs are also called brain–machine interfaces or BMIs. The choice between these two synonymous terms is essentially a matter of personal preference. One reason for using BCI rather than BMI is that the word “machine” in BMI implies a fixed translation of brain signals into output commands, which does not match the reality that a computer and the brain are essentially partners in the interactive adaptive control that is required for successful BCI, or BMI, function.

The terms dependent BCI and independent BCI appeared in 2002 [10]. In accord with the definition of a BCI, both employ brain signals to control applications; however they differ in how they depend on natural CNS output. A dependent BCI employs brain signals that depend on muscle activity. The BCI developed by Vidal [4, 5] used a VEP that depended on gaze direction and therefore on the muscles that controlled gaze. A dependent BCI is basically an alternative way to detect messages conveyed by natural CNS outputs. Thus, it does not give the brain a new output independent of natural outputs. Nevertheless, it can still be very useful (e.g., [14]).

Contrastingly, an independent BCI does not depend on natural CNS output; muscle activity is not needed to generate the crucial brain signals. Thus, in BCIs that measure EEG sensorimotor rhythms, the user typically employs mental imagery to change sensorimotor rhythms in order to produce the BCI output. For those who are severely disabled by neuromuscular disorders, independent BCIs are likely to be more effective.

The recent term hybrid BCI is used in two ways [15]. It can be applied to a BCI that employs two different types of brain signals (e.g., VEPs and sensorimotor rhythms) to produce its outputs. Or it can be applied to a system that combines a BCI output and a natural muscle-based output. In this second usage, the BCI output supplements a natural CNS output (as Fig. 2.1 illustrates).

2.3 The Components of a BCI

A BCI detects and measures features of brain signals that reveal the user’s intentions and translates these features in real time into commands that achieve the user’s intent (Fig. 2.1). In order to do this, a BCI system has four components: 1) signal acquisition; 2) feature extraction; 3) feature translation; and 4) device output commands. A BCI also has an operating protocol that specifies how the onset and timing of operation is controlled, how the feature translation process is parameterized, the nature of the commands that the BCI produces, and how errors in translation are handled. A successful operating protocol enables the BCI system to be flexible and to serve the particular needs of each of its users.

The signal acquisition component measures brain signals using a particular kind of sensor (e.g., scalp or intracranial electrodes for electrophysiological activity, functional magnetic resonance imaging for hemodynamic activity). It amplifies the signals to enable subsequent processing, and it may also filter them to remove noise such as 60-Hz (or 50-Hz) power line interference. The amplified signals are digitized and transmitted to a computer.

The feature extraction component analyzes the digitized signals to isolate signal features (e.g., power in specific EEG frequency bands or firing rates of individual cortical neurons) and expresses them in a compact form suitable for translation into output commands. Effective features need to have strong correlations with the user’s intent. Since much of the most relevant (i.e., most strongly correlated) brain activity is transient or oscillatory, the signal features most commonly extracted by present-day BCIs are EEG or ECoG response amplitudes, power in particular EEG or ECoG frequency bands, or firing rates of single cortical neurons. To ensure accurate measurement of the chosen signal features, artifacts such as electromyogram (EMG) from cranial muscles need to be avoided or eliminated.

The signal features are provided to the feature translation algorithm, which converts them into commands for the output device, that is, into commands that achieve the user’s intent. Thus, a decrease in power in a specific EEG frequency band might be translated into an upward displacement of a computer cursor, or a particular evoked potential measure might be translated into the selection of a letter to be added to a document being composed. The translation algorithm should be able to accommodate and adapt to spontaneous or learned changes in the user’s signal features in order to ensure that the user’s possible range of feature values covers the full range of device control and also to make control as effective and efficient as possible.

The commands that the feature translation algorithm produces are the output of the BCI. They go to the application and there produce functions such as letter selection, cursor control, robotic arm operation, wheelchair movement, etc. The operation of the device provides feedback for the user, and thereby closes the control loop.

2.4 The Unique Challenge of BCI Research and Development

As noted earlier, the natural CNS function is to produce muscular and hormonal outputs that act on the outside world or the body. BCIs give the CNS entirely new artificial outputs derived from brain signals. In essence, they ask the CNS, which has evolved to produce muscular and hormonal outputs, to produce entirely new kinds of outputs. Thus, for example the sensorimotor cortical areas, which normally act in combination with subcortical and spinal areas to control muscles, are now required instead to control specific brain signals (such as neuronal firing patterns or EEG rhythms). The fundamental implications of this requirement become evident when BCI use is considered in terms of two basic principles that govern how the CNS produces its natural outputs.

First, the task of producing natural outputs is distributed throughout the CNS, from the cerebral cortex to the spinal cord. No one area is entirely responsible for a natural output. Actions such as speaking, walking, or playing the piano are produced by the integrated activity of cortical areas, basal ganglia, thalamic nuclei, cerebellum, brainstem nuclei, and spinal cord interneurons and motoneurons. Thus, while the cortex usually initiates walking and monitors its course, the rhythmic rapid sensorimotor interactions that underlie effective walking are handled primarily by circuits in the spinal cord [1618]. The final result of this highly distributed CNS activity is the proper excitation of the spinal (or brainstem) motoneurons that activate muscles and thereby produce actions. In addition, while activity in the different CNS areas that are participating generally correlates with the action, the activity in a particular area may vary considerably from one performance of the action to the next. At the same time, the coordinated activity in the many areas involved ensures that the action itself is stable.

Second, natural CNS outputs (such as speaking, walking, or playing a musical instrument) are acquired initially and maintained in the long term by adaptive changes in the many CNS areas that contribute to them. Throughout life, CNS neurons and synapses change continually to master new skills and to maintain those already learned (e.g., [1923]). Referred to as activity-dependent plasticity, this continuing change underlies the acquisition and preservation of both common skills (e.g., walking and talking) and special skills (e.g., athletics, singing), and it is guided by its results. For example, as muscle strength and body size and weight change during life, CNS areas change appropriately to maintain these skills. In addition, the basic CNS features (that is, its anatomy, physiology, and plasticity mechanisms) that support this ongoing adaptation are the results of evolution shaped by the need to produce appropriate muscle-based actions.

Given these two principles that numerous CNS areas participate in natural outputs and that adaptive plasticity occurs continually in all these areas, BCI use presents a unique challenge for the CNS, which has evolved and is continually adapting to optimize its natural outputs. In contrast to natural CNS outputs, which are produced by spinal motoneurons and the muscles they control, BCI-based CNS outputs are produced by signals reflecting activity in another CNS area, such as the motor cortex. Activity in the motor cortex is normally one of multiple contributors to natural CNS output. But when its signals control a BCI, this activity becomes the CNS output. In sum, the cortex is given the role normally performed by spinal motoneurons, that is, it produces the final product, the output, of the CNS. How well the cortex performs this new unnatural role depends on how effectively the multiple CNS areas that normally combine to control spinal motoneurons (which are downstream in natural CNS function) can instead adapt to control the relevant cortical neurons and synapses (which are largely upstream in natural CNS function).

The available evidence indicates that the adaptations needed to control activity in the CNS areas that produce the signals used by BCIs are possible but as yet very imperfect. As a rule, BCI outputs are much less smooth, rapid, and accurate than natural muscle-based CNS outputs, and their moment-to-moment and day-to-day variability is disturbingly high. These problems (especially poor reliability) and the different approaches to solving them represent major challenges in BCI research.

2.5 BCI Operation Depends on the Interaction of Two Adaptive Controllers

Muscle-based CNS outputs are optimized to serve the goals of the organism, and the adaptation responsible for this optimization takes place mainly in the CNS. In contrast, BCI outputs can be optimized by adaptations in the CNS and/or in the BCI itself. Thus, a BCI may adapt to the amplitudes, frequencies, and other basic characteristics of the user’s brain signals; it may adapt to improve the fidelity with which its output commands match the user’s intentions; and it may adapt to improve the effectiveness of CNS adaptations and perhaps to guide the CNS adaptive processes.

In sum, a BCI introduces a second adaptive controller that can also change to ensure that the user’s goals are achieved. Thus, BCI usage requires successful interaction between two adaptive controllers, the user’s CNS and the BCI. The management of the complex interactions between the concurrent adaptations of CNS and BCI is one of the most difficult problems in BCI research.

2.6 Choosing Signals and Brain Areas for BCIs

Brain signals acquired by a number of different electrophysiological and hemodynamic methods can be used as BCI inputs. These signals differ greatly in topographical resolution, frequency content, area of origin, and technical needs. The major electrophysiological methods are illustrated in Fig. 2.2. They range from EEG with its centimeter resolution, to the electrocorticogram (ECoG) with its millimeter resolution, to neuronal action potentials with their tens-of-microns resolution. Each of these electrophysiological methods has been used by BCIs and deserves continued evaluation, as do the hemodynamic methods such as functional magnetic resonance imaging (fMRI) and functional near-infrared imaging. Each has distinctive advantages and disadvantages. Which methods are best for which purposes is not known as yet.

Fig. 2.2
figure 00022

Recording locations for electrophysiological brain signals used by BCI systems. EEG is recorded by scalp electrodes. ECoG is recorded by cortical surface electrodes. Neuronal action potentials (spikes) or local field potentials (LFPs) are recorded by microelectrode arrays inserted in the cortex (or in other brain areas). A few large cortical pyramidal neurons are indicated (from [13] with permission)

The role of neuronal action potentials (spikes) as basic units of communication between neurons suggests that spikes recorded from many neurons could provide multiple degrees of freedom and might therefore be the optimum signals for BCIs to employ. In addition, the clear relationships between cortical neuronal activity and normal motor control provide logical starting points for BCI-based control of applications such as robotic arms. On the other hand, the importance of CNS adaptation for all BCIs and the evidence that appropriate training can elicit multiple degrees of freedom from even EEG signals suggest that the difference between the BCI performance possible with single neurons and that possible with EEG or ECoG may not be nearly as large as the difference in their respective topographical resolutions.

The most important point is that questions about signal selection are empirical questions that can be answered only by experimental evidence, not by a priori assumptions about the fundamental superiority of one kind of signal or another. For BCI usage, the crucial issue is which signals can best indicate the user’s intent, that is, which signals are the best language for communicating to the BCI the output that the user wants. This question can be answered only by actual results.

The choice of the optimum brain areas from which to obtain the signals is also an empirical question. The work to date has focused largely on signals from sensorimotor and visual areas of cortex. The BCI capacities of signals from other cortical or subcortical areas are just beginning to be investigated. This is an important aspect of BCI research, particularly because the sensorimotor cortices of many possible BCI users have been compromised by disease or injury, and/or their vision may be impaired. Different brain areas may differ in their adaptive capabilities and in other factors that could affect their capacity to function as the sources of BCI output commands.

3 Signal Acquisition

As discussed earlier, translation of intent into action is dependent on expression of the intent in the form of a measurable signal. Proper acquisition of this signal is important for the functioning of any BCI. The goal of signal acquisition methods is to detect the voluntary neural activity generated by the user, whether the signals are acquired invasively or noninvasively. Each method of signal acquisition is associated with an inherent spatial and temporal signal resolution. Choice of the appropriate method to use in a particular circumstance depends on striking a balance between the feasibility of acquiring the signal in the operating environment and the resolution required for proper translation.

3.1 Invasive Techniques

Invasive acquisition of brain signals for use in BCIs is primarily accomplished by electrophysiologic recordings from electrodes that are neurosurgically implanted either inside the user’s brain or over the surface of the brain. The motor cortex has been the preferred site for implanting electrodes since it is more easily accessible and has large pyramidal cells, which produce measurable signals that can be generated through simple tasks such as actual or imaginary motor movements. Other brain areas such as the supplementary motor cortex, parietal cortex, and subcortical motor areas can also serve as candidate sites for electrode implantation. Information from complementary imaging techniques such as functional Magnetic Resonance Imaging (fMRI) can help determine potential target areas for a specific subject [24]. fMRI measurement of the blood-oxygenation level dependent (BOLD) response has facilitated determination of cortical areas useful for recording of brain activity and has also been shown to provide reliable BCI control across several cortical areas using different cognitive tasks [26].

3.1.1 Intracortical

With chronic recording using implanted microelectrode arrays, the key factors for successful recording are the spatial/temporal resolution of the desired signal, the number and placement of electrodes, and the functional lifetime of the device. A growing number of electrode technologies have been developed to meet these requirements. Among them, wire bundles and electrode arrays are the simpler technologies and are the most widely used implantable electrodes [27]. A few examples of implantable electrodes are shown in Fig. 2.3. These devices have enabled multichannel parallel recording of single-neuron activity and local field potentials. To date, these implanted electrode systems have mostly been used in BCIs in primates [2831]. Such studies have shown that monkeys are able to modulate small groups of neurons to achieve one, two, or three-dimensional control of robotic arms (e.g., [32, 33]). There are also some studies of BCI use of implanted electrodes in a few patients with tetraplegia (e.g., [34, 35]).

Fig. 2.3
figure 00023

A survey of implantable electrodes. (a) Photograph of a four-shank silicon neural probe having four electrode sites arranged near the tip, each terminated in a bond pad at the tab (NeuroNexus Technologies). (b) High-magnification photographs illustrating four different types of site layouts for specialized interfaces (NeuroNexus Technologies). (c) Photograph of a modular 128-site, three-dimensional array made from several multishank planar arrays (NeuroNexus Technologies). (d) Open-architecture probe designs to improve tissue integration. (e) High-density recording of unit activity in rat neocortex. The placement of an eight-shank silicon device in layer V is overlaid on recordings color coded for different electrodes. Note the presence of spikes on several sites of the same shank and lack of the same spikes across the different shanks, indicating that electrodes placed ≥200 μm apart laterally record from different cell populations. (f) Demonstration of functional connectivity in the cortex of behaving animals. A small network of pyramidal cells (red triangles) and putative inhibitory interneurons (blue circles) in layer V of the prefrontal cortex of the rat are mapped (from [27] with permission)

In a notable experiment by the Schwartz’s research group, monkeys controlled a multi-jointed prosthetic arm for direct real-time interaction with the physical environment using neuronal action potentials recorded by an implanted microelectrode array [33]. This study used mathematical models to derive four-dimensional control – velocity of the endpoint in a three-dimensional coordinate frame, and aperture velocity between gripper fingers – from the parallel streams of neuronal activity recorded with the implanted electrodes. In this study, the monkeys fed themselves by controlling a robotic arm. The monkeys could interact with physical objects at a near-natural level via multi-degrees-of-freedom control of the prosthetic device. Figure 2.4 gives a more detailed description of this study [33].

Fig. 2.4
figure 00024

(a) Embodied control setup. Each monkey had its arms restrained (inserted up to the elbow in horizontal tubes, shown at bottom of image) and a prosthetic arm positioned next to its shoulder. Spiking activity was processed (boxes at top right) and used to control the three-dimensional arm velocity and the gripper aperture velocity in real time. Food targets were presented (top left) at arbitrary positions. (b) Timeline of trial periods during the continuous self-feeding task. Each trial started with presentation of a food piece, and a successful trial ended with the monkey unloading (UL) the food from the gripper into its mouth. Owing to the continuous nature of the task, there were no clear boundaries between the task periods (from [33] with permission, © 2008, Nature)

The major advantage of invasive techniques is their high spatial and temporal resolution: recordings can be made from individual neurons at very high sampling rates. Since intracranially recorded single-unit (i.e., single-neuron) activity can obtain more information and may allow faster responses, the requirements for training and attention might be less than that required for noninvasive methods. However, several important challenges need to be met for the success of intracortical recording in BCI technology. The first issue is the long-term stability and reliability of the signal acquisition over the days, months, and years that a person would expect to be able to use an implanted device. Recording single-unit activity with implanted electrodes requires microelectrodes with roughly 20-μm diameter tips that penetrate several mm into the brain parenchyma. Action potentials are detectable not more than 200 μm lateral to the dendritic arbor of the pyramidal cells. Given these relatively small dimensions, stability can be an issue for long-term recording of single-unit activity. For success as a BCI method, implanted electrodes must allow the user to consistently generate the BCI control signal reliably without the need for frequent re-tuning. The second issue is the quality of the signal over long time periods. As illustrated in Fig. 2.5, tissue in the region surrounding the implanted intracranial electrode reacts after the insertion of electrodes. Reaction includes damage to the local tissue and irritation at the electrode–tissue surface, leading to migration of microglia and astrocytes toward the implant site [36]. The long-term effects may be neuronal cell death and increased tissue resistance, resulting in signal variability. There is also the possibility of infection due to the surgical implantation. Finally, if the device includes a neuroprosthesis that requires a stimulus to activate the disabled limb, this additional stimulus might also produce a significant recurrent effect on the neural circuits that might interfere with the signal of interest. In such cases, BCI systems must be able to accurately detect and adapt to such effects.

Fig. 2.5
figure 00025

Cartoons showing the acute and chronic tissue responses following device insertion. The acute response (a) is characterized by vasculature damage, neuronal injury, plasma protein adsorption, recruitment of activated microglia, and a broad region of reactive astrocytes around inserted devices. The chronic response (b) is characterized by a condensed sheath of cells primarily composed of activated microglia and reactive astrocytes around the insertion sites. Degeneration of neuronal processes and additional neuronal loss may also be seen (from [36], with permission)

It will also be necessary to develop better understanding of the principles by which neural ensembles encode sensory, motor, and cognitive information. In the case of motor control, for instance, the neuroscience of hand orientation is actively being studied and this can be expected to further increase the degrees of freedom for prosthetic control [37, 38]. Neural encoding models for dexterous movements, such as individual finger movement, are being investigated for development of more sophisticated control [39]. In closed-loop control designs such as BCIs, further work will be needed to develop models that elucidate learning-related changes in the neural networks so that artificial devices can incorporate such models for more robust control [40].

Although there are significant research applications for advanced neural interface technologies, the ultimate goal of these technologies is to restore lost function to people with disabilities and to gain better understanding of human nervous system function. Studies using invasive recording techniques in human subjects have thus far been limited. Only a few severely disabled peoples have been implanted with electrodes. Donoghue and his colleagues reported that a 96-microelectrode array was implanted in the primary motor cortex of a person with tetraplegia three years after spinal cord injury [34]. This person was able to gain neural control of a prosthetic device with up to two degrees of freedom. Further advancements in microelectrodes, however, are required to reliably obtain stable recordings over a long term (so far maximum three years reported, [41]). In addition to the areas mentioned earlier, research is focusing on minimizing the number of cells needed for useful control and on providing concurrent sensory feedback to the nervous system via electrical stimulation. For a comprehensive review of neurorobotic research using invasive techniques, see Chapter 3 in this book.

3.1.2 Cortical surface

A less invasive approach, though still requiring surgical implantation of the recording device, is electrocorticography (ECoG). This technique, in which an electrode array is implanted subdurally over cortex, has been used mainly in preparation for surgery in people with epilepsy. As is the case for EEG recording, this technique takes advantage of the fact that most large cortical neurons are orientated perpendicular to the cortical surface and that locally synchronized activity within a cortical column can sum to yield a detectable signal. Subdural electrodes are closer to neuronal structures in superficial cortical layers than EEG electrodes placed on the scalp and therefore the signals that they record have higher amplitude (as well as a broader frequency bandwidth). Whereas scalp electrode recordings represent synchronized activity from a large number of neurons and synapses over extended regions of cortex [42], subdural recordings are sensitive to smaller sources of synchronized neuronal activity. Subdural recordings also have a higher signal-to-noise ratio than scalp recordings and have increased ability to record and study gamma activity (i.e., activity >30 Hz). Since gamma activity has been shown to be well correlated with the surrounding single-unit activity recorded by penetrating microelectrodes [43], ECoG can yield an effective representation of the underlying cortical electrical activity with less invasiveness and more stability than penetrating microelectrodes, albeit still invasive.

The standard clinical electrodes used for ECoG monitoring in epilepsy patients typically have diameters on the order of a few millimeters. Although finer than scalp electrodes, this dimension is still much larger than that of a typical cortical column. Therefore, most studies involving subdural ECoG use gross motor movements to determine tuning parameters. It was shown that overt movements as well as motor imageries are accompanied not only by relatively widespread mu and beta event-related desynchronization (ERD), but also by a more focused event-related synchronization (ERS) in the gamma frequency band [191]. In the first closed-loop ECoG-based BCI, study subjects quickly learned to modulate high-frequency gamma rhythms in motor cortical areas and in Broca’s speech area to control a one-dimensional computer cursor in real time [44]. Subsequent studies achieved two-dimensional control of a computer cursor using the upper arm region of motor cortex for one dimension and the hand region of motor cortex for the other dimension [45, 46]. Other investigators explored distinctly human traits such as speech and language processing that cannot be analyzed in an animal model and have had success using gamma activity from a speech network to control a cursor in one dimension [47]. The subjects used self-selected imagery to modulate gamma band activity at one or more specific electrodes. This represents a new approach in ECoG-based BCIs.

3.2 Noninvasive Techniques

There are many methods of measuring brain activity through noninvasive means. Noninvasive techniques reduce risk for users since they do not require surgery or permanent attachment to the device. Techniques such as positron emission tomography (PET), functional magnetic resonance imaging (fMRI), functional near-infrared spectroscopy (fNIRS), magnetoencephalography (MEG), and EEG have been used to measure brain activity noninvasively.

3.2.1 EEG

EEG is the most prevalent method of signal acquisition for BCIs. EEG recording has high temporal resolution: it is capable of measuring changes in brain activity that occur within a few msec [48]. The spatial resolution of EEG is not as good as that of implanted methods, but signals from up to 256 electrode sites can be measured at the same time [49]. EEG is easy to set up, portable, inexpensive, and has a rich literature of past performance. The practicality of EEG in the laboratory and the real-world setting is unsurpassed. EEG recording equipments are portable and the electrodes can be easily placed on the subject’s scalp by simply donning a cap. In addition, since EEG systems have been widely used in numerous fields since their inception more than 80 years ago, the methods and technology of signal acquisition with this modality have been standardized. Finally, and most important, the method is noninvasive.

Many EEG-based BCI systems use an electrode placement strategy based on the International 10/20 system as detailed in Fig. 2.6. For better spatial resolution, it is also common to use a variant of the 10/20 system that fills in the spaces between the electrodes of the 10/20 system with additional electrodes. Nevertheless, EEG-based BCI control with several degrees of freedom can be achieved with just a few electrodes (e.g., [51, 52]).

Fig. 2.6
figure 00026

Placement of electrodes for noninvasive signal acquisition using EEG. This standardized arrangement of electrodes over the scalp is known as the International 10/20 system and ensures ample coverage over all parts of the head. The exact positions for the electrodes are at the intersections of the lines calculated from measurements between standard skull landmarks. The letter at each electrode identifies the particular subcranial lobe (FP Prefrontal lobe, F Frontal lobe, T Temporal lobe, C Central lobe, P Parietal lobe, O Occipital lobe). The number or second letter identifies its hemispherical location (Z: denotes line zero and refers to an electrode placed along the cerebrum’s midline; even numbers represent the right hemisphere; odd numbers represent the left hemisphere; the numbers are in ascending order with increasing distance from the midline) (from [50] (web edition at http://butler.cc.tut.fi/~malmivuo/bem/bembook/in/in.htm), with permission)

Over the past few decades, EEG-based BCIs have been widely investigated in healthy human subjects, as well as in people with amyotrophic lateral sclerosis (ALS) and in those with severe CNS damage from spinal cord injuries and stroke resulting in substantial deficits in communication and motor function. By modulating their EEG signals, users were able to acquire control of two or three degrees of freedom and reach targets in a virtual three-dimensional space [53, 54], with accuracy comparable to that reported in studies using intracranial recordings [31]. An example of fine control was demonstrated by the He’s research group [52, 54] in studies that showed that human subjects could fly a virtual helicopter to any point in a 3D virtual world using control of EEG signals recorded from scalp (Fig. 2.7). In this study, subjects were given the opportunity for continuous multidimensional control to fully explore an unconstrained virtual 3D space; they learned to fly the helicopter to any target point in the 3D space.

Fig. 2.7
figure 00027

A diagrammatic representation of an EEG-based BCI system. Using motor imagery paradigm, human subjects control the three-dimensional movement of a virtual helicopter. Raw EEG is temporally and spatially filtered to produce individualized control signal components. These components are weighted and digitized in a subject-specific manner and output to influence control in the virtual world (from [52], with permission)

Compared with invasive BCIs, EEG-based BCI methods have the advantages of no surgical risk, signal stability, and low cost. However, since EEG represents scalp manifestation of brain electrical activity from a distance, it has a lower signal-to-noise ratio than many invasive methods. The spatial resolution of EEG is also reduced by the volume-conduction effect [55, 56]. Many noninvasive BCIs are based on classification of different mental states rather than decoding kinematic parameters as is typically done in invasive BCIs. Various mental strategies exploiting motor, sensory, and cognitive activity detectable by EEG have been used to build communication systems. In these systems, typically one mental state corresponds to one direction of control and four independent mental states are generally required for full two-dimensional control. Therefore, a substantial period of training is typically required for users to develop the skill to maintain and manipulate various mental states to enable the control. This can be quite demanding for users, especially disabled users [51, 191]. Other investigators attempted to directly decode the kinematic information related to movement or motor imagery and have reported success in revealing information about the (imagined) movement direction and speed from the spatiotemporal profiles of EEG signals [25, 57, 192]. In a closed-loop experiment by Bradberry et al. [58] using the direct decoding of kinematic information, subjects were able to attain two-dimensional control after a much shorter training (~40 min) than that reported for other EEG-based two-dimensional BCIs.

It will also be important to develop better understanding of the mechanisms of information encoding in EEG signals. It has been demonstrated that detailed kinematic information, not simply gross mental states, is represented in the distributed EEG signals [25, 57, 59]. Interestingly, brain signals recorded on the scalp surface and those recorded intracranially reveal similar encoding models [28, 60], suggesting that knowledge gleaned from invasive BCIs could be transferred to the understanding of EEG-based BCI signals. This might further advance noninvasive BCI technology and thereby possibly achieve high degrees of control and reduce training requirements.

Source analysis has been widely used to estimate the sources of the brain activity that produces noninvasively recorded signals such as EEG (See Chapter 12 of this book for details). The rationale behind this approach is the linear relationship between current source strength and the voltage recorded at the scalp. Thus, one may estimate equivalent current density representations in regions of interest from noninvasive EEG or MEG recordings. He and colleagues proposed to use such EEG-based source signals to classify motor imagery states for BCI purposes [61]. Several groups have reported promising results from source analyses as compared to results from the scalp EEG data [24, 6266].

The use of source estimation in BCI applications involves increased computational cost due to the need to solve the inverse problem. On the other hand, such source analysis transforms signals from sensor space back to source space and may lead to enhanced performance due to the use of a priori information in the source estimation procedure.

3.2.2 MEG

MEG measures the magnetic induction produced by electrical activity in neural cell assemblies. The magnetic signal outside of the head is on the order of a few femtoteslas, one part in 109 or 108 of the earth's geomagnetic field. Such tiny fields are currently detectable only using the SQUID (superconducting quantum interference device). In using this method, it is also necessary to provide shielding from external magnetic signals, including the earth's magnetic field. Thus, MEG recording requires a laboratory setting. A modern MEG system is equipped with an array of up to ~300 gradiometers evenly distributed in a helmet shape with an average distance between sensors of 1 ~ 2 cm.

MEG has similarities to EEG. MEG and EEG are, respectively, magnetic and electric fields produced by neuronal and synaptic activity. Both methods sense synchronized brain activity. MEG detects only the tangential components of a neural current source, whereas EEG is sensitive to both tangential and radial components. Importantly, like EEG, MEG is also a noninvasive recording technology. Studies using electrophysiological source imaging techniques have also located common cortical sources underlying the control provided by the EEG- and MEG-based BCIs [66, 67]. Meanwhile, other investigators reported that kinematic parameters are similarly represented in MEG and EEG recordings, since the key information is embedded in the lower frequency ranges [59]. Nonetheless, the high-frequency information in MEG signals is actively being investigated for neural encoding. Notably, it was found that in human subjects who are planning a reaching movement, the 70–90 Hz gamma-band activity originating from the medial aspect of the posterior parietal cortex (PPC) was synchronized and direction sensitive [68]. These results in human subjects are compatible with the functional organization of monkey PPC derived from intracranial recordings. From the viewpoint of BCI research, these findings may suggest new approaches for developing control signals utilizing such high-frequency components in MEG or in EEG as well [69].

An advantage of MEG over EEG is that magnetic fields are less distorted by the skull layer than are electric fields [70]. Thus, it was hoped that MEG would be able to reduce the training time or increase the reliability of BCIs. However, studies so far have shown that the performance and training times for EEG- and MEG-based BCIs are comparable [67, 71]. In addition, the instrumentation necessary for MEG is more sophisticated and more expensive than that for EEG and most importantly the current MEG recording device is not portable. These factors have tended to discourage BCI research using MEG recording.

3.2.3 fMRI

Functional magnetic resonance imaging or functional MRI (fMRI) [7274] measures changes in the blood flow (i.e., the hemodynamic response) related to neural activity in the brain. It samples very large numbers of spatial locations spanning the whole brain and provides an ongoing stream of information from the many measurement points at the same time. Compared to prior methods for acquiring brain signals, fMRI therefore provides measurements that are highly distributed and highly parallel, on the order of millimeter resolution. For example, a modern MRI scanner can currently sample from ~216 spatial locations per second, each location (i.e., each voxel) with a dimension on the order of 3x3x3 mm. In fMRI, the same volume is sampled repeatedly at short, regular intervals (for example, once per second) using an imaging contrast, such as the blood-oxygen-level-dependent (BOLD) contrast [75], that is sensitive to the hemodynamic response. The intensities of BOLD contrast are related to the changes in the deoxyhemoglobin concentration in the brain tissue. When neurons are activated, increases in blood flow are associated with increases in local glucose metabolism and increases in local oxygen consumption. The changes in local deoxyhemoglobin concentration are reflected in the brightness of the MRI image voxels at each time point. It has also been reported that a strong colocalization of fMRI activation and electrophysiological sources exist during hand movement and motor imagery [66, 76]. fMRI imaging is thought to be quite safe. It does not use an exogenous contrast agent. Typically, it does not involve any invasive procedure, injections, drugs, radioactive substances, or X-rays. It requires an instrument providing a strong external magnetic field and radio-frequency energy pulses.

fMRI images can be processed in real time as they are collected, namely as real-time fMRI (rtfMRI) [77], so that the resulting information is immediately available and can thus be used for feedback purpose. For example, the mental states inferred from the rtfMRI can be used to guide a person’s cognitive process or a clinician’s interventions in the case of psychiatric disorders. fMRI has high spatial resolution since the three-dimensional volume information is directly sampled for very small voxels and enables the detection of activity in all areas of the brain including some deep structures such as the amygdala. In contrast, EEG/MEG measurements near the surface of the head are made far from these locations and the spatial resolution for EEG/MEG source imaging of deep brain activity is quite limited at the present time.

On the other hand, an essential limit of rtfMRI or fMRI lies in its underlying mechanism: it measures changes in blood flow rather than neuronal activity. The technique is therefore inherently indirect and noisy. Most important, there is an intrinsic delay of several seconds in the response of fMRI, no matter how fast the images can be obtained. This means that the feedback given to a subject is delayed by several seconds. This could affect the usefulness of rtfMRI in many BCI applications.

3.2.4 NIRS

Functional near-infrared spectroscopy (fNIRS) is another noninvasive technique. It utilizes light in the near-infrared range (700 to 1,000 nm) to determine the oxygenation, blood flow, and metabolic status of localized cortical regions. It is similar to BOLD-fMRI in terms of the imaging contrast, that is, it measures the hemodynamic response. It can produce relatively well-localized signals with a spatial resolution on the order of centimeters and it provides information related to neural activity. However, since the images rely on the shallow-penetrating photons, NIRS operates effectively only for brain structures that are on or near the brain surface. NIRS is also inherently limited in its imaging contrast (i.e., hemodynamic responses) which results in a temporal resolution on the order of seconds and a delay of several seconds for feedback. Thus, in terms of information transfer rate, fNIRS-based BCIs are likely to be less effective than BCIs based on electromagnetic signals. Compared to fMRI, it stands as a compromise between imaging capability and practical usability (i.e., fNIRS is inexpensive and portable). Its flexibility of use, portability, and affordability make NIRS a viable alternative for clinical studies and possibly for practical use.

3.3 Neural Signals Used by BCIs

3.3.1 Sensorimotor Rhythms

Electromagnetic recording from the brain at rest exhibits endogenous oscillatory activity that is widespread across the entire brain. As shown in Fig. 2.8 this activity can be split into several bands. This spontaneous activity consists mainly of oscillations in the alpha-frequency band (8–13 Hz), which is called the mu rhythm when focused over the sensorimotor cortex and the visual alpha rhythm when focused over the visual cortex. This idling oscillation is thought to be caused by complex thalamocortical networks of neurons that create feedback loops. The synchronized firing of the neurons in these feedback loops generates observable oscillations. The frequency of oscillations decreases as the number of synchronized neurons increases. The underlying membrane properties of neurons, the dynamics of synaptic processes, the strength and complexity of connections in the neuronal network, and influences from multiple neurotransmitter systems also play a role in determining the oscillations.

Fig. 2.8
figure 00028

Different signal bands present in the EEG signal. The delta band ranges from 0.5 to 3 Hz and the theta band ranges from 4 to 7 Hz. Most BCI systems use components in the alpha band (8–13 Hz) and the beta band (14–30 Hz). The gamma band, which is just beginning to be applied in BCS, is >30 Hz

Other oscillations detected over the sensorimotor cortex occur in the beta frequency band (14-30 Hz) and in the gamma band (>30 Hz). Together with the mu rhythm, these oscillations recorded over sensorimotor cortex are called sensorimotor rhythms (SMRs). They originate in sensorimotor cortex and change with motor and somatosensory function. These oscillations occur continually during “idling” or rest. During nonidling periods, however, these oscillations change in amplitude and/or frequency, and these changes are evident in the EEG or MEG. Task-related modulation in sensorimotor rhythms is usually manifested as amplitude decrease in the low-frequency components (alpha/beta band) (also known as event-related desynchronization (ERD) [78]). In contrast, an amplitude increase in a frequency band is known as event-related synchronization (ERS) [78]. For example, it has been found that the planning and execution of movement lead to predictable decreases in the alpha and beta frequency bands [78]. Also, as illustrated in Fig. 2.9, many studies have demonstrated that motor imagery can cause ERD (and often ERS) in primary sensorimotor areas [78, 8085]. Such characteristic changes in EEG rhythms can be used to classify brain states relating to the planning/imagining of different types of limb movement. This is the basis of neural control in EEG-based BCIs. Studies have demonstrated that people can learn to increase and decrease sensorimotor rhythm amplitude over one hemisphere using motor imagery strategies, and thereby control physical or virtual devices (e.g., [8, 5154, 66]).

Fig. 2.9
figure 00029

Event-related desynchronization (ERD) and event-related synchronization (ERS) phenomena before and after movement onset. ERD/ERS is a time-locked event-related potential (ERP) associated with sensory stimulation or mental imagery tasks. ERD is the result of a decrease in the synchronization of neurons, which causes a decrease of power in specific frequency bands; it can be identified by a decrease in signal amplitude. ERS is the result of an increase in the synchronization of neurons, which causes an increase of power in specific frequency bands; it can be identified by an increase in signal amplitude (from [79], with permission, © 2001 IEEE)

Fig. 2.10
figure 000210

Different slow cortical potential (SCP) signals to convey different intents. SCPs are caused by shifts in the dendritic depolarization levels of certain cortical neurons. They occur from 0.5 to 10 s after the onset of an internal event and are thus considered a slow cortical potential (from [89], with permission)

3.3.2 Slow cortical potentials

A completely different type of signal measured by EEG is the slow cortical potential (SCP) that is caused by shifts in the depolarization levels of pyramidal neurons in cortex (Fig. 2.10). Negative SCP generally reflects cortical activation, while positive SCP generally reflects reduced activation. People can learn to control SCPs and use them to operate a simple BCI [86, 87].

3.3.3 The P300 event-related potential

The P300 is an endogenous event-related potential (ERP) component in the EEG and occurs in the context of the “oddball paradigm” [88]. In this paradigm, users are subject to events that can be categorized into two distinct categories. Events in one of the two categories occur only rarely. The user is presented with a task that can be accomplished only by categorizing each event into one of the two categories. When an event from the rare category is presented, it elicits a P300 response in the EEG. As shown in Fig. 2.11, this is a large positive wave that occurs approximately 300 msec after event onset. The amplitude of the P300 component is inversely proportional to the frequency of the rare event is presented. This ERP component is a natural response and thus especially useful in cases where either sufficient training time is not available or the user cannot be easily trained [90]. P300-based BCIs are the only BCIs in current daily use by severely disabled people in their homes (e.g., [91]).

Fig. 2.11
figure 000211

P300 ERP component. When the user sees objects randomly flashed on a screen, the P300 response occurs when the user sees the flash of the object the user is looking for (or wishes to select), while the flashes of the other objects do not elicit this response. The amplitude of the P300 component is inversely proportional to the rate at which the desired object is presented and occurs approximately 300 ms after the object is displayed. It is a natural response and requires no user training. (From [89], with permission)

3.3.4 Event-related potentials

Exogenous event-related potentials (ERPs) are responses that occur in the EEG at a fixed time after a particular visual, auditory, or somatosensory stimulus. The most common way to derive ERP from EEG recording is aligning the signals according to the stimulus onset and then averaging them. The number of stimuli averaged typically range from a few (e.g., in BCI applications) to hundreds or thousands in other neuroscience research. ERPs are sometimes characterized as “exogenous” or “endogenous.” In general, exogenous ERPs are shorter latency and are determined almost entirely by the evoking stimulus, while endogenous ERPs are longer latency and are determined to a considerable extent by concurrent brain activity (e.g., the nature of the task in which the BCI user is engaged).

ERPs are related to the ERD/ERS described earlier. ERPs reflect in large part activity in the ongoing EEG that is phase locked by the stimuli. Typically, after averaging, the ERP contains information about very low-frequency components (i.e., <1 Hz). Other components are cancelled out in the process of averaging across repetitions and the information above 1 Hz is poorly represented. An alternative way to characterize task-related EEG signals is to examine the rhythmic activity before averaging, in terms of power (ERD/ERS) or phase. This method does not require averaging and thus can be applied to single trials. Therefore, it is useful for BCI control (although it is still subject to the limitations of its signal-to-noise ratio).

The ERP most commonly used in BCIs is the visual evoked potential (VEP), which occurs in response to a visual stimulus. One frequently used VEP is the steady-state visual evoked potential (SSVEP). SSVEPs and other VEPs depend on the user’s gaze direction and thus require muscular control. To produce such signals, the user looks at one of several objects on a screen that flicker at different frequencies in the alpha or beta bands. Frequency analysis of the SSVEP shows a peak at the frequency of the object at which the user is looking. Thus, a BCI can use the frequency of this peak to determine which object the user wants to select (e.g., [9294]).

3.3.5 Spikes and local field potentials

Both spikes and local field potentials are acquired from microelectrodes implanted through invasive techniques. Spikes reflect the action potentials of individual neurons. Spike trains from several different individual neurons are shown in Fig. 2.12. Since the central nervous system appears to encode information in the firing rates of neurons, recording spiking activity may be extremely useful. It might potentially achieve multidimensional control for a BCI by using the individual firing rates of multiple neurons. Research in this area has so far been limited largely to animals due to the invasive procedures required to implant the electrodes, as well as to a lack of electrodes that reliably produce stable recording over long periods of time.

Fig. 2.12
figure 000212

Neuronal action potentials (i.e., spikes). These signals are voltage spikes produced by individual neurons. They are used to determine the average firing rates, temporal patterns, and functional correlations of neuronal firing. The top frame shows spikes recorded from several different idling neurons (note differences in amplitude) and the bottom frame shows spikes from a larger number of active neurons (from [95], with permission)

Local field potentials (LFPs) represent mainly synchronized events (largely in the frequency range of <300 Hz) in neural populations. The major sources of LFPs are synaptic potentials (which are also the major sources for EEG/MEG/ECoG). Other integrative soma-dendritic processes, including voltage-dependent membrane oscillations and after-potentials following soma-dendritic spikes, can contribute to LFPs. LFPs and their different band-limited components (e.g., theta (4-7 Hz), alpha, beta, gamma) are tightly related to cortical processing. Gamma-band LFP activity is especially tightly coupled to spiking activity. Because LFPs reflect signals from many neurons, their spatial resolution (and possibly their functional specificity) is lower than that of spiking activity.

4 Signal Processing

The goal of BCI signal processing is to extract features from the acquired signals and translate them into logical control commands for BCI applications. A feature in a signal can be viewed as a reflection of a specific aspect of the physiology and anatomy of the nervous system. Based on this definition, the goal of feature extraction for BCI applications is to obtain features that accurately and reliably reflect the intent of the BCI user. See Fig. 2.1 for typical components of a BCI.

4.1 Feature Extraction

The goal of all processing and extraction techniques is to characterize an item (i.e., the desired user selection) by discernable measures whose values are very similar for those in the same category but very different for items in another category. Such characterization is accomplished by choosing relevant features from the numerous choices available. This selection process is necessary since unrelated features can cause the translation algorithms to have poor generalization, increase the complexity of calculations, and require more training samples to attain a specific level of accuracy.

In addition, even though a BCI user is able to generate detectable signals that convey her or his intent, signal acquisition methods also capture noise generated by other unrelated activity in or outside of the brain. Thus, it is important that feature extraction maximizes the signal-to-noise ratio.

4.1.1 Artifact/noise removal and signal enhancement

Artifact or noise removal plays an important role in EEG-based BCIs. Since signals are often captured across several electrodes over a series of points in time, existing methods concentrate on either spatial-domain processing or temporal-domain processing or both. To minimize noise in the signal, it is important to understand its sources. First, noise can be captured from neural sources when brain signals not related to the target signal are recorded. Noise can also be generated by nonneural sources such as muscular movements, particularly of the facial muscles (e.g., [96]). This type of noise in EEG is especially important as signals generated by muscular movements may have much higher amplitudes and can easily be mistaken for actual EEG activity. The problem is further complicated when the frequencies and scalp locations of the nonneural noise and the chosen EEG features are similar.

Typically non-CNS artifacts are the result of unwanted potentials from eye movements, EMG, and other nonneural sources. They are often more prominent in the EEG than brain signals. Simple instructions to the user to not use facial muscles can help and trials that contain such artifacts can be disregarded, but these approaches are not always adequate to remove such noise. Mathematical operations such as linear transformations and component analyses are also used for artifact removal.

After artifact removal, spatial filtering techniques are useful for enhancing features with a specific spatial distribution. In BCI systems that use mu or alpha rhythms, the selection of spatial filters can greatly affect the signal-to-noise ratio (e.g., [97]). A high-pass spatial filter such as the bipolar derivation calculates the first spatial derivative and emphasizes the difference in the voltage gradient in a particular direction. The surface Laplacian [98105] also acts as a high-pass filter and can be approximated by subtracting the average of the signal at four surrounding nodes from the signal at the node of interest [98]. It is the second derivative of the spatial voltage distribution and thus is effectively a spatial high-pass filter that emphasizes the contributions from the neural areas closest to the recording electrode (node of interest) [104].

Temporal-domain processing techniques are also useful in maximizing the signal-to-noise ratio. These methods work by analyzing the signal across a period of time. Some temporal-domain processing methods such as Fourier analysis require significantly long signal segments, while others such as band-pass filtering or autoregressive analysis can work on shorter time segments. Though all temporal-domain processing methods work well during offline BCI analysis, some of them are not as useful as spatial-domain processing methods during online analysis because of the rapid responses required.

4.1.2 Feature Extraction Methods

The methods for extracting features depend largely on the type of neural signals used in the BCI and the characteristics associated with the underlying neural process. For BCIs based on spiking activity, the goal of feature extraction is to model the spike train and determine the BCI output commands (e.g. movement trajectories) from neural firing rates. Several models have been developed to relate the single or population neural firing rates to the user’s intended movements (i.e., the tuning relationship). The seminal discoveries on this topic were made in the late 1980s (i.e., the cosine-tuning properties of neurons recorded in the primary motor cortex [106]). Later, other investigators determined that the position, velocity (including direction and speed), and other kinematics of hand movement were represented in single- or multiunit activity [60, 107110]. Similar tuning models have also been developed for LFP recordings. Based on these spike/LFP tuning models, several methods such as linear filtering methods and neural networks have been used to extract useful features from the neural firing rates.

For other neural signals (EEG/MEG/ECoG), spatial resolution/specificity is lower and the acquired signal usually reflects activity in larger regions. For these methods, defining features by spatial location is as important as defining them by temporal/spectral characteristics. In order to optimize the spatial information, the channels used for BCI control are usually a selected subset of a few channels. These can be selected with methods such as principal components analysis (PCA) [90] or independent component analysis (ICA) [111], or based on a priori knowledge of the functional organization of the relevant cortical area(s). Recently electrophysiological source imaging methods have also been proposed as a spatial deconvolution approach to extracting spatial information about the features used in a BCI [6163, 66].

In order to define the temporal/spectral parameters of the chosen features, the neural signals are usually subjected to time–frequency analysis. Frequency-based features have been widely used in signal processing because of their ease of application, computational efficiency, and straightforward interpretation. Because these features do not provide time domain information, they are not sensitive to the nonstationary nature of EEG signals. Thus, mixed time–frequency representations (TFRs) that map a one-dimensional signal into a two-dimensional function of time and frequency are used to analyze the time-varying spectral content of the signals. A typical example is the extraction of the ERD feature in sensorimotor rhythms, which can be obtained using a traditional moving-average method [78] (as shown in Fig. 2.13), an envelope-extraction method [82] (Fig. 2.14), or a TFR method based on wavelets [190] (Fig. 2.15). Parametric approaches are also commonly used to estimate the time–frequency features, such as autoregressive (AR) modeling for stationary signals and adaptive autoregressive modeling for nonstationary signals, which are widely implemented in online BCI systems due to their computational efficiency. However, it is worth noting that such parametric modeling approaches usually require predetermined parameters, such as the model order, which can influence BCI performance.

Fig. 2.13
figure 000213

Techniques required to extract ERD and ERS from raw EEG signals. First, the raw EEG signal from each trial is bandpass filtered. Second, the amplitude samples are squared to obtain the power samples. Third, the power samples are averaged across all trials. Finally, variability is reduced and the graph is smoothed by averaging over time samples (from [78], with permission from Elsevier)

Fig. 2.14
figure 000214

Steps of feature extraction for sensorimotor rhythms. It is difficult to detect a coherent component in the raw EEG signal depicted in the top frame because there is a lot of noise in the signal. The second frame shows the signal after being processed through a surface Laplacian filter that focuses on EEG components in a specific spatial frequency range. As shown in the third frame, the signal is then band-pass filtered to isolate the frequencies of interest. The features become evident in the fourth frame as they are extracted by using a grand averaging method over a fixed bin or window size (From Wang and He [83] with permission

Fig. 2.15
figure 000215

Time–frequency representations (TFRs) of sensorimotor rhythms during motor imagery. TFRs were realigned at time = 0 s (dashed line) and the target times were normalized to be 2 s (solid line) (from [66], with permission, © 2008 IEEE)

4.1.3 Feature Selection and Dimensionality Reduction

Feature selection algorithms are used in BCI designs to find the most informative features for determining the user’s intent. This approach is especially useful for BCI designs with high-dimensional input data, as it reduces the dimension of the feature space. Since a feature-selection block reduces the complexity of the translation problem, higher translation accuracies (i.e., higher accuracies of determining the user’s intent) can be achieved.

As discussed in Blum and Langely [112], feature-selection techniques can be divided into three major categories. In the first category, called embedded algorithms, the feature selection is a part of the translation (also called classification) method. The feature-selection procedure adds or removes features to counteract prediction errors as new training data are introduced. Embedded algorithms, however, are of little use when there is a high level of interaction among relevant features.

In the second category, filter algorithms, specific features are selected prior to, and independent of, the translation process. These algorithms work by removing irrelevant features (those providing redundant data or contaminated by noise) prior to training the translation technique. One approach to filtering involves calculating each feature’s correlation with the user’s intent and then selecting a fixed number of features with the highest scores. Another filtering approach derives higher order features based on features from the raw data, sorts these higher order features based on the amount of variance they explain, and then selects a fixed number of the highest scoring features.

The final category consists of wrapper algorithms. Wrapper algorithms select features by using the translation algorithms to rate the viability or quality of a feature set. Rather than selecting a feature set based on the results of the translation, these algorithms use the translation algorithm as a subroutine to estimate the accuracy of a particular subset of features. This type of algorithm is unique to a translation algorithm and particularly useful with limited training data.

For certain situations, existing signals are not sufficient for high accuracy feature extraction. Some methods introduce more signals to capture additional information about the state of the brain (e.g., by using 56 electrodes where only 2 were previously used). For example, the increased spatial data can be processed to derive common spatial patterns. This is achieved by projecting the high-dimensional spatio-temporal signal onto spatial filters that are designed such that the most discriminative information is inherent in the variances of the resulting signals [113].

4.2 Feature Translation

Translation techniques are algorithms developed with the goal of converting the input features (independent variables) into device control commands (dependent variables) that achieve the user’s intent [10]. Translation techniques used widely in other areas of signal processing are adapted to BCI technology. Ideally, the translation algorithm will convert the chosen features into output commands that achieve the user’s intent accurately and reliably. Furthermore, an effective translation algorithm will adapt so as to adjust for spontaneous changes in the features and will also encourage and facilitate the user’s acquisition of better control over the features.

There are numerous types of feature translation algorithms. Some use simple characteristics such as amplitude or frequency, and some use single features. Some advanced algorithms utilize a combination of spatial and temporal features produced by one or more physiological processes. Algorithms currently in use include, but are not limited to, linear classifiers [83, 114], Fisher discriminants [115], Mahalanobis distance-based classifiers [116], neural networks (NN) [80, 117119], support vector machines (SVM) [120], hidden Markov models (HMM) [121], and Bayesian classifiers [122, 123].

Whatever translation algorithm is used, the outcomes of translation can be translated into control commands in two ways, continuous or discrete. The following section details the difference between these two ways of translation.

4.2.1 Continuous feature translation

In continuous feature translation, consecutive output commands are generated continually based on the features. Examples of this translation are the kinematic parameters (e.g., arm position, velocity, etc.) that control a prosthetic arm. The features are usually derived from short-time windowed signals and are then continuously fed into the translation algorithm so that dynamic outcomes are obtained for BCI control. A fixed translation algorithm can be used for continuous feature translation. Algorithms that adapt can often yield better performance. Due to the demands of processing the features in consecutive short-time windows, the choice of feature extraction methods and translation methods should favor those with less computational load, which may not be those algorithms that perform best in offline testing. However, the advantage of using continuous translation is that it allows the users to adjust their strategies in the course of control. This is beneficial for learning by the user as well as by the BCI.

4.2.2 Discrete feature translation

In contrast, discrete feature translation produces periodic commands at fixed intervals. An example of this type of translation is a BCI that uses a P300 signal. A P300-based BCI will typically issue a command every several seconds. Thus, it is particularly suited for applications such as word processing, which requires discrete letter selections, and less suited for applications such as multidimensional robotic arm control, which is best implemented by a continuous series of output commands.

In this section, we briefly introduced signal processing methods used in BCI research. Due to limited space it is impossible to discuss these methods in depth. Several review articles are readily available for further reading with regard to the signal processing methods [124127]. See also Chapter 4 of this book for various feature signal processing and decoding algorithms.

5 Major BCI Applications

5.1 Replacing Lost Communication

An important application for BCI technology is providing a new method for communication so that a person who has lost normal means of communication can interact with his or her external environment. Current BCIs are suitable for environmental control (e.g., temperature, lights, television), for answering yes/no questions, and for simple word processing.

While such communication can be provided through brain control, there are alternative options not involving neural signals. Those who retain control of only a single muscle can often use this for communication. For example, the electric activity associated with finger muscles, eyebrows, or the diaphragm can be used to build an alternative control channel that may be faster and more accurate than current BCIs driven by neural signals. Thus, BCIs are particularly needed for users who lack all muscle control or whose remaining control is easily fatigued or otherwise unreliable. These people include those who are nearly totally paralyzed but retain cognitive function (e.g., people with advanced ALS) and those who have movement disorders that abolish useful muscle control (e.g., people with severe cerebral palsy). Although people with these disorders may have lost the ability to control any muscle movement, their cognitive function may still be intact and they may therefore have the potential to control a BCI and use it to communicate. For these locked-in people, conventional communication methods based on muscle activity may have little to offer them, so that even the simplest BCI-based communication, like the ability to say yes or no, can be extremely valuable.

Thus far, most current BCI research is carried out in healthy subjects. A few studies have been conducted to test the feasibility of BCI communication in severally disabled people in laboratory settings or even in their homes. The transfer of current BCI communication systems into use by severely disabled people for useful purposes faces several challenges. First, the disease states that abolish voluntary muscle control may also impair user control of the signal features used by a BCI. For example, ALS may lead to loss of cortical neurons, which might conceivably affect generation or control of the sensorimotor rhythms or evoked potentials used for BCI-based communication. Thus, it may be important to develop diverse BCI systems that are based on various types of neural signals, so that more options can be provided for different types of brain impairments. Furthermore, damage to prefrontal cortex (e.g., in multiple sclerosis, Parkinson’s disease, or ALS) can impair attention and thereby adversely affect BCI use. For these users, a long-duration training protocol may be problematic. Thus, for these users, BCI systems that require minimal training, such as SSVEP-based systems, may be most suitable.

5.2 Replacing Lost Motor Function and Promoting Neuroplasticity to Improve Defective Function

Perhaps the highest degrees of control achieved so far in BCI development are with neuroprostheses developed for restoring motor function. The state of the art in movement control is multidimensional and point-to-point control of a robotic arm or a virtual object. A design of natural self-feeding with four degrees of freedom has been demonstrated in monkeys using intracranial recordings [33]. In humans, three-dimensional control of a computer cursor [53] or continuous real-time control of flight of a virtual helicopter [52, 54] based on noninvasive EEG recordings has been demonstrated using sensorimotor rhythms. A direct decoding of three-dimensional movement trajectory from human EEGs has also been reported [57]. Although these studies have generally controlled a robotic arm or a virtual object, their applications could eventually extend to wheelchair control, vehicle driving, dexterous finger control, or robots for various other functions. Such replacement of motor function could be very valuable for patients who suffer from various degrees of paralysis. It is estimated that there are currently over two million people in the United States suffering from paralysis. Additionally, every year there are approximately 12,000 new cases of spinal cord injury alone in the United States. The list of causes of paralysis is extensive and includes: stroke, cerebral palsy, ALS, multiple sclerosis, muscular dystrophies, trauma, and other neurodegenerative conditions. Many individuals suffer from permanent loss of motor function. A neuroprosthesis, therefore, offers an opportunity to get back a useful substitute for normal motor control. While conventional options based on limited muscle activity may also provide such function, BCI-operated neuroprostheses could provide an embodied prosthetic control that is directly related to the user’s intention. For example, when users want to move their arms, they could instead move a robotic arm by communicating with the BCI their intention to move their own arms. They would not have to use different muscle activity, such as eye blinking, to move a robotic arm.

Another exciting possible application of BCI technology is promoting neuroplasticity to restore lost function. Many studies have shown that training for and using BCIs can lead to changes in neural activity that facilitate use of prosthetic devices, especially when combined with functional electric stimulation (FES) [128, 129]. Such learning-related changes are especially important for people with brain injuries, such as those who have suffered strokes. In a study using MEG recordings, patients with chronic hand hemiplegia after stroke successfully learned to use motor imagery to control their sensorimotor rhythms, and they were able to use a BCI to control an orthotic device that opened and closed their paralyzed hands [130]. As shown in Fig. 2.16, subjects’ performances steadily improved as they learned to use the device. Comparison between the early and late training stages revealed enhanced sensorimotor rhythms in the ipsilesional hemisphere, which was the hemisphere used to control the device. Several randomized controlled studies have indicated that assisting movement with FES coupled to BCI use can substantially improve upper-limb function in individuals who have been mildly to moderately [132, 133] or severely [134] impaired by stroke. Studies with both invasive and noninvasive BCIs also indicate that learning-related changes can occur over days to months [135]. Interestingly, once users have learned to operate a neuroprosthesis with a BCI they retain this skill months later without intervening use [51], suggesting a long-term learning-related change in neural circuits. Thus, BCIs might be used to help actually restore motor function by promoting beneficial neuroplasticity in neuromuscular pathways.

Fig. 2.16
figure 000216

Patients with chronic hand hemiplegia after stroke were trained to move a cursor on a screen via modulation of ipsilesional sensorimotor mu rhythm recorded by MEG. Successful trials with the BCI resulted in the opening or closing of the patient’s paralyzed hand via a mechanized orthosis. This figure shows the results from three patients. (a) The performance of these patients across sessions indicates that the proportion of successful trials increased over time. The statistical maps for the correlations between sensorimotor mu rhythm amplitudes from signals recorded from sensors above the ipsilesional primary motor cortex and successful performance at b (early) or c (late) training time points demonstrate modulation of sensorimotor rhythms with BCI training. Red and yellow colors identify areas where there was a high degree of correlation. (d) Single axial MRI scans obtained for each patient. Each patient’s lesion is highlighted in red (from [131], with permission, © 2011 Nature)

5.3 Supplementing Normal Function

BCI technology may also be used to supplement normal neuromuscular function. This is particularly true when considering BCI applications for use in the daily life of healthy individuals for the purpose of enhancing quality of life or functionality. One potential application is to aid navigation by means of BCI use. Controlling a computer cursor represents one such application aimed not only at helping disabled people to gain control of external devices, but also serving as a means for healthy individuals to control external devices without using normal neuromuscular channels. Studies have shown promise in accomplishing navigation in a virtual world, including moving a computer cursor [51, 53], walking in a virtual world [136], and recently, continuous real-time controlling of flight of a helicopter in a three-dimensional virtual campus [52, 54].

A challenge in using BCI technology to supplement normal function is the limited information transfer rate compared with that of normal muscular control. A healthy subject will prefer manual typing over BCI use to accomplish that task. Nevertheless, BCI technology controls may meet the need for cases in which high information transfer rate is not an essential factor and nonmuscular control is desirable.

6 Examples of EEG-Based BCI Systems

With the growing kinds and combinations of signals, feature extraction methods, and translation techniques, the number and variety of different BCI systems are increasing rapidly. Basic research typically starts using offline analyses, where signal acquisition is followed by feature extraction and translation as a separate step. This type of BCI simulation allows researchers to refine and test extraction and translation algorithms before testing them in actual online use. On the other hand, ultimately, any new BCI technique needs to be tested online to assess its performance.

A useful categorization of BCI systems is external versus internal. External BCI systems, also known as exogenous BCI systems, classify based on a fixed temporal context in regard to an external stimulus not under the user’s control. These systems use brain signals evoked by external stimuli, such as VEPs. These BCI systems do not require extensive training but do require a controlled environment and stimulus. Internal BCI systems, also known as endogenous BCI systems, on the other hand, classify based on a fixed temporal context with regard to an internal event. These systems use brain signals evoked by tasks such as motor imagery and usually require significant user training.

6.1 General-Purpose Software Platform for BCI Research

With the advances in BCI research and development that have taken place during the past decade, the number of laboratories conducting BCI research has grown substantially. However, when building new BCI systems, problems often arise in trying to integrate hardware and software from different sources. As more new BCI paradigms are proposed, it is very useful to have a general software platform for comprehensive evaluation of different BCI methodologies.

Such a general platform should readily support different BCI methodologies and facilitate the interchange of data and experimental protocols.

Perhaps the most widely used general-purpose software platform for BCI research is BCI2000 (http://www.bci2000.org/). BCI2000 was developed and is being maintained by the BCI laboratory at the Wadsworth Center, New York State Department of Health, Albany, New York, USA, in collaboration with the University of Tübingen in Germany [137, 138]. Figure 2.17 shows the overall structure of BCI2000. It consists of four modules (Source, Signal Processing, User Application, and Operator Interface) that communicate with each other. BCI2000 supports incorporation of different data acquisition hardware, signal processing routines, and experimental paradigms. BCI researchers can use it to start their research quickly and effectively. Use of BCI2000 is free for academic and research institutions. A detailed description of the BCI2000 software platform and its practical applications can be found in Schalk and Mellinger [138].

Fig. 2.17
figure 000217

BCI2000 design. BCI2000 consists of four modules: Operator, Source, Signal Processing, and User Application. The Operator module acts as a central relay for system configuration and online presentation of results to the investigator. It also defines onset and offset of operation. During operation, information (i.e., signals, parameters, or event markers) is communicated from the Source module to the Signal Processing to the User Application module and back to the Source module (from [138], with permission)

6.2 BCIs Based on Sensorimotor Rhythms

Wolpaw and coworkers developed a BCI system that allows users to control to move a computer cursor in one, two, or three dimensions. The EEG is recorded as the users actively controlled mu and/or beta rhythm power (amplitude squared) at one or several specific electrode locations over sensorimotor cortex. The EEG power spectra are calculated by an autoregressive method to generate the feature vector (e.g., [51, 53]). This methodology provides multidimensional control that is comparable in speed and accuracy to that achieved to date in humans with microelectrodes implanted in cortex [139].

Pfurtscheller and coworkers developed a BCI system that used mu-rhythm EEG recordings measured over sensorimotor cortex. The raw EEG signals were filtered to yield the mu band (8-12 Hz) and then squared to estimate the instantaneous mu power. Five consecutive mu-power estimates during ERD were combined to create a five-dimensional feature vector that was classified using one-nearest neighbor (1-NN) classifier with reference vectors generated by a learning vector quantization (LVQ) method. LVQ is a vector quantization method in which the high-dimensional input space is divided into different regions with each region having a reference vector and a class label attached. During feature translation, an unknown input vector is classified by assigning it to the class label of the reference vector to which it is closest [140].

He and his students investigated the possibility of using BCI control based on sensorimotor rhythms for continuous navigation in a virtual three-dimensional world [52, 54]. Control signals were derived from motor imagery tasks and intelligent control strategies were used to improve the performance of navigation. By using a constant forward flying velocity, three-dimensional navigation was reduced to two-dimensional navigation which allowed human subjects to fly a virtual helicopter to any point in the three-dimensional space [54]. Further studies have enabled human subjects to perform fast, accurate, and continuous control of a virtual helicopter in three-dimensional space [52]. In this BCI system, the virtual helicopter's forward–backward translation and elevation controls were actuated through the modulation of sensorimotor rhythms that were converted to forces applied to the virtual helicopter at every simulation time step, and the helicopter's angle of left or right rotation was linearly mapped, with higher resolution, from sensorimotor rhythms associated with other motor imaginations. These different resolutions of control allow for interplay between general intent actuation and fine control as is seen in the gross and fine movements of the arm and hand. Subjects controlled the helicopter with the goal of flying through rings (targets) randomly positioned and oriented in a three-dimensional space.

6.3 BCIs Based on P300

A BCI based on P300 was first explored by Farwell and Donchin in [7]. The P300-BCI has now become one of the most widely used and successful BCI paradigms.

The P300 is a positive deflection in the ERP, with a latency of 200 to 700 ms after stimulus onset (Fig. 2.11). The response is elicited when subjects attend to a sequence of stimulus events including an infrequently presented target (i.e., the “oddball”) event. The P300 response is typically recorded over central-parietal areas [141]. Most P300-based BCIs use visual stimuli, but systems using auditory stimuli have also been studied. The latter is discussed further in Section 6.5.

Most P300-BCIs use the visual P300 ERP with the row/column paradigm (RCP) [7, 142]. In the RCP, a matrix (e.g., 6 x 6 cells) containing the alphabet, numbers, and other items is presented to the user for selection. The rows and columns of the matrix flash in a random order (see Fig. 2.18). The subject attends to the desired item letter and counts how many times the row and column containing it flashes. Since P300 potentials are prominent only in the responses elicited by the target stimulus, the computer is able, after a sufficient number of repetitions, to identify the row and column that evoke a P300 response. The item at the intersection of this row and column is recognized as the target item, that is, the item desired by the user.

Fig. 2.18
figure 000218

Classical visual P300-based BCI: the row/column paradigm. The rows and columns of the matrix flash in random order. The infrequent event (i.e., the row or column containing the item the BCI user wishes to select) has a 1/6 probability of appearing

The row/column paradigm P300-based BCIs have some inherent sources of errors. Items that are adjacent to the target item (and thus flash in the same row or column) are selected incorrectly more than other nontarget items (i.e., “the adjacency problem”). In addition, when the row and column of the target item flash in succession, the amplitude of the P300 to the second flash may be reduced or have a different morphology (i.e., “the double-flash problem”). Townsend et al. [143] therefore proposed an alternative P300 paradigm called the checkerboard paradigm. An 8 × 9 matrix containing 72 items was virtually superimposed on a checkerboard (Fig. 2.19(A)). The items in white cells and black cells were segregated into two 6 × 6 matrices. The item locations in these two matrices were randomly arranged (Fig. 2.19(B)). During online operation, each stimulus sequence consisted of six virtual rows in the white matrix flashing in order from top to bottom followed by the six virtual rows in the black matrix flashing in order from left to right. After each sequence of stimuli, the positions of the items in each matrix were re-randomized for the next trial. The flashing of a virtual row or column is perceived by the subject as randomly distributed items. In Fig. 2.19(C), the highlighted items in the 8 × 9 matrix are the cells in the first row of the white matrix.

Fig. 2.19
figure 000219

The Checkerboard paradigm for the 8×9 matrix. (a) The 8×9 checkerboard pattern. (b) The two virtual 6×6 matrices. (c) The items in the top row of the white 6×6 virtual matrix is shown on the 8×9 matrix. (From [143], with permission)

With the checkerboard paradigm, the double-flash problem is eliminated because after an item flashes it cannot flash again for a minimum of six intervening flashes. In addition, because adjacent items cannot be included in the same flash group, adjacency errors are generally avoided. Initial results suggest that, compared to the standard row/column paradigm, the checkerboard paradigm is faster, more accurate, and more reliable. BCI users also tend to find it more pleasant. Recently, Jin et al. [145] proposed another alternative P300 paradigm to reduce adjacency and double-flash errors.

Recent studies suggest that gaze shift may play an important role in traditional P300 spellers [146, 147]. However, for severely disabled patients, the ability to control eye gaze may be impaired or totally lost. Liu et al. [148] developed a gaze-independent visual P300-based BCI. It differs from the classical row/column paradigm P300 speller in that each row or column of a 6 × 6 matrix is clustered and presented around the gaze point in a small near-central visual field. The stimulus is composed of an ‘image’ phase with 6 characters in a circle and an ‘interval’ phase without characters (Fig. 2.20). The durations of ‘image’ and ‘interval’phases are 240 and 160 ms, respectively. One basic stimulus sequence consists of 12 ‘image’ phases (similar to the 6 rows and 6 columns in a classical P300 speller) and 12 ‘interval’ phases. The participant is asked to fixate on the center of the circle and note the target character among clustered nontarget symbols by simple visual search in the ‘image’ phase. In this study, eight healthy subjects showed an average accuracy greater than 90% at a rate of about one character per minute. This paradigm may benefit those with severe paralysis and limited gaze.

Fig. 2.20
figure 000220

Schematic view of the stimulus presentation sequencing. See text for explanation (from [148], with permission)

P300-based BCIs are one of the most commonly used BCI systems and one of the few BCI systems that have been tested in severely disabled people (e.g., [91]). Current research focuses on improving system performance such as speed, accuracy, consistency, and user comfort. Hong et al. [149] proposed a new type of BCI speller (i.e., the N200-speller) that uses a motion-onset visual ERP component. This system has the advantage of lower luminance and contrast thresholds and thus reduces the discomfort of bright stimuli.

6.4 BCIs Based on Visual Evoked Potentials

Among noninvasive EEG-based BCIs, systems based on visual evoked potentials (VEPs) have been studied extensively [150, 151]. VEPs recorded over occipital areas are triggered by sensory stimulation of a subject’s visual field. VEPs reflect visual information-processing mechanisms in the brain. Stimulation of the central visual field evokes larger VEPs than does peripheral stimulation. A VEP-based BCI is a tool that can identify a target on which a user is visually fixated via analysis of concurrently recorded EEG. In a VEP-based BCI, each target is coded by a unique stimulus sequence, which in turn evokes a unique VEP pattern. To ensure reliable identification, VEPs derived from different stimulus sequences should be orthogonal, or near orthogonal, to each other in some transform domain (e.g., the frequency domain).

Stimulus sequence design is an important consideration for a SSVEP-based BCI. Depending on the specific stimulus sequence (i.e., the modulation approach) used, current SSVEP-based BCIs fall into four categories: frequency-modulated VEP (f-VEP) BCIs [92, 152154]; time-modulated VEP (t-VEP) BCIs [155157]; code-modulated VEP (c-VEP) BCIs [14, 158161]; and phase-modulated VEP BCIs (p-VEP) [94, 162164].

As shown in Fig. 2.21(a) [151], each target in a frequency-modulated (f-VEP) BCI flickers at a different frequency. This generates a periodic visual evoked response with the same fundamental frequency as that of the flickering stimulus, as well as its harmonics. Because the flicker frequency of f-VEP BCIs is usually higher than 6 Hz, the evoked responses from consecutive flashes of the target overlap with each other. This generates a periodic sequence of VEPs—a steady-state visual evoked potential (SSVEP)—which is frequency locked to the flickering target. As such, f-VEP BCIs are often referred to as SSVEP BCIs. Target identification can be achieved through power spectral analysis. In past decades, the robustness of f-VEP BCI systems has been convincingly demonstrated in many laboratory and clinical tests. The advantages of an f-VEP BCI include simple system configuration, little or no user training, and high information transfer rate (ITR) (30–60 bits/min).

Fig. 2.21
figure 000221

(a) Left, The stimulus sequences of an f-VEP based BCI. Targets flash at different frequencies. Right, The power spectrum of the VEP derived from a target flickering at 10 Hz. (b) Left, The stimulus sequences of a t-VEP based BCI. Target flashes are mutually independent. Right, The evoked response to a single stimulus. (c) Left, The stimulus sequences of a c-VEP based BCI. Right, A sample of time course of the evoked response. (d) Left, The stimulus sequences of a p-VEP based BCI. The phase difference between adjacent targets is 60°. Right, The phase distribution of response signals from stimuli with different phases (revised from [151] and [165, 166] with permission)

As shown in Fig. 2.21(b) [151], in time-modulated VEP (t-VEP) BCIs, the flash sequences of different targets are mutually independent. This may be achieved by requiring that flash sequences for different targets are strictly nonoverlapping or by randomizing the duration of ON and OFF states in each target’s flash sequence. The briefly flashed stimuli elicit visual evoked potentials, which have short latencies and durations.

In a t-VEP BCI, a synchronous signal must be given to the EEG amplifier for marking the flash onset of each target. t-VEPs are time-locked and phase-locked to visual stimulus onset. Thus, since the flash sequences for all targets are mutually independent, averaging over several short epochs synchronized according to the flash onset time of each possible target will produce VEPs for each possible target. Since foveal (i.e., fixation-point) VEPs are larger than peripheral VEPs, the target producing the largest average peak-to-valley VEP amplitude can be identified as the fixation target. Accurate target identification in a t-VEP BCI requires averaging over many epochs. Furthermore, to prevent overlap of two consecutive VEPs, t-VEP BCIs usually have low stimulus rates (4 Hz). Thus, t-VEP BCIs have a relatively low information transfer rate (30 bits/min).

In code-modulated (c-VEP) BCI, pseudorandom stimulus sequences are used. The most commonly used pseudorandom sequence in c-VEP BCIs is the m-sequence. M-sequences have autocorrelation functions that are a very close approximation to a unit impulse function and are nearly orthogonal to its time lag sequence. Thus, in c-VEP BCIs, an m-sequence and its time lag sequence can be used for different stimulus targets. Sample stimulation sequences and their time course of evoked potentials are shown in Fig. 2.21(c) [151]. At the beginning of each stimulation cycle, a synchronous signal, which provides a trigger for target identification, is given to the EEG amplifier. The template matching method is generally used for target identification.

A c-VEP-based BCI system was developed by Sutter [158]. Recently, Bin et al. [161] described a PC-based c-VEP BCI and tested it in five subjects. The average information transfer rate (ITR) reached 108 ±12 bits/min, with a maximum of 123 bits/min for one of the subjects studied.

As shown in Fig. 2.21(d) [165, 166], in a phase-modulated VEP (p-VEP) BCI, several targets flicker at the same frequency but with different phases so that more targets can be presented in less time. Jia et al. [164] proposed a coding method using a combination of frequency and phase information. With this method, they developed a BCI system with 15 targets and only three stimulus frequencies. Through the optimization of lead position, reference phase, data segment length, and harmonic components, the average ITR exceeded 60 bits/min in a simulated online test with ten subjects.

Wang et al. [150] and Bin et al. [151] summarized the pros and cons of VEP-BCIs. The advantages of VEP-BCIs are their simplicity, lower training time, and high information transfer rate. The disadvantages of the system are the need for good gaze control (which people with severe neuromuscular disabilities may lack) and visual fatigue from prolonged fixation.

6.5 BCIs Based on Auditory Evoked Potentials

BCIs that use visual stimuli have been shown to be effective as we discussed earlier. However, some severely disabled people may have difficulty using a BCI that requires good vision due to compromised vision or loss of eye movement control. Nevertheless, even in severely paralyzed patients, such as those suffering from amyotrophic lateral sclerosis (ALS), hearing is usually preserved. Thus, a BCI based on auditory evoked potentials (AEP-BCI) becomes an alternative paradigm.

Auditory evoked potentials (AEPs) are the brain’s response to external auditory stimuli. Two types of AEP-based BCIs have been explored. One uses auditory stimuli as feedback in order to help subjects learn to regulate their sensorimotor rhythms [167] or to regulate the slow cortical potential [168, 169]. The second type of system uses an auditory “oddball” paradigm [170173]. Most current AEP-based BCIs use an “oddball” paradigm (e.g., [172, 173]). As in the case of the visual P300 described earlier in this chapter, the auditory stimuli in auditory oddball BCIs are divided into two types: frequently presented nontargets and rarely presented targets. For example, spoken digits could comprise a stimulus sequence. The digits would be presented in random order and used to represent the possible selections. In the sequence, all the digits would be standard nontarget stimuli except for one target stimulus, i.e., the subject’s desired choice. The subject is instructed to pay attention to the target digit and perform a mental task when the target digit is spoken (e.g., count each time it is heard). The auditory event-related potentials (auditory ERPs) in response to the target stimulus are similar to those in visual P300-based BCIs. An auditory spelling system was proposed by Furdea et al. [172] and tested with four ALS patients [174]. To compare a user’s performance with the auditory and visual modalities, a 5x5 visual support matrix was displayed to the participants. Rows were coded with numbers 1-5 and columns with numbers 6-10. The flashes in a typical visual P300 speller were replaced by spoken digits. As in a visual P300 speller, the subjects using the auditory system were instructed to first select the row number and then the column number containing the target letter. The auditory system was first tested with healthy subjects. Nine of 13 subjects achieved accuracies above 70% [172]. In the study by Kubler et al. [174], four ALS patients used the system and performed above chance level.

Another auditory BCI system was reported by Guo et al. [173]. This system used no visual display support; spoken digits 1–8 were used to represent eight possible selections and the subjects performed a mental task (see Fig. 2.22). During operation, one of the 8 digits is the target that corresponds to the subject’s desired choice. Instead of simple silent counting, the subjects’ task in this paradigm included discriminating the laterality (left or right) or the gender (male or female voice) of the target digit. The spatiotemporal pattern of the target ERP in this paradigm is identified by the N2 component (latency 100-300 ms) and the late positive component (LPC) (latency 400-700 ms). The N2 component represents the auditory processing negativity enhanced by voluntary endogenous attention, and the broad LPC reflects the memory-updating operations. Both N2 and LPC were selected as salient markers of the brain’s response to the attended target (see Fig. 2.23). This task significantly enhances the LPC in amplitude and duration [173].

Fig. 2.22
figure 000222

Auditory BCI scheme with a voice-sequence design: Each sequence consists of eight Chinese voice stimuli in random order; the eight digits are presented once each in a trial. The duration of each stimulus is 200 ms and the ISI is random from 50 to 200 ms

Fig. 2.23
figure 000223

The temporospatial pattern of auditory ERPs from all subjects’ grand averages for the condition “Task-LR.” (a) Grand-average waveform. Solid curves show the averaged waveform at electrode P3 for target stimuli, while dashed curves show the averaged waveform at the same electrode for nontarget stimuli. (b) Grand-average amplitude topographic maps of all subjects at 155 ms (N2) and 505 ms (LPC) (from [173], with permission)

Compared to the visual spelling system, users’ performance with the auditory speller was lower and the peak latencies of the auditory ERPs were longer. However, for severely disabled people with compromised vision or loss of eye movement control, AEP-based BCIs might provide a preferred way to communicate with the external world, and thus are worthy of further study.

6.6 Attention-Based BCI

In a conventional SSVEP BCI system, the subject overtly directs attention to one of the stimuli by changing his or her gaze direction. The attended stimulus elicits enhanced SSVEP responses at the corresponding frequency over occipital brain areas. This kind of system is considered a ‘dependent’ BCI since muscle activity such as that producing gaze shifting may be necessary. Therefore, it might not be usable by people who have lost control of gaze direction.

A large number of psychophysical and neurophysiological studies have shown that people can covertly shift attention to different spatial locations without redirecting gaze [175, 176]. In addition, shifting attention to one out of several superimposed objects can improve behavioral performance (reaction time and accuracy) and increase neuronal responses compared to paradigms in which the object is unattended [177]. Kelly et al. [178, 179] reported a BCI based on spatial visual selective attention. Two bilateral flickers with superimposed letter sequences were presented to the subjects. The subjects covertly attended to one of the two bilateral flickers for target selection. Greater than 70% average accuracy was achieved with this system. Zhang et al. [180] explored a nonspatial visual selective attention-based BCI. Two sets of dots with different colors and flicker frequencies, rotating in opposite directions, were used to induce the perception of two superimposed, transparent surfaces. Because the surfaces flickered at different frequencies, they elicited distinguishable SSVEPs. By selectively attending to one of the two surfaces, the SSVEP amplitude at the corresponding frequency was enhanced so that the subjects could select among two different BCI outputs. This system was tested in healthy subjects in a 3-day online training program. An average online classification accuracy of 72.6 ±16.1% was achieved on the last training day.

Visual selective attention-based BCIs have thus far provided only binary control. However, their performance with gaze independence encourages further study, including the development of a multiple-selection system. These systems may be a good option for paralyzed people who cannot control well gaze direction. It might enable them to achieve control of a BCI by employing covert attention shifts instead of changes of gaze direction.

7 BCI Performance Assessment

A BCI user controls brain signal features that the BCI can recognize and translate into control commands. The performance of BCIs can be affected by the differences among users, by the varying signal-processing abilities of the BCI systems, or by the signal acquisition protocols used in the BCI systems. In order to better understand the impact of these factors, researchers usually assess BCI performance with respect to one factor at a time.

For example, for communication systems, the traditional unit of measure is the amount of information transferred in a unit of time. Therefore, the performance measure can be indicated by bits per trial and bits per minute. This provides a tangible measure for making intra-system and inter-system performance comparisons. For other systems aimed at replacing motor function, it is not only the attainment of the goal (i.e., reaching a target location) that matters, but also how well the continuous trajectories are reconstructed. Therefore, the performance measure can be indicated by statistical measures for goodness of fit, such as the coefficient of determination (r 2).

7.1 User Performance Assessment

The square of the Pearson product–moment correlation coefficient (PPMCC) is denoted as r 2 and has been widely used in assessment of BCI user performance.

The PPMCC between two variables X and Y is defined as the covariance of the two variables divided by the product of their standard deviations:

$$ {\rho_{{X,Y}}} = \frac{{{\rm cov} (X,Y)}}{{{\sigma_X}{\sigma_Y}}} = \frac{{E[\left( {X - {\mu_X}} \right)\left( {Y - {\mu_Y}} \right)]}}{{{\sigma_X}{\sigma_Y}}} $$
(2.1)

where μ X , μ Y , σ X , and σ Y are the mean and standard deviation of X and Y, respectively.

Substituting estimates of the covariances and variances based on samples gives the sample correlation coefficient, commonly denoted by r:

$$ r = \frac{{\sum\limits_{{i = 1}}^n {({X_i} - \bar{X})\left( {{Y_i} - \bar{Y}} \right)} }}{{\sqrt {{\sum\limits_{{i = 1}}^n {{{\left( {{X_i} - \bar{X}} \right)}^2}} \sqrt {{\sum\limits_{{i = 1}}^n {{{\left( {{Y_i} - \bar{Y}} \right)}^2}} }} }} }} $$
(2.2)

where r ranges between +1 and −1. Its square (r 2) then has a value between 0 and 1. A value of r 2 close to 1 indicates a strong linear relationship between X and Y, whereas values close to 0 indicates that there is very little linear correlation.

In BCI systems, user performance can be defined as the level of correlation between the user’s intent and the brain signal feature(s) that the BCI translates into its output commands.

7.2 System Performance Assessment

Many different BCI systems have been studied. They differ in inputs, outputs, translation algorithms, and other characteristics. To compare and evaluate the performance of different BCI systems, an objective measure is required. BCIs provide the capability of communication between brain signals and external devices. Therefore, the information transfer rate (ITR) has been used as one of the primary metrics to evaluate BCI system performance.

Most current BCI systems translate the user’s brain signal features into output commands by a regression method or by a classification method. The former has the advantage of requiring only one translation function for each dimension of the matrix of possible output commands, while the latter requires additional functions as additional output commands are added.

Currently, the most popular method for ITR calculation was defined by Wolpaw et al. [181] and discussed further in McFarland and Krusienski [127]. The definition is a simplified computational model based on Shannon channel theory under several assumptions. The measure of ITR is the bit rate B (bits/symbol) as show in eq. (3).

$$ B = {\log_2}N + P{\log_2}P + (1 - P){\log_2}[(1 - P)/(N - 1)] $$
(2.3)

where N is the number of possible selection, P is the accuracy (probability that the desired selection will be selected), and B is the bits per trial. If the execution time per symbol selection is T, then the bits per minute B t can be calculated as follows.

$$ {B_t} = B^*(60/T) $$
(2.4)

It is worth noting that the use of eq. (3) and eq. (4) is conditional, because the following assumptions were used in the derivation of eq. (3).

  1. (1)

    BCI systems are memoryless and stable transmission channels.

  2. (2)

    All the output commands (i.e., selections) have the same probability of selection \( (p({w_i}) = 1/N) \)

  3. (3)

    The translation accuracy is the same for all the selections \( (p({y_i}/{x_i}) = p({y_j}/{x_j})) \).

  4. (4)

    The translation error is equally distributed among all the remaining selections \( {p_{{j \ne i}}}({y_j}/{x_i}) = \frac{{1 - p({y_i}/{x_i})}}{{N - 1}} \).

  5. (5)

    The translation accuracy is above the chance level.

The resulting ITR by eqs. (3) and (4) depends on both speed and accuracy. Figure 2.24 illustrates the relationship between accuracy and bit rate for different numbers of selections.

Fig. 2.24
figure 000224

Information transfer rate in bits/trial (i.e., bits/selection) and in bits/min (for 12 trials/min) when the number of possible choices (i.e., N) is 2, 4, 8, 16, or 32. As derived from Pierce [182] (and originally from [183]), if a trial has N possible choices, if each choice has the same probability of being the one that the user desires, if the probability (P) that the desired choice will actually be selected is always the same, and if each of the other (i.e., undesired) choices has the same probability of selection [i.e., (1 − P)/(N − 1)], then bit rate, or bits/trial (B), is B = log2 N + Plog2 P + (1 − P)log2[(1 − P)/(N − 1)]. For each N, bit rate is shown only for accuracy ≥ 100 = 1/N (i.e., ≥chance) (from [10], with permission)

In reality, r 2 and ITR are just two factors that can be used for BCI performance assessment. Other factors important for BCI evaluation include invasiveness, training time, ease and comfort of use, cost, and others. The significance of these various factors may vary across different BCI applications.

7.3 BCI Training

The effectiveness of a BCI depends on the capacity of the user to produce brain signals that reflect intent and that the BCI can decode accurately and reliably into output commands that achieve that intent [10, 184]. Control of brain activity is harder to achieve than control of motor activity partly because the user can neither identify nor discern the activity. The user can only comprehend EEG activity through the feedback received from the BCI system. Different BCI systems use different strategies to help users learn to control the crucial brain signals.

7.3.1 Cognitive tasks

Many BCIs ask the user to perform specific cognitive tasks that generate recognizable EEG components (i.e., components that the BCI can decode into intent). Motor imagery (MI) tasks have been the most widely used cognitive task. For each selection, the user imagines or plans one of several motor movements (i.e., left or right hand movement) based on visual or aural cues. Research has shown that this generates brain signals (e.g., from sensorimotor cortex) that can be detected by EEG or fMRI [24, 66]. After several training sessions, the user is usually able to produce a specific pattern of signal features (e.g., amplitudes in specific frequency bands at specific locations) by performing a specific cognitive task [185].

Other cognitive tasks can be used, such as arithmetic (addition of a series of numbers), visual counting (sequential visualization of numbers), geometric figure rotation (visualization of rotation of a 3D object around an axis), letter composition (nonvocal letter composition), and baseline (relaxation). Numerous studies have shown that these tasks produce components detectable in the EEG [24, 25, 87, 140, 185188].

7.3.2 Operant conditioning

In contrast, the operant conditioning approach does not require the user to perform specific cognitive tasks. The focus of this method is to help the user gain automatic control of the device by thinking about anything he or she chooses. The feedback provided by the system serves to condition the user to continue to produce and control the EEG components that have achieved the desired outcome. With continuous practice, the user is able to gain control of the device without necessarily being aware of the specific EEG components being produced (e.g., [8, 53]). It is important to note, however, that the operant conditioning method often uses motor imagery tasks to initially acclimate users to the concept that brain waves can be controlled.

7.3.3 Factors that affect training

Both methods of training, cognitive tasks and operant conditioning, are influenced by many external factors. Some of the most common factors are concentration, distraction, frustration, emotional state, fatigue, motivation, and intentions. It is important to counteract these factors during training by providing ample feedback and varying the duration or frequency of the training sessions.

In addition, the EEG components produced by cognitive tasks are vulnerable to the amount of direction provided to the user. Motor imagery, for example, is subject to issues such as first/third person perspective, visualization of the action versus retrieving a memory of the action performed earlier, imagination of the task as opposed to a verbal narration, etc. Research has yet to prove whether users can effectively control such fine details to produce significant change in the components they produce.

It is important to note that these two methods of training, cognitive tasks and operant conditioning, are not completely different. As BCI use continues with either method, the brain is likely to adapt its signals to optimize BCI performance. Such adaptation occurs with both EEG-based and single-neuron-based BCIs (e.g., [31, 51]). Thus, in terms of long-term BCI use, cognitive tasks may essentially provide starting points (i.e., initially distinguishable brain signal feature vectors that represent specific BCI outputs) for continuing operant conditioning that improves and maintains the user’s production of brain signals that the BCI can decode accurately and reliably into output commands that achieve the user’s intent.

The major focus of BCI development thus far has been to provide communication for severely disabled people. It is possible that some potential users have disorders that are also cognitively debilitating in ways that preclude their control of signals from areas of the brain that may be important for BCI control. The left hemisphere of the brain, for example, is the center of activity for tasks involving language, numbers, and logic, while the right hemisphere is more active during spatial relations and movement imagery. Users need to be paired with the cognitive tasks that best suit their capabilities.

As indicated earlier, it is possible to discern different cognitive tasks based on the EEG components generated when the task is performed. When using a set of cognitive tasks during training, overlap of EEG signals can occur if the tasks require similar skills or cortical areas. It is important to choose tasks with contrasting EEG components for easy discrimination.

Another factor to consider during training is the particular EEG component to use. P300 responses, for example, require less training time than that needed by a user learning to control sensorimotor rhythms. As mentioned earlier, choosing contrasting cognitive tasks accelerates training. It is also important to maintain consistent training regiments to ensure that subjects retain their ability to control their EEG components.

The tasks used in training carry forward into general BCI usage. The method of training, therefore, is associated with the method of signal acquisition. Neuronal activity generated by specific cognitive tasks is focused in specific areas of the brain. This allows signal acquisition to occur over a few electrodes that encompass these areas.

8 Future Expectations and Critical Needs

8.1 Expectations

BCI research and development evokes a great deal of excitement in scientists, engineers, clinicians, and the public in general. This excitement is largely in response to the considerable promise of BCIs. With continued development, they may replace or restore useful function to people severely disabled by neuromuscular disorders. In addition, BCIs might augment natural motor outputs for pilots, surgeons, other professionals, or ordinary citizens for daily activities. They might also give new opportunities and challenges to artists, athletes, and video-gaming enthusiasts. Furthermore, BCIs might conceivably also improve rehabilitation methods for people with strokes, head trauma, and other devastating disorders.

At the same time, it is clear that this exciting future can become reality only if BCI researchers and developers address and resolve problems in crucial areas including: signal acquisition, BCI validation and dissemination, and reliability.

8.2 Signal Acquisition

BCI systems depend on the sensors and the related hardware that record the crucial brain signals. Improvements in this hardware are essential. EEG-based (noninvasive) BCIs should: have electrodes that do not need skin abrasion or conductive gel (i.e., so-called dry electrodes); be small and portable; use comfortable, convenient, and attractive mountings; be easy to set up; work for many hours without needing maintenance; work reliably in any environment; use telemetry rather than connecting wires; and interface easily with many different applications. Reliable performance in all relevant environments may be especially hard to ensure and should therefore be a major research goal.

BCIs that employ implanted electrodes (i.e., invasive BCIs) face a number of complex issues, some of which are not yet fully understood. These systems require hardware that: is safe and completely implantable; stays intact, functional, and reliable for many years; records stable signals for many years; transmits the recorded signals using telemetry; is able to be recharged in situ (or has batteries that last for many years); has external components that are durable, comfortable, convenient, and unobtrusive; and interfaces readily with a range of high-performance applications. While considerable progress has been made in the past few years, it is not yet clear which possible solutions will be most successful, or how successful they can be. Fundamental innovations in sensor technology may be needed for invasive BCIs to achieve their full promise. Much of this critical research will continue to rely mainly on animal studies to supply the technology and justification essential for human trials.

8.3 Clinical Validation and Dissemination

Numerous different noninvasive and invasive BCIs are being developed. As this work proceeds and BCIs start to actually be used clinically, two key questions must be addressed: how good a particular BCI can get (e.g., how capable and reliable) and which BCIs are the best choices for which clinical purposes. To address the first question, each candidate BCI should be optimized and the limits on users’ capacities with it should be determined. Engaging the second question will require some consensus among researchers concerning which applications to use for comparing BCIs and concerning how their performance should be measured. One obvious example is the question of whether BCIs that use intracortical signals can perform better than BCIs that use ECoG signals, or even EEG signals, and if their performance justifies the necessary electrode implantation by surgery. For many people, invasive BCIs will need to perform much better to be considered preferable to noninvasive BCIs. It is as yet unclear whether they can do so. Contrary to widespread expectations, the available data do not provide a clear answer to this critical question [139].

Furthermore, the widespread clinical usage of BCIs by people with disabilities requires definite validation of their real-life value in efficacy, practicality, and effect on quality of life. Such validation depends on multidisciplinary groups able and willing to perform chronic studies of real-life use in complex and frequently difficult environments. These studies, which are just beginning (e.g., [91]), are a critical step if BCIs are to achieve their promise. The results of these studies could also shape the development of BCIs for the general population. The clear validation of BCIs for functional rehabilitation after strokes or in other disorders will be similarly demanding and will necessarily entail direct comparisons with the outcomes of conventional methods alone.

Present-day BCIs, with their modest capabilities, are likely to be useful primarily for people with very severe disabilities. This user population is relatively small, and thus these BCIs are essentially an orphan technology. That is, there is not sufficient incentive for commercial entities to develop and manufacture them and to promote and support their widespread dissemination. Definite evidence that BCIs can help in rehabilitation might considerably increase the potential user population. Furthermore, if and when further research increases BCI capacities and makes them commercially viable, their dissemination will depend on effective business models that provide both financial incentive for the commercial enterprise and sufficient reimbursement to the clinical and technical personnel who will be needed to deploy and support the BCI systems. The best scenario might be one in which BCIs for people with severe disabilities and BCIs for the general population develop synergistically. Development of the former could produce crucial knowledge, hardware, and clinical experience, and development of the latter could provide the commercial incentive, simplifications, and robustness essential for widespread dissemination.

8.4 Reliability

As the previous sections indicate, the future of BCIs depends on improvements in signal acquisition, definitive validation studies, and effective dissemination models. However, these needs are far exceeded by those related to the problem of reliability. For all researchers, with any recording method, signal type, or signal-processing algorithm, BCI reliability remains poor for any but the most basic applications. BCIs adequate for actual use in real life need to be as dependable as conventional muscle-based actions. Without substantial improvements in dependability, the practical value of BCIs will remain confined to the simplest most basic communication functions for people with very severe disabilities.

Effective solutions to this problem require the recognition and engagement of three key issues: the critical role of adaptive interactions in BCI operation; the value of developing BCIs that imitate the distributed functioning typical of the normal CNS; and the need to incorporate additional brain signals and provide additional sensory feedback during BCI operation.

BCIs provide the CNS with the chance to master novel skills in which brain signals substitute for the spinal motoneurons that produce natural muscle-based skills. Muscle-based skills rely for their initial mastery and long-term preservation on continual activity-dependent plasticity in many CNS areas, from the cortex to the spinal cord. This plasticity, which can require practice over many months or even years, allows infants to learn to walk and talk; children to master reading, writing, and arithmetic; and adults to acquire many different athletic and intellectual skills.

Acquisition and maintenance of BCI-based skills, such as robust multidimensional movement control, depend on comparable plasticity (e.g., [31, 5153, 189]). BCI operation requires the successful interaction of two adaptive controllers, the CNS and the BCI. The BCI needs to adapt so that its output commands correspond to the intent of the user. Concurrently, the BCI needs to encourage and facilitate CNS plasticity that improves the reliability and precision with which the brain signals encode the intent of the user. In summary, the BCI and CNS need to work together to master and maintain a partnership that is reliable in all circumstances. The work required to realize this essential partnership has just started. It engages basic neuroscientific questions and may produce valuable new insights into CNS function. Thus, BCI research has importance for neuroscience in general, independent of the practical uses that are the primary focus of most BCI research and development.

The fundamental importance of CNS adaptation implies that the key problems in BCI research are neurobiological. The principles that determine how the CNS masters, improves, and preserves its natural muscle-based skills are likely to be the best guide for designing BCI systems. CNS control of actions is typically distributed among multiple areas. While cortical areas may define the goal and the broad outlines of an action, the details (especially high-speed sensorimotor interactions) are often managed subcortically. Thus, spinal reflex circuits respond earliest to sudden load changes or postural disturbances; the cortex learns of these events only later and may or may not initiate further corrective responses. Furthermore, control is distributed in the CNS in accord with the demands of the task. Piano playing can require cortical control of every finger individually, while merely grasping an object may not do so.

The performance of BCIs is also likely to benefit from comparable distribution of control. In this case, the distribution would be between the BCI’s output commands (that is, the user’s intent) and the application that receives the commands and then converts them into action. The most effective distribution will probably vary with the BCI and with the application. Reliable BCI performance could be facilitated by putting into the application as much control as is consistent with the action that is to be produced, just as the distribution of control in the CNS normally adapts to fit each muscle-based action.

The natural muscle-based CNS outputs are products of the combined contributions of numerous areas from the cortex to the spinal cord. This reality suggests that BCI performance might be improved and stabilized by employing signals from more than one brain area and by employing brain signal features that represent relationships among different areas (e.g., coherences). By permitting the CNS to operate more in the way it does in producing muscle-based actions, this approach could substantially increase BCI reliability. Research of this kind has recently begun.

Employing signals from multiple brain areas might also mitigate another obstacle to practical BCIs. In present-day systems, the BCI rather than the user generally determines when BCI output commands are produced. Ideally, however, BCIs should be self-paced so that the BCI is continuously available and the user’s brain signals control when BCI output commands are produced. BCI systems that employ signals from multiple areas are more likely to be sensitive to current context, and therefore may be better able to determine when it is or is not appropriate for them to send output commands to their applications.

Lastly, the feedback that present-day BCIs give their users is primarily visual, and thus relatively slow and often imprecise. Natural muscle-based skills rely on multiple types of sensory input (e.g., proprioceptive, cutaneous, visual, auditory). BCIs that control applications that produce complex high-speed movements (e.g., limb movements) would benefit from sensory feedback that is faster, more precise, and more comprehensive than vision alone. Work seeking to provide such feedback using stimulators in cortex or elsewhere has begun. The best techniques will almost certainly vary with the BCI, the application, and the user’s disability (e.g., peripheral inputs may not be useful in many people with spinal cord injuries).

9 Conclusion

Numerous researchers throughout the world are realizing BCI systems that only a few years ago might have been considered science fiction. Figure 2.25 illustrates the publication years of essentially all peer-reviewed BCI articles that have appeared to date and shows that a majority of all the articles ever published have appeared just in the past few years. These BCIs use a variety of different brain signals, recording techniques, and signal-processing methods. They can operate a wide variety of different applications, including communication programs, cursors on computer screens, wheelchairs, and robotic arms. A small number of people with severe disabilities are already employing BCIs for simple communication and control functions in their everyday lives. With improved signal-acquisition hardware, definitive clinical validation, effective dissemination models, and, most importantly, better reliability, BCIs could become a major new technology for people with disabilities - and perhaps for the general population as well.

Fig. 2.25
figure 000225

Peer-reviewed BCI articles in the scientific literature. Over the past 15 years, BCI research, which was previously limited to a very few research groups, has become an extremely active and rapidly growing scientific field. The majority of research articles have been published in the last 5 years (from [13], with permission)