Keywords

Introduction

What is it like to control the world with your mind? Psychokinesis (“mind movement” in Greek) is “an alleged psychic ability allowing a person to influence a physical system without physical interaction” (Wikipedia). While there is no evidence that such parapsychological abilities actually exist, the integration of two technologies – BCI and virtual reality (VR) – now allows a wide range of experiences whereby participants can control various aspects of their environment, using mental effort alone.

This chapter is not intended as a tutorial on BCI nor as a tutorial on immersive virtual reality. Rather, we focus on the outcome of bringing these two disciplines together. For recent reviews on brain-computer interfaces, we recommend other sources (Huggins and Wolpaw 2014; Krusienski et al. 2011; van Gerven et al. 2009), and we only provide a brief introduction. In addition, we focus on the human-computer interface aspects, getting into the BCI engineering aspects only when they are relevant.

Most BCI research in humans is done with electroencephalography (EEG) , whereby electrodes are placed on the scalp. Neuroscientific studies overcome the low signal-to-noise ratio of EEG by averaging responses of multiple subjects and multiple events. BCI does not have this luxury, as it requires reasonable accuracy in decoding every single trial, in real time, and thus only a small number of “thought”-based interaction paradigms are possible. In the last two decades, only three EEG-based paradigms have been recruited for BCI. Two of these methods, P300 and SSVEP, are based on evoked potentials and are thus externally driven; i.e., the interaction requires an external stimulus to be provided to the participant, and the participant’s commands are inferred from the neural response to this stimulus. The P300 paradigm utilizes the fact that the infrequent events to which the subject is expecting, based on the so-called oddball paradigm, elicit the P300 component of the event-related potential (ERP) (Donchin et al. 2000). The steady-state visually evoked potential (SSVEP) paradigm utilizes the fact that when the retina is excited by a flickering visual stimulus, the brain generates electrical activity at the same (or multiples of) frequency (Cheng et al. 2002). Although these paradigms are based on brain signals, they can be argued to be functionally equivalent to control using eye gaze (Brunner et al. 2010). The third paradigm is based on subjects imagining moving their left hand, right hand, or legs, which is referred to as motor imagery . This paradigm is internally driven and can be used in ways that intuitively map “thoughts” to functionality. However, it is limited in that it requires extensive training, not everyone can use it (Guger et al. 2003), and its information transfer rate is lower than the other two paradigms.

In this chapter, we focus on virtual reality not only as a technology but also as a conceptual framework. The early pioneer Ivan Sutherland envisioned VR as the ultimate display (Sutherland 1965). Brain-computer interface, in theory, has the potential to become the ultimate interaction device – just “think” of something and it happens. Current state of the art in BCI is, of course, very far from that vision; at the moment, BCI should be referred to as “brain reading” rather than “mind reading,” i.e., it is often based on decoding brain waves rather than decoding mental processes (“thoughts”). Eventually, there may be a one-to-one mapping from brain waves to mental processes, but with the current recording techniques, the brain patterns that can be detected are much coarser than specific thoughts.

The relationship between VR and BCI goes further. Recent attempts in explaining the illusions that can be so powerfully induced by highly immersive VR mostly rely on the sensorimotor contingencies between perception and action (Slater 2009). Thus, unlike more traditional interfaces such as keyboard and mouse, VR is based on body-centered interaction and on the immediate feedback that the participants receive when they move their bodies. BCI, however, allows bypassing the muscles and the body, allowing the brain to directly control the environment. The combination of VR and BCI may thus lead to an extreme state of disembodiment – the closest we can get to being a “brain in a vat” (Putnam 1982). Char Davies, with her VR art pieces Osmose and Ephemere, wanted to challenge the “disembodied techno-utopian fantasy,” by controlling VR by breathing – thus bringing the body back into VR (Davies and Harrison 1996; Davies 2004). In this sense, BCI-VR takes us a step backward: while VR attempts to bring back our whole body into the digital realm, BCI attempts to bypass our bodies (Friedman et al. 2009). Until recently, video games have not been played in a highly immersive setup and thus have not utilized the full consequences of VR. However, at the time of writing, the popularity of the low-cost VR devices suggests that this may change.

Why is VR a natural addition for BCI? First, the reasons to use VR for BCI are the same as for using VR in general: it is the best option for exploring and practicing tasks in an environment that is dynamic and realistic yet controlled and safe. For example, VR can be used for evaluating BCI and training paralyzed patients before they attempt to use the BCI in the physical world (Leeb et al. 2007a). In addition, VR can provide motivation for BCI training, which is often lengthy and tedious; motivation has also been shown to play an important role in BCI used by paralyzed patients (Alkadhi et al. 2005). Emotionally relevant stimuli enhance BCI, and this has led some to embed faces in the visual stimuli used for SSVEP and P300 BCIs, rather than just using letters or abstract symbols. Using BCI in VR is expected to lead to higher emotional responses. An interesting finding relates to changes in heart rate in VR BCI. In “typical” BCI, with abstract feedback, heart rate is expected to decrease, but it has been found to increase in VR BCI (Pfurtscheller et al. 2008); this is another evidence that VR feedback has a different physiological effect on subjects than “typical” BCI.

While developers of both VR and BCI still face many technical challenges, both fields may be at the stage of moving out from the research laboratories into the real world. At the time of this writing, low-cost VR devices are becoming available to the mass market. Low-cost EEG devices , such as the Emotiv EPOC or the Interaxon MUSE device, are also available. Most of these EEG devices are limited in signal quality, but they may be at least partially sufficient for BCI (Liu et al. 2012). There are open software platforms for BCI development and customization. The OpenVibe platform may be an easy way to get started, even for nonprogrammers using visual programming, and it is integrated with a VR environment (Renard et al. 2010).

In this chapter we review over 10 years of BCI-VR research. Our focus will be on human-computer interaction paradigms, and our main goal is to highlight both the constraints and the opportunities of BCI and VR combined. Consequently, the chapter will be divided into four themes: (i) navigation, (ii) controlling a virtual body, (iii) controlling the world directly, and (iv) paradigms beyond direct control.

Navigation: Controlling the Viewpoint

Typically, our brain controls our body in an action-perception loop: the brain sends commands to the muscles for generating motor movement, and sensory information provides feedback to the brain regarding the resulting body motion and its effects on the environment. A natural BCI paradigm would therefore aim at substituting the physical body with a virtual body. Such substitution can take place in two ways. The first is by allowing the participant to perform navigation – implicitly controlling the viewpoint; this can be considered a limited form of first-person view. The second is by providing the VR participant with an explicit control over a virtual body – an avatar .

A typical BCI navigation experiment follows three steps: (i) training, (ii) cue-based BCI, and (iii) free choice navigation task. The training stage is used to establish a first model of the user’s brain activity: the user is provided with a set of discrete instructions, such as a series of left, right, and forward commands, and no feedback is provided. Cue-based BCI is typically similar, but since a model is already available, feedback is provided about what the system “thinks” that the subject is “thinking,” after each trigger. Typically, several sessions of cue-based BCI take place for further training of both the user and the classifier model. Eventually, the goal is to let the users perform a task with free choice, and the subject performs a navigation task. Here, we distinguish between real and fake free choice; in BCI we often prefer fake free choice – we instruct the user to perform specific actions throughout the session – in order to evaluate the BCI performance.

EEG -based BCI suffers from several limitations and constraints as a user input device. Although this varies among the different BCI paradigms, mostly, (i) it is often not 100 % accurate, (ii) it has a long delay, (iii) it has a low information rate, (iv) it requires extensive training, (v) some users cannot perform BCI despite training, (vi) it is difficult to recognize the non-control state, and (vii) it is often synchronous, i.e., the initiation of action and timing are driven by the software.

Most studies to date in BCI-VR used BCI for navigation. The first ever BCI navigation experiment tested whether it can be used in a flight simulator (Nelson et al. 1997). Subjects were trained to control a plane on a single axis in a wide field of view dome display, using a combination of EEG and electrical signals from the muscles – electromyogram (EMG).

In the years 2004–2006, I was fortunate to take part in a set of BCI navigation studies in immersive VR (Friedman et al. 2007a; Leeb et al. 2006; Pfurtscheller et al. 2006). We have integrated the Graz BCI, based on motor imagery , with the VR cave automatic virtual environment (CAVE)-like system (Neira et al. 1992) in UCL, London. We have explored several scenarios. For example, one study included a social scenario whereby the subject sits in a virtual bar room, various virtual characters talk to the subject, and he or she has to rotate left or right to face the character speaking. Rotation was achieved by left- and right-hand imagery, and as a result the virtual bar was rotated. The reason we have eventually focused on a navigation task is that it seemed to provide the best motivation – subjects were competitive and wanted to reach down the virtual street further each time.

Three subjects, already trained with the Graz BCI, performed BCI tasks with three different setups: (i) abstract feedback, (ii) head-mounted display (HMD), and (iii) the CAVE-like system, over a duration of 5 months. In order to assess the impact of the interface on BCI performance, the subjects all went through the order – abstract feedback, HMD, CAVE, HMD, abstract feedback. In order to be able to determine BCI performance, the navigation experiment was trigger based (this is what we referred to as “fake free choice”): the subjects received one of two cues, “walk” or “stop,” and had to respond by feet or right-hand imagery, correspondingly. If the cue was “walk” and they correctly activated feet imagery, they moved forward; otherwise, if they activated hand imagery, they stayed in place. If the cue was “stop” and they correctly activated hand imagery, they stayed in place, and if they incorrectly activated feet imagery, they moved backward. Thus, the distance in the virtual street served as a measure of BCI performance (https://www.youtube.com/watch?v=QjAwmSnHC1Q). This study did not find any consistent performance trend related to the type of interface (abstract, HMD, or CAVE), but the event-related synchronization (ERS) was most pronounced in the CAVE (Pfurtscheller et al. 2006).

Self-paced, asynchronous BCI is more difficult, since the system needs to recognize the non-control (NC) state. Leeb et al. first attempted experimenter-cued asynchronous BCI, i.e., the subject was cued when to rest (move into NC state) and when to move (Leeb et al. 2007c). Five participants navigated in a highly immersive setup in a model of the Austrian National Library, using binary classification: one motor imagery class was selected as the most accurate one in training – left hand, right hand, or feet – and this was compared with NC or no activation. The results indicate a very low false-positive rate of 1.8–7.1 %, but the true-positive rate was also low: 14.3–50 %. The authors suggest that the main challenge in this specific study was that keeping imagery for long durations is very difficult for subjects.

Self-paced BCI navigation based on motor imagery was demonstrated for controlling a virtual apartment (Leeb et al. 2007b). Although successful, we also provide details of the limitations of this study, in order to highlight the limitations of BCI, referred to above. After training, subjects performed a free choice binary navigation (left hand vs. right hand). Walking was along predefined trajectories, subjects had to reach specific targets, but the left/right decisions were made freely. Motor imagery recognition was based on offline processing of a training session, taking the duration between 1.5 s and 4.5 s after the trigger. Separating motor imagery from the NC state in real time was done as follows: classification took place at the sample rate, 250 Hz, and only a unanimous classification over a period of 2 s resulted in an action. This study allowed estimating the delay required to classify motor imagery – between 2.06 s and 20.54 s with a mean of 2.88 s and standard deviation (SD) of 0.52 s. The delay was slightly shorter than in cue-based BCI – 3.14 s. Performance in VR was better than cue-based BCI with abstract feedback, and there were no significant differences between a desktop-based virtual environment and an immersive virtual environment (a “power wall” setup) in BCI performance. Despite extensive training, two out of nine subjects were not able to perform the task, and for the rest, mean error was between 7 % and 33 %.

In Leeb et al. (2007a), we showed that a tetraplegic patient could also navigate immersive VR, in the UCL CAVE-like system, in a self-paced study. The subject was trained over 4 months with the Graz BCI until he reached high performance with one class – activating 17 Hz imagining feet movement. Classification was achieved with a simple threshold on the bandpower of a single EEG channel near Cz for determining “go” or NC. Since the subject’s control was very good, there was no dwell time (minimum time over threshold to activate motion) or refractory period (minimum time between two activations). The virtual environment included moving along a straight line and meeting virtual female characters on the way (https://www.youtube.com/watch?v=cu7ouYww1RA). The subject performed 10 runs with 15 avatar s each and was able to stop in front of 90 % of the avatars. The average duration of motor imagery periods was 1.58 s + − 1.07 s, the maximum 5.24 s, and the minimum 1.44 s.

In a post-experimental interview, the subject indicated that the VR experience was significantly different than his previous BCI training: “It has never happened before, in the sense of success and interaction. I thought that I was on the street and I had the chance to walk up to the people. I just imagined the movement and walked up to them. However, I had the sensation that they were just speaking but not talking to me…” He said that he had the feeling of being in that street and forgot that he was in the lab and people were around him. “Of course the image on the CAVE wall didn’t look like you or me, but it still felt as if I was moving in a real street, not realistic, but real. I checked the people (avatars). We had 14 ladies and 1 man” (actually, there were 15 female avatar s).

Scherer et al. demonstrated a self-paced four-class motor imagery BCI for navigating a virtual environment (Scherer et al. 2008). They combine two classifiers: one “typical,” separating among left-hand, right-hand, and feet imagery, and another to detect motor imagery-related activity in the ongoing EEG . They selected the three top subjects out of eight who performed training, and after three training sessions, they were able to perform cue-based two-class BCI with 71 %, 83 %, and 86 %. The second classifier used two thresholds – one for switching from intentional control (IC) to non-control (NC) and another to switch from NC to IC. The thresholds were applied to the LDA classifier’s output vectors. The task was to navigate a virtual environment and reach three targets, including obstacle avoidance. The second classifier, separating NC and IC, resulted in performance of 80 %, 75 %, and 60 %. The mean true-positive (TP) rates for 8 s action period were 25.1 % or 28.4 %. Adapting the thresholds can yield a higher TP rate but at the cost of more false-positives (FPs). Again, we see that keeping motor imagery for long durations is difficult for subjects.

Given the limitation of motor imagery for BCI, Lotte et al. suggested an improvement in the control technique (Lotte et al. 2010): the navigation commands were sorted in a binary tree, which the subjects had to traverse using self-paced motor imagery – left and right to select from the tree and feet for “undo.” One branch of the tree allowed selection of points of interest, which were automatically generated based on the subject’s location in the VE. Using this interface, users were able to navigate a large environment and were twice faster than when using low-level, “traditional” BCI.

Most BCI-VR navigation studies are aimed at improving the navigation performance. Only a few studies investigate scientific issues around this fascinating setup. In one such example, we compared free choice with trigger-based BCI in the CAVE (Friedman et al. 2010). Ten subjects were split into two conditions: both used left-hand and right-hand imagery to navigate in a VR, but one condition was instructed at each point in time what “to think” and the other condition was not. The subjects in the control condition, which was cue-based, performed significantly better. Post-experimental interviews may have revealed the reason – the subjects were used to being conditioned by the trigger-based training. This highlights the fact that BCI training under strict conditions, while necessary to achieve a good classifier model, might result in mistraining with respect to the target task, which is typically un-triggered.

Larrue et al. compared the effect of VR and BCI on spatial learning (Larrue et al. 2012). Twenty subjects navigated a real city, 20 subjects navigated a VR model of the city using a treadmill with rotation, and eight subjects navigated the same model using BCI. Surprisingly, spatial learning was similar in all conditions. More studies of this type are needed if we want to understand how BCI interacts with cognitive tasks; for example, one limitation of this study is that the BCI required much more time than in the other conditions.

Controlling a Virtual Avatar

VR navigation is equivalent to controlling the virtual camera. This is equivalent to the trajectory of the viewpoint from your eyes when you walk or drive in the physical world. In the physical world, however, you also have a body. In video games, controlling the camera directly is often referred to as “first-person view,” but this is misleading. If you look at yourself now, you will (hopefully) not only see the world around you but also see a body (albeit without a head, unless you are looking at the mirror). The sensation of our own body is so natural that we often forget it, but body ownership has been shown to be highly important for the illusions induced by VR (Maselli and Slater 2013). In this section we focus on studies whereby the visual feedback for the BCI involves a virtual body. Such an experience can be regarded as a radical form of re-embodiment; it is as if the system disconnects your brain from your original body and reconnects your brain to control a virtual body.

Lalor et al. (2005) demonstrated SSVEP control of a virtual character in a simple video game: the subjects had to keep the balance of a tightrope walking character with two checkerboard SSVEP targets. Whenever the tightrope loses balance, a 3 s animation is played, and the subject has to attend to the correct checkerboard to shift the walker to the other side. Thus, the game consists of multiple mini-trials, in controlling two SSVEP targets, with a video game context instead of abstract feedback.

Lalor et al.’s study was a first step, but it did not attempt to provide the subjects with a sense of body ownership, and it was based on arbitrary mapping: gazing at a checkerboard to shift the balance of the character. We have performed a study aimed at checking ownership of a virtual body using motor imagery BCI (Friedman et al. 2007b, 2010). Since this study took place in a CAVE-like system, we opted for third-person embodiment : the subjects sat down on a chair in the middle of the CAVE room and saw a gender-matched avatar standing in front of them, with their back toward the subjects. In one condition the subjects used feet imagery to make the avatar walk forward and right-hand imagery to make the avatar wave its arm, and in the other condition, the control was reversed: hand imagery caused walking and feet imagery caused arm waving. After several training sessions with abstract feedback, three subjects performed the task in eight sessions – four normal and four reversed, in interleaved order. We expected the more intuitive mapping to result in better BCI performance, but the results were not conclusive – one of the subjects did even better in the reverse condition; more studies, with a larger number of subjects, are required to establish the effect of intuitive vs. nonintuitive mapping between imagery and body motion. During the experiment, we have deliberately avoided setting any expectations in the subject regarding body ownership – e.g., in our instructions, we referred to “feet” rather than to “the avatar’s feet” or “your avatar’s feet.” Anecdotally, we have witnessed that one of the subjects, as the experiment progressed, started referring to her avatar as “I” instead of “she.”

A more systematic experiment was carried out by Perez-Marcos et al., intended to induce a virtual hand ownership illusion with BCI (Slater et al. 2009). In the rubber hand illusion (Botvinick and Cohen 1998), tactile stimulation of a person’s hidden real hand in synchrony with touching a substitute rubber hand placed in a plausible position results in an illusion of ownership of the rubber hand. This illusion was reconstructed in virtual reality (Slater et al. 2008), and even a full body illusion was achieved (Ehrsson 2007; Marcos et al. 2009). In the BCI version of this setup, 16 participants went through left-hand vs. right-hand imagery BCI training without receiving any feedback. In the VR stage subjects had their real arm out of view in a hollow box while wearing stereo goggles in front of a “power wall.” The subjects saw a virtual arm and used left-hand imagery to open its fingers and right-foot imagery to close the fingers into a fist. Eight subjects experienced a condition whereby motor imagery was correlated to the virtual hand movement, and eight subjects went through a control condition, in which the virtual hand motion was uncorrelated with the motor imagery. The strength of the virtual arm ownership illusion was estimated from questionnaires, electromiogram (EMG) activity, and proprioceptive drift, and the conclusion was that BCI motor imagery was sufficient to generate a virtual arm illusion; this is instead of the “classic” method for inducing the illusion, which is based on synchronous stimulation of the real and virtual arm.

Evans et al. showed that reduced BCI accuracy, resulting in a lower sensory feedback, results in a decrease in the reported sense of body ownership of the virtual body (Evans et al. 2015). Their results also suggest that bodily and BCI actions rely on common neural mechanisms of sensorimotor integration for agency judgments, but that visual feedback dominates the sense of agency, even if it is erroneous.

The combination of VR, BCI, and body ownership is a promising avenue toward stroke rehabilitation. While BCI and rehabilitation are an active area of research (Huggins and Wolpaw 2014), we are only aware of one study attempting to combine these necessary ingredients (Bermúdez et al. 2013). The authors describe a non-immersive desktop-based setup, which includes a first-person view with only virtual arms visible. They compared among several conditions: passive observation of virtual hand movement, motor activity, motor imagery , and simultaneous motor activity and imagery. The BCI phase included three conditions: left arm stretching, right arm stretching, and none. Unfortunately, the subjects were asked to imagine the avatar moving its hands, rather than imagine moving their own hand, which rules out virtual body ownership. In addition, BCI performance results are not reported. We support the authors’ assumption that the combination of motor imagery and movement is likely to recruit more task-related brain networks than in the rest of the conditions, making such a setup promising for rehabilitation.

More recently, we have performed several studies using a BCI based on functional magnetic resonance imaging (fMRI) to control avatars. FMRI is expensive, is much less accessible than EEG , and suffers from an inherent delay and low temporal resolution, since it is based on blood oxygen levels rather than directly on electrical brain activity. Nevertheless, fMRI, unlike EEG, has a high spatial resolution: in our typical study using a 3 T fMRI scanner, we perform a whole brain scan every 2 s, and each scan includes approximately 30,000 informative voxels. Our studies aim to show that despite its sluggish signal, fMRI can be used for BCI. We suggest that this method would be extremely useful in BCI for paralyzed patients; due to the limitations of noninvasive BCIs (based on EEG or functional near-infrared spectroscopy – fNIRS), there is a growing effort to opt for invasive BCIs (Hochberg et al. 2012). We suggest that prior to surgery, fMRI-BCI can be used for identifying new mental strategies for BCI, localizing brain areas for implants, and training subjects.

In our studies we have allowed subjects to control a virtual body from a third-person perspective (Cohen et al. 2014b) (https://www.youtube.com/watch?v=rHF7gYD3wI8), as well as a robot from first-person perspective (Cohen et al. 2012) (https://www.youtube.com/watch?v=pFzfHnzjdo4). In our experiments the subject, lying down in the fMRI scanner, sees an image projected on a screen (e.g., Fig. 1). We do not use stereo projection, but since the screen covers most of the field of view, the experience is visually immersive. Our subjects were able to perform various navigation tasks, including walking a very long footpath in the jungle (Video: https://www.youtube.com/watch?v=PeujbA6p3mU). Our first version was based on the experimenter locating regions of interest (ROIs) corresponding to left-hand, right-hand, and feet imagery or movement and a simple threshold-based classification scheme (Cohen et al. 2014b). Recently, we have completed an improved version of fMRI-based BCI, based on machine learning, using information gain (Quinlan 1986) for feature (voxel) selection and a support vector machine (SVM) classifier (Cohen et al. 2014a). This allowed us to test more complex navigation tasks and shorten the delay; we show that subjects can control a four-class (left hand, right hand, feet, or NC state) BCI with a 2 s delay with very high accuracy.

Fig. 1
figure 1

The subject lying down in the fMRI scanner (top) sees an avatar lying down in a virtual fMRI scanner (bottom) and controls it using motor movement or imagery

In addition to proving that fMRI-BCI is possible, these studies provided new insights on motor imagery -based BCI. A few anecdotal results came from repeated administration of body ownership questionnaires to the subjects after each experimental session. In one study in which the subjects had to navigate toward a balloon (Fig. 2a) (https://www.youtube.com/watch?v=l1yMd_UFp5s), questionnaires revealed that sense of body ownership over the avatar was significantly higher when using motor imagery as compared to using motor execution for BCI. In another study in which the subjects had to navigate along a footpath (Fig. 2b), subjects seemed to be significantly more confused about their body ownership when the delay was reduced to 2 s; this difference was nearly significant for the question, “I was aware of a contradiction between my virtual and real body,” and significant for the question, “It felt like I had more than one body.”

Fig. 2
figure 2

Snapshots from the fMRI navigation studies: the subjects had to navigate toward a balloon (a) or along a trail (b) (Journal of Neuroengineering (IOPScience))

Due to fMRI’s superior spatial resolution over EEG, it can highlight the differences between motor execution and motor imagery . Figure 3 compares voxels captured by information gain against voxels captured by a general linear model (GLM) analysis, which is typically used in fMRI studies to obtain brain activation patterns. Since each method captures voxels differently, with different thresholds, the figures cannot be directly compared; however, inspection suggests pre-motor cortex activation in motor imagery whereas motor execution was mostly based on the specific body representations in primary motor cortex. In addition, the differential activations were much stronger using motor execution as compared to motor imagery. Figure 4 shows classification results over time comparing motor execution and imagery, showing that using imagery classification accuracy drops faster than it does when using motor execution. The results are based on tenfold cross validation of 150 cues, 50 from each class: left hand, right hand, and feet.

Fig. 3
figure 3

A subset of corresponding slices from S1. The left column shows the GLM contrast (right, left, forward) > baseline (thresholds: t = 4.6 for MM and t = 3.2 for MI), and the right column shows the 1024 voxels with highest information gain selected by our algorithm. The top row shows imagery and the bottom row shows motor movement

Fig. 4
figure 4

A comparison of (a) motor execution (MM) and (b) motor imagery (MI) classification accuracy across six (MM) and three (MI) subjects, between machine learning and ROI-based classification. The TRs have a 2 s duration. Error bars indicate the 95 % confidence interval. The machine learning results were obtained by using either all voxels with information gain above 0 or the smallest number of voxels that permit perfect classification of all training examples. Every repetition time (TR) is 2 s

Taken together, these findings suggest that people find it hard to activate motor imagery and especially to keep it active for long durations. Our evidence from fMRI-based BCI thus corresponds to similar evidence obtained in EEG-based BCI. This indicates that these challenges in activating motor imagery are most likely not the result of the limitations of the specific recorded signals but an inherent difficulty in motor imagery. In another study using real-time fMRI , we suggest that there are significant differences in the ways different brain areas lend themselves to internal control (Harmelech et al. 2015); this was demonstrated in the context of neurofeedback, but should equally apply to BCI. Using fMRI, we may be able to extend the repertoire of BCI interaction paradigms and to find the paradigms that are easiest for subjects.

Controlling the World Directly

In the previous sections, we discussed navigation and virtual reembodiment – using BCI to control a virtual body or its position – these interaction paradigms are based on how we interact with the physical world. But in VR we can go beyond – why not control the world directly?

As an example of a practical approach, consider using a P300 BCI matrix to control a room in VR (Edlinger et al. 2009). This is a simulation of the scenario whereby a paralyzed patient can control a smart home. Such as setup can allow people to rapidly select the specific command out of many different choices. The study suggests that more than 80 % of the healthy population could use such a BCI within only 5 min of training. In a further study this approach was improved using a hybrid approach: SSVEP was used to toggle the P300 BCI on and off, in order to avoid false-positive classifications (Edlinger et al. 2011).

Using this approach, the P300 matrix serves as a BCI remote control. While this is a practical approach, it goes against VR philosophy. Even the best BCI requires several seconds of attention to the P300 matrix for each selection, which is outside the VR display. This greatly reduces the sense of being present in the VR, as demonstrated in another study by the same authors, after they noted that the subjects reported a very low sense of presence (Heeter 1992; Lombard and Ditton 1997; Sanchez-Vives and Slater 2005; Slater 1993; Witmer and Singer 1998) in post-experiment questionnaires. In this follow-up study (Groenegress et al. 2010), post-experiment questionnaires revealed that subjects reported a significantly higher sense of presence in a gaze-based interface as compared with the P300 interface, for controlling the same virtual apartment in the same VR setup.

In-place Control

Given the limitations arising of having the P300 or SSVEP targets outside the VR, several attempts were made to embed the target visual stimuli more naturally into the VR scene. Imagine what it would be like if you could just focus on an object around you and thereby activate it. In fact, one of the first ever BCI-VR studies used this approach by turning the traffic lights in a driving simulation into P300 targets (Bayliss and Ballard 2000; Bayliss 2003). The setup included a modified go-cart and an HMD. Red stoplight was used as the P300 oddball task: most lights were yellow, and the subject was instructed to ignore green and yellow lights and detect red light, which were less frequent.

Donnerer and Steed (Donnerer and Steed 2010) embedded P300 in a highly immersive CAVE-like system and compared three paradigms: (i) spheres arranged in an array, (ii) different objects cluttered around the virtual room, and (iii) tiles – different areas of the virtual world can be selected, instead of objects. Each sphere, object, or tile flashed separately in order to enable its selection by the subject’s P300 response, after eight flashes (16 in the training phase). The setup was successful but results do not show very high accuracy. In addition, the interaction is relatively slow, since sequential flashing of the stimuli is required, as opposed to SSVEP .

Faller et al. have developed such a system using SSVEP, in order to control VR and even augmented reality (Faller and Leeb 2010; Faller et al. 2010). They have achieved high classification results using just two occipital electrodes – O1 and O2. They demonstrate three applications, but in all of them, the BCI is used for navigation rather than for controlling the world. They report an average number of true-positive (TP) events of 8.5, 7.1, and 6.5 per minute.

In a similar study Legeny et al. also demonstrated BCI navigation with embedded SSVEP targets (Legény et al. 2011). They have attempted a more natural embedding, which they call mimesis: rather than controlling buttons or arrows, the SSVEP cues were embedded inside the wings of butterflies. Three butterflies kept hovering around the middle of the screen and were used for navigating forward, left, or right. The wings changed color for SSVEP stimulation and also flapped their wings; the latter did not interfere with SSVEP classification. Feedback about the level of BCI confidence toward one of the classes (distance from separating the hyperplane used by LDA classifier) was also provided in the appearance of the butterflies’ antennas. Since the BCI was self-paced, such feedback is useful, especially when none of the classes are activated. The study was carried out in a 2 × 2 design: overlay/mimesis and feedback/no feedback. Their results indicate that overlay was significantly faster than mimesis, mimesis resulted in higher sense of presence, and feedback had no effect on the sense of presence. The mimesis interaction increased subjective preference and sense of presence, but reduced performance in terms of speed, as compared with a more “standard” SSVEP overlay interface.

The studies by Faller et al. and Legeny et al. used in-place SSVEP , but only for navigation. In my lab we have also developed such in-place SSVEP, but our interaction approach is different – we are interested in using BCI to activate arbitrary objects in the virtual world, as a form of virtual psychokinesis. We have developed a generic system that allows easily turning any object in a 3D scene in the Unity game engine into an SSVEP target. A Unity script is attached to the object, which makes it flicker at a given frequency. Another script connects to the BCI system using user datagram protocol (UDP), assigns different frequencies to different objects, and activates objects in real time based on classifier input. We have shown that this software implementation of SSVEP allows for very high classification rates and robust BCI control.

Given the novel aspect of this interface, we have decided to allow participants to experience a “psychokinesis”-like experience, without telling them that they have such “powers.” We have conducted an experiment in which subjects controlled a brain-computer interface (BCI) without being aware that their brain waves were responsible for events in the scenario. Ten subjects went through a stage of model training in steady-state visually evoked potential (SSVEP)-based BCI, followed by three trials of an immersive experience where stars moved as a response to SSVEP classification. Only then the subjects were explained that they were using a BCI, and this was followed by an additional trial of immersive free choice BCI and a final validation stage. Three out of the ten subjects realized that they controlled the interface, and these subjects had better accuracy than the rest of the subjects and reported a higher sense of agency in a post-study questionnaire (Giron and Friedman 2014).

Furthermore, our study shows that subjects can implicitly learn to use a SSVEP -based BCI (Giron et al. 2014). The SSVEP stimuli were presented in a pseudorandom order in an immersive star field virtual environment, and the participants’ attention to the stimuli resulted in stars moving within the immersive space (Fig. 5). Participants were asked to view four short clips of the scene and try to explain why the stars were moving, without being told that they are controlling a BCI. Two groups were tested: one that interacted implicitly with the interface and a control group in which the interaction was a sham (i.e., the interface was activated independently of the participants’ attention, with the same response frequency). Following the exposure to the immersive scene, the participants’ BCI accuracy was tested, and the experiment group showed higher accuracy results. This finding may indicate that implicit SSVEP BCI interactions are sufficient in inducing a learning effect for the skill of operating a BCI.

Fig. 5
figure 5

The star field experience, responding to SSVEP -based BCI unbeknown to subjects (From Giron and Friedman 2014)

Hybrid Control

Due to its limitations, a promising direction for BCI is to be used as an additional input channel complementing other interaction devices, rather than replacing them. This is true for able-bodied users – BCI cannot compete with keyboard, mouse, or similar devices in terms of information rate and accuracy. A similar case can be made for paralyzed patients: BCI does not need to compete with other assistive technologies, but can be part of a basket of solutions, such that patients can leverage whatever muscle control works best for them, in parallel to using the brain waves as an input signal.

Leeb et al. demonstrated a hybrid BCI for skiing in a CAVE-like system: steering with a game controller and jumping (to collect virtual fish targets) with a feet motor imagery BCI (Leeb et al. 2013). The joystick controller did not deteriorate BCI performance. The BCI was continuous, based on crossing a threshold for 0.5–1.5 s. The threshold was defined for each subject as the mean plus one standard deviation of the classifier output during the time of the fixation cross, and the dwell time was selected as half of the time over this threshold during the imagery period. The detected events were transferred into control commands for the feedback. After every event, a refractory period of 4 s was applied during which event detection was disabled. The study compared using a push button (94–97 % success) with BCI (45–48 % success).

Another form of hybrid BCI involves the combination of two or more BCI paradigms simultaneously. For example, Su et al. used two-class motor imagery for navigation of a virtual environment and P300 over five targets for controlling a device (Su et al. 2011). The control was toggled between P300 and motor imagery rather than simultaneous, and the toggle was automatically activated based on the subject’s location inside the virtual environment: the subject used motor imagery to navigate a virtual apartment and the P300 to control a virtual TV set. Subjects reported that hybrid control was more difficult than standard BCI, but showed no drop in performance.

Beyond Control

So far, we have discussed BCI for direct control of VR, but BCI technologies also allow to be used for other closed-loop interaction paradigms. For example, aspects of the user’s cognitive and emotional state can be computed online, and the application can be adapted accordingly. Applications that are based on automatic recognition of emotions have been studied extensively in the field of affective computing (Picard 1997). A more recent term is passive BCIs, referring to applications that respond to online cognitive monitoring (Zander and Kothe 2011). Despite the great promise of this field, there is very little work, and almost none involving VR.

One question is how to extract emotional and cognitive state from brain signals; this is a major challenge that is still open (Berka et al. 2004; Liu et al. 2011). The other challenge is how to adapt the application to the feedback; in the context of VR, this opens up opportunities for new types of experiences. In one such creative example, affective mood extracted from online EEG was coupled to the avatar in the massive multiuser game World of Warcraft (Plass-Oude Bos et al. 2010). The parietal power of the alpha band was mapped to shape shifting between animal forms in the fantasy world: e.g., increase in parietal alpha is related to relaxed readiness and thus was mapped in the game world to transforming to an elf. The authors do not validate or evaluate the brain activity or the accuracy of the BCI but provide some useful lessons regarding interaction – for example, they use hysteresis and some dwell time in order to avoid shape-shifting too frequently.

Finally, Gilroy et al. suggest a new interaction technique incorporating empathy derived from brain signals which drives interactive narrative generation (Gilroy et al. 2013). Subjects used EEG neurofeedback, based on frontal alpha asymmetry (Coan and Allen 2004; Davidson et al. 1990), to modulate empathic support of a virtual character in a medical drama, and their degree of success affected the unfolding of the narrative. FMRI analysis also showed activations in associated regions of the brain during expression of support. This study demonstrates that there are yet many opportunities for integrating real-time information from brain activity into virtual environments and VR. While some progress can be made with peripheral physiological signals, such as heart rate and its derivatives, electrodermal activity (EDA, “sweat response”), or EMG (indicating muscle activity), the information from the central nervous system is expected to contain more information.

Conclusion and Future Directions

BCI still faces many challenges, but it has matured, especially over the last decade. There is now growing interest in getting BCI out of the laboratory and into real-world applications. For paralyzed patients the goal is restoring basic communications and control abilities. For able-bodied participants, it seems that the greatest potential is in hybrid BCI and passive BCI. In all cases VR is a natural partner for BCI.

Due to the limitations of EEG, there is an effort in exploiting other brain signals. For medical applications, methods such as fMRI and electrocorticogram (ECoG) hold much promise for moving BCI forward. For other applications the devices need to be low cost and noninvasive. FNIRS may allow for novel BCI paradigms, instead or in addition to EEG. Furthermore, we see potential in combining brain signals with other signals, such as from the autonomous nervous system – heart rate and its derivatives, electrodermal activity, and respiration – as well as eye tracking. It remains to be seen whether the value of these joint signals would be greater than their sum and if so how this value can be translated into new interaction paradigms and applications.

The combination of VR and BCI offers radically new experiences. Since both of these fields are young, especially BCI, we have only scratched the surface, and we have barely begun to study the resulting psychological impact and user experience. Each breakthrough in BCI would allow us to provide VR participants with novel experiences.