Keywords

1 Introduction: The Challenge of Walking in VR

Walking is probably the oldest and still most common mode of transportation for humans. Walking allows for easy and intuitive locomotion, and even with eyes closed enables us to remain oriented in our immediate environment with little cognitive effort [80, 97]. This phenomenon is typically ascribed to an (at least partially) automated mental process that spatially updates our egocentric mental spatial representation such as to stay aligned with where we are with respect to our immediate surroundings. Thus, it seems to make sense that we should be able to walk through virtual environments in a similar manner, in the hope that walking will enable us to more easily remain oriented and reach our destination with little effort or cognitive load, just like in the real world. As several chapters in this book discuss in detail, however, enabling humans to use this most intuitive mode of transportation in VR bears many challenges, both from technical and perceptual points of view (see also [37] for a review). Allowing VR users to walk naturally requires them to carry the visual display with them, typically using position-tracked head-mounted displays (HMDs). Although technology is advancing, there are still major technical limitations (e.g., pixel resolution, limited (FOV) of view, and tracking/display latencies) as well as perceptual challenges including spatial misperception such as underestimation of distance [59] or motion sickness [31, 66]. Moreover, allowing for actual and unencumbered walking requires huge tracked free-space walking areas, especially if virtual environments larger than room-sized are intended.

A variety of techniques have been proposed to address these fundamental issues, including virtual walking interfaces, walking-in-place metaphors, or redirected walking. While many of these approaches are promising and discussed in detail in other chapters of this book, they include non-trivial technical challenges, and often either restrict the walking motions or possible trajectories as in the case for re-directed walking (e.g., [111], and Chap. 10 of this book), change the biomechanics of walking fundamentally (as in the case for walking-in-place interfaces, see Chap. 11 of this book) and/or require considerable technical, financial, and safety efforts to implement (as in the case for larger or omni-directional treadmills, where additional safety measures like harnesses are needed). Many of these issues are actively researched, and we are hopeful that most of these issues might be solved eventually.

Treadmills are probably the most promising and most widely used and researched approaches to allow for walking in VEs, as they are commercially available for relatively affordable prices and allow for fairly natural biomechanical cues from walking, especially when augmented with a force-feedback harnesses for linear or omni-directional locomotion ([37], and Chap. 6 of this book). Somewhat counter-intuitively, though, despite allowing for fairly natural walking motions, even the most advanced treadmills do not seem to provide the user with an actual compelling sensation of self-motion unless accompanied with wide-FOV visual motion cues. That is, while actual walking is naturally accompanied with an embodied sensation of self-motion through the environment, even in the absence of visual or auditory cues, walking on a linear treadmill is typically not. Walking can, however, sometimes affect our visual perception: for example, Yabe and Taga [131] showed that walking on a linear treadmill can affect the perception of ambiguous visual motion, similar to motion or action capture phenomena. This “treadmill capture” effect seems to disappear, however, for extended experience of treadmill locomotion in regular treadmill runners [132].

There is little published research on the perception or illusion of self-motion (“vection”) on linear treadmills. Durgin et al. [26] observed, for example, that “during treadmill locomotion, there is rarely any illusion that one is actually moving forward” (p. 401) and continues to state that “people do not have the illusion that they are moving when running on a treadmill, nor do their inertial systems experience any net acceleration” (p. 415). Informal observation, discussions with colleagues, and pilot studies by the authors corroborate the notion that biomechanical cues from walking on linear treadmills hardly ever lead to compelling and reliable sensations of self-motion that matches the walker’s biomechanical motion, even for the most advanced linear treadmills that include force-feedback harnesses.

This might, of course, be related to the lack of any net acceleration cues as Durgin et al. pointed out [26]. Most treadmills simply do not seem to be long enough to allow for sufficient motion cueing and physical translations that would allow for sustained biomechanically-induced linear vection that would approach the intensity and compellingness of self-motion illusions induced by moving visual stimuli (for recent reviews in the context of VR, see [34, 86, 100].

Hence, for the current chapter we will pursue an alternate approach, by focusing not on how to enable realistic walking in VR (which is covered in depth by other chapters in this book), but on how to provide a compelling and embodied sensation of self-motion through computer-mediated environments with minimal or no physical motion of the observer, with or without walking. In particular, we will review and discuss how we can utilize and maximize illusory self-motions (“vection”) that can be induced by visual, auditory, and sometimes biomechanical/somatosensory cues, and how these different cues contribute and interact, often in a synergistic manner.

Especially for visually-induced vection, there is a large body of literature that will provide essential guidelines, and dates back to more than a century ago [33, 60]. Here, we will start with a brief review on visually-induced self-motion illusions, as they have received by far the most attention in research and are known to induce quite compelling vection (Sect. 2.2). After this general introduction to vection, we will review potential relations between walking and perceived self-motion and self-motion illusions (Sect. 2.3). In particular, we will discuss how walking interacts with other sensory information such as visual or auditory motion cues (see Sect. 2.4) and briefly cover further cross-modal effects (Sect. 2.5) and potential relations between vection and simulator sickness in VR (Sect. 2.6). We will discuss both perceptual factors and cognitive contributions (such as participants’ perception/knowledge of whether or not actual self-motion might be possible), and how to best utilize such factors and interactions in VR to provide a compelling and embodied sensation of self-motion through computer-simulated environments while trying to minimize overall costs and efforts (Sect. 2.7). We will continue by discussing how self-motion illusions might facilitate spatial orientation in VR (Sect. 2.8), and conclude by proposing a conceptual framework that integrates perceptual and cognitive factors and is centered on perceptual as well as behavioral effectiveness of VR simulations (Sects. 2.92.10).

2 Visually Induced Self-Motion Illusions

In this section, we will provide a brief review of the literature on self-motion illusions that is relevant for the current context. More comprehensive reviews on visually induced vection are provided by, e.g., [2, 23, 38, 39, 61, 123]. Vection with a specific focus on VR, motion simulation, and undesirable side-effects has more recently been reviewed in [34, 86, 100].

When stationary observers view a moving visual stimulus that covers a large part of the field of view (FOV), they can experience a very compelling and embodied illusion of self-motion in the direction opposite to the visual motion. Many of us have experienced this illusion in real life: For example, when we are sitting in a stationary train and watch a train pulling out from the neighboring track, we will often (erroneously) perceive that the train we are sitting in is starting to move instead of the train on the adjacent track [33]. This phenomenon of illusory self-motion has been termed “vection” and has been investigated for well over a century [33, 60, 114, 122, 127]. Vection has been shown to occur for all motion directions and along all motion axes: Linear vection can occur for forward-backward, up-down, or sideways motion [38]. Circular vection can be induced for upright rotations around the vertical (yaw) axis, and similarly for the roll axis (frontal axis along the line of sight, like in a “tumbling room”), and also around the pitch axis (an imagined line passing through the body from left to right). The latter two forms of circular vection are especially nauseating, since they include a strong conflict between visual and gravitational cues and in particular affect the perceived vertical [11].

2.1 Circular Vection

In a typical classic circular vection experiment, participants are seated inside an upright rotating drum that is painted with black and white vertical stripes (see illustration in Fig. 2.1a), a device called optokinetic drum [16, 23]. After the optokinetic drum starts to rotate around the earth-vertical axis, the onset latency until the participant reports perceiving self-motion is measured, which ranges from about 2–20 s typically, depending on various stimulus and procedural parameters as discussed below.

Fig. 2.1
figure 1

Top-down sketch of different circular vection conditions. a Visual vection induced by an optokinetic drum rotating around the stationary observer. b Auditory vection induced by sound sources rotating around blindfolded listeners. c Biomechanical or “apparent stepping around” vection induced by blindfolded participants stepping along a rotating floor (“circular treadmill”)

Note that vection typically does not occur instantly with the stimulus motion, and takes some time to saturate, as sketched in Fig. 2.2. The strength of the illusion can be measured by a variety of introspective measures including the onset latency and duration of the illusion, or by some indication of perceived speed, intensity, or compellingness of self-rotation, e.g., by magnitude estimation or by letting the participant press a button every time they think they have turned \(90^{\circ }\) [8]. As Riecke et al. point out, one of the challenges for utilizing self-motion illusions in VR is to reduce the vection onset latency and increase the intensity and compellingness of the illusion [94].

Fig. 2.2
figure 2

Schematic depiction of typical stimulus motion and resulting vection time course

The most frequently investigated type of vection is circular vection around the earth-vertical axis (see illustrations in Fig. 2.1). In this special situation where the observer perceives self-rotation around the earth-vertical axis, there is no interfering effect of gravity, since the body orientation always remains aligned with gravity during illusory self-rotation. Roll and pitch vection are consequently harder to induce and can lead to paradoxical sensations of continuous illusory self-rotation despite perceiving only limited overall body tilt of generally no more than \(20^\circ \) [1, 32, 133]. Complete head-over-heals tumbling sensations can, however, be induced when the conflict between rotating visual cues and gravitational cues (from otoliths and somatosensory system) is reduced, e.g., in bilateral vestibular loss patients [22] or in micro-gravity conditions [21, 134]. Alternatively, even under normal gravitational conditions, \(360^{\circ }\) head-over-heals tumbling sensations can be induced in most observers when a fully furnished naturalistic room is rotated around a stationary observer [1, 40, 43, 71].

2.2 Linear Vection

In a similar manner, linear vection can be induced by presenting optic flow patterns that simulate translational motion. The traditional method used to induce linear vection in the laboratory is to use two monitors or screens facing each other, with the participant’s head centered between the two monitors and aligned parallel to the screens, such that they cover a large part of the peripheral visual field [10, 47, 58]. Optic flow presented in this peripheral field induces strong linear vection. For example, Johansson showed that observers perceive an “elevator illusion”, i.e., upward linear vection, when downward optic flow is shown [47]. More recent studies often use a monitor or projection screen in the front of the participant to show expanding or contracting optic flow fields [3, 70]. Comparing different motion directions shows vection facilitation for up-down (elevator) vection, presumably because visual motion does not suggest a change in the gravito-inertial vector as compared to front-back or left-right motion [30, 112].

In recent times, VR technology has been successfully introduced to perceptual research as a highly flexible research tool (see recent reviews by Hettinger [34] and Riecke [86]). It has been shown that both linear and circular vection can be reliably induced using modern VR technology, and the fact that this technology allows for precise experimental stimulus control under natural or close-to-natural stimulus conditions is much appreciated by researchers.

3 Self-Motion Sensation from Walking

Although walking on a linear treadmill cannot itself reliably induce vection, walking in a circular pattern on a rotating disc (“circular treadmill”, see Fig. 2.1c) can induce compelling curvilinear or circular vection [13, 14]. That is, stepping along a circular treadmill in darkness or with eyes blindfolded can induce strong sensations of self-rotation in the direction opposite to the floor motion (i.e., congruent with the walking motion), irrespective of step size and without any net body motion [13, 14]. Several names have been used to refer to this phenomenon, including “apparent stepping around” by Bles and colleagues [13, 14], “podokinetic vection” by Becker et al. [8], or “biomechanical vection” by Bruggeman et al. [18] and Riecke et al. [92]. Note that the mere act of moving one’s leg as if walking but without floor contact does not induce any vection. While the above-mentioned studies reported reliable and consistent biomechanical vection for circular treadmill walking without net body motion, Becker and colleagues observed biomechanical vection only in rare cases: only 25  % of participants occasionally reported biomechanical vection, suggesting that their procedure did not reliably induce vection [7]. As suggested by Becker et al. [18], this unusually low rate of biomechanical circular vection occurrences might be related to the specific instructions used by Becker et al., in that they asked participants to “track angular self-displacement relative to the platform” (p. 461), not relative the surrounding stationary room.

In addition to biomechanically-induced self-motion illusions, Bles and colleagues also reported nystagmus and Coriolis-like effects when participants performed active head tilts, corroborating the strength of vection that can be induced by biomechanical cues [13, 14]. Biomechanical vection from stepping-around occurs similarly in labyrinth-defective patients, although their somatosensory nystagmus was stronger [12]. While actual rotation results in self-rotation illusion after-effects in the direction opposite to the prior motion, circular vection induced by blindfolded stepping along a rotating disc results in illusory self-rotation after-effects in the same direction as the prior perceived self-motion [44].

Apart from walking on a circular treadmill, passive arm or foot movement can induce similar circular vection [15]: Participants sat stationary in complete darkness inside a slowly rotating optokinetic drum (\(10^{\circ }\)/s). When they touched the rotating surrounding wall with their extended hand such that it was passively rotated around their shoulder joint, compelling arthrokinetic circular vection in the direction opposite to the arm movement occurred. Illusory self-rotation occurred within 1–3 s and was indistinguishable from actual self-motion. Arthrokinetic vection was accompanied by arthrokinetic nystagmus and resulted in considerable after-effects [24]. remarked that “actively pedaling the free wheeling floor while seated or turning the railing with a hand-over-hand motion makes the experience very powerful” (p. 766). We are currently investigating the feasibility of such a circular walking paradigm for rotational self-motion simulation in VR (http://iSpaceLab.com/iSpaceMecha).

4 Interaction of Walking and Other Modalities for Vection

4.1 Walking and Auditory Cues

While both biomechanical and visual cues can induce compelling vection, moving auditory cues can elicit self-motion illusions only in 1/4–3/4 of participants, and such auditory vection is much weaker, less compelling, and only occurs when participants are blindfolded (for reviews, see [95, 118]). Despite their low vection-inducing potential, however, moving auditory cues have recently been shown to significantly enhance visually induced vection [88, 118] as well as biomechanically induced circular vection [86]. In the latter study, participants were blindfolded and seated stationary above the center of a circular treadmill. Auditory circular vection was induced by binaural recordings of rotating sound fields presented via headphones (Fig. 2.1b), and biomechanical circular vection was induced by stepping along the floor disc that rotated at the same velocity (\(60\,^{\circ }\)/s) as the auditory stimulus (Fig. 2.1c). Although auditory vection by itself was weak and occurred in less than half of the trials, adding rotating sound fields significantly enhanced biomechanically-induced vection. Moreover, there were synergistic, super-additive effects when combining auditory and biomechanical vection-inducing stimuli, in that bi-modal stimulation resulted in vection intensities and perceived rotation realism that was higher than the sum of the uni-modal vection ratings. This corroborates the importance of consistent multi-modal simulation and suggests that even a fairly weak stimulus can sometimes make a significant contribution. This is also promising from an applied perspective of improving VR simulations, as sound spatialization can be of high fidelity while still being affordable and technically feasible.

4.2 Walking and Visual Cues

4.2.1 Circular Vection

Lackner and DiZio [55] used a circular treadmill inside an optokinetic drum to demonstrate that visual cues that did not match treadmill (i.e., walking) speed systematically affected not only perceived self-motion, but also the perceived stride length and frequency and even the perceived stepping direction. Of particular interest in our context is condition 3 in their experiment, in which participants were stationary and stepped along with the rotating floor disc while the optokinetic drum did not move. Whereas half of the participants ‘correctly’ perceived to be stationary while stepping along a rotating disc, the other half experienced illusory self-motion in the sense that they (erroneously) reported walking forward on a stationary disc while the optokinetic drum was moving along with them. This suggests that biomechanical cues from walking can (at least for some participants) induce self-motion illusions even in the presence of conflicting visual cues, illustrating that visual cues do not necessarily dominate in cross-modal cue conflict situations. This further corroborates the different vection-inducing potential of walking in circular patterns (where biomechanical vection is strong and can even overpower conflicting visual cues) as compared to linear walking, where biomechanical vection does not reliably occur at all. DiZio and Lackner [24] reported that combining biomechanical and visual vection by rotating the disc of a circular treadmill together with the optokinetic drum could even yield immediate vection onset.

Although Jürgens and Becke [49] demonstrated that a Bayesian sensor fusion could be successfully applied to model the rotation perception based on vestibular, biomechanical, visual, and cognitive information, further research is needed to fully explain and predict cross-modal and higher-level effects and contributions. The current data predicts substantial vection benefits for consistent multi-modal stimulation, at least for the case of self-rotation perception. Surprisingly, however, cue combination benefits are much more ambiguous for translational vection, as we will discuss below.

4.2.2 Linear Vection

Whereas walking on a linear treadmill apparently cannot by itself induce a compelling sensation of self-motion (linear vection), it can modulate the occurrence and strength of visually-induced linear vection: Although one would normally assume that perceived self-motion during visual motion simulation in VR should benefit from additional walking cues, a recent study by Kitazaki et al. [52] suggests that providing biomechanical cues from walking on a linear treadmill might, in fact, impair visually-induced vection (see also [51]). Participants watched expanding or contracting optic flow patterns on a \(2.4{\times }1.8\) m projection screen while either standing still or walking forward on a linear treadmill with the same 4 km/h velocity as the visually simulated self-motion. When the visual cues simulated a forward motion, vection occurred later when participants also walked forwards as compared to standing still. An additional study extended these findings by including backwards walking on the linear treadmill [69]. Vection onset was delayed when the visually simulated self-motion matched participants walking direction, that is, in the condition that most closely matches real-world walking.

The authors suggest that this surprising finding might be caused by a decrease of the relative weight of the visual cues when observers are walking as compared to standing still. We propose that this effect might also be related to Wertheim and Reymond’s explanation of the freezing illusion (where an optic flow pattern suddenly appears to freeze when vestibular stimulation is added) and the Pavard and Berthoz effect, in that the perceived relative velocity of the visual motion might be reduced by the biomechanical motion [124]. Additional factors might also have contributed: Apart from affecting the occurrence and amount of vection, differences in the velocity of treadmill walking and visually presented motion can also induce changes in perceived self-motion and stepping movements [25, 55] as well as adaptation and re-calibration (e.g., [25, 98]).

While Kitazaki and colleagues observed an inhibition of vection when locomotor cues matched the direction of visual motion, Seno et al. recently reported the opposite effect [104]: Using visual motions that were 30 times faster than the treadmill walking motions (58 km/h as compared to 2 km/h, respectively), they observed that visually-induced forward vection was facilitated by consistent biomechanical cues, whereas inconsistent walking cues impaired vection. In addition, they showed that locomotion cues from walking on a linear treadmill could systematically bias the strength and direction of vection perceived for up-down and left-right translational visual motion. Comparing the results from Kitazaki et al. and Seno et al. suggests that the differences between visual and walking speed might be critical, with vection facilitation occurring for larger visual motion speeds, and impairment found for matching visual speeds.

A recent study confirmed that forward walking on a linear treadmill can indeed impair visually induced vection when visual and treadmill velocities are matched [4]. Similar impairments of visually-induced linear were observed when the visual display depicted backward motion while participants walked forwards (exp. 2) or when participants simply walked on the spot while viewing forward vection displays (exp. 3). When the head motions that naturally occurred during treadmill walking were tracked and used to update the visual stimulus according to the changed viewpoint (thus mimicking real-world walking), vection strength increased [4]. However, a similar facilitation of vection was observed in passive viewing conditions when participants stood still and simulated viewpoint jitter was added to the visual display, thus confirming earlier studies (see review by Palmisano et al. [72]). Thus, even when head motions were tracked during treadmill walking, vection was still reduced compared to standing still and passively viewing the jittered display.

In conclusion, it remains puzzling how adding velocity-matched treadmill walking to a visual motion simulation can impair vection [4, 52, 69] while active head motions and simulated viewpoint jitter clearly enhance vection [72]. More research is needed to better understand under what conditions locomotion cues facilitate or impair linear vection, and what role the artificiality of treadmill walking might play. Nevertheless, the observation that self-motion perception can, at least under some circumstances, be impaired if visual and biomechanical motion cues are matched seems paradoxical (as it corresponds to natural eyes-open walking) and awaits further investigation. These results do, however, suggest that adding a walking interface to a VR simulator might potentially (at least in some cases) decrease instead of increase the sensation of self-motion and thus potentially decrease the overall effectiveness of the motion simulation. Thus, caution should be taken when adding walking interfaces, and each situation should be carefully tested and evaluated as one apparently cannot assume that walking will always improve the user experience and simulation effectiveness.

5 Further Cross-Modal Effects on Self-Motion Perception in VR

Helmholtz suggested already in 1866 that vibrations and jerks that naturally accompany self-motions play an important role for self-motion illusions, in that we expect to experience at least some vibrations or jitter [33]. Vibrations can nowadays easily be included in VR simulations and are frequently used in many applications. Adding subtle vibrations to the floor or seat in VR simulations has indeed been shown to enhance both visually-induced vection [94, 100] and auditory vection [85, 88], especially if accompanied by a matching simulated engine sound [119, 120].

Vection can also be substantially enhanced when the vection onset is accompanied by a small physical motion (such as a simple jerk of a few centimeters or degrees) in the direction of visually-simulated self-motion. This has been shown for both passive movements of the observer [9, 93, 100, 126] and for active, self-initiated motion cueing using a modified manual wheelchair [84] or a modified Gyroxus gaming chair where participants controlled the virtual locomotion by leaning into the intended motion direction [87]. For passive motions, combining vibrations and small physical movements (jerks) together was more effective in enhancing vection than either vibrations or jerks alone ([100], exp. 6).

These findings are promising for VR applications, as both vibrations and minimal motion cueing can be added to existing VR simulations with relatively little effort and cost. Moreover, these simple means of providing vibrations or jerks were shown to be effective despite being physically incorrect—while jerks normally need to be in the right direction to be effective and be synchronized with the visual motion onset, their magnitude seems to be of lesser importance. Indeed, for many applications there seems to be a surprisingly large coherence zone in which visuo-vestibular cue conflicts are either not noticed or at the least seem to have little detrimental effect [115]. Surprisingly, physical motion cues can enhance visually-induced vection even when they do not match the direction or phase of the visually-displayed motion [128]: When participants watched sinusoidal linear horizontal (left-right) oscillations on a head-mounted display, they reported more compelling vection and larger motion amplitudes when they were synchronously moved (oscillated) in the vertical (up-down) and thus orthogonal direction. Similar enhancement of perceived vection and motion amplitude was observed when both the visual and physical motions were in the vertical direction, even though visual and physical motions were always in opposite directions and thus out of phase by \(180^{\circ }\) (e.g., the highest visually depicted view coincided with the lowest point of their physical vertical oscillatory motion). In fact, the compellingness and amplitude of the perceived self-motion was not significantly smaller than in a previous study where visual and inertial motion was synchronized and not phase-shifted [129]. Moreover, for both horizontal and vertical visual motions, perceived motion directions were almost completely dominated by the visual, not the inertial motion. That is, while there was some sort of “visual capture” of the perceived motion direction, the extent and convincingness of the perceived self-motion was modulated by the amount of inertial acceleration.

Recently, Seno et al. [106] demonstrated that air flow provided by a fan positioned in front of observers’ face significantly enhanced visually induced forward linear vection. Backward linear vection was not facilitated, however, suggesting that the air flow needs to at least qualitatively match the direction of simulated self-motion, similar to head wind.

In two recent studies, Ash et al. showed that vection is enhanced if participants’ active head movements are updated in the visual self-motion display, compared to a condition where the identical previously recorded visual stimulus was replayed while observers did not make any active head-movements [5, 6]. This means that vection was improved by consistent multisensory stimulation where sensory information from own head-movements (vestibular and proprioceptive) matched visual self-motion information on the VR display [6]. In a second study with similar setup, [5] found that adding a deliberate display lag between the head and display motion modestly impaired vection. This finding is highly important since in most VR applications, end-to-end system lag is present, especially in cases of interactive, multisensory, real-time VR simulations. Despite technical advancement, it is to be expected that this limitation cannot be easily overcome in the near future.

In conclusion, there can often be substantial benefits in providing coherent self-motion cues in multiple modalities, even if they can only be matched qualitatively. Budget permitting, allowing for actual physical walking or full-scale motion or motion cueing on 6DoF motion platforms is clearly desirable and might be necessary for specific commercial applications like flight or driving simulation. When budget, space, or personnel is more limited, however, substantial improvements can already be gained by relatively moderate and affordable efforts, especially if consistent multi-modal stimulation and higher-level influences are thoughtfully integrated. Although they do not provide physically accurate simulation, simple means such as including vibrations, jerks, spatialized audio, or providing a perceptual-cognitive framework of movability (see Sect. 7.2) can go a long way. Even affordable, commercially available motion seats or gaming seats can provide considerable benefits to self-motion perception and overall simulation effectiveness [87].

As we will discuss in our conceptual framework in Sect. 2.9 in more detail, it is essential to align and tailor the simulation effort with the overarching goal: e.g., is the ultimate goal physical correctness, perceptual effectiveness, or behavioral realism? Or is there a stronger value put on user’s overall enjoyment, engagement, and immersion, as in the case of many entertainment applications, which represent a considerable and increasing market share?

6 Simulator Sickness and Vection in VR

While a compelling sensation of self-motion in VR clearly increases the overall believability and realism of a simulation, the occurrence and strength of vection can sometimes also correlate with undesirable side-effects like motion after-effects or motion/simulator sickness [34, 35, 50, 73]. It remains unclear, however, whether and how vection might be causally related to simulator sickness, as vection is more easily observed when visuo-vestibular cue conflicts are small, whereas motion sickness tends to increase for larger cue conflicts [50, 73]. Moreover, visually-induced motion sickness can occur without either vection or optokinetic nystagmus [46], indicating that vection cannot be a necessary pre-requisite of visually-induced motion sickness.

Carefully planned research is needed to investigate and disambiguate underlying factors promoting desirable outcomes (like compelling self-motion perception with reduced simulation cost) versus undesirable side-effects (like simulator sickness, after-effects, or (re)adaptation effects) and their potential interactions. As displays become more effective in inducing vection, they might also become more powerful in inducing undesirable side-effects. Thus, applications should be carefully evaluated in terms of not only intended benefits but also potential undesirable side-effects (see also conceptual framework in Sect. 2.9).

7 Perceptual Versus Cognitive Contributions to Vection

While self-motion illusions have traditionally been explained by perceptual (lower-level) factors and bottom-up processes (e.g., stimulus frequency, velocity, or field of view), recent studies provide converging evidence that self-motion illusions can also be affected by cognitive (higher-level) factors and top-down processes. In the following, we will briefly review and discuss relevant findings before attempting to integrate them into a conceptual framework in the final sections of this chapter.

7.1 Lower-Level and Bottom-Up Contributions to Vection

Visually-induced self-motion illusions have clearly received the most attention in vection research so far, and a number of lower-level/perceptual factors and bottom-up processes have been shown to facilitate visually-induced vection, which will be briefly discussed below. More in-depth discussion of lower-level factors and bottom-up contributions for vection can be found in [2, 23, 38, 39, 61, 86, 123].

Visual field of view. Although vection can sometimes be induced using field of views as small as \(7.5^{\circ }\) [3], increasing the field of view subtended by the moving stimulus generally enhances all aspects of vection [10, 16, 23, 32]. Strongest vection is observed with full-field stimulation, up to a point where illusory self-motion cannot be distinguished from physical self-motion any more. When perceived depth is held constant, vection strength linearly increases with increasing stimulus size, independent of stimulus eccentricity [63]. This suggests that most affordable fishtank VR (desktop-monitor-based) and HMDs are unsuitable for reliably inducing compelling vection, as their field of view is typically not sufficiently large.

Eccentricity of moving stimulus. Earlier studies argued that visual motion in the periphery is more effective in inducing vection than central motion [16, 23, 47]. When display areas are equated, however, central and peripheral stimulus areas have similar vection-inducing potential [3, 41, 63, 79, 125]. However, peripheral stimuli need to be of lower spatial frequency to be maximally effective in inducing vection, as our visual acuity systematically decreases in the periphery [76]. From an applied perspective, this suggests that peripheral displays need not be of high resolution unless users frequently need to focus there [125].

Stimulus velocity. Increasing stimulus velocities generally tends to enhance both the perceived velocity and intensity of vection, at least up to an optimal stimulus velocity of, e.g., around \(120\,^{\circ }\)/s for circular visual vection [1, 16, 23, 39, 101]. Note that these maximum effective velocities are larger then the maximum stimulus velocities that can easily be displayed in VR without noticeable and disturbing image artifacts (such as motion blur or seeing multiple images) due to the limited update/refresh rate of typically 60 Hz.

Density of moving contrasts. The occurrence and strength of vection in general increases with the number and density of moving objects and contrasts [17, 23]. This suggests that VR simulations that are too sparse (e.g., driving in fog, or flight simulations in clouds with low density of high-contrast objects) might not be able to reliably induce vection without artificially increasing contrast and/or the density of moving objects.

Viewpoint jitter. A common explanation why vection does not occur instantaneously is the inter-sensory conflict between those cues indicating stationarity (e.g., vestibular cues) and those suggesting self-motion (e.g., moving visual cues or circular treadmill walking). This cue conflict account is corroborated by showing that bilaterally labyrinthine defective participants perceive visual vection much earlier and more intensely [48], and can perceive unambiguous roll or pitch vection through head-over-heels orientations [22]. All the more surprisingly, however, there are situations where increasing visuo-vestibular conflicts can enhance vection, as reviewed in [72]: In a series of carefully designed experiments, Palmisano and colleagues demonstrated that forward linear vection occurred earlier, lasted longer, and was more compelling when coherent viewpoint jitterFootnote 1 was added to the expanding optic flow display [77], whereas incoherent jitter impaired vection [74]. This was found even when the display was perceived as flat and did not contain any depth cues [64]. Overall, simulated viewpoint jitter shows a larger vection-facilitating effect if it is orthogonal to the main vection direction [64, 73, 78]. In VR, such findings could be used to enhance vection by, for example, adding viewpoint oscillations induced by walking or head motions [4, 19] as is sometimes done in gaming. This should be carefully tested, however, as adding image jitter or oscillations can increase not only vection, but also motion sickness [73].

7.2 Cognitive and Top-Down Contributions to Vection

While earlier vection research focused predominately on perceptual and lower-level factors, there is increasing evidence that vection can also be affected by what is outside of the moving stimulus itself, by the way we move and look at a moving stimulus, our pre-conceptions, intentions, and how we perceive and interpret the stimuli, which is of particular importance in the context of VR. Vection might even be directly or indirectly affected by cognitive/top-down processes [3, 57, 61, 96]. Below we will discuss some of these examples. More comprehensive reviews are provided by [85, 86, 100].

Viewing pattern and perceived foreground-background relationship. Fixation on a stationary foreground object or simply staring at the moving visual stimulus has long been known to enhance visual vection, as compared to natural viewing or smooth pursuit [28, 60, 121, 122]. Suppressing the optokinetic reflex seems to play a central role here, and this is facilitated when a fixation object is provided [8]. Potentially related to this, stationary foreground objects facilitate vection (especially if centrally presented), whereas stationary background stimuli reduce vection, especially if presented peripherally [17, 42, 62]. Of particular importance seems to be the perceived foreground-background or figure-ground relationship, in that vection tends to be dominated by motion of the perceived background, even if the background is not physically further away than the perceived foreground [17, 45, 53, 63, 65, 67, 68]. This “object and background hypothesis for vection” has been elaborated upon and confirmed in an elegant set of experiments using perceptually bistable displays like the Rubin’s vase that can be perceived either as a vase or two faces [103].

In VR simulations, these findings could be used to systematically reduce or enhance illusory self-motions depending on the overall simulation goal, e.g., by modifying the availability of real or simulated foreground objects (e.g., dashboards), changing peripheral visibility of the surrounding room (e.g., by controlling lighting conditions), or changing tasks/instructions (e.g., instructions to pay attention to instruments which are typically stationary and in the foreground).

Naturalism, presence, and interpretation of the moving stimulus. Naturalism and ecological validity of the moving stimulus has also been suggested to affect vection [84, 116], potentially due to our inherent assumption of a stable environment [23, 81, 82]. For example, auditory vection was enhanced when the moving sounds represented “acoustic landmarks”, i.e., objects that do not normally move such as church bells, as compared to typically moving objects like cars or artificial sounds like pink noise [56, 96, 118].

For visual vection Riecke et al. [84] demonstrated that vection as well as presence were impaired when the naturalistic stimulus of a city environment was systematically degraded by mosaic-like scrambling. Different aspects of presence were correlated with specific aspects of vection: Whereas spatial presence correlated most strongly with the convincingness of illusory self-motion, attention/involvement in the simulation correlated predominately with vection onset latency. In a second experiment, the visual stimulus of a natural scene was compared to an upside-down version of the same stimulus. Even though the inversion of the stimulus left the physical stimulus characteristics (i.e., the image statistics and thus perceptual/bottom-up factors) essentially unaltered, both presence and the convincingness of vection were significantly reduced. This strongly suggests a cognitive or top-down contribution to presence and the convincingness of self-motion illusions. We posit that the natural, ecologically more plausible upright stimulus might have more easily been accepted as a stable “scene”, which in turn facilitated both presence and the convincingness of vection.

These findings are supported by tumbling room studies, where the tumbling sensation (roll vection) is enhanced for naturalistic environments that include a clear visual frame of reference and objects with an obvious intrinsic upright orientation [1, 40]. That is, whereas simple textured displays only tend to produce limited tilting sensations [1, 32, 133], observing a fully furnished natural room rotating around stationary participants can induce compelling \(360^{\circ }\) head-over-heals tumbling sensation in most people [1, 40, 43, 71]. Moreover, Palmisano et al. stated that “the \(360^{\circ }\) illusory self-rotations produced by rotating a furnished room around the stationary observer’s roll axis were very similar to the sensations of self-rotation produced by rotating the observer inside the stationary room” (p. 4057). The importance of a naturalistic visual stimulus is corroborated by Wright et al. who demonstrated that visual motion of a photo-realistic visual scene can dominate even conflicting inertial motion cues in the perception of self-motion [128, 129].

Metaphorical cross-modal facilitation of vection. Recently, Seno et al. demonstrated that linear visual vection could even be facilitated by auditory cues that do not move by themselves, but only match the visual motion metaphor [102]. For example, sounds increasing in amplitude (as if coming closer) facilitated visually-induced forward vection, but not backwards, sideways (left-right) or vertical (up-down) vection. Sounds decreasing in amplitude did not show any clear effects on vection, though. Whereas forward motions in normal life are often accompanied by increasing sound amplitudes for sounding stationary objects in front of us, this physical correspondence to real-world situations does not seem to be necessary for sound to facilitate visually-induced vection: Sounds ascending (“going up”) in frequency facilitated upwards vertical vection, but had no influence on downwards, sideways (left-right), or forward-backwards vection [102]. Correspondingly, sounds decreasing in frequency (“going down”) facilitated downwards vertical vection, but had no effect on any other vection direction. Similar effects of spatial metaphor mapping have been observed for the emotional connotation of sounds, in that emotionally “positive” sounds facilitated upwards vection compared to neural sounds [99]. Together, these findings further corroborate the proposition that multi-modal consistency between different stimuli can facilitate vection [86, 94, 96, 102], even in situations where this correspondence is only metaphorical and not purely sensorial. However, as vection is an inherently subjective phenomenon, vection researchers need to carefully assess potential experimental biases such as perceived demand characteristics of the experimental situation and participants’ expectations and prior knowledge.

Cognitive-perceptual framework of movability. A number of studies demonstrated that merely knowing/perceiving that actual motion is impossible versus possible can reduce visual vection, both in the real world and VR [3, 57, 130]. For example, Andersen and Braunstein [3] remark that pilot experiments had shown that in order to perceive any self-motion, participants had to believe that they could actually be moved in the direction of perceived vection. Accordingly, participants were asked to stand in a movable booth and looked out of a window to view the optic flow pattern. This procedure allowed them to elicit vection with a visual FOV as small as \(7.5^{\circ }\). Lepecq et al. [57] demonstrated that seven year old children perceive vection earlier when they were previously shown that the chair they were seated on could physically move in the direction of simulated motion—even though this never happened during the actual experiment. Similarly, knowing that actual motion is possible in VR (by demonstrating the motion capabilities of a motion platform prior to testing) can make people believe that they actually moved, even though they never did [86, 100]. Recently, Riecke et al. [85] demonstrated that providing such a cognitive-perceptual framework of movability can also enhance auditory vection. When blindfolded participants were seated on a hammock chair while listening to binaural recordings of rotating sound fields, auditory circular vection was facilitated when participants’ feet were suspended by a chair-attached footrest as compared to being positioned on solid ground. This supports the common practice of seating participants on potentially moveable platforms or chairs in order to elicit auditory vection [54, 117, 118].

Attention and cognitive load. There seems to be mixed evidence about the potential effects of attention and cognitive load on vection. Whereas Trutoiu et al. [113] observed vection facilitation when participants had to perform a cognitively demanding secondary task, vection inhibition was reported by Seno and colleagues [105]. When observers in [53] were asked to specifically attend one of two simultaneously presented upward and downward optic flow fields of different colors, the non-attended flow field was found to determine vection direction. This might, however, also be explained by attention modulating the perceived depth-ordering and foreground-background relationship, as discussed in detail in [75, 103] demonstrated that cognitive priming can also affect the time course of vection: Adult participants experienced vection earlier when they were seated on a potentially movable chair and were primed towards paying attention to self-motion sensation, compared to a condition where they were seated on a stationary chair and instructed to attend to object motion, not self-motion. Thus, while attention and cognitive load can clearly affect self-motion illusions, further research is needed to elucidate underlying factors and explain seemingly conflicting findings. A recent study suggests that vection can even be induced when participants are not consciously aware of any global display motion, which was cleverly masked by strong local moving contrasts [107].

Finally, the occurrence, onset latency, and perceived strength of vection tend to vary considerably between participants. Although there is little research investigating potential underlying factors, recent research suggests that personality traits might be a contributing factor. In a linear visual vection study, more narcissistic observers reported weaker vection, indicated by increased vection onset latencies, reduced vection duration, and decreased vection magnitude [108]. Future research is needed to investigate if differences in personality traits indeed directly affect the self-motion illusions, and/or if the observed vection reduction for increasing narcissism might also be related to a criterion shift for reporting vection.

In general, cognitive factors seem to become more relevant when stimuli are ambiguous or have only weak vection-inducing power, as in the case of auditory vection [85] or sparse or small-FOV visual stimuli [3]. It is conceivable that cognitive factors generally have an effect on vection, but that this has not been widely recognized for methodological reasons. For example, the cognitive manipulations might not have been powerful enough or free of confounds, or sensory stimulation might have been so strong that ceiling level was already reached, which is likely the case in an optokinetic drum that covers the full visible FOV.

8 Does Vection Improve Spatial Updating and Perspective Switches?

Spatial updating is seemingly automatic and requires little cognitive resources if participants physically move to the new position [80, 97]. For example, humans can continuously and accurately point to a previously-seen target when either walking or being passively transported, both for linear motions [20, 109] and curvilinear motions [29]. However, when participants in Frissen et al. [29] were stationary and only biomechanical cues from stepping along a circular treadmill indicated the curvilinear motion, spatial updating performance (quantified using continuous pointing) declined and showed systematic errors. The authors did not assess whether participants in some trials might have perceived biomechanical vection. In a follow-up study by Frissen et al. continuous pointing responses indicated that participants can indeed perceive a slow drift (about \(7^{\circ }\)/s) for curvilinear off-center walking-in-place on a large (3.6 m diameter) circular treadmill, but only at about 16  % of their actual walking speed of \(40^{\circ }\)/s (cf. Chap. 6 of this book). Surprisingly, although participants were always walkingforward, pointing responses indicated backward self-motion in 42  % of the trials. This suggests that biomechanical cues from curvilinear forward walking were not sufficient for inducing a reliable sensation of forward self-motion. Indeed, when averaged over trial repetitions, participants did not report any substantial net self-motion. This might have contributed to the above-mentioned decline in spatial updating performance when participant did not physically move [29].

It is, however, conceivable that a compelling illusion of self-motion (even without any actual physical motion) might be sufficient to enable spatial updating performance similar to physical motions, or at least better than in purely imagined perspective switches. Riecke et al. [90] tested this hypothesis and provide first evidence that self-motion illusions might indeed help us to update target locations in the absence of physical self-motions. After learning the layout of nine irregularly arranged objects in the lab, participants were blindfolded and asked to point to those previously-learned objects from novel imagined perspectives (e.g., “imagine facing ‘mic’, point to ‘hat’ ”). As predicted by prior research [80, 97], imagined perspective switches were difficult when participants remained stationary and simply had to imagine the perspective switch. Both pointing accuracy and consistency (“configuration error”) improved, however, when participants had the illusion of rotating to the to-be-imagined perspective, despite not physically moving. Circular vection in this study was induced by combining auditory vection (induced via rotating sound fields) with biomechanical vection (induced by stepping on a circular treadmill, similar to sitting stationary above a turning carousel) in order to avoid visual cues that might interfere with imagined perspective-taking.

While further studies are needed to corroborate these findings, these data suggest that providing the mere illusion of self-motion might provide similar benefits in terms of spatial orientation and perspective switches as actual self-motion. This could ultimately enable us to design effective yet affordable VR simulations, as the need for physical motion of the observer could be largely reduced, which, in turn, reduces overall costs, space and equipment needs, and required safety and simulation effort.

9 Conclusions and Conceptual Framework

In conclusion, the above review of the literature supports the notion that cognitive or top-down mechanisms like spatial presence, the cognitive-perceptual framework of movability, as well as the interpretation of a stimulus as stable and/or belonging to the perceptual background, do all affect self-motion illusions, a phenomenon that was traditionally believed to be mainly bottom-up driven ([85], for reviews, see [86, 100]). This adds to the small but growing body of literature that suggests cognitive or top-down contributions to vection, as discussed in Sect. 7.2. Furthermore, correlations between the amount of presence/immersion/involvement and self-motion perception [91, 92] suggests that these factors might mutually affect or support each other. While still speculative, this would be important not only for our theoretical understanding of self-motion perception, presence, and other higher-level phenomena, but also from an applied perspective of affordable yet effective self-motion simulation. In the following, we would like to broaden our perspective by trying to embed these ideas and findings into a more comprehensive tentative framework. This conceptual framework is sketched in Fig. 2.3 and will be elaborated upon in more detail below. It is meant not as a “true” theoretical model but as a tentative framework to support discussion and reasoning about these concepts and their potential interrelations.

Any application of VR, be it more research-oriented or application-oriented, is typically driven by a more or less clearly defined goal. In our framework, this is conceptualized as the effectiveness concerning a specific goal or application (Fig. 2.3, bottom box). Possible examples include the effectiveness of a specific pilot training program in VR, which includes how well knowledge obtained in the simulator transfers to corresponding real world situations, or the degree to which a given VR hardware and software can be used as an effective research tool that provides ecologically valid stimulation of the different senses.

So how can a given goal be approached and the goal/application-specific effectiveness be better understood and increased? There are typically a large number of potential contributing factors, which span the whole range from perceptual to cognitive aspects (see Fig. 2.3, top box). Potentially contributing factors include straight-forward technical factors like the FOV and update rate of a given VR setup or the availability of biomechanical cues from walking, the quality of the sensory stimulation with respect to the different individual modalities and their cross-modal consistency, and task-specific factors like the cognitive load or the users’ instructions.

Fig. 2.3
figure 3

Tentative conceptual framework that sketches how different factors that can be manipulated for a given VR/research application (top box) might affect the overall effectiveness with respect to a specific goal or application (bottom box). Critically, we posit that the factors affect the overall goal not (only) directly, but also mediated by the degree to which they support both the perceptual effectiveness and behavioral effectiveness and the resulting perception-action loop (middle box)

All of these factors might effect both our perception and our action/behavior in the VE. Here, we propose a framework where the different factors are considered in the context of both their perceptual effectiveness (e.g., how they contribute to the perceived self-motion) and their behavioral effectiveness (e.g., how they contribute by empowering the user to perform a specific behavior like robust and effortless spatial orientation and navigation in VR), as sketched in Fig. 2.3, middle box.

Perception and action are interconnected via the perception-action loop, such that our actions in the environment will also change the input to our senses. State-of-the art VR and human-computer interface technology offer the possibility to provide highly realistic multi-modal stimuli in a closed perception-action loop, and the different contributing factors summarized in the top box of Fig. 2.3 could be evaluated in terms of the degree to which they support an effective perception-action loop [27].

Apart from the perceptual and behavioral effectiveness, we propose that psychological and physiological responses might also play an important role. Such responses could be emergent and higher-level phenomena like spatial presence, immersion, enjoyment, engagement, or involvement in the VE, but also other psychological responses like fear, stress, or pleasure on the one hand and physiological responses like increased heart rate or adrenalin level on the other hand. In the current framework, we propose that such psychological and physiological responses are not only affected by the individual factors summarized in the top box in Fig. 2.3, but also by our perception and our actions themselves. Slater et al. [110] demonstrated, for example, that increased body and head motions can result in an increased presence in the VE. Presence might also be affected by the strength of the perceived self-motion illusion [81, 91]. Conversely, certain psychological and physiological responses might also affect our perception and actions in the VE. By systematically manipulating the naturalism and global scene consistency of a visually simulated scene, Riecke et al. [84] showed that the degree of presence in a simulated scene might also affect self-motion perception. Our actions and behaviors in a VE might, however, also be affected by our psychological and physiological responses. Von der Heyde and Riecke proposed, for example, that spatial presence might be a necessary prerequisite for robust and effortless spatial orientation based on automatic spatial updating or certain obligatory behaviors like fear of height or fear of narrow enclosed spaces [36, 83].

In summary, we posit that our understanding of the nature and usefulness of the cognitive factors and higher-level phenomena and constructs such as presence, immersion, or a perceptual-cognitive framework of movability might benefit if they are embedded in a larger conceptual framework, and in particular analyzed in terms of possible relations to perceptual and behavioral aspects as well as goal/application-specific effectiveness. Similar benefits are expected if other higher-level phenomena are analyzed in more detail in the context of such a framework.

10 Outlook

A growing body of evidence suggests that there is a continuum of factors that influence the perceptual and behavioral effectiveness of VR simulations, ranging from perceptual, bottom-up factors to cognitive, top-down influences. To illustrate this, we reviewed recent evidence suggesting that self-motion illusions can be affected by a wide range of parameters including attention, viewing patterns, the perceived depth structure of the stimulus, perceived foreground/background distinction (even if there is no physical separation), cognitive-perceptual frameworks, ecological validity, as well as spatial presence and involvement. While some of the underlying research is still preliminary, findings are overall promising, and we propose that these issues should receive more attention both in basic research and applications.

These factors might turn out to be crucial especially in the context of VR applications and self-motion simulations, as they have the potential of offering an elegant and affordable way to optimize simulations in terms of perceptual and behavioral effectiveness. Compared to other means of increasing the convincingness and effectiveness of self-motion simulations like increasing the visual field of view, using a motion platform, or building an omni-directional treadmill, cognitive factors can often be manipulated rather easily and without much cost, such that they could be an important step towards a lean and elegant approach to effective self-motion simulation [86, 94, 96]. This is nicely demonstrated by many theme park rides, where a conducive cognitive-perceptual framework and expectations are set up already while users are standing in line. Although there seems to be no published research on these priming phenomena in theme parks, they likely help to draw users more easily and effectively into the simulation and into anticipating and “believing” that they will actually be moving. Thus, we posit that an approach that is centered around the perceptual and behavioral effectiveness and not only the physical realism is important both for gaining a deeper understanding in basic research and for offering a lean and elegant way to improve a number of applications, especially in the advancing field of virtual reality simulations. This might ultimately allow us to come closer to fulfilling the promise of VR as an alternate reality, that enables us to perceive, behave, and more specifically locomote and orient as easily and effectively in virtual worlds as we do in our real environment.