Introduction

As we move through the world, different patterns of visual motion are presented to our left and right eyes. If this stereoscopic optic flow is filmed and then later presented to stationary observers (e.g., through a head-mounted display), they will typically experience compelling illusions of self-motion, known as vection.Footnote 1 While such illusions can be induced by presenting optic flow to only a single eye, research has shown that there is a stereoscopic advantage for vection (Allison, Ash, & Palmisano, 2014; Lowther & Ware, 1996; Palmisano, 1996, 2002; Palmisano, Summersby, Davies, & Kim, 2016b; Palmisano, Davies, & Brooks, 2019; Seya & Shinoda, 2018). These studies have reported that stereoscopic optic flow induces vection that is not only stronger and perceived to be faster, but also starts sooner and lasts longer, than the vection induced by comparable monocular or synoptic self-motion displays. This paper further investigates these stereoscopic contributions to the experience of self-motion-in-depth.

Monocular motion signals are known to be important for visual perceptions of both self-motion and scene layout (e.g., Braunstein & Andersen, 1981; Gibson, 1950; Koenderink & van Doorn, 1981; Longuet-Higgins & Prazdny, 1980). Self-motion-in-depth not only produces changes in the image locations of environmental objects over time (known as motion perspective; see Gibson, 1950), but also local changes in the image sizes of individual objects (known as changing-size cues; see Regan & Beverley, 1979). While it is possible to induce vection using monocular motion perspective alone (Andersen & Braunstein, 1985; Telford & Frost, 1993), these illusions can still be improved by adding changing-size cues (Palmisano, 1996; Palmisano & Chan, 2004; Seya & Shinoda, 2018). However, even when displays contain both of these monocular cues, vection can still be further enhanced by adding consistent stereoscopic information (Palmisano, 1996; Seya & Shinoda, 2018).

Palmisano (1996, 2002) proposed that stereoscopic optic flow improves vection by providing extra, purely binocular information about motion-in-depth and self-motion (Palmisano, 1996, 2002). However, the binocular cues it provides about 3-D layout might also improve vection indirectly (Palmisano, 1996, 2002). For example, binocularly perceived scene distances might be used to scale the speed of monocularly induced vection (particularly in displays where the monocular speed information is ambiguous). Consistent with the latter possibility, binocular viewing improves perceptions of scene depth, ground slant and surface roughness at distances relevant to self-motion (Allison, Gillam, & Palmisano, 2009a; Allison, Gillam, & Vecellio, 2009b; Palmisano et al., 2010). As these binocular (over monocular) viewing advantages are still found for stationary observers, they have been primarily attributed to the extra distance and depth information provided by binocular scene disparities (which arise due to individual objects falling on different retinal positions in the left and right eyes – Howard & Rogers, 2012). Static horizontal binocular disparities can generate compelling perceptions of depth (Wheatstone, 1838) and vertical binocular disparities can provide useful information about absolute egocentric distances (Gillam & Lawergren, 1983; Rogers & Bradshaw, 1993). However, these binocular scene disparities will change over time during self-motion in depth, and as a result, the images of environmental objects will also move at different velocities in the two eyes. These stereo-motion cues are known to produce compelling perceptions of object motion-in-depth (see Allison & Howard, 2011, for a review). Palmisano (1996) proposed that global patterns in these stereo-motion cues might even provide extra, purely binocular information about self-motion-in-depth.

As noted above, while binocular information about 3-D layout could contribute to vection, most evidence suggests that stereoscopic optic flow improves vection by providing extra information about motion-in-depth. In one of the earliest of these studies, Palmisano (1996) compared the vection-in-depth induced in: (1) stereo-consistent conditions, where the binocular and monocular information specified the same self-motion through the same 3-D environment, (2) stereo-conflicting-near conditions, where the dynamic monocular information specified self-motion relative to a 3-D scene, but the static binocular cues indicated the observer was stationary relative to a near 2-D scene; and (3) monocular-viewing conditions, where the observer wore an eye-patch while viewing the optic flow used in stereo-consistent conditions, and was therefore provided with only monocular information about self-motion and 3-D layout. Palmisano found that the stereo-consistent conditions induced vection-in-depth that was stronger and had shorter onset latencies than that induced by the stereo-conflicting-near and monocular-viewing conditions. As all three conditions contained changing-size cues to motion-in-depth, he next compared the vection-in-depth induced by monocularly viewed patterns of optic flow with and without these cues (object size remained constant and only monocular motion perspective information was provided in the latter same-size patterns of optic flow). Palmisano found changing-size cues also improved vection-in-depth. He concluded that dynamic stereoscopic and changing-size cues both improved vection by providing additional motion-in-depth information.Footnote 2

In a later study, Palmisano (2002) compared the vection, as well as the perceived depths and distances, induced by stereo-consistent, stereo-conflicting-near and stereo-conflicting-far patterns of radial flow. As in his earlier study, the monocularly available information in all three conditions was consistent with self-motion-in-depth relative to a static 3-D environment. While the dynamic binocular information specified exactly the same situation in stereo-consistent displays, the static binocular information in the other conflicting conditions indicated the observer was stationary relative to either a near 2-D or a very distant environment. As expected, the stereo-consistent and stereo-conflicting-far displays were perceived to be more 3-D and to have larger maximum extents than the stereo-conflicting-near displays. However, only stereo-consistent displays were found to improve vection-in-depth. The vection induced by these displays was perceived to be faster, and move the observer further, than that induced by both conflicting displays. Palmisano again concluded that these stereoscopic vection advantages were due to extra motion-in-depth information (because the binocular information in the conflicting conditions did not indicate any motion-in-depth and their effects on perceived scene layout were quite different to those on vection).

Based on the above, Palmisano (2002) proposed that stereoscopic information might need to be consistent with monocular motion signals in order to improve vection-in-depth. Palmisano, Davies and Brooks (2019) recently tested this proposal by examining whether large discrepancies between the observer’s simulated and physical eye-separations affected the vection induced by stereoscopic optic flow. They compared the vection induced by self-motion displays simulating eye-separations of 0 cm (the control condition), 3.25 cm (reduced from normal), 6.5 cm (approximately normal) and 13 cm (exaggerated relative to normal). Instead of the 6.5-cm (stereo-consistent) condition producing the strongest vection, they found that vection strength increased systematically with the simulated eye-separation up to at least 13 cm. Since vection advantages were found for all non-zero simulated eye-separations tested, this showed that stereoscopic benefits were not restricted to strictly ecological conditions. Stereoscopic advantages for vection persisted, and even increased, when stereoscopic and monocular motion signals had very different magnitudes.

Since binocular and monocular motion signals with very different magnitudes can still induce vection, it is possible that stereoscopic information only needs to be dynamic (as opposed to static) in order to generate a vection advantage. In principle, there are many different ways these binocular and monocular signals could be combined with each other (and with other non-visual sources of self-motion information – see Ernst & Banks, 2002; Fetsch et al., 2010; Landy et al., 1995; Perrone, 2018; van den Berg & Brenner, 1994). For example, there have been several recent attempts to model how various binocular, or monocular and binocular, motion-in-depth signals are integrated using Bayesian or maximum likelihood estimation frameworks (Allen et al., 2015; Aguado & López-Moliner, 2019; Welchman, Lam, & Bulthoff, 2008; Thompson, Rokers, & Rosenberg, 2019). While these particular studies were focussed on object-motion perception, their findings – that motion-in-depth cues combine according to cue reliability under some conditions – may generalize to self-motion perception as well. In the context of the current experiments it is important to note that the optimal behaviour of such frameworks assumes unbiased and consistent estimates from multiple cues. It is, however, possible that the perceived consistency of stereoscopic and monocular motion signals is based primarily on sign (rather than on magnitude) during vection processing. If so, linear optimal estimation frameworks would predict that either: (1) the combination of cues differing in sign should result in weaker percepts than those obtained with the individual cues; or (2) one cue would be ignored. The current study was aimed at strongly testing these predictions. It investigated whether stereoscopic advantages for vection persist under pseudoscopic viewing conditions (i.e. when what would normally be seen by the left eye is presented to the right eye, and vice versa). This manipulation reverses the signs of all the binocular disparities in the display (e.g. uncrossed binocular disparities become crossed binocular disparities), placing them in direct conflict with the monocularly available information. Pseudoscopic viewing can produce vivid stereoscopic effects when looking at natural scenes (Ewald & Gross, 1906; Kalaugher, 1987; Shimojo & Nakajima, 1981; Stratton, 1898; Wallin, 1905; Wheatstone, 1852), but there can also be large individual differences in the nature and timing of these perceptual experiences (see Palmisano, Hill, & Allison, 2016a). While apparent reversals of scene depth order are common (where near objects appear to be far and vice versa; e.g., Stratton, 1898), pseudoscopic viewing can also generate complex illusory foreground surfaces, reversals in scene border ownership, as well as changes in apparent object size and shape (similar effects to the hollow face and hollow potato illusions – Hill & Johnston, 2007) (please see Palmisano, Hill, & Allison, 2016a for full descriptions of all of these different pseudoscopic viewing effects).

Here we examine (for the first time) the effects on vection of reversing the available stereoscopic information (i.e. binocular disparity, vergence, changing disparity over time, and interocular velocity differences), thereby putting them into direct conflict with the monocularly available information about self-motion (i.e. monocular motion perspective and changing-size cues). These pseudoscopic viewing conditions should produce conflicts between stereoscopic and monocular motion signals in terms of both their signs and their magnitudes.

Experiment 1: Vection induced by stereo-consistent, stereo-reversed and flat-stereo same-size flow

This experiment examined how vection is affected when stereo-consistent or stereo-reversed information is added to same-size optic flow (i.e. self-motion displays where no changing-size cues are provided). The monocularly available motion perspective information in all of the displays represented forward self-motion-in-depth through a 3-D dot cloud. In flat-stereo conditions,Footnote 3 the static binocular information indicated the observer was stationary relative to a near 2-D (i.e. flat) environment. By contrast, the dynamic binocular information in stereo-consistent and stereo-reversed displays simulated self-motion in either the same or the opposite direction to the monocular motion perspective. In addition to measuring vection strength and onset latency in this experiment, we also measured perceptions of overall scene depth and motion-in-depth speed to identify the origins of any stereo-consistent/stereo-reversed effects on vection (i.e. to determine if the stereo cues were having direct or indirect effects on the vection). The vection, scene depth and motion-in-depth speed data produced by stereo-consistent and stereo-reversed displays were compared to those obtained using the flat-stereo reference displays.

Method

Participants

Six male and 12 female psychology students and staff at the University of Wollongong participated in this experiment (mean age 22.5 years; SD 2.6 years). All had normal or corrected-to-normal visual acuities, static stereoacuities of 40 s of arc or better, and an average inter-pupillary distance of 6.0 cm (SD = 0.34 cm). They all reported being clear of vestibular impairments and presented no obvious signs of oculomotor or neurological pathology. The University of Wollongong human ethics committee approved the study in advance and each participant provided written informed consent before taking part in the study.

Design

Two independent variables were manipulated in this within-subjects experiment: (1) DISPLAY TYPE. In addition to the non-stereoscopic displays (flat-stereo), participants were shown displays that provided either consistent stereoscopic information (stereo-consistent) or reversed stereoscopic information (stereo-reversed). Participants viewed all three types of display binocularly; (2) SPEED. Displays simulated either a 0.23 m/s, a 0.46 m/s or a 0.92 m/s speed of forward self-motion (i.e. on a straight-ahead and eye-level heading). Four dependent variables were measured for each of these nine DISPLAY TYPE by SPEED conditions, including: (1) the time from the start of the display motion until participants first felt they were moving (i.e. the vection onset latency); as well as (2) the strength of the vection (rated from 0–10); (3) the perceived speed of the object motion-in-depth (rated from 0–10); and (4) the perceived scene depth (rated from 0–10) for each display. In the latter three cases, these ratings were anchored with ‘0’ indicating no sense of vection/motion/depth and ‘5’ corresponding to the standard (see Procedure below).

Apparatus

Prior to the experiment, each participant’s static stereoacuity was measured using the Random Dot Stereo Butterfly Test (Stereo Optical Co., Inc.) and their eye-separation was measured using a digital pupillary distance (PD) meter (PD-NH-L8; http://ophthalmology.innz.se/equipment/refraction/pd-meters/). Self-motion displays were then generated on a Dell Precision T3500 workstation by rear-projecting optic flow onto a flat screen (1.82 m wide by 1.46 m high) using a Panasonic PT-AE7000 3D projector (1,920 × 1,080 pixel resolution; refresh rate 60 Hz; in top-and-bottom stereoscopic frame sequential presentation mode). Participants viewed all of the displays through Panasonic TY-ER3D4MU 3D shutter glasses (i.e. alternate frame sequencing with infrared time synchronization; these glasses resulted in 30 images/s per eye). They were seated 1.4 m in front of the projection screen, which subtended a visual angle of 66° horizontally and 55° vertically. Like most previous stereo vection studies (e.g., Palmisano, 1996, 2002; Palmisano et al., 2016b), we chose to use this relatively long viewing distance to minimise any potential problems due to accommodation-vergence mismatch (see Hoffman et al., 2008; Watt et al., 2005). A chinrest was used to minimise any head movements and participants viewed these self-motion displays in an otherwise dark room. Vection-onset latency responses were recorded by pressing the left button of a USB mouse during each trial, and vection strength, perceived scene depth and perceived motion-in-depth speed ratings were entered by the participant using the computer’s keyboard following that trial.

Visual displays

All displays simulated forward self-motion through a 3-D cloud of 1,681 randomly positioned blue dot objects (maximum object luminance was 3.8 cd/m2 on a 1.0 cd/m2 black background). The dimensions of this 3-D cloud were 3.7 m wide by 2.8 m high by 6.3 m deep (i.e. the nearest objects were simulated to be 0.35 m, and the furthest objects were simulated to be 6.65 m, away from the observer along the depth axis). Object image sizes did not increase as the observer was simulated to approach them – each object always subtended a visual angle of 0.29° wide by 0.61° high irrespective of where it was simulated to lie in the 3-D cloud. There was, however, a gradual linear increase in object luminance as the simulated distance of the object from the observer decreased (from 1.7 to 3.8 cd/m2), simulating movement through a fog. The main purpose of this simulated fog was to reduce the salience of new objects suddenly appearing at the far end of the cloud. However, it also served as a potential monocular cue to the depth order of the objects in this simulated environment.

Stereoscopic displays presented different patterns of optic flow to the left and right eyes. In stereo-consistent conditions, the stereoscopic and monocular motion signals simulated the same forward self-motion through this 3-D cloud (please see Supplementary Material, Movie 1: Same-sizeStereoMovie.avi). Stereo-reversed conditions were created by swapping the left and right eye views of these stereo-consistent displays, thereby placing static and dynamic stereoscopic information into direct conflict with the monocular motion perspective information. By contrast, the flat-stereo displays projected the same left eye view to both eyes. As a result, only monocular motion signals indicated forward self-motion through the cloud in these control conditions (the available binocular information indicated that observers were not moving in depth; instead it suggested that they were stationary relative to a 2-D frontal surface that was 1.4 m in front of them at the level of the projection screen). The same stereoscopic presentation mode was used for all three different types of display to equate frame rates (always set at 30 Hz per eye) and the observers viewed all conditions through the shutter glasses. Each display was presented for either 25 s (during the vection trials) or 5 s (during the scene depth rating and the motion-in-depth speed rating trials).

Procedure

Each participant’s static stereoacuity was first measured using the Random Dot Stereo Butterfly Test (the range of disparities in this test allowed us to measure static stereoacuity up to 40 s of arc). After having their eye-separation measured using a digital pupillary distance meter, the experimenter began the main vection experiment. Participants were told that they would be shown displays of moving objects and that: ‘sometimes the objects may appear to be moving towards you; at other times you may feel as if you are moving towards the objects. If you feel that you are moving then press the left button of the mouse and hold it down as long as the experience continues’. Participants (who wore 3-D shutter glasses throughout the experiment) were not told about the three different DISPLAY TYPE conditions, as we did not want to introduce any unnecessary demand characteristics.

Each session consisted of three blocks of trials. The first block consisted of vection trials. At the start of this block, the standard stimulus was shown to participants to set the modulus for their vection strength ratings (i.e. via the method of magnitude estimation; Stevens, 1957). This standard was always a binocularly viewed flat-stereo self-motion display simulating a forward speed of 0.92 m/s.Footnote 4 After a 25-s exposure to this standard stimulus, participants were asked whether they felt they were moving or stationary. If they responded that they felt they were moving, then they were told that the strength of this feeling of self-motion corresponded to a value of ‘5’ (with ‘0’ representing ‘no experience of self-motion’). During each subsequent self-motion display participants pressed the mouse button whenever they experienced vection (and held that button down as long as the experience continued). Following each self-motion display, a rating scale was presented on the screen, which participants used to make their vection strength ratings for that trial. Participants used the ‘up’ and ‘down’ arrow keys on the keyboard to move a horizontally elongated needle along the vertical axis of this rating scale (from ‘0’ to ‘10’ in 0.5 steps) and pressed the ‘enter’ key to record their overall vection strength rating for each trial. Finally, if vection was experienced, the experimenter asked the participant for its perceived direction (i.e. forward/backward) at the end of each trial. There was then a 30-s delay before the next trial could begin. During this time the room lights were turned on in order to prevent dark adaptation. After several practice trials, the experimental vection trials began. Each of the nine DISPLAY TYPE by SPEED conditions was presented twice (presentation order was fully randomised).

Following the main vection measurement block, participants ran two additional blocks of trials. Prior to each block, participants were instructed that they would be re-exposed to these different self-motion displays, but this time they would instead rate their perceived scene depth or their perceived speeds of object motion-in-depth (depending on the block; the order of these last two blocks was randomised). Again the first display presented (the 0.92 m/s flat-stereo display) was used to set the modulus for their magnitude estimates. They were told that: (1) this reference display had a perceived scene depth or a perceived motion-in-depth speed (depending on the block) that should be rated as a ‘5’; and (2) ‘0’ represented either a ‘flat display’ or ‘no motion-in-depth’ (depending on the block). Following each 5-s display, the rating scale was again presented on the screen, which participants used to make magnitude estimates of the percept from ‘0’ to ‘10’. Each of the nine DISPLAY TYPE by SPEED conditions was presented twice in each of these two rating blocks.

Results

Vection data

When participants experienced vection, it was always in the direction simulated by the monocular motion signals. That is, forwards (as opposed to backwards) vection was perceived even during the stereo-reversed conditions. Average vection strength ratings and onset latencies were calculated for each participant in each of the nine different experimental conditions (according to convention, non-vection trials were assigned a strength rating of 0 and an onset latency of 25 s). Then separate repeated measures analyses of variance (ANOVAs) were performed on this averaged vection strength rating and vection onset latency data (Greenhouse-Geisser corrections were applied whenever the assumption of sphericity was violated). When significant main effects of DISPLAY TYPE or SPEED were found, selected pairwise comparisons (two-tailed) were used to examine them further (i.e. stereo-consistent vs. flat-stereo, stereo-reversed vs. flat-stereo, stereo-consistent vs. stereo-reversed; 0.23 m/s vs. 0.48 m/s, 0.48 m/s vs. 0.92 m/s). Although the raw uncorrected p values were reported for each of these comparisons, they were evaluated against Bonferroni-corrected critical α values to determine their significance (i.e. α = 0.0167 was used for the three DISPLAY TYPE comparisons; α = 0.025 for the two SPEED comparisons). Significant Display TYPE by SPEED interactions were also examined via additional one-way repeated-measures ANOVAs.

Numbers of non-vection trials

Participants reported experiencing vection on 256 of the 324 experimental vection trials tested (i.e. 18 participants each responding twice to the nine different DISPLAY TYPE by SPEED conditions). Of the 68 non-vection trials, nine trials were stereo-consistent (six 0.23 m/s, three 0.46 m/s and no 0.92 m/s trials), 36 trials were flat-stereo (twenty-one 0.23 m/s, eleven 0.46 m/s and four 0.92 m/s trials) and 23 trials were stereo-reversed (eleven 0.35 m/s, nine 0.46 m/s and three 0.92 m/s trials). Wilcoxon signed-rank tests were used to examine the numbers of non-vection trials in the three DISPLAY TYPE conditions across the 18 participants (as this data was not normally distributed). These tests revealed that: (1) there were significantly fewer non-vection trials in the stereo-consistent conditions than in the flat-stereo conditions (Z = -2.965, p = 0.003); and (2) the numbers of non-vection trials in the stereo-reversed conditions were not significantly different to those in the flat-stereo conditions (Z = -1.652, p = 0.099) (p values were evaluated against a Bonferroni-corrected critical α of 0.025).

Vection strength

We found a main effect of DISPLAY TYPE on vection strength ratings, F(2,34) = 16.496, p < 0.0001, partial η2 = 0.492 (see Fig. 1, left). Pairwise comparisons on this main effect revealed that on average: (1) the stereo-consistent conditions (M = 5.0) produced significantly stronger vection ratings than the flat-stereo conditions (M = 1.9) (p < 0.0001); (2) the stereo-reversed conditions (M = 3.4) also produced significantly stronger vection ratings than the flat-stereo conditions (p = 0.002); and (3) the stereo-consistent conditions did not produce significantly stronger vection ratings than the stereo-reversed conditions (uncorrected p = 0.02; note that a Bonferroni-corrected critical α of 0.0167 for three comparisons was used). We also found a main effect of SPEED on vection strength ratings, F(2,34) = 76.245, p < 0.0001, partial η2 = 0.818 (see Fig. 1, left). Pairwise comparisons on this main effect revealed that on average: (1) the 0.92 m/s speed conditions (M = 5.0) produced significantly stronger vection ratings than the 0.46 m/s speed conditions (M = 3.1) (p = 0.0001); and (2) the 0.46 m/s conditions produced significantly stronger vection ratings than the 0.23 m/s conditions (M = 2.2) (p < 0.002). The interaction between DISPLAY TYPE and SPEED was not significant, F(4,68) = 1.681, p = 0.165, partial η2 = 0.09.

Fig. 1
figure 1

Effects of DISPLAY TYPE (Stereo-Consistent, Stereo-Reversed or Flat-stereo) and SPEED (0.23, 0.46 or 0.92 m/s) on the vection strength ratings (left) and onset latencies (right) induced by same-size patterns of radially expanding optic flow. Error bars depict standard errors of the mean (SEMs)

In the above analysis, trials where participants did not experience vection were rated as having a vection strength of ‘0’. As was noted in the previous section, flat-stereo conditions were significantly more likely to have vection ratings of ‘0’. Only seven of our 18 participants experienced some vection on all trials. Unfortunately this sample was too small to investigate whether the stereo-reversed advantage for vection strength was due to differences in the number of non-vection trials. However, for 15 of our participants, vection was experienced on all of the trials simulating the fastest speed of self-motion. We therefore ran a repeated-measures ANOVA that compared the strength ratings of these 15 participants on the 0.92 m/s trials. Although all of these trials had non-zero vection ratings, we still found a significant main effect of DISPLAY TYPE on vection strength, F(2,28) = 27.423, p < 0.0001. Post hoc comparisons confirmed that stereo-consistent conditions (M = 6.767, SE = 0.520) still induced stronger vection experiences than flat-stereo conditions (M = 3.417, SE = 0.514) (p < 0.0001), and stereo-reversed conditions (M = 5.217, SE = 0.563) still induced stronger vection experiences than flat-stereo conditions (p < 0.0001). Stereo-consistent conditions also continued to induce stronger vection experiences than stereo-reversed conditions (p = 0.003).

Vection onset latency

Prior to reporting the results of the repeated-measures ANOVA on vection onset latency, we first confirmed that the unstandardized residuals of this data followed a normal distribution, Kolmogorov-Smirnov D(162) = 0.054, p = 0.2. The repeated-measures ANOVA revealed a significant main effect of DISPLAY TYPE on vection onset latencies, F(2,34) = 5.382, p = 0.009, partial η2 = 0.24 (see Fig. 1, right). Pairwise comparisons on this main effect revealed that on average: (1) stereo-consistent conditions (M = 8.1 s) produced significantly shorter vection onset latencies than the flat-stereo conditions (M = 12.9 s) (p = 0.003); (2) the stereo-reversed conditions (M = 11.7 s) did not produce significantly different vection onset latencies to the flat-stereo conditions (M = 12.9 s) (p = 0.41); and (3) the stereo-consistent conditions did not produce significantly different vection onset latencies to the stereo-reversed conditions (uncorrected p = 0.049; note that a Bonferroni-corrected critical α of 0.0167 for three comparisons was used). We also found a main effect of SPEED on vection onset latencies, F(2,34) = 53.372, p < 0.0001, partial η2 = 0.758. Pairwise comparisons on this main effect revealed that on average: (1) the 0.92 m/s speed conditions (M = 5.6 s) produced significantly shorter vection latencies than the 0.46 m/s speed conditions (M = 11.8 s) (p < 0.0001); and (2) the 0.46 m/s conditions produced significantly shorter vection onset latencies than the 0.23 m/s conditions (M = 15.4 s) (p < 0.0001). We also found a significant interaction between DISPLAY TYPE and SPEED, F(4,68) = 5.144, p = 0.001, partial η2 = 0.232. Inspection of the interaction plot (see Fig. 1, right) suggested that the stereo-reversed conditions might have reduced vection onset latencies (relative to flat-stereo conditions) but only for the slowest speed of self-motion tested (M = 15.1 s compared to M = 20.0 s). To investigate this possibility further, we first calculated each participant’s signed stereo-reversed effects on vection onset latency for each of the simulated speeds (i.e. relative to the flat-stereo-conditions). We then conducted two additional paired-samples t-tests (two-tailed) on this newly calculated data. We found that stereo-reversed effects on latency were significantly greater for the 0.23 m/s speed (M = -4.9 s) than for the 0.46 m/s speed (M = 1.73 s) (t17 = -3.9214, p = 0.001). However, stereo-reversed effects were not significantly different for the 0.46 m/s (M = 1.73 s) and 0.92 m/s speeds (M = -0.582 s) (t17 = 1.279, p = 0.218).

Other rating data

Similar analyses to those used for the vection data were conducted on the scene depth and motion-in-depth speed rating data.

Scene depth

We found a main effect of DISPLAY TYPE on scene depth ratings, F(2, 34) = 31.120, p < 0.0001, partial η2 = 0.647 (see Fig. 2, left). Pairwise comparisons on this main effect revealed that on average: (1) the stereo-consistent conditions (M = 5.5) produced significantly greater scene depth ratings than the flat-stereo conditions (M = 2.4) (p < 0.0001); (2) the stereo-consistent conditions also produced significantly greater scene depth ratings than the stereo-reversed conditions (M = 4.0) (p = 0.001); and (3) these stereo-reversed conditions also produced significantly greater scene depth ratings than the flat-stereo conditions (p < 0.0001). We also found a main effect of SPEED on scene depth ratings, F(1.436, 24.411) = 15.767, p < 0.0001, partial η2 = 0.481. Pairwise comparisons on this main effect revealed that on average: (1) the 0.92 m/s conditions (M = 4.6) produced significantly greater scene depth ratings than the 0.46 m/s conditions (M = 3.9) (p = 0.001); and (2) the 0.46 m/s conditions produced significantly greater scene depth ratings than the 0.23 m/s conditions (M = 3.4) (p = 0.02). The two-way interaction between DISPLAY TYPE and SPEED was not significant, F(4,68) = 0.637, p = 0.638, partial η2 = 0.036.

Fig. 2
figure 2

Effects of DISPLAY TYPE (Stereo-Consistent, Stereo-Reversed or Flat-stereo) and SPEED (0.23, 0.46 or 0.92 m/s) on the scene depth ratings (left) and motion-in-depth (MID) speed ratings (right) produced by same-size patterns of radially expanding optic flow. Error bars depict standard errors of the mean (SEMs)

Motion-in-depth speed

We found a main effect of DISPLAY TYPE on motion-in-depth speed ratings, F(1.352, 22.987) = 4.868, p = 0.028, partial η2 = 0.223 (see Fig. 2, right). Pairwise comparisons on this main effect revealed that on average: (1) the stereo-consistent conditions (M = 4.3) produced significantly faster motion-in-depth ratings than the stereo-reversed conditions (M = 3.5) (p = 0.01); (2) the stereo-consistent conditions did not produce significantly different motion-in-depth ratings to the flat-stereo conditions (M = 3.3) (p = 0.037; note that a Bonferroni-corrected critical α of 0.0167 for three comparisons was used); and (3) the stereo-reversed conditions did not produce significantly different motion-in-depth ratings than the flat-stereo conditions (p = 0.48). We also found a main effect of SPEED on motion-in-depth speed ratings, F(1.367, 23.233) = 60.993, p < 0.0001, partial η2 = 0.782. Pairwise comparisons on this main effect revealed that on average: (1) the 0.92 m/s conditions (M = 5.5) produced significantly faster motion-in-depth speed ratings than the 0.46 m/s conditions (M = 3.5) (p < 0.0001); and (2) the 0.46 m/s conditions produced significantly faster motion-in-depth speed ratings than the 0.23 m/s conditions (M = 2.2) (p < 0.0001). The two-way interaction between DISPLAY TYPE and SPEED was not significant, F(4,68) = 1.15, p = 0.341, partial η2 = 0.063.

Relationships between the stereoscopic effects on vection, scene depth and motion-in-depth speed

The above analyses revealed significant stereo-consistent and stereo-reversed effects on vection strength (compared to flat-stereo conditions). In order to further investigate the origins of these stereoscopic vection advantages, we calculated the average stereo-consistent and stereo-reversed effects for each participant for the following three dependent variables: vection strength, scene depth and motion-in-depth speed. These effects were calculated by first averaging each participant’s ratings across SPEED for each DISPLAY TYPE and then subtracting the average ratings for flat-stereo conditions from those for the stereo-consistent conditions or the stereo-reversed conditions in each case. We then conducted correlational analyses to examine the relationships between these different stereo-consistent or stereo-reversed effects.

We found that there were significant positive correlations between stereo-consistent effects on: (1) vection strength and motion-in-depth speed; (2) vection strength and perceived scene depth; and (3) motion-in-depth speed and perceived scene depth (see Table 1). The top plots in Fig. 3 show the relationship between the stereo-consistent effects on vection strength and perceived scene depth (top-left), as well as the relationship between the stereo-consistent effects on vection strength and motion-in-depth speed (top-right).

Table 1 Pearson correlation matrix of stereo-consistent effects on vection strength, scene depth and motion-in-depth speed
Fig. 3
figure 3

Plots showing the relationships between stereoscopic effects on vection strength and perceived scene depth ratings (left) and between these effects on vection strength and motion-in-depth (MID) speed ratings (right). Stereo-consistent effects are shown in the top-left and top-right plots. Stereo-reversed effects are shown in the bottom-left and bottom-right plots. A positive value along each axis represents a stereoscopic or pseudoscopic advantage for that particular percept

We also found a significant positive correlation between the effects of stereo-reversed information on vection strength and motion-in-depth speed (see Table 2). However, we did not find significant correlations between the effects of stereo-reversed information on: (1) vection strength and perceived scene depth; or (2) motion-in-depth speed and perceived scene depth (see Table 2). The bottom plots in Fig. 3 show the relationship between the stereo-reversed effects on vection strength and perceived scene depth (bottom-left), and also the relationship between the stereo-reversed effects on vection strength and motion-in-depth speed (bottom-right).

Table 2 Pearson correlation matrix of stereo-reversed effects on vection strength, scene depth and motion-in-depth speed

Discussion

As expected, stereo-consistent conditions were found to induce superior vection to flat-stereo conditions (as indicated by their stronger vection ratings and shorter vection onsets). A stereo-consistent advantage was displayed by all 18 participants for vection strength and by 13 of the 18 participants for vection onset latency (as indicated by larger average strength ratings and shorter average onset latencies for stereo-consistent compared to flat-stereo conditions).

However, in addition to these stereo-consistent advantages for vection, we also found evidence of a stereo-reversed advantage for vection strength. Swapping the left and right eye views should have placed binocular and monocular information in these self-motion displays into direct conflict (according to most cue combination models/frameworks this would have been expected to result in either weaker vection or binocular self-motion information being ignored). Even so, these stereo-reversed conditions were still found to induce significantly stronger vection ratings than flat-stereo conditions for all of the simulated speeds tested (i.e. 0.24, 0.46 and 0.92 m/s). This stereo-reversed advantage for vection strength was driven by the ratings of 15 of the 18 participants (with the three remaining participants having larger vection strength ratings on average for flat-stereo conditions).

In principle, this stereo-reversed advantage for vection strength ratings could have been due to: (1) fewer non-vection trials in stereo-reversed conditions than in flat-stereo conditions; (2) stronger vection ratings in stereo-reversed conditions than in flat-stereo conditions; or (3) both factors acting together. Consistent with possibility number 1, we found that the flat-stereo conditions were significantly less likely to induce vection than the stereo-reversed conditions in this experiment. However, when this particular difference was accounted for, stereo-reversed conditions were still rated as inducing stronger vection than the flat-stereo conditions. Thus, when the findings of this experiment are taken together the most likely explanation for the stereo-reversed advantage for vection strength appears to be possibility number 3.

We also found a stereo-reversed advantage for vection onset latency for the slowest simulated speed of self-motion (0.24 m/s). For the 0.24 m/s displays, ten participants had shorter onset latencies for the stereo-reversed condition than the flat-stereo condition, five participants had shorter onset latencies for the flat-stereo condition, and the remaining three participants had similar onset latencies for both stereo-reversed and flat-stereo conditions. One possible explanation for this stereo-reversed effect on vection onset latency might be that the 0.24 m/s displays provided weaker monocular motion signals than the faster 0.46 m/s and 0.92 m/s displays. Thus, there might have been more opportunity for the stereoscopic information (consistent/reversed) to improve the vection-in-depth induced by the 0.24 m/s displays.

Stereo-reversed conditions were often reported to look like fast visual motion viewed through a mud-splattered windscreen (this ‘windscreen’ was perceived to move along with the observer). In these stereo-reversed conditions, the stereo-defined ‘far objects’ appeared to be moving very quickly because they were actually simulated to be near to the observer (via their monocular motion perspective). The change in disparity over time for these ‘far’ objects would also have been large, indicating fast motion (although their stereo-motion also had the opposite sign to what would have been expected for an approaching object). A number of participants spontaneously reported that they felt that their vection was driven primarily by these fast-moving objects and that the slower moving objects (perceived to be near) had little impact on their experiences. Nevertheless, the stationary foreground objects reported by the participants (the dirty ‘windscreen’) might also have enhanced the vection percept (see Howard & Howard, 1994; Howard & Heckman, 1989; Nakamura & Shimojo, 1999; Riecke et al., 2004).

If one assumes that stereo-consistent and stereo-reversed advantages for vection had the same underlying mechanism/s, then they would appear to be better explained by stereoscopic effects on perceived motion-in-depth (as opposed to perceived scene depth). Stereo-consistent effects on vection strength were found to correlate significantly with stereo-consistent effects on both perceived motion-in-depth speed and perceived scene depth. However, stereo-reversed effects on vection strength were only found to correlate significantly with stereo-reversed effects on perceived motion-in-depth speed (not with stereo-reversed effects on perceived scene depth).

While stereo-consistent information was found to significantly alter perceived motion-in-depth speed and vection-in-depth, there appeared to be little difference in the mean motion-in-depth speed ratings for the stereo-reversed and flat-stereo conditions (see Fig. 2, right). How then could perceived motion-in-depth be responsible for the vection advantages found for the stereo-reversed conditions? In Fig. 3 bottom-right we can see that on average stereo-reversed information: (1) increased perceived motion-in-depth speeds for seven participants; (2) decreased perceived motion-in-depth speeds for six other participants; and (3) had little effect on perceived motion-in-depth speeds for the remaining five participants. The relative effects of stereo-reversed information on vection-in-depth were quite similar. That is, stereo-reversed information increased vection more if it also increased the perceived motion-in-depth speed. Stereo-reversed information still increased vection, but by less, if it had no effect on perceived motion-in-depth speed or if it decreased this perceived speed. Thus, one potential explanation for the current findings was that stereo-reversed information provided a general benefit to vection (relative to flat-stereo conditions, e.g. by triggering binocular motion-in-depth processing), and that stereo-consistent (and sometimes also stereo-reversed) information could provide further vection benefits beyond this (e.g. by increasing the perceived speed of motion-in-depth).

Experiment 2: Motion adaptation during stereo-consistent, stereo-reversed and flat-stereo same-size flow

Most available evidence suggests that the mechanisms underlying the stereoscopic advantage for vection-in-depth are based on motion-in-depth processing. Experiment 2 therefore directly compared the motion-in-depth processing and vection triggered by the stereo-consistent, stereo-reversed and flat-stereo self-motion displays used in Experiment 1. Prolonged exposure to these displays (which all simulate constant velocity self-motion-in-depth) should result in neural motion adaptation, which would normally be expected to reduce the experience of vection over time (e.g., Kim & Khuu, 2014; Kim & Palmisano, 2011; Palmisano, Gillam, & Blackburn, 2000; Seno, Palmisano & Ito, 2011). However, while the flat-stereo self-motion displays in Experiment 1 only provided monocular motion signals, stereo-consistent and stereo-reversed self-motion displays also provided stereo-motion cues (i.e. changing-disparities-over-time and interocular-velocity-differences). Thus, in order to determine the degree of motion-in-depth adaptation generated by these three different types of display, we measured the durations of their motion after-effects (MAEs; Wohlgemuth, 1911). On each trial in this experiment, the optic flow ceased after 25 s and the now stationary dots remained on the screen until the observer indicated that the MAEs had been extinguished. Palmisano et al. (2016b) previously found that adding stereo-consistent information to radially expanding optic flow actually increased MAE durations. Presumably this was because their observers were adapting to both stereoscopic and monocular motion signals in stereo-consistent conditions, but only to monocular motion signals in flat-stereo conditions. Here we examine how observers adapt to the conflicting binocular and monocular motion signals in stereo-reversed self-motion displays for the first time.

Method

The apparatus used was identical to that of Experiment 1. There were, however, some important differences in the methods between these experiments. Unlike Experiment 1, only a single block of nine trials was tested (with much longer 5-min intertrial intervals). Each of these trials consisted of two specific phases: (1) an initial motion adaptation phase, where participants were shown either a stereo-consistent, stereo-reversed or flat-stereo pattern of radially expanding optic flow; and then (2) a subsequent MAE test phase, where all motion ceased and the dot objects remained stationary on the screen until the MAE was extinguished. Two different dependent measures were recorded for each trial: (1) the duration of the motion-in-depth aftereffect (i.e. the MAE duration); and subsequently (2) an overall vection strength rating for that trial.

Participants

Eight male and 12 female psychology students and staff at the University of Wollongong participated in this experiment (mean age 24.6 years; SD 8.7 years).Footnote 5 None of them had participated in Experiment 1. All had static stereoacuities of 40 s of arc or better, and an average inter-pupillary distance of 6.2 cm (SD 0.3 cm). All of the other participant details were the same as those of the previous experiment.

Visual displays

Displays during the motion adaptation phase were identical (in all but one respect) to the self-motion displays examined in Experiment 1 (they were stereo-consistent, stereo-reversed and flat-stereo patterns of same-size optic flow). As noted above, the blue objects also remained on the screen during the MAE test phase. However, they did not move at all during this phase of the trial. In the stereo-consistent and stereo-reversed trials, the displays were stereoscopic during both adaptation and test phases (with both static and dynamic stereoscopic information available during the adaptation phase and only static information available during the test phase). In the flat-stereo trials, displays were flat-stereo during both adaptation and test phases (providing only monocular motion perspective information during the adaptation phase, and only relative luminance information about spatial layout during the test phase). Unlike Experiment 1, all of the displays tested had a stationary, central fixation target, which subtended a visual area of 0.29° wide by 0.61° high and had a luminance of 35 cd/m2. This target was simulated to lie 1.4 m directly in front of the observer (i.e. at the same distance as the screen). The purpose of this fixation target was to reduce MAE measurement noise by minimising eye-movements during the motion adaptation phase (see Kim & Khuu, 2014; Palmisano et al., 2016b; Seno et al., 2011).

Procedure

Each trial in this experiment consisted of two phases. During the initial motion adaptation phase, the participant was exposed to same-size optic flow for 25 s and then all display motion ceased, leaving a static dot pattern on the screen for the MAE test phase. During this test phase, the now stationary dots remained visible until the participant indicated that the MAE had been completely extinguished. Then this screen was replaced with a rating scale, which was used to rate the strength of the vection experienced during the earlier motion adaptation phase. The participant was instructed as follows: “You will be shown a variety of displays simulating self-motion. During this period please maintain your fixation on the white target located in the middle of the display. After 25 s, all physical motion in the display will cease. At this time, your task is as follows: press the left mouse button when/if you perceive any motion and hold it down as long as this illusory motion continues. If such a decision becomes difficult, or if this perception of motion disappears, please release the mouse button” (instructions modified from Seno, Ito, & Sunaga, 2010; see also Palmisano et al., 2016b, Experiment 2). Before releasing the mouse button, participants were asked to double check that their MAEs for that trial had been completely extinguished by blinking.Footnote 6 Directly after recording their MAE duration, participants were asked to verbally rate the strength of their vection (from 0–10; the standard stimulus for these ratings was the same as that used in Experiment 1).

Results

The vection strength data, as well as the MAE duration data, were analysed in a similar fashion to Experiment 1. When participants experienced vection during the motion adaptation phase, it was again always in the direction simulated by the monocular motion signals. That is, forwards (as opposed to backwards) vection was perceived during exposure to the optic flow even in the stereo-reversed conditions.

Numbers of non-vection trials

Participants reported experiencing vection on 162 of the 180 experimental trials tested (i.e. 20 participants each responding once to the nine different DISPLAY TYPE by SPEED conditions). Of the 18 non-vection trials, one trial was stereo-consistent (a 0.23 m/s trial), ten trials were flat-stereo (six 0.23 m/s, two 0.46 m/s and two 0.92 m/s trials) and seven trials were stereo-reversed (two 0.35 m/s, four 0.46 m/s and one 0.92 m/s trials). Wilcoxon signed-rank tests were used to examine the numbers of non-vection trials in the three DISPLAY TYPE conditions across the 20 participants (as these data were not normally distributed). These tests revealed that the numbers of non-vection trials were not significantly different in: (1) the stereo-consistent conditions and the flat-stereo conditions (Z = -2.124, p = 0.034); and (2) the stereo-reversed conditions and the flat-stereo conditions (Z = -1.134, p = 0.257) (p values were evaluated against a Bonferroni-corrected critical α of 0.025).

Vection strength

We found a main effect of DISPLAY TYPE on vection strength ratings, F(1.553,29.514) = 9.629, p < 0.001, partial η2 = 0.336 (see Fig. 4, left). Pairwise comparisons on this main effect revealed that on average: (1) the stereo-consistent conditions (M = 4.2) produced significantly stronger vection ratings than the flat-stereo conditions (M = 2.1) (p = 0.001); (2) the stereo-reversed conditions (M = 3.0) also produced significantly stronger vection ratings than the flat-stereo conditions (p = 0.016); and (3) the stereo-consistent conditions did not produce significantly stronger vection ratings than the stereo-reversed conditions (p = 0.03; note that a Bonferroni-corrected critical α of 0.0167 for three comparisons was used). We also found a main effect of SPEED on vection strength ratings, F(2,38) = 9.179, p = 0.001, partial η2 = 0.326. Pairwise comparisons on this main effect revealed that on average: (1) the 0.92 m/s speed conditions (M = 3.5) produced significantly stronger vection ratings than the 0.46 m/s speed conditions (M = 3.0) (p = 0.009); and (2) the 0.46 m/s conditions did not produce significantly stronger vection ratings than the 0.23 m/s conditions (M = 2.8) (p = 0.16). The interaction between DISPLAY TYPE and SPEED was also significant, F(2.395,45.503) = 3.345, p = 0.036, partial η2 = 0.150 (see Fig. 4, left). To investigate this interaction further, we first calculated each participant’s signed stereo-consistent and stereo-reversed effects for vection strength for each of the simulated speeds tested (i.e. relative to flat-stereo-conditions). We then conducted separate one-way repeated-measures ANOVAs on this data. We found that SPEED significantly increased stereo-consistent effects on vection strength, F(1.326,25.197) = 7.171, p = 0.008, partial η2 = 0.274 – with the average size of the stereo-consistent advantage for vection strength increasing consistently with the speed from +1.29 (at 0.23 m/s) to +2.96 (at 0.92 m/s). However, SPEED did not significantly alter the stereo-reversed effects on vection strength, F(2,38) = 1.094, p = 0.345, partial η2 = 0.054.

Fig. 4
figure 4

Effects of DISPLAY TYPE (Stereo-Consistent, Stereo-Reversed or Flat-stereo) and SPEED (0.23, 0.46 or 0.92 m/s) on vection strength ratings (left) and motion aftereffect (MAE) durations (right) produced by same-size patterns of radially expanding optic flow. Error bars depict standard errors of the mean (SEMs)

MAE duration

Participants typically experienced MAEs as illusory object or scene motion-in-depth. We found a main effect of DISPLAY TYPE on MAE durations, F(2,38) = 8.071, p = 0.001, partial η2 = 0.298 (see Fig. 4, right). Pairwise comparisons on this main effect revealed that on average: (1) the stereo-consistent conditions (M = 9.0 s) produced significantly longer MAE durations than the flat-stereo conditions (M = 4.9 s) (p = 0.001); (2) the stereo-reversed conditions (M = 7.4 s) also produced significantly longer MAE durations than the flat-stereo conditions (p = 0.012); and (3) the stereo-consistent conditions did not produce significantly different MAE durations than the stereo-reversed conditions (p = 0.115). The main effect of SPEED and the interaction between DISPLAY TYPE and SPEED were both found to be non-significant, F(2,38) = 1.846, p = 0.172, partial η2 = 0.089 and F(4,76) = 1.499, p = 0.211, partial η2 = 0.073, respectively.

Relationship between vection strength and motion aftereffect duration

The ANOVAs in the previous section revealed stereo-consistent and stereo-reversed advantages for vection strength (compared to the flat-stereo conditions). In order to investigate the origins of these vection advantages, we calculated the average stereo-consistent and stereo-reversed effects on vection strength and MAE duration for each participant. These effects were calculated by first averaging each participant’s ratings across SPEED for each DISPLAY TYPE and then subtracting the average ratings for flat-stereo conditions from those for the stereo-consistent conditions or the stereo-reversed conditions in each case. We then conducted separate correlational analyses to examine the relationships between these different stereo-consistent and stereo-reversed effects. We found that the relationship between the stereo-consistent effects on vection strength and MAE duration was significant, r(20) = 0.786, p < 0.0001 (see Fig. 5, left). We also found that the relationship between the stereo-reversed effects on vection strength and MAE duration was significant, r(20) = 0.751, p < 0.0001 (see Fig. 5, right).

Fig. 5
figure 5

Plots showing (1) the relationship between stereo-consistent effects on vection strength ratings and motion aftereffect (MAE) durations (left); and (2) the relationship between stereo-reversed effects on vection strength ratings and MAE durations (right). A positive value along each axis represents a stereoscopic or pseudoscopic advantage for that particular percept

Discussion

As in Experiment 1, while stereo-consistent conditions produced the strongest vection ratings, stereo-reversed conditions still induced stronger vection-in-depth on average than flat-stereo conditions. A stereo-consistent advantage for vection strength was displayed by 15 of the 20 participants, and a stereo-reversed advantage for vection strength was displayed by 14 of these participants (as indicated by larger average strength ratings for these stereo, compared to flat-stereo, conditions). It is noteworthy that the stereo-reversed advantage for vection strength persisted in this experiment even though there was not a significant difference in the number of non-vection trials between the stereo-reversed and flat-stereo conditions (unlike Experiment 1).

Consistent with a common motion-in-depth based mechanism for both types of stereoscopic vection advantage, stereo-consistent and stereo-reversed information were both found to increase MAE durations compared to flat-stereo conditions (stereoscopic increases in MAE duration were displayed by 16 out of the 20 participants in both cases). Stereo-consistent effects on vection strength were also positively correlated with stereo-consistent effects on MAE duration. Similarly, stereo-reversed effects on vection strength were also positively correlated with stereo-reversed effects on MAE duration.

Interestingly, vection strength in flat-stereo conditions did not appear to be particularly affected by simulated speed in this experiment. By contrast, vection strength in these conditions did increase significantly with the simulated speed in Experiment 1. Given that the only difference between the displays used in these two experiments was the inclusion of a stationary fixation target, we speculate that its presence might have selectively impaired vection during these flat-stereo conditions. While the simulated location in depth of this stationary target object was poorly specified during flat-stereo conditions, it was clearly located at the same distance as the screen during the stereo-consistent and stereo-reversed conditions (as it had zero horizontal screen disparity). Alternatively, it is possible that the apparent lack of an effect of simulated speed in this experiment was a consequence of the longer delays before rating vection. The vection strength ratings in Experiment 1 were made directly after the display motion had ceased. However, in Experiment 2, these vection strength ratings were made from memory only after the MAE had been completely extinguished (7.1 s on average after the display motion had ceased). So it is possible that the effects of SPEED on vection were less salient and memorable than the DISPLAY TYPE effects (which were clearly present in both Experiment 1 and Experiment 2).

Experiment 3: Does the stereo-reversed advantage for vection persist for changing-size flow?

As noted previously, motion perspective does not appear to be the only monocular source of information about self-motion in depth. The self-motion displays used in Experiments 1 and 2 did not contain changing-size cues to motion-in-depth. Here we examine whether stereo-consistent and stereo-reversed advantages for vection-in-depth persist when changing-size cues are added to the optic flow. These additional cues in changing-size optic flow mimicked the natural perspective changes in retinal object size that normally occur during real-world self-motions. Previously Regan and colleagues found that perceptions of motion-in-depth induced by stereoscopic motion could be cancelled by changing-size cues, and vice versa (e.g. Regan & Beverley, 1979; Regan, Beverly, & Cynader, 1979). They concluded that dynamic stereoscopic and changing-size based information converge at the same motion-in-depth processing centre. This suggests that both types of motion-in-depth information might be similarly effective for enhancing vection. Consistent with this notion, adding changing-size cues to optic flow has been previously shown to induce more compelling vection than same-size optic flow (Palmisano, 1996; Palmisano & Chan, 2004; Seya & Shinoda, 2018). In fact, Seya and Shinoda (2018) found that stereoscopic patterns of same-size optic flow and non-stereoscopic patterns of changing-size optic flow produced very similar vection in terms of their rated strength, onset latency and duration. Based on these findings, one might expect that both stereo-consistent and stereo-reversed effects on vection-in-depth should be reduced in Experiment 3 (compared to Experiments 1 and 2, which tested these stereo effects on same-size optic flow).Footnote 7 However, Seya and Shinoda (2018) also found that displays which contained both stereo-consistent and changing-size cues produced superior vection to that induced by displays with only one of these cues. So it is possible that these stereoscopic advantages for vection-in-depth might be robust to the addition of changing-size cues.

Method

The design, apparatus and procedure were identical to those of Experiment 1. The visual displays were also identical to those used in Experiment 1, with one exception: Each of the objects in the 3-D cloud environments was circular and simulated to have a finite size (1.5 cm in diameter) (please see Supplementary Material, Movie 2: Changing-sizeStereoMovie.avi). As a result, the optical sizes of these objects increased from 0.14° up to 2.5° in visual angle as the observer was simulated to approach them (just as they would have during real-world self-motion). In these changing-size patterns of optic flow, depth order was unambiguously specified by both relative size and relative motion (i.e. objects with larger relative image sizes, larger rates of change in size, and faster relative optical velocities should have appeared to be nearer to the observer – at least during the flat-stereo conditions; see Braunstein & Andersen, 1981; Hochberg & Hochberg, 1952).

Participants

Six male and 12 female psychology students and staff at the University of Wollongong participated in this experiment (mean age 23.9 years; SD 3.4 years).Footnote 8 None of them had participated in the previous experiments. All had normal or corrected-to-normal visual acuities. While 15 participants had static stereoacuities of 40 s of arc or better, the remaining three participants had stereoacuities of either 100 or 140 s of arc. Overall, the average inter-pupillary distance was 6.1 cm (SD = 0.31 cm). All of the other participant details were the same as those for Experiments 1 and 2.

Results

Vection data

The vection strength and onset latency data were analysed in a similar fashion to Experiment 1.

Numbers of non-vection trials

Participants reported experiencing vection on 262 of the 324 experimental vection trials tested (i.e. 18 participants each responding twice to the nine different DISPLAY TYPE by SPEED conditions). Of the 62 non-vection trials, six trials were stereo-consistent (six 0.23 m/s trials), 32 trials were flat-stereo (seventeen 0.23 m/s, nine 0.46 m/s and six 0.92 m/s trials) and 24 trials were stereo-reversed (ten 0.23 m/s, ten 0.46 m/s and four 0.92 m/s trials). Wilcoxon signed-rank tests were used to examine the numbers of non-vection trials in the three DISPLAY TYPE conditions across the 18 participants (as these data were not normally distributed). These tests revealed that: (1) there were significantly fewer non-vection trials in the stereo-consistent conditions than in the flat-stereo conditions (Z = -3.104, p = 0.002); and (2) the numbers of non-vection trials in the stereo-reversed conditions were not significantly different to those in the flat-stereo conditions (Z = -0.995, p = 0.320) (p values were evaluated against a Bonferroni-corrected critical α of 0.025).

Vection strength

We found a main effect of DISPLAY TYPE on vection strength ratings, F(2,34) = 14.271, p < 0.0001, partial η2 = 0.456 (Fig. 6, left). Pairwise comparisons on this main effect revealed that on average: (1) the stereo-consistent conditions (M = 5.1) produced significantly stronger vection ratings than the flat-stereo conditions (M = 2.7) (p < 0.0001); (2) these stereo-consistent conditions also produced significantly stronger vection ratings than the stereo-reversed conditions (M = 3.2) (p = 0.005); and (3) the stereo-reversed conditions did not produce significantly different vection ratings to the flat-stereo conditions (p = 0.277). We also found a main effect of SPEED on vection strength ratings, F(2,34) = 26.270, p < 0.0001, partial η2 = 0.607. Pairwise comparisons on this main effect revealed that on average: (1) the 0.92 m/s speed conditions (M = 5.1) produced significantly stronger vection ratings than the 0.46 m/s speed conditions (M = 3.4) (p < 0.0001); and (2) the 0.46 m/s conditions produced significantly stronger vection ratings than the 0.23 m/s conditions (M = 2.6) (p = 0.02). The interaction between DISPLAY TYPE and SPEED was not significant, F(2.601,44.220) = 1.226, p = 0.309, partial η2 = 0.067.

Fig. 6
figure 6

Effects of DISPLAY TYPE (Stereo-Consistent, Stereo-Reversed or Flat-stereo) and SPEED (0.23, 0.46 or 0.92 m/s) on the vection strength ratings (left) and onset latency (right) induced by changing-size patterns of radially expanding optic flow. Error bars depict standard errors of the mean (SEMs)

Vection onset latency

As our vection onset latency data was not found to follow a normal distribution (Kolmogorov-Smirnov D(162) = 0.094, p = 0.01), we examined the effects of DISPLAY TYPE and SPEED using non-parametric Friedman tests (instead of conducting a repeated measures ANOVA as we did for the vection onset data in Experiment 1). We found a significant main effect of DISPLAY TYPE on vection onset latencies, X2 (2, N = 18) = 13.00, p = 0.002 (see Fig. 6, right). Additional Wilcoxon signed-rank tests on this main effect revealed that on average: 1) the stereo-consistent conditions produced significantly shorter vection onset latencies than both the flat-stereo conditions (Z = -3.506, p < 0.0001) and the stereo-reversed conditions (Z = -2.461, p = 0.014); and 2) the stereo-reversed conditions did not produce significantly different vection onset latencies to the flat-stereo conditions (Z = -0.762, p = 0.446) (a Bonferroni-corrected critical α of 0.0167 was used to evaluate the significance of these three tests). We also found a main effect of SPEED on vection onset latencies, X2 (2, N = 18) = 14.778, p = 0.001. Additional Wilcoxon signed-rank tests on this main effect revealed that on average: 1) the 0.92 m/s speed conditions produced significantly shorter vection latencies than the 0.46 m/s speed conditions (Z = -3.245, p = 0.001); and (2) the 0.46 m/s conditions did not produce significantly shorter vection onset latencies than the 0.23 m/s conditions (Z = -1.067, p = 0.286) (a Bonferroni-corrected critical α of 0.025 was used to evaluated the significance these two tests).

Other rating data

Scene depth

We found a main effect of DISPLAY TYPE on scene depth ratings, F(2,34) = 21.080, p < 0.0001, partial η2 = 0.554 (see Fig. 7, left). Pairwise comparisons on this main effect revealed that on average: (1) the stereo-consistent conditions (M = 5.5) produced significantly greater scene depth ratings than the flat-stereo conditions (M = 3.2) (p < 0.0001); (2) these stereo-consistent conditions also produced significantly greater scene depth ratings than the stereo-reversed (M = 3.3) conditions (p < 0.0001); and (3) the stereo-reversed conditions did not produce significantly greater scene depth ratings than the flat-stereo conditions (p = 0.808). We also found a main effect of SPEED on scene depth ratings, F(2,34) = 23.571, p < 0.0001, partial η2 = 0.581. Pairwise comparisons on this main effect revealed that on average: (1) the 0.92 m/s conditions (M = 4.7) produced significantly greater scene depth ratings than the 0.46 m/s conditions (M = 4.0) (p = 0.001); and (2) the 0.46 m/s conditions produced significantly greater scene depth ratings than the 0.23 m/s conditions (M = 3.3) (p = 0.001). The two-way interaction between DISPLAY TYPE and SPEED was significant, F(4,68) = 5.418, p = 0.001, partial η2 = 0.242. To investigate this interaction further, we first calculated each participant’s signed stereo-consistent and stereo-reversed effects for scene depth for each of the simulated speeds tested (i.e. relative to flat-stereo-conditions). We then conducted separate one-way repeated-measures ANOVAs on this data. We found that SPEED significantly increased stereo-consistent effects on scene depth, F(2,34) = 5.332, p = 0.01, partial η2 = 0.239. However, SPEED did not significantly alter the stereo-reversed effects on scene depth, F(2,34) = 2.386, p = 0.107, partial η2 = 0.123.

Fig. 7
figure 7

Effects of DISPLAY TYPE (Stereo-Consistent, Stereo-Reversed to Flat-stereo) and SPEED (0.23, 0.46 or 0.92 m/s) on the scene depth ratings (left) and motion-in-depth (MID) speed ratings (right) produced by these changing-size patterns of radially expanding optic flow. Error bars depict standard errors of the mean (SEMs)

Motion-in-depth speed

We also found a main effect of SPEED on motion-in-depth speed ratings, F(1.347, 22.893) = 105.585, p < 0.0001, partial η2 = 0.861 (see Fig. 7, right). Pairwise comparisons on this main effect revealed that on average: (1) the 0.92 m/s conditions (M = 5.9) produced significantly greater motion-in-depth speed ratings than the 0.46 m/s conditions (M = 3.7) (p < 0.0001); and (2) the 0.46 m/s conditions produced significantly greater motion-in-depth speed ratings than the 0.23 m/s conditions (M = 2.1) (p < 0.0001). The main effect of DISPLAY TYPE and the two-way interaction between DISPLAY TYPE and SPEED were not significant, F(2,34) = 2.043, p = 0.145, partial η2 = 0.107 and F(2.402,40.838) = 0.623, p = 0.570, partial η2 = 0.035, respectively.

Relationships between the stereoscopic effects on vection, scene depth and motion-in-depth speed

The above analyses revealed significant stereo-consistent effects (but not significant stereo-reversed effects) on vection (compared to the flat-stereo conditions). In order to further investigate the origins of the stereo-consistent advantage for vection, we calculated the average stereo-consistent effects on vection strength, scene depth and motion-in-depth speed for each participant. These effects were calculated by first averaging each participant’s ratings across SPEED for each DISPLAY TYPE and then subtracting the average ratings for flat-stereo conditions from those for the stereo-consistent conditions or the stereo-reversed conditions in each case. We then conducted a correlational analysis to examine the relationships between these three stereoscopic effects (see Table 3).

Table 3 Pearson correlation matrix of stereo-consistent effects on vection strength, scene depth and motion-in-depth speed

We found that there was a significant positive correlation between stereo-consistent effects on vection strength and motion-in-depth speed (this significant relationship is also shown in Fig. 8, top-right). However, the relationship between stereo-consistent effects on vection strength and scene depth was not significant (this non-significant relationship is shown in Fig. 8, top-left).

Fig. 8
figure 8

Plots showing the relationships between stereoscopic effects on vection strength and perceived scene depth ratings (left) and between these effects on vection strength and motion-in-depth (MID) speed ratings (right). Stereo-consistent effects are shown in the top-left and top-right plots. Stereo-reversed effects are shown in the bottom-left and bottom-right plots. A positive value along each axis represents a stereoscopic or pseudoscopic advantage for that particular percept

We performed an equivalent correlational analysis on the stereo-reversed effects on vection strength, scene depth and motion-in-depth speed. However, none of the relationships between the stereo-reversed effects were found to be significant (see Table 4; the non-significant relationships between stereo-reversed effects on vection and scene depth, and between stereo-reversed effects on vection strength and motion-in-depth speed, are shown in Fig. 8, bottom-left and bottom-right, respectively).

Table 4 Pearson correlation matrix of stereo-reversed effects on vection strength, scene depth and motion-in-depth speed

Discussion

Consistent with the previous research (Palmisano, 1996; Seya & Shinoda, 2018), the stereo-consistent advantage for vection was found to persist for changing-size optic flow. These stereo-consistent displays induced vection which was stronger and started sooner than that induced by the flat-stereo displays. This stereo-consistent advantage was displayed by all 18 participants for vection strength and by 16 of the participants for vection onset latency. Also consistent with the notion that this stereoscopic advantage for vection was driven by binocular information about motion-in-depth, we found that the relationship between the stereo-consistent effects on vection strength and perceived motion-in-depth speed remained significant in Experiment 3 (as it was in Experiment 1). By contrast, the relationship between the stereo-consistent effects on vection strength and perceived scene depth was not significant in this experiment. Regan and colleagues previously showed that dynamic stereoscopic information is more effective for object motion-in-depth perception than changing-size information when these objects are observed for an extended period of time (Regan & Beverley, 1979). By contrast, they found that changing-size information was more effective for perceiving the motion-in-depth of briefly glimpsed objects. It is possible then that the 25-s display motion exposures in this experiment might have favoured stereo-consistent, over changing-size based, self-motion information (since the simulated self-motion was always seen for a relatively long time). Conceivably the stereo-consistent advantage for vection might have been weaker if we had used shorter exposures of visually simulated self-motion-in-depth.

In this experiment we also found that the stereo-reversed advantage for vection was reduced by adding changing-size cues to the optic flow. The vection induced by stereo-reversed and flat-stereo patterns of changing-size optic flow did not differ significantly in terms of their strength ratings and onset latencies. These null findings suggest that stereo-reversed information in Experiment 3 was ignored/downplayed in favour of the changing-size cues to motion-in-depth. These local changes in optical size were completely compatible with the global motion perspective – providing consistent monocularly-available information about the direction and speed of the motion-in-depth. While both stereo-consistent and stereo-reversed conditions should have triggered additional, binocular motion-in-depth processing, the stereoscopic information in the stereo-reversed conditions would have conflicted with the monocular motion perspective and changing-size cues. Taken together our findings suggest that, while stereo-reversed information can improve vection-in-depth, there still appears to be an added advantage provided by stereo-consistent information.

General discussion

This study shows that stereoscopic advantages for vection-in-depth persist for most observers even when their left and right eye views are swapped - at least in conditions where changing-size cues to motion-in-depth are not available. When same-size patterns of optic flow were examined in Experiments 1 and 2, stereo-consistent and stereo-reversed conditions both produced significantly stronger vection ratings than flat-stereo conditions. Unlike stereo-consistent conditions, stereo-reversed conditions placed stereoscopic and monocular information about motion-in-depth, scene layout and self-motion into direct conflict. Thus, our finding of a stereo-reversed advantage for vection-in-depth shows that stereoscopic information does not need to be consistent with monocular motion signals (in terms of either sign or magnitude) in order to improve vection. Taken together with previous findings, these results suggest that stereoscopic information might only need to be dynamic (as opposed to static) in order to improve vection-in-depth.

When participants experienced vection in this study, it was always perceived to be in the direction simulated by the monocular motion signals. Participants perceived illusory forwards self-motion in depth even during the stereo-reversed conditions of Experiments 1, 2 and 3. However, in these stereo-reversed conditions, the stereoscopic motion signals should have indicated backwards, rather than forwards, self-motion. Based on this observation, it appears unlikely that stereo-motion cues in the stereo-reversed conditions improved vection by providing extra binocular information about self-motion. Instead, it appears more likely that monocular motion signals generally dominated the visual perception of self-motion. We propose that: (1) the static binocular cues in flat-stereo patterns of same-size optic flow placed additional restrictions on (or otherwise interfered with) vection-in-depth; and (2) the dynamic stereoscopic cues in stereo-consistent and stereo-reversed conditions removed these particular restrictions on vection induction.

If one assumes that stereo-consistent and stereo-reversed advantages for vection had similar underlying mechanisms, then they would appear to be best explained by motion-in-depth processing (rather than by any indirect effects of perceived scene depth on vection). First, we found that both stereo-consistent and stereo-reversed displays produced longer MAE durations than flat-stereo conditions – suggesting that both stereo conditions had similar effects on visual motion-in-depth processing. Second, we found that relationships between vection strength and perceived motion-in-depth were more reliable than those between vection strength and perceived scene depth. Third, we found that whenever there were significant stereo-consistent or stereo-reversed advantages for vection, there were also significant correlations between stereoscopic effects on vection strength and stereoscopic effects on perceived motion-in-depth. In Experiments 1–3, stereo-consistent information always increased vection strength (relative to flat-stereo conditions), and these effects on vection always correlated significantly with stereo-consistent effects on perceived motion-in-depth. By contrast, stereo-reversed information only increased vection strength when same-size (not changing-size) optic flow was used. While stereo-reversed effects on vection correlated significantly with stereo-reversed effects on perceived motion-in-depth in Experiment 1 (same-size flow), they did not do so in Experiment 3 (changing-size flow).

We propose that the stereo-reversed conditions in Experiments 1 and 2 produced stronger vection-in-depth than flat-stereo conditions, simply because they triggered binocular motion-in-depth processing. Since both of these experiments used same-size optic flow, this stereo-reversed information was needed to confirm the monocular motion perspective information about self-motion-in-depth (Also consistent with this proposal, flat-stereo same-size conditions were often less likely to induce vection than conditions with some stereo-motion). However, stereo-reversed information was not required for this purpose in Experiment 3. In this final experiment, all of the displays (including the flat-stereo conditions) contained changing-size cues, which should have triggered similar motion-in-depth mechanisms to the stereo-motion cues.

While the stereo-reversed advantage for vection was only found for same-size optic flow, the stereo-consistent advantage for vection persisted for changing-size optic flow. It is also worth noting that only stereo-consistent information was able to significantly reduce vection onset latencies in this study. While stereo-consistent conditions produced shorter vection onset latencies than flat-stereo conditions in Experiments 1 and 3, stereo-reversed conditions did not generally produce different onset latencies to flat-stereo-conditions (note that vection onset latency was not recorded in Experiment 2). Based on these findings, the stereo-consistent advantage must have involved something more than releasing restrictions on vection produced by presenting stimuli on a 2-D planar screen (i.e. by replacing static with dynamic binocular cues). It seems likely that stereo-consistent conditions continued to improve the vection induced by changing-size optic flow because they provided compatible binocular and monocular information about motion-in-depth. It is even possible that the stronger, more reliable stereo-consistent advantages for vection were due to the presence of extra, purely binocular information about self-motion-in-depth in these displays – as was originally proposed by Palmisano (1996, 2002).

Not surprisingly we found that there were large individual differences in the effects of stereo-reversed information on perceptions of scene depth, motion-in-depth and vection in this study. This was expected based on previous reports of large individual differences in both the nature and the timing of pseudoscopic experiences (Palmisano, Hill, & Allison, 2016a). While stereo-reversed information increased the perceived scene depths of same-size optic flow for almost all participants (relative to flat-stereo conditions), the magnitudes of these effects were found to vary considerably. However, when changing-size cues were added to the optic flow this stereo-reversed information was found to increase perceived scene depths for some participants and reduce them for others (i.e. there were individual differences in terms of both the signs and the magnitudes of these stereo-reversed effects). Stereo-reversed information was also found to increase perceptions of motion-in-depth for some participants and reduce them for others (however in this case the types of individual differences seen were similar for both changing-size and same-size optic flow). There were also individual differences in stereo-reversed effects on vection. While stereo-reversed information increased vection for all but two individuals when same-size optic flow was used, it increased vection for fewer participants when changing-size cues were added. Presumably these individual differences in our participants’ perceptual experiences reflect differences in the ways that they combined these conflicting binocular and monocular sources of information. However, it should be noted that there were also some (albeit lesser) individual differences in stereo-consistent effects on perceived scene-depth, motion-in-depth and vection as well. For example, while adding stereo-consistent information increased the perceived motion-in-depth for most participants, it actually reduced perceived motion-in-depth for 27% of the participants in Experiment 1 and 40% of the participants in Experiment 3. Similarly, adding stereo-consistent information was found to slightly reduce vection-in-depth for two of the participants in Experiment 1. Thus, there can also be substantial individual differences in how observers combine consistent binocular and monocular information as well.

Finally, we should briefly discuss the possible applications of the current findings – especially our discovery that stereo-reversed information does not impair the induction of vection and sometimes it even improves the experience of this illusory self-motion. The knowledge gained in this study could be used to modulate perceived self-motion in a variety of applications including virtual reality and telepresence/teleoperation (depending on the application-specific goals). Our finding that only dynamic stereoscopic information appears to be important for enhancing vection suggests there may be some flexibility in the static accuracy of stereoscopic cues, and thus more emphasis should be placed on providing suitable dynamic stereoscopic information. Future research will need to examine whether the surprising tolerance to errors in binocular viewing geometry seen in this study also generalise to the vection induced by head-mounted displays (HMDs). While reported motion sickness was minimal with the external displays used in the present study, it would also be informative to examine the effects that these stereoscopic display manipulations have on HMD-based cybersickness.

Conclusions

The stereo-reversed advantage for vection (reported here for the first time) shows that stereoscopic information does not need to be consistent with monocularly available information in order to improve vection. Even when the stereoscopic motion used in this study had the opposite sign to the monocular motion signals, it was still capable of enhancing experiences of vection-in-depth. We have previously shown that stereoscopic advantages for vection persist, and sometimes even increase, when there are differences in the magnitudes of the stereoscopic and monocular motion signals provided (e.g. Palmisano et al., 2019). When taken together, these findings along with the correlation between stereoscopic advantages for perceived motion-in-depth and vection, suggest that stereoscopic information might only need to be dynamic (as opposed to static) in order to improve vection-in-depth. However, we also appeared to find an added advantage for stereo-consistent (over stereo-reversed) conditions in the current study. While stereo-reversed advantages for vection strength were only sometimes found, stereo-consistent advantages for vection strength and vection onset latency were always found. Thus, we propose that stereoscopic motion is capable of enhancing vection-in-depth in at least two different ways: (1) by triggering binocular motion-in-depth processing; and (2) by providing binocular motion-in-depth information which (when compatible) is used to supplement monocularly-available information about self-motion.