1 Introduction

For the past few decades, there has been a growing demand for effective virtual reality (VR) solutions across a wide range of industries (Lombard and Ditton 1997), including telemedicine (Crump and Pfiel 1995; Hamit 1995), distance education (Hackman and Walker 1990), video gaming (Cook 1992) and even health (Chen et al. 2020). The principal perceived benefit of VR—particularly when using an head-mounted display (HMD) for VR—is its ability to generate presence (i.e., the feeling of “being there” in the virtual environment; see Sheridan 1992 and Barfield & Weghorst 1993; see also Slater and Wilbur 1997 and Witmer and Singer 1998; and see Skarbez et al. 2017 for a recent review of the debate on presence). According to Slater (2009), presence depends on the degree to which the user perceives that: 1) they are actually there in the virtual environment (referred to as the place illusion); and 2) what is apparently happening to him/her is actually happening (referred to as the plausibility illusion). The recent release of next-generation HMDs has accelerated the potential for presence and immersion to occur in VR at an increasingly affordable cost to consumers. This financial accessibility offers increased opportunities for creative development, home-brew entertainment and laboratory scientists interested in understanding how we perceive our own movement (i.e., both actual and simulated self-motion) through real and virtual environments (Steinicke et al. 2013). Increasing the experience of self-motion perception has the benefit of increasing engagement for education and training (e.g., Coyne et al. 2019), and remote rehabilitation (Pedram et al. 2020).

1.1 Possible relationship between vection and presence in HMD-VR

Another feature of HMD-VR is its ability to generate illusions of self-motion, known as vection (for a review please see Palmisano et al. 2011). According to Palmisano et al. (2015), vection has previously been defined as: (i) a visual illusion of self-motion in stationary observers (Dichgans and Brandt 1978), (ii) an illusion of self-motion induced by stimulating either the visual or the non-visual senses (e.g., auditory vection, see Sakamoto et al. 2004), (iii) a real or illusory visually mediated perception of self-motion (Kim and Tran 2016), or (iv) the conscious subjective experience of self-motion (as in Ash et al., 2013). Thanks to their wide fields of view and their ability to provide the user with stereoscopic first-person views of their virtual environment, HMDs commonly induce highly compelling visual illusions of self-motion. However, in HMD-VR, auditory and other non-visual sources of stimulation can also contribute to the user’s overall experience of vection.

Several studies suggest that the user’s experience of vection is positively related to their feelings of presence in VR (e.g., Riecke et al. 2006; Keshavarz et al. 2018). In the earliest of these studies, Riecke et al. (2006) used a large external display that simulated a natural light field of a streetscape to generate circular vection. They reduced the coherence and realism of the virtual scene (by either progressively scrambling the image content or changing the orientation of the light field) and found that vection and spatial presence both declined as scene realism was degraded. The authors proposed the observed (positive) relationship between vection and presence was due to participants perceiving that they were moving more when they felt more spatially present in the virtual environment. In the more recent study, Keshavarz and colleagues (2018) compared the relationship between vection and spatial presence during different types of simulated head rotation (yaw, pitch and roll) in VR. They found positive relationships between presence and both vection strength and duration for simulated pitch and roll self-rotations, but not for rotations in yaw. The lack of an observed relationship during yaw head rotation could indicate that presence is only related to vection when the simulated self-motion occurs around an axis that is not aligned with the earth-gravitation vector.

Despite these insights on the relationship between vection and presence provided by previous studies, it should be noted that they used large field-of-view stereoscopic external displays to present their self-motion displays. To our knowledge, only one other study has examined the possible relationship between vection and presence in HMD-VR (Clifton and Palmisano 2020). While this failed to find a significant relationship between vection and presence, it should be noted that only half of the VR exposure trials in this study used a virtual navigation method that produced continuous visual motion stimulation (i.e., teleportation was used for all of the remaining trials). Thus, the current study plans to examine this relationship in HMD-VR under more favorable conditions for vection induction.

1.2 Effects of head-to-display lag on vection and presence

Since users in HMD-VR are typically quite active when exploring and interacting with a virtual environment, it is important to consider the effects their tracked head (and hand/body) movements have on their perceptual experiences—particularly as all VR systems have a finite head-to-display lag (also known as motion to photon latency). Previously, Allison, Harris and Jenkin (2001) found that human observers could tolerate significant additional display lag before the virtual environment became perceptually destabilized. In that study, significant display destabilization was only perceived when observers executed head movements at a velocity of 45°/s or more (which revealed the inconsistencies between their head and display motion). However, other researchers have proposed that moderate increases in head-to-display lag (e.g., of 40–60 ms) can impair perception of simulator fidelity (Adelstein et al. 2003) and that even shorter increases in display lag (< 20 ms) can be perceptible to well-trained human observers (Mania et al. 2004). Detection thresholds for display lag in these active head movement studies were lower than the reported detection thresholds for passive participant rotations in an oscillating chair (Moss et al. 2010). Given this variability in the findings from previous studies, it is important to consider how head-to-display lag affects perceptual experiences in HMD-VR.

No previous study has systematically examined how head-to-display lag affects the experiences of vection and presence induced by HMDs. However, a number of studies have examined the effects this display lag has on cybersickness (i.e., the feelings of motion sickness in HMD-VR – La Viola, 2000; Palmisano et al. 2017). Moss et al. (2011) instructed participants to view a real live video scene through an HMD and found that cybersickness was not affected by increased display lag. Similarly, St Pierre et al. (2015) found that: (1) adding a constant display lag of 270 ms did not significantly alter cybersickness (compared to the baseline lag condition of 70 ms); and (2) introducing a variable 0.2 Hz display lag increased cybersickness. In another study also using live video, Kinsella et al. (2016) found that 0.2 Hz varying lag generated greater sickness than 1.0 Hz varying lag and 100 ms fixed lag conditions. However, contrary to the view that only variable display lag affects the user’s well-being, Feng et al. (2019) recently found that adding constant lag to the very low baseline latencies of the Oculus Rift CV1 and S HMDs (estimated at < 5 ms) significantly increased cybersickness. This disruptive effect of display lag was later found to be even larger under binocular, compared with monocular, viewing conditions (Palmisano et al. 2019). These effects of display lag on cybersickness can be attributed to the very low initial display lag levels and may be enhanced by use of virtual (as opposed to live video) content in HMDs.

In the past when studies have measured vection or presence in HMD VR, the baseline system latencies were (or would have been) quite high (e.g., 37.9 ms in Kim et al. 2015 and 72 ms in Palmisano et al. 2017). It is critically important to understand the effect that controlled manipulation of system latency has on perceived vection and spatial presence. However, most HMD VR studies have only used displays with fixed latencies. Below we will briefly review some of the past HMD studies on vection, as well as some studies on the perceptual effects of head-display lag using large external displays.

Kim et al. (2015) showed that the Oculus Rift DK1 HMD could induce compelling vection when observers made yaw angular head rotations at approximately 1 Hz in time with a metronome. They found that synchronized head-display motion generated stronger vection than no compensation, which in turn generated superior vection to inversely compensated motion. This dependence of vection on head-to-display synchronization was observed despite the Oculus Rift DK1 HMD having relatively long system latencies, ranging from 37.9 ms up to 196.7 ms depending on the scene complexity and rendering mode (Kim et al. 2015). However, this Kim et al. (2015) study did not systematically determine the effects of increasing display lag on vection and presence, nor did they use an HMD system that could achieve an extremely low baseline display lag.

A number of studies have used large external displays to examine how display lag affects the strength of vection. They found that vection decreased when the display moved in a contralateral (as opposed to ipsilateral) direction relative to the observer’s oscillatory linear head movements (e.g., Kim and Palmisano 2008; Kim and Palmisano 2010; c.f., Ash et al. 2011a, b). Technically, this contralateral display motion constituted a phase shift of 180˚ and a system lag of 500 ms. Another study by Ash et al. (2011a, b) systematically added display lags ranging between 0 and 200 ms to their baseline system latency of 113 ms. They found a negative relationship between ratings of vection strength and perceived lag; whereas higher perceived head-display lag generated lower vection strength ratings, lower perceived head-display lag generated stronger vection strength ratings. Vection was the lowest (and perceived lag the greatest) when head and display motions were approximately 60\(^\circ\) to 75\(^\circ\) out of phase (i.e., equivalent to display lags of 113 + 50 ms and 113 + 100 ms for head movements at ~ 1 Hz). Interestingly, lags above 90\(^\circ\) (which were more consistent with counter-phase head-display synchronization) generated relatively strong vection and low perceived lags. These findings suggest that vection depends more critically on the temporal head-to-display lag (as opposed to the phase angle of the head-display synchronization).

One limitation of the Ash et al. (2011a, b) study was again the very large baseline system latency. As a result, the explored head-display lag was limited in range from 40\(^\circ\) to 112\(^\circ\) in phase angle and from 113 to 313 ms in temporal latency. Another potential limitation of this study was the use of an external display scenario, where the visual display was fixed to the wall and viewed monocularly. This is unlike VR technology where the vantage points for both eyes are rendered separately and presented on an HMD display that adheres to the observer’s own head movements with high fidelity. Previous research has shown that stereoscopic viewing improves linear vection in depth (Palmisano 1996, 2002), and recent work has shown that stereo viewing also enhances the strength of circular vection (Palmisano et al. 2016). The HMDs used in VR should therefore generate superior vection because they support large-field stereoscopic viewing. However, systematic testing of the effects of system latency and speed of yaw head rotation on vection and spatial presence is yet to be investigated. This requires controlled study of active head movement and latency imposed on the performance of the HMD VR system.

1.3 The Current Study

In the present study, we examined the perceptual effects of adding head-to-display lag to the Oculus Rift CV1 HMD. We used this particular HMD because of its anticipated low baseline head-display lag (under 10 ms). Specifically, this study had two aims: (1) to determine whether vection differentially depends on interactions between head movement speed and the inherent head-display lag; and (2) to determine whether presence also differentially depends on interactions between head movement speed and the inherent head-display lag. The magnitude of the difference in the user’s physical and virtual head orientations should increase with both the head-to-display lag and the speed of their head movements. This in turn should increase visual-proprioceptive conflict (Lee and Lishman 1975) as well as visual-vestibular conflict (Kim et al. 2020). Based on past research (primarily using external displays), we predicted that increases in head-to-display lag and head-movement speeds would both reduce vection strength. While the effects of display lag on our other outcome measures had not previously been examined, we also predicted that increases in head-display lag and head-movement speed would decrease user feelings of presence in their visual environment.

2 Method

2.1 Participants

A total of 23 observers participated in the experiment. There were 14 females and 9 males with an age range of 18–42 years. All had normal or corrected-to-normal vision and no known or reported signs of neurological disorder. All reported feeling well at the start of the experiment. Procedures were approved by the biomedical Human Research Ethics Advisory panel (HREA-B) at the University of New South Wales (UNSW, Sydney) and adhered to the principles in the Declaration of Helsinki.

2.2 The virtual environment

We used the Oculus VR software development kit (OVR) to render a 3D cloud of randomly positioned objects. The 3D point cloud was implemented in the same way as previous research (Kim and Khuu 2014), except that we used circular objects with no local orientation. A total of 6,912 points were rendered for each eye’s view using a combination of calls to OpenGL and the GLSL pipeline (GL Shading Language). Initially, a framebuffer object was created, which is an offscreen memory allocation representing the displayable image area. Custom Open GLSL vertex and fragment shaders were written to render the points to the framebuffers prior to displaying on the Oculus Rift HMD. This method of rendering the display was the most efficient approach to performing rendering operations close to real-time performance, as it relies on the GPU of the video card. When the head is held completely stationary, the display expanded radially to simulate smooth forwards self-motion at a velocity of approximately 3 m/s.

2.3 Estimating HMD system lag

The system was controlled by custom software written in Microsoft Visual C +  + 2010 running under the Windows 10 operating environment on an H270 PRO ASUS configuration with Intel i7-7700 CPU and 16 GB RAM. The video card was a Nvidia GeForce GTX 1050ti graphics adapter with 4 GB RAM. We configured the system to work with the Oculus Rift CV1 HMD without the touch remotes. The initial calibration and positioning of equipment was performed according to the prescribed procedures in the Oculus setup manuals.

After setting up the system, we created a simple display to estimate the baseline HMD system lag using the same procedure as Kim et al. (2015). The scene was configured to render the optic flow display described above, but with the dots rendered invisible (the dots and the background were both rendered as mid-gray). Two additional dark spots were added to the scene (both 0.5˚ in diameter). The dark spot on the left remained fixed on the left-third central axis of the display, irrespective of the HMD movement (reference spot). The dark spot on the right was configured so that its vertical position was altered by yaw HMD movement (calibration spot), which helped minimize cross-talk during recording. A Blackfly S digital USB camera (effective frame rate of 400 fps) was used to track the image positions of the dark spots on one of the Oculus Rift’s displays. We oscillated the HMD in yaw (about ± 8˚), taking care to ensure the image capture was maintained. A custom gimbal ensured that these HMD rotations were centered around the camera’s image plane. The tracked vertical position of the reference spot was subtracted from the tracked vertical position of calibration spot to further minimize cross-talk generated by unintentional vertical displacements of the HMD device during yaw oscillation. Then, the HMD system lag was estimated by assessing the temporal cross-correlation between the peak positions of the dark calibration and reference spots [after cubic spline fitting to increase precision (1000 points)].

Since Wu et al. (2013) have shown that tracker-based system lags (like those associated with HMDs) can vary markedly over time, we examined how consistent our lag estimates were over time with the Oculus Rift CV1 HMD. We estimated the display lags in the current study based on 40 s recordings of both the baseline lag condition and the condition with the maximum imposed lag. These time-series data were then broken into 20 × 2 s windows in order to compute average head-display lag and the variation in this lag (95% confidence intervals (CIs)) for these two conditions.

2.4 Technique for increasing head-display lag

We modified head-to-display lag by shifting head movement data in time using a 1D memory array of finite length. As shown in Fig. 1, the observer’s head position in 6DOF was stored in a single block of memory at the current index (denoted ti). Once written, the current index of the memory array was incremented to the next location (ti → ti+1). At this new location, the memory was read and used to overwrite the contents of the current head sensor data. By increasing the length of the array (n) above a value of 1, we progressively imposed temporal lag into our system. An array of one element in length would generate no added system lag because the sensor data written to the current element would be the same data read and used for updating the perspective views for the two eyes.

Fig. 1
figure 1

The memory buffer method used to impose head-display lag. The contents of the HMD sensor were written to the memory block at the current time index (ti). The current index was then incremented to read the contents of the next element which were used to overwrite the current HMD sensor data. Incrementing beyond the last element in the array (i.e., n − 1) resets the index to 0, ensuring continuity of the write/read operations. Note that increasing the total number of elements in the array (n) above 1 will increase system lag above the baseline benchmark

2.5 Experimental design

The experiment was configured according to a 6 × 2 repeated-measures design. Trials were presented in a randomized order, and each participant performed a total of 12 trials in a block: Display lag (6 levels) × Head oscillation frequencies (2 levels).

2.6 Procedure

Participants were initially briefed on the requirements of the study. They were seated in a chair without arm supports and presented with audible metronome–sample tones at 1.0 Hz and 0.5 Hz—to gain practice with engaging in yaw head movements. They were instructed to adjust the posture of their head side-to-side in a way that was most “natural” and comfortable for them within the confines of their seating arrangements. The experimenter emphasized the importance of their head movements being consistent in amplitude (i.e., leftwards-rightwards extent) in response to the metronome across all conditions, which was presented in both the training and experimental sessions. To optimize participant comfort, no constraint of torso rotation was imposed in the current study.

Following sufficient practice in making these head movements (about 2–3 min), the participant was then reminded of the task. They were instructed to continually oscillate their head from side-to-side throughout each 40 s presentation of radially expanding optic flow. They were to attend to their overall experience of self-motion in depth (i.e., the illusion of forwards self-motion induced by viewing the display). Following each presentation of optic flow, the trial concluded with the participant being required to perform two psychophysical judgment tasks (as well as being checked to determine whether they felt sick or well).

The first of these tasks was to provide an overall estimate of vection strength for the trial on a 101-point scale (0–100) using a rating bar (e.g., Seno 2013). We only used overall ratings for gauging vection (and not latency) to maximize participant attention to optic flow throughout the stimulus presentation (however such ratings have been shown to correlate well with real-time indices of the vection time-course—see Seno et al. 2017). The second judgment task was to provide an overall estimate of the subjective experience of presence (i.e., “being there” in the simulated environment) as performed in previous studies (e.g., IJsselsteijn et al. 2001; Bouchard et al. 2001; Clifton and Palmisano 2020; Kim et al. 2020). Although some of these studies have used a rating scale ranging from 0 to 10 for presence judgments (Clifton and Palmisano 2020; Bouchard et al. 2001), other researchers have used a 21-point scale ranging from 0 to 20 (Kim et al. 2020). We used this 21-point scale to allow for greater resolution in presence ratings. When judging spatial presence, participants were instructed to consider how much they felt spatially present in the virtual environment. This judgment was based on the question: “To what extent do you feel present in the virtual environment, as if you were really there?” (as used previously by Bouchard et al. 2001). Ratings could range from 0 = feeling completely “not there” in the display; to 20 = feeling “completely there” in the display.

After completing these two judgment tasks for the trial, participants were then required to answer yes/no to the question: “Do you feel sick?”. This was performed to check on the well-being of our participants during the experiment (similar to recent studies of HMD based cybersickness—e.g., Munafo et al. 2017). Any experience of dizziness, nausea or ocular discomfort (e.g., eye strain) was encouraged to be reported.

Based on the results reported below, the estimated imposed latencies for the six different levels of display lag were approximately: 6 ms (baseline), 47 ms, 88 ms, 130 ms, 171 ms and 212 ms (based on lag increments estimated at approximately 41.2 ms). Participation in this study required a total of 30 min to complete briefing, training and the experimental trials. Perceptual judgments and real-time sensor outputs from the HMD were logged to separate data files.

3 Results

3.1 Head movement data

We first checked to see whether our participants had followed our instructions about making yaw head movements in this study. In yaw, the mean peak-to-peak head oscillation range for all participants was 55.8\(^\circ\) for 0.5 Hz rotations (SD = 34.6\(^\circ\)) and 54.0\(^\circ\) for 1.0 Hz rotations (SD = 30.0\(^\circ\)). In pitch, the mean peak-to-peak head oscillation range was 0.4\(^\circ\) for 0.5 Hz rotations (SD = 3.3\(^\circ\)) and 0.6\(^\circ\) for 1.0 Hz rotations (SD = 2.7\(^\circ\)). In roll, the mean peak-to-peak head oscillation range was 5.7\(^\circ\) for 0.5 Hz rotations (SD = 5.6\(^\circ\)) and 5.4\(^\circ\) for 1.0 Hz rotations (SD = 5.6\(^\circ\)). Three-dimensional changes in the angular orientation of the head during (0.5 Hz and 1.0 Hz) yaw head rotations are shown in Fig. 2 for one representative participant.

Fig. 2
figure 2

Angular head position during oscillation of the head in yaw at 0.5 Hz (A) and 1.0 Hz (B). Separate traces show orientation in yaw (thick red trace), pitch (thinner light green trace) and roll (thinnest dark blue trace). Vertical dashed and dotted lines indicate the points in time of peaks and troughs in the yaw head position, respectively (color figure online)

We next examined how our participant’s head movements were affected by the imposed display lag and the frequency of the metronome. A two-way repeated-measures ANOVA found no main effect of display lag (F5,110 = 0.27, p = 0.93) or yaw head oscillation frequency (F1,22 = 0.44, p = 0.51) on the peak-to-peak amplitude of yaw head oscillations. There was also no interaction effect between display lag and oscillation frequency on the amplitude of yaw head oscillations (F5,110 = 1.52, p = 0.19).

3.2 Estimating head-display lag

Results of our system latency benchmarking are shown in Fig. 3. This figure shows correlations between the reference target’s horizontal motion and the yaw-modulated target’s vertical motion when two different system lags were imposed (i.e., the baseline lag and maximum imposed lag conditions). The peak in the correlation for the baseline lag condition indicates that the average benchmark system latency for our Oculus Rift HMD when presenting optic flow was 5.3 ms (± 1.2 ms 95% CI). The peak in the correlation for the largest display lag condition is also shown in Fig. 3—the average lag for this particular condition was estimated to be 212.8 ms (± 1.3 ms 95% CI). Thus, the effective range of the (average) lag imposed on the Oculus Rift based optic flow displays in this experiment was from 5.3 ms to 212.8 ms. This level of lag corresponds to approximately 0.5 frames-per-second for the Oculus CV1, which has a refresh rate of 90 Hz (i.e., 11 ms per frame). The reported variability (95% CIs) appeared to be highly consistent across the full range of display lags that were imposed.

Fig. 3
figure 3

Cross-correlations in the time domain of targets used to estimate head-display lag. Mean cross-correlations plotted as a function of temporal offset in the yaw-modulated calibration signal relative to the change in position of the reference target. Separate curves and 95% confidence bands show data for the baseline lag condition (solid line) and maximum lag condition (dashed line). The dotted vertical lines show the location of the peaks in cross-correlation at baseline (-5.3 ms ± 1.2 ms 95% CI) and maximum latency (−212.8 ms ± 1.3 ms 95% CI)

3.3 Effect of varying head-display lag on perceptual judgment tasks

Figure 4 shows sets of axes plotting the means and standard errors for the two dependent variables examined in this study (vection strength and presence). The results of mixed-design ANOVAs are reported separately for each of the two outcome measures in the paragraphs that follow.

Fig. 4
figure 4

Outcome measures plotted as a function of display lag and head oscillation frequency. Separate plots show (A) vection strength ratings, and (B) presence ratings. Hollow points show data for 0.5 Hz conditions, and solid points show data for 1.0 Hz conditions. Note that error bars are standard errors of the mean

We used the recorded YES/NO cybersickness data to construct a between-subjects grouping variable to compare vection strength and presence across participants who reported cybersickness on at least one trial and those who remained well the entire time. This criterion identified a total of 13 participants who were classified as sick and 10 who were classified as well.

Vection strength ratings are plotted in Fig. 4A as a function of display lag and the two frequencies of head oscillation. A mixed-model ANOVA found a between-subjects effect of reported cybersickness on vection strength (F1,21 = 0.11, p = 0.75). There was a significant main effect of display lag on vection strength (F5,105 = 4.08, p < 0.005). There was also a significant main effect of oscillation frequency on vection strength (F1,21 = 16.05, p < 0.001). There were no interaction effects between display lag, oscillation frequency and reported cybersickness on vection strength.

Spatial presence ratings are plotted in Fig. 4B as a function of display lag for the two frequencies of head oscillation. A mixed-model ANOVA found no significant effect of reported cybersickness on presence (F1,21 = 2.51, p = 0.13). There was a significant main effect of display lag on presence ratings (F5,105 = 2.89, p < 0.05). There was also a significant main effect of oscillation frequency on presence ratings (F1,22 = 13.25, p < 0.005). There were no interaction effects between display lag, oscillation frequency and reported cybersickness on presence ratings.

3.4 Relationships between vection and presence

Like regression (Lorch and Myers 1990), correlational analyses assume that the data represent independent samples. Thus, we first obtained the average vection strength and presence ratings for each participant. There was a significant correlation between vection strength and presence for the 0.5 Hz head oscillation conditions (r =  + 0.61, p < 0.005). There was also a significant correlation between vection strength and presence for the 1.0 Hz head oscillation conditions (r =  + 0.62, p < 0.005). We plotted perceived vection strength as a function of presence for the two frequencies of head oscillation as shown in Fig. 5. The thick solid line shows the line of best fit for the data averaged across head oscillation frequency. Note that the positive intercept implies that participants could experience vection even when they did not feel they were immersed in the display. One sample t-tests can then be used to assess significance of the slope and intercept parameters. We performed this analysis on the relationship between vection strength and presence. For the 0.5 Hz condition, we computed the average linear model (Vection strength = 1.48 * Presence + 51.8). Because there are multiple points for each participant represented in the plot (repeat data from different lag levels), Lorch and Myers (1990) prefer an analysis, whereby a linear-least squares model is fit to each set of participant data. One sample t-tests found that the model slope was significantly different from zero (t22 = 4.48, p < 0.0005), and the model intercept was significantly different from zero (t22 = 4.00, p < 0.001). For the 1.0 Hz condition, we again computed the average linear model (Vection strength = 2.35 * Presence + 47.2). Again, one sample t-tests found that the model slope was significantly different from zero (t22 = 4.46, p < 0.0005), and the model intercept was significantly different from zero (t22 = 5.95, p < 0.00001).

Fig. 5
figure 5

Vection strength plotted as a function of presence. Hollow points and dotted line show data points and line of best fit for the 0.5 Hz condition. Solid points and dashed line show data points and line of best fit for the 1.0 Hz condition. The thick solid line shows the line of best fit for data averaged across head oscillation frequency

4 Discussion

We presented our optic flow displays (simulating self-motion) to participants wearing the Oculus Rift CV1 HMD. As the system used in the study was found to have a very low baseline lag (~ 5.3 ms), this allowed us to assess the effects of systematically increasing head-to-display lag on both vection strength and presence. We found that increasing head-to-display lag significantly decreased both vection strength and reported feelings of presence. As predicted, we also found moderate to strong correlations between vection strength and presence in HMD-VR (for both the 0.5 Hz and the 1.0 Hz head movement conditions). The significant relationships between vection and presence in this study were similar to those previously reported by studies using large external displays viewed during passively generated self-rotations (e.g., Riecke et al. 2006; Keshavarz et al. 2018). For example, Keshavarz et al. (2018) examined the relationship between vection and presence when simulated viewpoint oscillations were added to displays simulating self-motion in depth. However, they found significant correlations between vection and presence for conditions of passive rotation in pitch and roll, but not for yaw. By contrast, the results of our study reveal a strong relationship between vection and presence for actively generated yaw head rotations. Our measures of head rotation confirmed that head rotations were predominantly oriented in yaw with very little engagement of pitch and roll head rotation. It is possible that this relationship arose because the increased head-to-lag actively reduced presence, which in turn reduced the potential for the simulation to induce vection in depth (based on the assumption that it is easier to perceive that you are moving through a virtual environment if you already feel present in that environment). However, contrary to this notion, we did find that some vection could still be induced when observers had presence ratings of zero (see the positive model intercepts in Fig. 5).

In this study, it was possible that some of the variability in the responses across participants was due to eye movements. While we instructed observers to look ahead in the distance, no fixation point was provided to suppress their eye movements. Previous work has shown that active central fixation can impair the vection generated by angular viewpoint oscillation (similar to that used here), but not linear viewpoint oscillations (Kim et al. 2012). It is likely that if observers had made eccentric eye movements then vection strength would have increased rather than decreased. Indeed, Palmisano and Kim (2009) showed that vection could be increased significantly following periodic eccentric fixations relative to the expanding flow field. In a recent paper, Moroz et al. (2018) found that active ego-centric fixation reduced sensitivity to detecting modulations in head-to-display gain during both passive and active yaw rotations. They also found that world-centric fixation increased sensitivity to these modulations which may have increased susceptibility to cybersickness.

It has been suggested that undesired perceptual effects (such as cybersickness) could be caused by ‘variability’ in latency, rather than latency of HMDs per se (Moss et al. 2011; St Pierre et al. 2015; Kinsella et al. 2016). We performed calibrations at baseline and at the maximum level of lag imposed to ascertain how consistent the measured latency of the Oculus Rift CV1 was over our 40 s trials. We found that the estimated variability in sampled latencies was similar across the baseline (low latency) and highest system latency imposed in the study. This finding suggests that the effects of head-display lag on our vection and presence measures cannot be explained by variance in system latency over time. Rather, we propose that the perceptual effects that we observed were caused by differences in the orientation of the simulated and physical head orientations achieved over time. This proposal is supported by the finding that lower vection and presence measures were found when participants made faster head movements (i.e., 1.0 Hz compared with 0.5 Hz head oscillations).

The findings of the present study suggest that head-to-display lag affects vection and presence. There was a strong relationship between vection and presence, which could suggest these two percepts are perceptually related. Recent studies have shown that vection was strongest in conditions where simulated head orientation matched physical head orientation (Kim et al. 2015; Palmisano et al. 2017). Future work will hopefully determine whether differences in orientation between the physical and the (virtually) perceived head orientation might account for declines in vection like those observed in the present study.

The potential effects of HMD constraints on perceptual experience should also be considered in future studies. Riecke and Jordan (2015) found that reducing the field of view (using an external or HMD display) reduced the latency of vection onset, and this effect was consistent across display types. However, they found no difference in vection strength with changes in field of view. In contradistinction, Basting et al. (2017) found a positive relationship between vection strength and the HMD’s field of view. It is possible the differences between these studies can be attributed to differences in the type of simulated display motion and structure of the scene. It is possible that the amount of physical head movement could also explain these differences. For example, viewpoint perspective changes associated with lateral linear head displacement can prime the onset of vection (Palmisano and Riecke 2018). Future work should be considerate of the potential effects on outcome metrics from an HMD’s field of view and other constraints. Fortunately, field of view was unlikely to have influenced the results of the present study as display size was not varied in the process of presenting our virtual environment.

Combining linear and angular perspective changes may help to mitigate cybersickness. For example, the findings of the current study may help to understand the physiological mechanisms underlying certain types of redirected walking (RDW), which is a technique that aims to extend the user’s perceived physical environment (Razzaque et al. 2018). To this end, angular shifts in the simulated viewing direction are imposed as a function of distance walked in meters, altering the user’s physical pathway of walking. This has the effect of causing the user to perceive they are walking along a straight path in the virtual environment when they are really traversing a curvilinear pathway over a small finite area of physical floor space. Evidence in the literature suggests that 20°/m RDW can minimize adverse experiences like cybersickness even though this gain is four times the detection threshold (Razzaque et al. 2018). Also, severity of cybersickness when experienced in RDW ends to adapt within a few minutes (Hildebrandt et al. 2018). Perceived scene instability in these situations may have lower impacts on cybersickness because: (1) the tolerable velocity of 20°/m can be regarded as low in velocity when walking at a speed of 2 m/s, (2) users may not make active angular head rotations when walking along a straight path in depth, (3) considerably low levels of display lag would be imposed in those studies, and (4) linear visual-vestibular conflicts do not generate strong levels of cybersickness, even when large amounts of scene instability is perceived (Kim et al. 2021). These findings together suggest that adaptation and combined angular and linear gain control in the absence of lag might be an effective strategy for imposing sensorimotor changes without generating significant side effects like cybersickness.

One potential further consideration for future research is the role that display lag may have on cybersickness. We monitored cybersickness using an insensitive YES/NO report and found no difference in the effect of system lag on vection or presence between participant who were classified as either sick or well. Feng et al. (2019) found that increasing display lag increased cybersickness severity, but a small amount of cybersickness was reported for even very low latencies. They proposed that constraints inherent in the presentation of content on the display itself may generate cybersickness. In a follow-up study, Kim et al. (2020) found that increasing display lag increased perceived scene instability in addition to cybersickness, and the magnitude of perceived scene instability was found to be predictive of cybersickness. In earlier work, Prothero (1998) proposed that background motion is important for cybersickness. The researcher found that introducing a stable visual background behind the virtual scene using a half-silvered mirror was sufficient to mitigate the effects of cybersickness, while preserving vection. More recent research has shown that cybersickness in HMD-VR appears to depend on stereoscopic disparity (Palmisano et al. 2019) and restrictions on the simulated depth of field (Carnegie and Rhee 2015). It would therefore appear that stabilizing image content at different simulated depths might be critical for minimizing cybersickness and its severity. Fortunately, there are also exciting approaches being implemented in augmented and mixed reality for optimizing fidelity in display alignment to minimize perceptual incompatibility when mixing virtual content with real-world visual information (Yokokohji et al. 2000; Freiwald et al. 2018).

Ultimately, the extent to which changes in perceived self-orientation can be readily measured will provide critical insight into the potential factors of multisensory conflict that may drive the perceptual advantages that users enjoy with extremely low system latencies (e.g., Riecke et al. 2015). These benefits will be ensured not by the future advances in adaptive latency reducing algorithms (e.g., time warp), but rather, the ongoing psychophysical research that will validate their effectiveness.