Introduction

Illusions of self-motion perception can be induced by presenting global optic flow to stationary observers—these purely visual illusions are commonly referred to as “vection” (Dichgans and Brandt 1978).

The influence of a large variety of factors on vection have been explored including field size (e.g., Held et al. 1975; Lestienne et al. 1977), stimulus speed (Brandt et al. 1973), distance to stimulus (Delmore and Martin 1986; Ohmi et al. 1987; Ohmi and Howard 1988; Howard and Heckman 1989; Telford et al. 1992; Ito and Shibata 2005), spatial frequency (Sauvan and Bonnet 1993, 1995), eccentricity in the visual field (Palmisano and Gillam 1998), acceleration of motion (Melcher and Henn 1981), and luminance (Gurnsey et al. 1998). An important property that has not been studied to the best of our knowledge is the smoothness of motion.

When successive images in a sequence (each image is termed a “frame”) are presented to the viewer, differences among frames are perceived as motion. This is the underlying phenomenon making motion pictures possible. The perceived smoothness of motion of a stroboscopic image sequence depends on the frame rate, which is number of frames presented per time unit [typically expressed as frames per second (fps)]. When the frame rate of a movie is too low, motion is not perceived at all. At low frame rate flicker is also an issue. The frame rate of conventional movies (24–30 fps) is lower than the flicker fusion threshold (Hetch and Shlaer 1936; Davis et al. 2015). Techniques such as multi-flash protocols (projecting the same frame multiple times) can reduce or eliminate flicker but do not enhanced motion smoothness at all. However, recent developments in digital technology allows us to capture higher frame rate movies and to project them directly without multi-flash protocols. Therefore, in this study, we focused on perceived smoothness rather flicker.

Smoothness also affects the perception of quality and realism in a movie (Wilcox et al. 2015). When capturing (or rendering) a movie at a limited frame rate, artifacts (for example, motion blur and judder) can arise due to spatiotemporal sampling of the continuous (‘infinite frame rate’) real world. Increasing frame rate can reduce the occurrence or impact of these artifacts (Banks et al. 2012; Watson 2013). Frame rates for conventional film and TV were standardized at 24–30 fps but recent development of digital technology enables use of higher frame rates.

Subjective smoothness of motion reportedly affects the perceived quality of movies and the spatiotemporal processing underlying motion perception. For example, the magnitude of the flash lag effect was found to decrease with increasing inter-frame displacements of a moving target (Rizk et al. 2009). Because they used a constant velocity stimulus, increase in interframe displacement in this experiment was accompanied by a reciprocal reduction in frame rate. They found a correlation between the displacement at which the rate of decrease in flash lag effect was largest and the transition from the subjective appearance of smooth motion to discontinuous motion.

The appearance of smooth motion is related to the adequacy of the spatiotemporal sampling of the stimulus and can be predicted from appropriate models of motion processing. Adelson and Bergen (1985) developed a model of motion detection in which spatiotemporal filtering is used to detect luminance-defined motion and argued that this model could be implemented by neurons in V1 (Emerson et al. 1992). These models are thought to detect the luminance defined motion information, i.e., “motion energy” from motion stimuli.

Based on these models, smoother motion in high frame rate content should activate V1 more effectively because it produces more motion energy than low frame rate movies. We ran a simulation of a moving bar at various frame rate based on the motion energy model and calculated motion energy (see “Appendix 1” for detailed information). The results of simulation are shown in Fig. 1 and demonstrated the impact of frame rate on motion energy.

Fig. 1
figure 1

Results of simulation based on the motion energy model: horizontal axis indicates relative frame rate (the maximum frame rate = 1) and vertical axis indicates relative motion energy (motion energy at the maximum frame rate = 1)

In turn, enhancement of low-level motion processing could enhance some other perceptual phenomenon (i.e., Flash lag effect). Consistent with this proposal, Miura et al. (2010) reported that the cortical networks of motion-sensitive visual areas showed increased activity correlated with increase of motion smoothness.

The motion energy model has also been used to explain vection strength, for example, the difference in vection strength between first- and second-order motion. Gurnsey et al. (1998) compared the effectiveness of contrast-defined motion (second-order motion) to luminance-defined (first-order motion) motion in inducing vection. Their results showed that the luminance-defined motion induced stronger vection than the equivalent contrast-defined motion. This result suggests that the amount of “motion energy” is important for producing vection. In this account, the differences in vection strength reflect the differences in activation levels in V1 between first- and second-order motions, i.e., the difference of the amount of motion energy.

Snowden and Braddick (1989) investigated relationship between frame rate and the signal to noise ratio threshold for coherent motion in random dot kinematograms. They found that thresholds decreased as frame rate increased and indicated that high frame rate processing can be different from low frame rate processing with their motion processing model similar to the motion energy model.

From these facts, we hypothesized that smoother motion could induce stronger vection because vection strength is closely related to motion perception. When the smoothness of motion is high, it should contain more motion energy and activate V1 more effectively. As a result, vection should be stronger based on activation of neurons sensitive to first-order motion in the same manner as in the luminance-defined motion in Gurnsey et al. (1998).

Experiment 1

Method

To investigate the effect of smoothness of motion, we controlled the frame rate of self-motion stimulus movies and measured the strength of vection.

Stimuli

Stimuli were movies of horizontal gratings defined by sinusoidal luminance modulation drifting downward (Fig. 2; stimuli can be downloaded from http://www.senotake.jp/stimulus/2016/). The grating wavelength was 22° (0.045 cycle/°), maximum luminance was 16.2 cd/m2, minimum luminance was less than 0.01 cd/m2 (which was the minimum sensitivity of the photometer), and Michelson contrast of the gratings was 100%. The grating moved downward (induced upward vection) at the speed of 22°/s (i.e., the temporal period of the sinusoidal wave was 1 s) and duration time was 40 s. The stimulus subtended 100° (horizontal) by 81° (vertical) and viewing distance was 57 cm. There was no fixation point, but participants were instructed to gaze at the center of the screen. In some of our previous studies, similar stimuli on the same apparatus induced substantial vection (e.g., Seno et al. 2014).

Fig. 2
figure 2

The downward grating stimulus used in Experiment 1

The main experimental manipulation was the frame rate of the motion stimulus, which produced variations in motion smoothness. Seven frame rate conditions were used: 3, 4, 6, 12, 20, 30 and 60 fps. Higher frame rate corresponded to smoother motion and lower frame rates produced juddered or discontinuous motion. The refresh rate of the display was constant at 60 Hz so, to vary the frame rate, a multi-flash protocol was used. That is, to produce lower frame rates than 60 fps the same frame images were presented multiple times in a row. For example, each image was repeated 20 times in 3 fps condition, 15 times in 4 fps condition and so on.

Importantly, all experiment parameters except frame rate—that is, speed, spatial frequency, direction of motion and refresh rate and so on—were identical between the seven conditions.

Apparatus

Stimuli were generated and controlled with Matlab (R2014b, Mathworks, USA) and Psychtoolbox-3 (Brainard 1997; Pelli 1997; Kleiner et al. 2007) on a computer (ALIENWARE M18x, Dell, USA) and presented on a plasma display (3D VIERA TH-65AX800, Panasonic, Japan, 65 inches (165 cm) screen and 1920 × 1080 pixel resolution at 60 Hz refresh rate). The experiment was conducted in a dark room and participants sat on a rocking chair to enhance induction of vection. To promote vection, neither chin-rest nor head-rest were used. Viewing distance was 57 cm.

Procedure

Participants were instructed to gaze at the center of stimuli while the stimulus was presented. They were also instructed to press and hold a response button whenever they perceived self-motion (i.e., vection) to measure latency and duration of vection. After each trial, vection strength was measured with the method of magnitude estimation. The participants rated the subjective vection strength using a 101-point rating scale ranging from 0 (no vection) to 100 (very strong vection). Very strong vection meant that participants perceived self-motion very naturally, as if they were moving throughout the stimulus presentation. This methodology has been used in several other studies (Seno et al. 2010a, 2014) and has been confirmed as a valid means to evaluate vection strength. Each participant served in all seven smoothness conditions in random order and repeated four times for each.

We used three indices to measure strength of vection as above, that is, the latency of vection (the onset delay between the start of the stimulus and the button press indicating the first experience of vection in a trial), the duration of vection (the total time that the button was pressed during a trial), and the magnitude of vection (the subjective strength of vection by magnitude estimation). The indices have been commonly used in previous studies and it is believed that they show consistent results. However, there are reports that they can be inconsistent when vection is weak (e.g., Seno et al. 2013), or when instruction introduces cognitive biases (e.g., Palmisano and Chan 2004). To avoid and check for these effects, we measured all three parameters simultaneously.

Participants

Ten adult volunteers (students and clerical staff of Kyushu University) and the first author of this article (mean age 28.18 years, SD 12.78, three females) participated in this experiment.Footnote 1 All participants were healthy and had normal or corrected to normal vision and no history of vestibular system diseases. None of them (except for the first author) was aware of the purpose of the experiment.

Ethics statements

Both experiments were pre-approved by the Ethics Committee of Kyushu University, and were conducted following the Declaration of Helsinki. Written informed consent was obtained from each participant before starting experiments.

Results and discussion

The results are shown in Fig. 3. The three graph plot mean values averaged across subjects for the three indices of vection strength as function of frame rate. The results clearly showed that vection strength increased with increasing frame rate, and a one-way ANOVA indicated that main effects of frame rate were significant (Magnitude: F(6,60) = 34.75, p < 0.001; Latency: F(6,60) = 18.73, p < 0.001; Duration: F(6,60) = 27.65, p < 0.001). This pattern is consistent with our hypothesis that the smoothness of motion affects vection strength and that judder disturbs induction of vection.

Fig. 3
figure 3

The results of Experiment 1: strength of vection for three indexes of vection strength, i.e., magnitude, latency and duration. The dashed lines indicate asymptotes of the fitting curves, the dotted lines indicate ± 5% of estimated infinite framerate and the error bars indicate SEs

However, the rates of increase in the vection strength with frame rate were not constant, but rapidly increased in the low range of frame rate and appeared to asymptote in the high range.

Those characteristics are very similar to the simulation results in Fig. 1. This supports the prediction that increase of frame rates induces more motion energy and in turn stronger vection.

The asymptote is expected because a real scene can induce vection and it has limited vection strength even though it has infinite frame rate. To characterize this relationship between the strengths and frame rates, we fit following function to the results with the least mean-square method.

$$y~=~a \times \exp ( - bx)~+~c\;,{\text{ where}}\;a,\;b\;{\text{and}}\;c\;{\text{are}}\;{\text{fitting}}\;{\text{parameters.}}$$

The function was selected because it is a simple exponential function and captured the saturating behavior observed: rapidly changing in the low range and approaching asymptotically to a value (c in the formula) at infinity. The fitting curves are drawn in Fig. 3. Fitted parameters (a, b, c) are (− 68.2, 0.0612, 69.7) for magnitude, (26.3, 0.0907, 8.92) for latency and (− 28.5, 0.0729, 25.7) for duration. R 2 were 0.987 (magnitude), 0.997 (latency) and 0.997 (duration). Note that the a parameter for latency is opposite in sign since latency decreases rather than increases corresponding to increasing vection strength.

The fitted functions indicate that for frame rates exceeding 48.5 fps in magnitude, 45.5 fps in latency and 42.5 fps in duration, the fitted strengths are within ± 5% of the value of predicted infinite frame rate (the c parameter of fitting, i.e., greater than \(0.95 \times c\) for magnitude and duration, and less than \(1.05 \times c\) for latency). We refer to these values as the economical frame rates because further increase in frame rate produces little improvement in vection (higher frame rate implies costs in storage, transmission, bandwidth and other aspects of image transmission and display). However, it indicates that the traditional 24 or 30 fps in ordinary movies are not good enough to induce maximal vection.

There was high consistency between the three vection indexes. This indicated that the effect of the smoothness of motion on vection was very strong and reliable. When the smoothness of motion stimulus is high and more motion energy is contained in it, vection can be facilitated.

Experiment 2

In Experiment 2, we investigated the effect of smoothness with expanding optical flow because vection in depth has different properties compared to parallel translation. For example, extended forward self-motion appears more frequently in our daily lives compared to upward self-motion. This difference in ecological validity between forward and upward self-motion could be reflected in the economical frequency determining the effect of the frame rate on vection strength. Furthermore, unidirectional stimuli can be detected by V1 but expanding optic flow requires integrating motion information over space and needs involvement of higher visual processing areas such as MT (middle temporal area) and MST (Medial superior temporal area) (Tanaka et al. 1986; Tanaka and Saito 1989). This difference in level of motion processing might be reflected in the effect of the frame rate on vection strength. To test for effects of these considerations, we repeated Experiment 1 using expanding circular grating stimuli to induce forward vection.

Method

Stimuli

An expanding circular grating was presented as a stimulus to induce forward vection (Fig. 4; stimuli can be downloaded at http://www.senotake.jp/stimulus/2016/demo.html). The stimulus simulated looking and travelling forward through a cylinder with a sinusoidal grating texture. The simulated dimensions of the tunnel were 266 cm for the radius of the cylinder and 444 cm for the wavelength of texture. In the image projected on the display plane, the grating of stimulus had the correct perspective depth cue for the constant radius tunnel extending in depth that is, the period of the grating increased with eccentricity in the projected image as seen in Fig. 4. Luminance of grating was the same as in Experiment 1. The speed of simulated self-motion was about 444 cm/s (i.e., period of the grating was 1 s). The mathematical projection of the simulated surfaces to the displayed image is written in the “Appendix 2”. As the grating texture gets infinitely small in the image at infinite distance, a gray disc was put on the center of the screen to occlude the far end of the tunnel and mask spatial aliasing artifacts. The size of disc was 7.2° (it corresponds to a simulated distance of 75 m farther than display distance). The duration time of movie was 40 s as in Experiment 1. The same seven frame rates (3, 4, 6, 12, 20, 30, and 60 fps) were used as in Experiment 1.

Fig. 4
figure 4

The expanding circular grating stimulus used in Experiment 2

Participants

Eleven naïve volunteers and the first author of this article participated in this experiment (mean age 30.41 years, SD 11.78, four females). Three of them (two naïve volunteers and author) also participated in Experiment 1. All participants were healthy and had a normal or corrected to normal vision and no history of vestibular system diseases or disorders.

Results and discussion

The results are shown in Fig. 5.Footnote 2 The results show a clear effect of smoothness on vection strength as Experiment 1. One-way ANOVA also showed significant main effects of frame rate (Magnitude: F(6,66) = 27.70, p < 0.001; Latency: F(6,66) = 10.98, p < 0.001; Duration: F(6,66) = 26.74, p < 0.001).

Fig. 5
figure 5

The results of Experiment 2: strength of vection for three indexes of vection strength, i.e., magnitude, latency and duration. The dashed lines indicate asymptotes of the fitting curves, the dotted lines indicate ± 5% of estimated infinite framerate and the error bars indicate SEs

We fit the exponential function to the results as in Experiment 1. The fitting curves are drawn in Fig. 5 and fitting parameters (a, b, c) are (− 65.3, 0.138, 63.0) for magnitude, (36.2, 0.356, 7.90) for latency and (− 30.7, 0.174, 26.4) for duration. R 2 were 0.985 (magnitude), 0.966 (latency) and 0.974 (duration).

The economic frame rates are 22.0 fps for magnitude, 12.7 fps for latency and 18.1 fps for duration. These are much lower than those of Experiment 1. On the other hand, the strength of vection in the high frame rate range was similar in the two experiments.

General discussion

We investigated the effect of the smoothness of stimulus motion on the strength of both upward and forward vection. In both cases, the results clearly showed stronger vection is induced by smoother motion. Those results support our hypothesis that smoother motion induces stronger vection.

Additionally, both results showed very similar properties qualitatively in that vection strength was rapidly increased with frame rate at low frame rate and approached to a value asymptotically at infinity. This similarity could suggest a common vection process that does not depend on direction. Additionally, the function was also similar to the results of the simple simulation showed in Fig. 1.

The motion energy model helps us to understand the effect of frame rate. According to the model, a large displacement on the retina can produce strong motion energy when the frame changes, even in low frame rate condition. However, between frame updates, the image displacement is zero because of the multi-flash protocol (even if the multi-flash protocol were not used, a dark blanking or filling gap would act similarly). This lack of displacement reduces motion energy and consequently also vection.

Differences in perceived realism might also be a factor. Wilcox et al. (2015) reported higher frame rate movies induce higher perceptual quality including realism. Other studies have suggested that increases in the plausibility of a portrayed scene enhance vection (Nakamura 2010; Seno et al. 2010). That is, in this account, higher frame rate movies enhanced not only motion energy but also realism of self-motion, and the higher realism in turn induced stronger vection. Consistent with this proposal some studies have indicated cognitive state of the participants or some cognitive instruction to them can modulate vection strength (Mast et al. 2001; Ogawa and Seno 2014; Palmisano and Chan 2004), and higher realism should enhance recognition. Generally, realism could be a key factor determining vection strength because many factors that facilitate vection, such as visual field and viewing distance, also influence perceived realism. Further research is needed to distinguish the effects of these cognitive factors from low-level factors such as motion energy.

The experimental results also revealed differences in sensitivity to frame rate between upward and forward vection stimuli. The results of Experiment 1 indicated that more than 40 fps were required to saturate vection strength. This suggests a perceptual advantage of high frame rate movies over traditional 24–30 fps movie content. However, results of Experiment 2 may indicate that traditional frame rates are enough for vection in some kinds of movies.

The difference in the effect of frame rate on vection strength between upward and forward vection could reflect the fact that, ecologically extended duration forward self-motion appears much more frequently than upward self-motion. Another possibility is that the process of integrating motion information is much more complex in the expanding grating than in the downward grating. In the expanding grating condition, low-level features such as spatial frequency, motion direction and speed vary across the display but they are uniform across the display in the downward grating. The wide range of variation in these basic parameters also stimulates a broader set of motion detectors and requires the involvement of higher cortical areas and more complex integration in the former compared to the latter case. The local grating motion can be detected by local motion detectors in primary visual cortex (V1) (Snowden et al. 1991; Emerson et al. 1992). Therefore, the downward motion can induce upward self-motion directly (or after an easier integration process than the expanding grating), but the complex integration of expanding optic flow deeply requires higher motion processing, i.e., MT and MST (Tanaka et al. 1986; Tanaka and Saito 1989). Thus, the relationship between motion energy and vection might be more direct in upward vection condition than in forward vection condition, consistent with the greater influence of frame rate in the former condition. This hypothesis is consistent with the finding that contrast-defined second-order motion could not induce stronger vection than the luminance-defined first-order motion (Gurnsey et al. 1998), which suggests that low-level motion detectors are more efficient at producing the motion signal driving vection in lamellar optic flow displays.

The difference in the economical frequency between the expanding grating and downward grating might be explained in terms of the relationship between complexity of motion and processing time. Burr and Santoro (2001) found that complex motion stimuli, such as radial or circular motion in a random dot kinematogram, take longer to process than simple stimuli (translational motion of random dots). Their study suggests that the expanding grating would take longer to process than the downward grating. It is conceivable that the longer processing time could be associated with reduced temporal resolution of motion processing and a lower economical frame rate in radial compared to translational stimuli.

We should also note the fact that in the looming case, some parts (periphery) were moving very fast and others (centre) were moving very slowly. This gradient of motion speed could result in variation in the effect of frame rate across the looming stimulus, which could explain the difference between the two stimulus conditions.

It is necessary to point out that we used only one grating stimulus for each direction of motion, i.e., single-speed, spatial frequency, luminance and so on. Thus, it is possible that the economical frame rates could be different under other conditions, and it is also possible that the directional difference in economical frame rates that we found may depend on other parameters that were not varied in the current study. Therefore, we cannot definitively conclude that the economical frame rate is always lower for forward than upward vection. Future experiments could reveal the influence of these parameters on the frame rate required for robust vection. However, the current study clearly shows that the economical frame rates depend critically on the stimulus and that the smoothness of motion is an important factor.

Conclusion

Motion smoothness affects perception of vection and higher frame rate can induce stronger vection. However, the increase in vection strength with higher frame rate saturates to an asymptotic limit so that further increases in frame rate yield-diminishing returns.