Introduction

It is generally accepted that visual information is processed in two separate, but interacting, neural pathways (Milner and Goodale 2008) that arise from primary visual cortex: the vision-for-perception system (ventral stream, projecting to inferior temporal cortex) and the vision-for-action system (dorsal stream, projecting to superior parietal cortex and intraparietal sulcus). The two systems use visual information in different ways: the ventral stream is concerned mainly with object recognition (‘what’ system) while the dorsal stream processes visuo-spatial information for control of visually guided actions (‘how’ system). Evidence in favor of this double route comes from neuropsychological studies of patients with selective cortical lesions, from human neuroimaging and from classic neuroanatomical and behavioral studies on non-human primates (see Ungerleider and Mishkin 1982). Furthermore, important evidence has been provided by studies of visual illusions in normal observers. In a pioneering experiment Aglioti et al. (1995) used a 3-D version of the Ebbinghaus–Titchener illusion in which a target disk surrounded by small circles appears to be larger than an identical disk surrounded by large circles. Aglioti et al. (1995) demonstrated that even though size judgments are affected by the illusion, the scaling of the grip aperture reflects the true size of the circle. A similar dissociation has been observed in other experiments using the Ebbinghaus–Titchener illusion (Amazeen and DeSilva 2005; Fischer 2001; Haffenden and Goodale 1998; Kwok and Braddick 2003), as well as in experiments using the Ponzo illusion (Brenner and Smeets 1996; Jackson and Shaw 2000), the horizontal–vertical illusion (Servos et al. 2000, but see Vishton et al. 1999 and Wolfe et al. 2005) for examples of controversial evidence on this illusion), the Müller–Lyer illusion (Otto-de Haart et al. 1999; Dewar and Carey 2006), and the rod-and-frame illusion (Dyde and Milner 2002). These studies found that even if perceptual judgment is deceived by pictorial illusions, grasping is refractory to them. However, other studies have shown that both perception and action can be misled by illusions (Donkelaar 1999; Franz et al. 2000, 2001, 2003; Pavani et al. 1999; Smeets and Brenner 2006). There are many possible explanations of these discrepant results such as the kind of motor task and illusion employed, the timing of the movement (real time vs. delayed movement) as well as learning and attentional factors (Goodale et al. 2008; Bruno et al. 2008; Bruno and Franz 2009).

We would like to point out, however, that the aim of the present experiments was not to try and cast more light on a rather controversial issue but rather to better clarify the functional characteristics of the two visual systems. We were interested in finding out whether in contrast to grasping and reaching, simple visual manual RT is affected by visual illusions. The reason for suspecting a difference is that, while in grasping and reaching, the kinematics of the motor response reflect the spatial location and the size or shape of the target, simple RT represents a stereotyped speeded response in which participants are reporting their conscious perception of the presence or onset of the target. Thus, even though reacting to a visual stimulus obviously entails a motor action, the ventral rather than the dorsal system might be involved given its crucial role for perceptual awareness.

We employed the Ponzo and the Ebbinghaus–Titchener illusions which share, albeit as a consequence of different perceptual processes (Gregory 2005), a misperception of size. The rationale of employing a simple RT paradigm is based on the classical finding that RT decreases as stimulus size increases (e.g., Marzi et al. 2006; Osaka 1976; Payne 1967) and this reflects the perceived rather than the retinal size of the image (Sperandio et al. 2009). If RT is sensitive to visual illusions, then stimuli perceived as bigger should be responded to faster than those perceived as smaller despite an identical retinal size.

Experiment 1: Ponzo illusion

In this experiment, we employed a simple manual RT paradigm and naturalistic version of the Ponzo perspective illusion in which converging railway lines give a vivid impression of depth, see Fig. 1a.

Fig. 1
figure 1

Experiment 1: a upper/lower lines with a Ponzo background; b upper/lower lines with a 2D aerial photograph as background. The length of the horizontal lines is the same. c Mean RT to upper and lower lines as a function of background

According to Emmert’s law, the perceived size of an object depends on the retinal angle subtended multiplied by its perceived distance. Therefore, when one of two identical horizontal lines is perceived as more distant, it looks illusorily bigger than the other and, as a consequence, RT should be faster.

Methods

Sixteen right-handed participants (seven males) with normal or corrected-to-normal visual acuity took part in the experiment. Their age ranged between 20 and 31 years (mean 23.44). The participant was seated in front of a PC monitor (Sony Trinitron Multiscan E530) at a distance of 57 cm in a dimly lit room. Participants were asked to keep the gaze on the fixation point at the centre of the screen and to respond to the onset of the stimuli as quickly as possible by pressing the space-bar of the PC key-board with their right index-finger. An acoustic warning stimulus (250 ms duration) signaled participants to start fixating steadily and warned them of the incoming stimulus. The interval between acoustic warning and visual stimulus onset was randomized within the temporal window of 500–700 ms.

Two displays were used: one, which emphasized depth cues and gave a vivid Ponzo illusion, consisted of two identical red geometric lines located one above the other with a colour photograph of converging railway tracks as background (Fig. 1a). The other display was intended to minimize perspective depth cues and consisted of two lines identical to those of the other display with a coloured aerial photograph of a flat landscape as background (Fig. 1b). Clearly, only in the Ponzo condition the upper line appeared illusorily longer than the lower line. The background was present throughout the whole trial and therefore was already there when the stimulus appeared on the screen. It is important to specify that in the RT experiment only a single line was presented together with one or the other background in order to single out RT to either line and thus assess the effect of the illusion by comparing RT to the upper versus the lower line presented with the same background. Four kinds of target stimuli were presented with an exposure of 120 ms, namely, the upper or the lower line with the railway as background inducing the illusion and the upper or lower line with the aerial photograph as background not inducing the illusion. The participant was to respond to the appearance of the line (length: 5.5°) presented 5° either above or below the fixation point. Line luminance was 22 cd/m2 and that of the two backgrounds was about 20 cd/m2. Half of the participants began the experiment with the “Illusion Display” and the other half with the “No Illusion Display” in a blocked sequence. There were 60 trials for each of the four conditions of stimulus presentation plus 32 catch trials in which after the warning signal only the background was presented and the participant was to refrain from responding. Thus, the overall number of presentations for each participant was 272. The range of accepted RTs was 140–650 ms; the total percentage of trials with shorter or longer RT or misses was 0.8%. Since the proportion of omission errors was negligible it was not statistically analyzed. In addition, at beginning and end of the experiment participants performed a size matching task to assess the presence and extent of the illusion. They had to adjust the length of the lower line to match the size of the upper line in the display by pressing two different keys. In this task, no time limit was applied. Measures were determined by the number of pixels added to the lower line to be perceived as equal to the upper line.

Results and discussion

A two-way ANOVA was carried out on RT data with Line position (upper vs. lower) and Background (illusion vs. no illusion) as main factors. Line position was not significant (F 1,15 = 1.392, p = 0.256) and the same was true for Background (F 1,15 = 3.45, p = 0.083). However, their interaction was significant (F 1,15 = 6.86, p < 0.02). Post-hoc T tests showed that the comparison between upper (284.52 ms) and lower (290.44 ms) line was significant (T 15 = 2.48, p < 0.03) only in the illusion condition, see Fig. 1c. The number of participants showing the above effect was 13/16.

This result indicates that RT is sensitive to the Ponzo illusion since the upper line which appeared to be illusorily longer than the lower line was responded to more quickly. Of course, given that our is a simple RT paradigm it has to be made clear that when the participant presses the space bar he or she is indicating his or her detection or awareness of the presence of the target rather that their explicit perception of its size. Perceived size certainly affects RT but it is not an element of the task. Depth cues in a naturalistic version of the Ponzo illusion induced a misperception of size sufficient to affect RT. In contrast, when an aerial photograph providing no depth cues was used as background, a non significant advantage for the line in the lower visual field was observed. This is in keeping with the well documented RT advantage of the lower with respect to the upper visual field, see for review Previc 1990. Clearly this advantage was not sufficient to counterbalance the illusory effect of size induced by the Ponzo figure and in the illusory depth condition the upper line yielded faster RTs. It is interesting that the perceptual effect of the illusion as measured by the adjustment procedure described in the “Methods” was 9.4%, a value somewhat lower but compatible with the 10–15% found on average (Carlson 1962; Leibowitz et al. 1969). The perceptual strength of the illusion in our study was clearly larger than that assessed with RT which amounted to 2.1%.

It is worth pointing out that in a pilot experiment we used a simpler background and got a similar, albeit slightly smaller effect. Thus, the naturalistic background reinforced an illusion already present in a simpler version.

Experiment 2: Ebbinghaus–Titchener illusion

To confirm that RT is influenced by visual illusions we carried out another study using the Ebbinghaus–Titchener circles illusion which is based on a different perceptual phenomenon with respect to the Ponzo illusion, namely size contrast (Gregory 2005). Participants were to manually respond as quickly as possible to a filled circle that was surrounded either by an annulus of small or large circles. If RT is sensitive to the illusion the target circle surrounded by the annulus of small circles which appears bigger than an identical one surrounded by large circles should be responded to more quickly.

Methods

Twelve right-handed participants different from those of the previous experiment (seven males) with normal or corrected-to-normal visual acuity took part in the experiment. Their age ranged between 21 and 30 years (mean 25.46).

The procedure was similar to that of the previous experiment: two kinds of black-and-white visual stimuli were presented, namely, a circle surrounded by an annulus of large circles or an identical circle surrounded by an annulus of small circles, see Fig. 2a, b. The target circle to which participants were to respond independently from the surrounding annulus subtended 2° in diameter and was presented in the centre of the monitor with an exposure duration of 80 ms. The luminance of the target circles, as well as that of the annulus circles, was 2.04 cd/m2 and that of the background was 0.001 cd/m2. The two kinds of display were presented in two different experimental blocks. The number of elements was identical for both annuli (6). Half of the participants began the experiment with the “Large Annulus” and the other half with the “Small Annulus” as background.

Fig. 2
figure 2

Experiment 2: a Circle surrounded by large annulus, b circle surrounded by small annulus. The two circles in the centre are physically identical. c Mean RT to the target circle as a function of surrounding annulus size

An acoustic warning stimulus (250 ms duration) prompted the participants to maintain fixation steady. The interval between acoustic warning and visual stimulus onset was randomized within the temporal window of 500–700 ms. There were 60 trials for each of the two backgrounds plus 16 catch trials with an overall number of 136 presentations for each participant. As in “Experiment 1” the range of accepted RTs was 140–650 ms. The overall percentage of anticipations, retards and misses was 1.5%. The minuscule number of omission was not statistically analyzed.

Results and discussion

A one-way ANOVA with annulus size (large vs small) as factor yielded a significant effect (F 1,11 = 6.975, p < 0.02) with the small annulus display yielding faster RT (287.82 ms) than the large annulus (294.82 ms), see Fig. 2c. The number of participants showing the above effect was 10/12.

This result is in keeping with that of “Experiment 1” in that RT reflected the illusory perception of the circle surrounded by the smaller annulus being bigger than the other. It is worth pointing out that the faster response to the small annulus display shows that participants did indeed respond to the target circle rather than to the whole display. Had they responded to the whole display RT should have been faster for the bigger rather than the smaller overall display given the well known inverse relationship between stimulus size and RT (Marzi et al. 2006; Osaka 1976; Payne 1967). Finally, it is interesting to note that the strength of the illusory effect as measured with RT was 2.4%, a value which is very close to that found in the previous experiment for the Ponzo illusion but much smaller than that found in the psychophysical assessment. In “Experiment 2” we did not use a psychophysical estimation at the beginning or end of RT testing as in “Experiment 1”. This rules out the possibility that the psychophysical assessment of the illusion carried out in “Experiment 1” might have biased the RT correlates of the illusion in “Experiment 1”.

Conclusions

A general point emerging from the results of the present study is that simple RT to visual stimuli reflects perception rather than mere retinal image. This in keeping with a recent study on size constancy (Sperandio et al. 2009) in which it was found that RT varies as a function of perceived rather than retinal stimulus size. In the present study, we found that this principle can be generalized to two different visual illusions: The Ponzo illusion which is a consequence of size constancy and the Ebbinghaus–Titchener illusion which results from size contrast. Clearly then, speed of response in a simple RT paradigm is controlled by perceptual rather than by physical parameters of the stimulus. Sometimes the two parameters go against each other. For example, in “Experiment 1” we found a non-significant RT advantage for the lower line when a no-illusion background was used reflecting a well-documented superiority of the lower hemifield in speed of RT (Previc 1990). This effect is likely to have diminished the speed advantage of the upper line with the illusion background and might be responsible for the smaller effect of the Ponzo illusion on RT with respect to the effect found with a matching procedure. Why is simple RT affected by visual illusions of depth and size contrast while grasping is not? As far as illusions induced by depth are concerned, they fail to generate effects in grasping experiments probably because the grasping procedure itself disambiguates depth since one knows how far the arm is extended when reaching. In contrast, RT measures an action that does not disambiguate depth, and as such preserves the illusion. Moreover, the pressing of the space bar in this experiment reports conscious perception of the presence or onset of the target stimulus. Indeed, according to Milner and Goodale (2008), a target must be detected (and sometimes identified) by the ventral stream before an action, such as grasping, can be directed at that target on the basis of visuomotor transformations mediated by the dorsal stream. Thus, a general conclusion, which applies to size contrast illusions as well, is that a speeded motor response such as simple RT, which is sensitive to illusions, is likely to be subserved by connections between the ventral stream and the motor system rather than by visuomotor networks in the dorsal ‘vision-for-action’ system. This ‘exception’ to the rule enables one to better characterize the differences and the similarities between the two streams of visual processing. Finally, as far as an interpretation of the perceptual bases of visual illusions is concerned it remains to be ascertained at what cognitive level the visual features of illusory stimuli are processed. Important clues have been provided by recent functional resonance imaging (fMRI) experiments (Fang et al. 2008; Murray et al. 2006) indicating that size constancy and related visual illusions may be subserved by the primary visual cortex, i.e., at early levels of visual processing. These findings are in broad keeping with the effects of illusions on simple RT found in the present study. However, the possibility of an important role of recurrent projections from higher order areas should be taken in consideration and verified in future electrophysiological studies.