Introduction

The brain’s ability to integrate information coming from separate sensory estimates is critical for creating a unified and coherent representation of the environment. The integration of cross-modal (Spence and Driver 2004; Stein and Stanford 2008; Meredith and Stein 1986) and within-modal (Schröter et al. 2007; Murray et al. 2001; Savazzi and Marzi 2002, 2008) stimuli offers many benefits such as enhanced discrimination and accelerated reaction to objects. Surprisingly, only very few studies explored how the beneficial effects obtained in multisensory conditions differ from those obtained when combining redundant stimuli of the same sensory modality (Forster et al. 2002; Laurienti et al. 2006; Gingras et al. 2009).

Whereas inputs derived from different senses provide independent estimates of the same event, inputs from the same modality can exhibit substantial covariance in the information they provide. Thus, it might be expected that two spatio-temporally concordant stimuli from two different modalities will produce a greater gain in performance than the combination of two concordant stimuli from the same modality (Ernst and Banks 2002; Stein et al. 2009). In contrast, one might assume that both multisensory and unisensory integration would yield equivalent results because multisensory response enhancement simply reflects the presence of more environmental energy or multiple, redundant stimuli (Miller 1982; Stein et al. 2009). Supporting the first assumption, a recent study contrasting the behavioral outcome of cross-modal and within-modal integration in cats involved in a localization task demonstrated that cross-modal stimuli lead to an enhanced performance compared to within-modal pairs (Gingras et al. 2009). Neurophysiological studies in cats also point to major distinctions between multisensory and unisensory integration. When integrating multiple cues from different senses, neurons of the superior colliculus (SC), a primary site for multisensory integration (Stein and Meredith 1993), show additive or super-additive responses that are equal to or greater than the sum of the responses of the individual components. In contrast, multiple inputs from the same modality presented to the same SC neurons produce responses that are typically lower than the sum of their component (Alvarado et al. 2007). Further demonstrating that different neurophysiological mechanisms underlie unisensory and multisensory integration, reversibly deactivating cortico-collicular inputs from anterior ectosylvian sulcus (AES) disrupts the multisensory integrative capabilities of their target neurons in the SC, but not their capacity to integrate within-modal stimuli (Jiang et al. 2001; Alvarado et al. 2007). These animal studies have received support from a network model study that accounts for the underlying computations that characterizes cross-modal and within-modal integration (Cuppini et al. 2010). Altogether, these animal studies suggest that within-modal and cross-modal stimulations produce different neurophysiological responses and behavioral outcomes, with cross-modal stimulations usually leading to enhanced gains.

In humans, the behavioral outcome of sensory integration can notably be investigated through simple reaction time (SRT) paradigms showing significant decrease in reaction time (RT) when two or more stimuli are presented simultaneously rather than individually (Todd 1912). This effect is classically referred to as the redundancy gain (RG) (Hershenson 1962; Raab 1962). Different explanations have been put forward to account for the observation of the RG. The most commons are the race and the coactivation models. The race model proposes that each individual stimulus elicits an independent detection process. For a given trial, the fastest stimulus determines the observable RT. On average, the time to detect the fastest of several redundant signals is faster than the detection time for a single signal. Therefore, the speeding up of reaction time is attributable to statistical facilitation. When the race model's prediction is violated, the speedup of RTs cannot be attributed to a statistical effect alone but some kind of coactivation must have occurred (Colonius and Diederich 2004, 2006). To account for violations of the race model's prediction the coactivation model (Miller 1982) proposes that the neural activations of both stimuli combine to induce faster responses. Testing the race model inequality is widely used as an indirect behavioral measure of neurophysiological integrative processes underlying RT facilitation (Murray et al. 2001; Forster et al. 2002; Zampini et al. 2007; Tajadura-Jiménez et al. 2008; Molholm et al. 2002; Gielen et al. 1983).

To date, the only SRT study directly comparing the RG produced by cross-modal and within-modal stimulations demonstrates that cross-modal stimuli violated the race model assumption over a substantial proportion of the reaction time distribution, whereas the gain associated with redundant unimodal targets could be accounted for by statistical facilitation (Forster et al. 2002). In this experiment, however, the redundant unisensory stimuli were only presented bilaterally, leaving unanswered how the spatial configuration of the stimuli differentially affects cross-modal and within-modal integration.

SRT studies investigating the impact of stimuli’s spatial configuration on the RG typically show comparable facilitative effects with spatially aligned and spatially misaligned stimuli presented across hemispaces for visuo-tactile (Forster et al. 2002; Girard et al. 2011), audio-tactile (Murray et al. 2005; Zampini et al. 2007) and audio-visual stimuli (Teder-Sälejärvi et al. 2005). It is worth noting here that the processing of spatially aligned and misaligned stimuli is equivalent in terms of multisensory integration when spatial information is irrelevant for the task, as in SRT paradigms (but see Gondan et al. 2005 for small spatial congruency effects in a task that does not require spatial discrimination). However, when the spatial information is made relevant for the task, cross-modal stimuli presented at different locations result in a smaller RG than stimuli presented at the same spatial location (Diederich et al. 2003; Harrington and Peck 1998; Bolognini et al. 2005). Indeed, we recently emphasized that task requirements are crucial in triggering spatial congruency effects on multisensory integration (Girard et al. 2011).

In comparison with the cross-modal literature, the impact of the spatial configuration on redundant unimodal targets has been scarcely investigated. In vision, SRT studies investigating within-modal interactions yielded inconsistent results. Some studies showed comparable RG for redundant stimuli in both unilateral and bilateral configurations (Murray et al. 2001; Ouimet et al. 2009) while other studies demonstrated larger RG for bilateral than unilateral pairs of visual stimuli (Miniussi et al. 1998; Corballis et al. 2002). To date, evidence of coactivation has been observed only for bilateral presentations (Corballis et al. 2002; Hughes et al. 1994; Savazzi and Marzi 2002). In touch, the effect of double tactile stimulations within or between hands has mainly been used to investigate tactile identification and discrimination (Craig 1985; Evans and Craig 1991; Haggard et al. 2006; Tamè et al. 2011) and temporal order judgment (Craig and Baihua 1990; Clark and Geffen 1990). If these studies suggest a clear advantage of stimulating both hands in identification and discrimination tasks, no study has compared the RG for within and between hands stimulations in a SRT experiment.

To the best of our knowledge, no study has directly investigated how the spatial congruence of the redundant targets might differentially influence cross-modal and within-modal integration. This is of major importance since the relative advantage of cross-modal over within-modal integration might vary across specific spatial configurations. Since it has been suggested that a RG only occurs when each stimulus of a pair produces its own independent percept (Mordkoff and Yantis 1993; Schröter et al. 2011), it could be hypothesized that the integration of spatially adjacent and physically identical stimuli from the same modality has minimal impact on RT. In other words, such stimuli would produce overlapping internal or neural representations which might covary up to the point that their integration is not beneficial for behavior. A RG would only emerge when stimuli possess a distinct characteristic such as a spatial discrepancy. With cross-modal stimuli however, a RG might be observed regardless of the spatial congruency of the targets because they originate from different sensory systems. Testing the extent of cross-modal and within-modal integration under different spatial configurations should provide further insight into whether the spatial alignment of the redundant targets (and therefore the independence of the sensory estimates of this event) has the same effect on the behavioral outcome of both types of integration. We therefore designed the present experiment to compare the RG and violations of the race model inequality yielded by both cross-modal and within-modal combinations when visual and tactile targets are presented in spatial congruence or not.

Methods

Participants

Sixteen right-handed (Oldfield 1971) participants (8 males; mean age of 24 years, SD = 2.3 years; range from 20 to 29 years) were recruited to take part in the experiment. None of the participants reported a history of neurological or psychiatric problems. They all reported normal tactile sensitivity and normal or corrected to normal vision. The study was approved by the “Comité d’Éthique de la Recherche de la Faculté des Arts et des Sciences” (CÉRFAS) of the Université de Montréal and all subjects gave their written informed consent prior to inclusion in the study.

Apparatus and stimuli

Somatosensory stimuli were delivered using a pneumatic tactile stimulator (Institute for Biomagnetism and Biosignal Analysis, University of Muenster, Germany) for 100 ms. A plastic membrane (1 cm in diameter) was attached to the distal volar part of the index and middle finger and was inflated by a pulse of air pressure delivered through a rigid plastic tube. The plastic tube connecting the stimulator to the participant’s fingertips were inserted into the testing room through a hole padded with sound attenuating foam to ensure that tactile stimulations were completely silent from the inside of the room. Due to large interindividual differences in sensitivity to somatosensory stimuli, intensity was individually calibrated to obtain reliable but weakly salient stimulations. This procedure resulted in a mean pressure of 13.99 kPa (kPa; range from approximately 9.99–25.03 kPa). Participant’s hands were positioned at a distance of approximately 56 cm from their head, and their fingertips were placed at 7.5 (index) and 9.5 (middle finger) visual degrees of eccentricity to the right and left of a central fixation cross (see Fig. 1). Since tactile stimulations could produce small but perceptible finger movements, participant’s hands were placed under a white plastic board.

Fig. 1
figure 1

Experimental setup. Schematic view of the experimental setup and stimulation conditions. Tactile stimuli were delivered to the index and middle fingers of each hand, and visual stimuli were projected on a surface above the stimulated fingers. All conditions including two stimuli were presented either in an aligned configuration (both stimuli in the same hemispace) or misaligned configuration (both stimuli presented in different hemispaces)

Visual stimuli consisted of white circles subtending 1 degree of visual angle presented against a gray background for 100 ms. These visual stimuli were delivered to the right or left of the central fixation cross at 7.5 and 9.5 degree of eccentricity. This procedure ensured that the initial neural representation in the visual cortex was lateralized (Sereno et al. 1995). Visual stimuli were presented on a plastic board located 105–155 mm above the stimulated fingertips by a projector fixed to the room's ceiling. Stimuli were displayed and reaction times were recorded using Presentation software (Neurobehavioral Systems, Inc., Albany, US).

Procedures

Participants sat in a silent and dimly lit room with their head on a chinrest. They were instructed to respond as fast as possible to the onset of any stimulus by pressing a button fixed on a small box with their right or left thumb. After each block, observers were told to change the hand they were using to respond. The hand that was initially used to respond in the first block was counterbalanced across participants. Breaks were encouraged between blocks to maintain a high concentration level and prevent mental fatigue. Participants’ gaze was monitored throughout the experiment via a camera to ensure that they maintained central fixation.

Participants were presented with (1) a tactile stimulus alone, (2) aligned double tactile stimuli, (3) misaligned double tactile stimuli, (4) a visual stimulus alone, (5) aligned double visual stimuli, (6) misaligned double visual stimuli, (7) aligned visuo-tactile stimuli and (8) misaligned visuo-tactile stimuli. This yielded 24 stimuli configurations (4 tactile alone, 4 visual alone, 2 aligned double tactile, 2 aligned double visual, 2 misaligned double tactile, 2 misaligned double visual, 4 aligned visuo-tactile, 4 misaligned visuo-tactile). Aligned conditions consisted of two stimuli presented in the same hemifield, whereas misaligned conditions consisted of two stimuli presented in different hemifields. All the conditions of stimulation are presented in the schematic view of the experimental setup (Fig. 1). Whenever double unimodal or cross-modal stimuli were presented in the same hemifield, one of the stimulation was presented either to the index (tactile) or above the index (visual) and the second stimulation was presented either to the middle finger or above the middle finger. The same logic was applied to misaligned stimuli. Hence, all misaligned stimuli were presented to the left at 7.5° and simultaneously to the right at 9.5° and vice versa.

Participants completed six blocks of 260 experimental trials with each of the 24 stimuli configurations presented 10 times per block. Each block contained 20 catch trials (8 %) in which no stimulus was presented. They were used in order to restrain participants from anticipatory responses. A total of 60 trials per conditions were recorded. Intertrial interval randomly varied between 1,600 and 3,600 ms (Mean ITI = 2,600 ms). The fixation cross was displayed throughout the experiment. Each block lasted approximately 11 min.

Data analysis

Only RTs between 100 and 1,000 ms were considered for analyses. As a result, less than 1 % of trials per conditions were discarded. Since there was no main effect of the responding hands in the RT data, RT obtained with both hands were averaged. Furthermore, RTs obtained for each redundant condition (either within-modal or cross-modal) were averaged separately as aligned (both stimuli presented in the same hemifield) or misaligned (each stimuli presented in opposite hemifields) depending on their spatial locations.

The RG was computed by calculating the decrease (in percent) in the mean RTs obtained in the redundant conditions as compared with the mean RTs obtained for the best single condition (Stein and Meredith 1993). For each condition and each participant separately, the mean RT of a redundant condition was subtracted from the mean RT of the fastest stimuli of the pair and then divided by the RT of the fastest stimulus of the pair, which yield to the percentage decrease in RT between the redundant condition of stimulation and its best constituent [(RT best stimulation − RT redundant)/RT best stimulation]. The RG were then submitted to repeated measures analysis of variance (ANOVA). Post hoc analyses using a Bonferroni correction were used when appropriate.

To further investigate RG differences obtained for within-modal and cross-modal conditions, the race model inequality was analyzed using RMITest software, which implements the algorithm described at length in Ulrich et al. (2007). This procedure involves several steps. First, cumulative distribution functions (CDFs) of the RT distributions are estimated for every participant and every condition (i.e., visual alone, tactile alone, redundant unimodal and cross-modal condition). Second, the bounding sum of the two CDFs obtained from the two unimodal conditions (visual and tactile) are computed for each participant. This measure provides an estimate of the boundary at which the race model inequality is violated. Third, percentile points are determined for every distribution of RT, including the estimated bound for each participant. In the present study, the race model inequality was evaluated at the 5th, 15th, 25th… 95th percentile points of the RT distributions. Fourth, for each percentile, the mean RTs for redundant conditions and the bound are compared using a two-tail one-sample t test using Bonferroni correction to avoid Type I errors due to multiple comparisons (Ulrich et al. 2007). If any percentile shows significantly faster RTs in the redundant condition relative to the bound, it can be concluded that the race model cannot account for the facilitation of the redundant signal conditions, supporting the existence of an integrative process.

Results

On average, participants detected 97.8 % of all tactile stimuli (range from 94.8 to 99.8 %), 98.4 % of visual stimuli (range from 97.7 to 99.1 %) and 99.0 % of multisensory pairs (range from 98.8 to 99.4 %). Participants responded to <1 % of catch trials throughout the experiment. Mean RTs obtained for single, within-modal and cross-modal conditions can be found in Fig. 2.

Fig. 2
figure 2

Reaction times. Mean reaction time (in milliseconds) and standard errors of the mean (SEM) for pooled single, within-modal and cross-modal conditions. Capital letters refer to the modality (V visual, T tactile) and spatial configuration (A aligned, M misaligned) for each combination. The error bars represent the SEM for within-subject designs, following Loftus and Masson (1994)

RGs (in percents; Fig. 3) were submitted to a 3 [Modality: visual, tactile and visuo-tactile] × 2 [Alignment: aligned or misaligned] repeated measures ANOVA. The results showed a main effect of “Modality” [F(2,30) = 72.53, p ≤ 0.001] demonstrating that cross-modal visuo-tactile stimuli produced greater RT facilitation compared to both double tactile (p ≤ 0.001) and double visual stimuli (p ≤ 0.001). However, the RGs of double tactile and double visual stimuli did not differ significantly (p = 1). There was also a main effect of “Alignment” [F(1,15) = 47.72, p ≤ 0.001] demonstrating that RGs were greater for the misaligned conditions than for the aligned conditions. There was a significant interaction effect between “Modality” and “Alignment” [F(2,30) = 5.92, p ≤ 0.007]. Follow-up comparisons showed that the RGs of the misaligned conditions were larger than the RGs of the aligned conditions for double visual stimuli (p ≤ 0.001) and double tactile stimuli (p ≤ 0.041). However, there was no spatial alignment difference in RGs for the cross-modal conditions (p = 0.47). As assessed with separate one-sample Student’s t test, the RG for cross-modal combinations was significantly different from zero for aligned [t(15) = 13.71, p ≤ 0.001] and misaligned configurations [t(15) = 12.65, p ≤ 0.001], whereas only the misaligned configuration yielded RG that were significantly different from zero for within-modal pairs of visual [t(15) = 4.84, p ≤ 0.001] and tactile stimuli [t(15) = 4.38, p ≤ 0.001].

Fig. 3
figure 3

Redundancy gain. Mean RGs for within-modal and cross-modal pairs obtained under aligned and misaligned spatial configurations. RGs were calculated as the decrease (in percent) in the mean RT obtained in redundant conditions compared with the mean RT obtained for its best constituent stimulus. The X axis refers to sensory combinations (V visual, T tactile) and spatial alignment (“A” for aligned and “M” for misaligned). Asterisks indicate that the RGs were significantly (p < 0.05) different from zero as assessed by one-sample Student’s t test. Cross-modal stimuli produced greater enhancement than within-modal stimuli combinations, supporting the advantage of combining multiple sensory cues for behavioral performance. Moreover, a RG was observed for within-modal pairs of both modalities only when the stimuli were presented in a misaligned configuration. The error bars represent the SEM for within-subject designs, following Loftus and Masson (1994)

To further test the advantage of cross-modal over within-modal integration, we investigated whether the RTs obtained in the redundant conditions exceeded the statistical facilitation predicted by Raab’s race model inequality (Miller 1982). For cross-modal stimuli, the race model inequality was significantly violated up to the 40th percentiles of the RT distribution in the aligned (all p ≤ 0.001) and in the misaligned (all p ≤ 0.004) conditions. No significant violation of the race model inequality was found for any redundant visual or tactile condition, suggesting that the faster RTs in these conditions could be explained by simple probability summation (Fig. 4).

Fig. 4
figure 4

Race model inequality. Test for violation of the race model inequality (Miller 1982; Ulrich et al. 2007). The graph represents the difference in milliseconds (on the Y axis) between the model prediction computed from the RTs of each unisensory counterpart (the model bound) and the RTs obtained in the redundant conditions. Positive values on the graph refer to RTs that were faster than the race model prediction. RTs that were significantly faster than the race model prediction are marked with an asterisk, which indicates race model inequality violation. Negative values on the graph refer to RTs that were slower than the race model prediction. The difference between the bound and the RTs of the redundant condition are computed for each percentile of the RT distribution (on the X axis). Cross-modal stimuli significantly violated the race model inequality irrespective of their alignment whereas both double visual and double tactile stimuli were consistent with simple probability summation

Control experiment

In the main experiment, intrahemispheric (aligned) stimuli were always presented closer to each other in an Euclidian (external) space when compared to stimuli presented interhemispherically (misaligned). In order to test if the greater RG observed for within-modal misaligned conditions depends on interhemispheric stimulation or on the external spatial separation between stimuli, we conducted a control experiment in which the spatial separation between stimuli was held constant in external space for redundant intrahemispheric and interhemispheric conditions.

Methods

Participants

Thirteen right-handed (Oldfield 1971) participants (6 males; mean age of 25 years, SD = 2.3 years; range from 20 to 29 years) were recruited to take part in the control experiment. None of the participants reported a history of neurological or psychiatric problems. They all reported normal tactile sensitivity and normal or corrected to normal vision.

Procedures and stimuli were the same as the main experiment. However, intrahemispheric and interhemispheric within-modal conditions were presented with a constant Euclidean distance between both stimuli. For the tactile experiment, participant’s hands were positioned at a distance of approximately 56 cm from their head and their fingertips were positioned parallel to the horizontal meridian to form an imaginary rectangle (Fig. 5). Both index fingers were placed at 1 visual degree below the fixation cross and middle fingers placed at 1 visual degree above the fixation cross. Left and right fingertips were positioned as close as possible to the vertical midline in order to maintain an equal distance between stimuli for intrahemispheric and interhemispheric conditions.

Fig. 5
figure 5

Schematic view of the experimental setup and stimulation conditions for the control experiment. Visual stimuli a were projected on a surface above the stimulated hands, and tactile stimuli b were delivered to the index and middle fingers of each hand. The distance between redundant stimuli was held constant when presented either in the same hemispace or in different hemispaces

For the visual experiment, all visual stimuli were presented at 2.5° of visual angle to the right and left of a central fixation cross. This ensured that visual stimuli were presented outside the naso-temporal retinal overlap region and that their initial representations were lateralized (Sereno et al. 1995). Briefly, for intrahemispheric conditions, one stimulus was presented at 2.9 visual degree above, and a second stimulus was presented at 2.9 visual degree below the vertical coordinate of the fixation cross. For interhemispheric conditions, one stimulus was presented to the left at 1.5 visual degree above the fixation cross, while the second stimulus was presented to the right at 1.5 degree below the fixation cross and vice versa. Hence, there was a constant 5.83 visual degree separation between redundant stimuli for both intrahemispheric and interhemispheric conditions (Fig. 5). There were 8 single visual conditions and 4 redundant visual conditions. Participant completed 4 blocks in the visual condition and 4 blocks in the tactile condition for a total of 60 trials per conditions. First block modality and responding hand were counterbalanced between participants.

Results

On average, participants detected 99.0 % of all tactile stimuli (range from 96.6 to 100 %) and 99.5 % of visual stimuli (range from 98.9 to 100 %). Mean RGs obtained for single tactile, double tactile, single visual and double visual can be found in Fig. 6.

Fig. 6
figure 6

Redundancy gain obtained in the control experiment. Mean RGs and SEM for within-modal pairs obtained under intrahemispheric and interhemispheric spatial configurations. The X axis refers to sensory combinations (V visual, T tactile) and spatial configurations (“SH” for same hemispace and “DH” for different hemispaces). Asterisks indicate that the RGs were significantly (p < 0.05) different from zero as assessed by one-sample Student’s t test. For both modalities, stimuli presented in different hemispaces produced greater enhancement than stimuli presented in the same hemispace. The error bars represent the SEM for within-subject designs, following Loftus and Masson (1994)

RGs (in percent; Fig. 6) were submitted to a 2 [Modality: visual, tactile] × 2 [Alignment: intrahemispheric or interhemispheric] repeated measures ANOVA. First, no main effect of “Modality” was found, [F(1,12) = 2.36, p = 0.150], demonstrating that the RGs of double tactile and double visual stimuli did not differ significantly. The results showed a main effect of “Alignment,” indicating that the RGs for interhemispheric conditions were greater than for intrahemispheric conditions [F(1,12) = 7.96, p ≤ 0.015]. Finally, no interaction was found between “Modality” and “Alignment,” [F(1,9) = 0.014, p = 0.907]. As assessed with separate one-sample Student’s t test, the RG for visual stimuli was significantly different from zero for interhemispheric [t(12) = 5.27, p ≤ 0.001] and intrahemispheric [t(12) = 2.329, p ≤ 0.038] conditions. For tactile stimuli, only the interhemispheric condition yielded a RG that was significantly different from zero [t(12) = 3.574, p ≤ 0.004].

These results are consistent with what was found in the main experiment (Fig. 6). First, it suggests that even with a constant distance held between the stimuli, the RGs appear to be greater for interhemispheric conditions than intrahemispheric conditions for both modalities. Secondly, a small increase in the distance between visual stimuli produced a small increase in the RG for the intrahemispheric condition, suggesting that the RG might be influenced by stimuli’s spatial separation under specific circumstances. Further studies including parametric variations of the distance between stimuli are needed to better understand the relative impact of the external spatial distance and the stimulation of the same or separate hemispheres on the within-modal RG.

Discussion

We compared the RGs yielded by cross-modal and within-modal combinations when targets were presented within or across hemispaces. The aim was to test how the spatial proximity of redundant stimuli affects cross-modal and within-modal integration. Our results compellingly demonstrate that the RG was far greater for combinations of cross-modal stimuli than for combinations of within-modal stimuli. These results extend previous findings obtained in simple (Forster et al. 2002) and choice (Laurienti et al. 2006; Bernstein et al. 1972) RT paradigms and parallels those obtained in animals showing a behavioral advantage of integrating cross-modal over within-modal stimuli (Gingras et al. 2009). Whereas statistical facilitation could account for within-modal RTs, all cross-modal conditions violated the race model inequality. Such RT advantage of multisensory over unisensory integration might appear surprising since the former relies on the integration of different kinds of energy captured by different sensory organs and transmitted to separate sensory regions of the brain. However, several studies have demonstrated that multisensory interactions can occur at low-level stages in the cortical hierarchy of perception and at very early latencies after stimuli presentation (Molholm et al. 2002; Ghazanfar and Schroeder 2006; Giard and Peronnet 1999; Foxe et al. 2000). Crucially, recent findings suggest that such early latency and low-level interactions of sensory information from different modalities are functionally linked to both reaction time facilitation (Sperdin et al. 2009, 2010) and detection accuracy (Van der Burg et al. 2011).

The main results of the present study relate to how within-modal and cross-modal integration are modulated by the spatial congruency of the stimuli. We observed that RTs of multisensory stimuli were significantly and equally facilitated for both aligned and misaligned conditions (Fig. 3) and that the race model inequality was violated over the same range of the RT distributions in both conditions (Fig. 4). These results contrast with studies showing enhanced behavioral gains for spatially congruent over incongruent multisensory conditions (Diederich et al. 2003; Harrington and Peck 1998; Hughes et al. 1994; Kitagawa and Spence 2006; Sambo and Forster 2009; Kitagawa et al. 2005; Spence and Driver 1994; Bolognini et al. 2005, Frassinetti et al. 2002). However, unlike the present study, most of these studies required an explicit processing of the spatial position of the targets. A critical aspect of the present study is that we used a SRT paradigm where no explicit processing of the target’s spatial location was required. This is consistent with other SRT studies showing no modulation of the RG according to the spatial position of the stimuli (Murray et al. 2005; Zampini et al. 2007; Tajadura-Jiménez et al. 2008; Teder-Sälejärvi et al. 2005; Forster et al. 2002; Girard et al. 2011). Therefore, when no explicit spatial discrimination is required, there is no effect of the spatial congruence of multisensory targets on the RG (see also Sperdin et al. 2010). It has indeed been shown that the same misaligned stimuli could produce a RG when presented in a SRT task or a decrease in RT when presented in a spatial discrimination task (Girard et al. 2011). These results are consistent with the proposition that higher-order cognitive or attentional processes, which are task-dependent, might have a top-down influence on multisensory interactions (Spence and Driver 2004; Talsma et al. 2007, 2010; Spence and MacDonald 2004; Hecht et al. 2008). This influence might involve dynamic shifts of spatial representations or strategies that emphasize the stimulus’ temporal aspect over its spatial location (Murray et al. 2005). More generally, this hypothesis is consistent with the idea that different computational goals might dictate different multisensory integrative principles (Stein and Stanford 2008).

In contrast to what was observed for cross-modal combinations, we observed that the RG for double tactile or visual stimuli was greater when the unisensory targets were delivered in separate hemispaces compared to situations where the unisensory targets were delivered in the same hemispace. These results are supported by our control experiment which demonstrates greater RG for interhemispheric than intrahemispheric presentation even when the distance is held constant between the stimuli. This indicates that the greater RG for interhemispheric and misaligned conditions likely depends on the simultaneous stimulation of both hemispheres rather than the physical distance that separates the stimuli. In the control experiment, intrahemispheric visual targets generated a significant RG, which was not observed in the main experiment, suggesting that the distance between visual stimuli might also influence the RG under specific conditions. To ensure an initial lateralization and constant distance between visual targets, the stimuli in the intrahemispheric conditions of the control experiment were slightly more distant than the aligned conditions of the main experiment.

The results for visual stimuli are consistent with previous findings showing that the RG for bilateral pairs was larger than for unilateral pairs and that this effect is present for symmetric and diagonal arrangements (Corballis et al. 2002). However, other studies previously reported a redundant target effect that was independent of the spatial configuration (unilateral, bilateral or vertical midline) of the stimuli (Murray et al. 2001; Ouimet et al. 2009). These studies used SRT tasks with similar spatial configurations and distances. Hence, the discrepancies regarding the spatial congruency effects for visual stimuli are presumably related to methodological factors such as the type of stimuli, response method and experimental settings. The results obtained with tactile stimuli are consistent with several studies showing the advantage of delivering stimuli to both hands rather than adjacent fingers for identification or discrimination (Craig 1985; Evans and Craig 1991; Haggard et al. 2006; Tamè et al. 2011). The current experiment therefore extends such bilateral tactile advantage to the RG observed in SRT paradigms.

In the aligned conditions, the representations of the stimuli from the same modality may largely overlap, resulting in similar internal or neural representations for single and redundant trials. Because both stimuli are not processed independently, such overlapping representations would contribute to the smaller or the absence of RG observed for these conditions. On the other hand, interhemispheric stimuli are initially processed independently by each hemisphere, resulting in distinct and non-overlapping internal representations that produce enhanced RG. In line with such interpretation, a recent study showed that the fusion of redundant targets into a single visual percept failed to produce a RG (Schröter et al. 2011). Using stereoscopic presentation where double visual stimuli could elicit either a single or distinct percepts, the authors demonstrated that the redundant target effect emerged only when the stimulations produced two distinct percepts. In addition, similar findings have been reported in the auditory domain (Schröter et al. 2007), suggesting that the number of percepts drives the appearance of the RG.

Along similar lines, we hypothesize that cross-modal stimuli would produce greater RGs than within-modal combinations because they originate from different sensory systems that provide independent and non-redundant estimates about the same external event. To some extent, this relates to the probabilistic “bayesian” view of sensory integration stating that individuals take the reliability of the sensory estimates into account when making behavioral decisions. They weight each modality according to its reliability to improve discrimination and localization (Ernst and Banks 2002; Alais and Burr 2004). Accordingly, combinations that do not provide more accurate information to the nervous system are less likely to improve behavior. Therefore, the combined information of two stimuli from different modalities should have lower variance because they are processed by independent sensory systems and are not influenced by the same noise source (Hillis et al. 2002). However, two identical sensory stimuli from the same modality presented at the same time and approximately at the same place might covary up to the point that their integration is only minimally beneficial (Gingras et al. 2009).

Since the physical features of a stimulus such as intensity can influence RTs (Piéron 1952; Bonnet et al. 1992; Bell et al. 2006), it may appear surprising that increasing a stimulus’ energy through double unimodal stimulation only had marginal or no effect on the RTs to unilateral pairs of stimuli in our study. Since the amount of energy was the same for within-modal pairs in aligned and misaligned configurations, our results suggest that the redundant target effect does not depend on stimulus’ energy, but rather appears to be under the influence of the stimuli’s spatial locations, with interhemispheric stimuli yielding an enhanced gain compared to intrahemispheric stimulation even when the absolute spatial distance between the targets is equal (Fig. 6). Nevertheless, it is still unclear whether increasing a single stimulus intensity or increasing stimulus intensity through double stimulation produce similar neurophysiological responses. For example, in a study investigating the effect of stimulus intensity on saccadic RTs and response onset latency in SC neurons of monkeys, (Bell and colleagues 2006) reported that increasing single stimulus intensity shortened both the latency of neuronal responses and saccadic RTs to visual targets. On the other hand, Alvarado et al. (2007) showed in the same structure that unisensory integration of within-modal (visual) pairs yielded responses that were similar to those evoked by their best component stimulus. Although they also varied the intensity of the visual stimuli, they did not report any response latency effect on SC neuronal activity. Thus, increasing a stimulus’ energy by presenting multiple stimuli does not seem to invariably produce neurophysiological or behavioral response enhancement, as appears to be the case for single stimuli.

In summary, in addition to our observation that cross-modal stimuli produce far greater RG than combinations of within-modal stimuli in every conditions of stimuli presentation, the results of the present experiment also demonstrate that the spatial locations from which the sensory inputs occur had differential impacts on cross-modal and within-modal integration. Whereas aligned and misaligned cross-modal stimuli yielded identical enhancements, performance was affected by the spatial location of within-modal stimuli. Behavioral facilitation for redundant visual and redundant tactile stimuli was greater when stimuli were presented in a misaligned or interhemispheric configuration (see Figs. 3, 6). These results provide novel insights regarding the impact of the spatial congruence of redundant targets on within-modal and cross-modal integration and support the notion that more independent estimates of a single event produce greater behavioral benefits.