Introduction

In typical naturalistic environments our various senses are often overwhelmed with a multitude of incoming sensory stimuli (Calvert et al. 2004). In order to enable the efficient processing of information and to allow coherent behavior, attention, therefore, has to be coordinated across the different sensory modalities (Spence and Driver 2004). However, it has for many years been claimed that not all of the senses contribute equally to our perception and, in particular, that vision is the dominant sense (e.g., Posner et al. 1976; Spence et al. 2001a). One of the most impressive demonstrations of visual dominance was first reported by Colavita (1974; though see also Osborne et al. 1963). In studies investigating this phenomenon, typically referred to as the Colavita effect, participants are typically instructed to respond whenever they perceive a light or a tone. On a small number of trials both the light and tone are presented at the same time. On these bimodal trials participants typically exhibit a decreased ability to perceive (or at least to respond to) the auditory stimulus. More specifically, participants make significantly more visual-only than auditory-only errors on the bimodal trials, and actually often report being unaware of the tone when probed afterward (see, for e.g., Colavita 1974). By now, the Colavita visual dominance effect has proved to be robust to a number of experimental manipulations (Colavita 1974, 1982; Colavita et al. 1976; Colavita and Weisberg 1979; Koppen and Spence 2007a, b, c).

Traditionally, the Colavita visual dominance effect has been explained in terms of attention. For instance, Posner et al. (1976) suggested that the visual system simply has poorer alerting properties than the auditory system (see also Klein 1977), and hence that people may tend to endogenously direct their attention toward visual events in order to compensate for this supposed weakness. Any such biasing of a person’s endogenous attention toward the visual modality may have resulted in the failure by participants to respond to the auditory stimuli on some proportion of the bimodal trials in previous Colavita-type visual dominance experiments. This argument has generally been supported by the results of subsequent research (Egeth and Sager 1977; Sinnett et al. 2007; though see Koppen and Spence 2007b). However, it is important to note that the idea that visual stimuli necessarily have a poorer alerting ability than auditory stimuli has been questioned by more recent research that has shown that visual stimuli actually have a greater capacity to capture attention exogenously (involuntary) than auditory stimuli (Spence et al. 2001b; Turatto et al. 2002), and Koppen and Spence (2007c) have argued that this may also contribute to the emergence of the Colavita visual dominance effect.

The present study was designed to investigate the Colavita visual dominance effect in the presence of threatening information. In this context, it is worth describing an early study by Shapiro et al. (1984) in which they examined the effects of arousal on visual dominance using a typical Colavita experimental design. Participants were instructed to press the tone key as soon as they heard a tone, and the light key as soon as they saw a light. On (infrequent) bimodal trials, the participants were instructed to press the key corresponding to the signal that they perceived first. While performing the Colavita task, the participants were threatened or actually presented with aversive electrical stimuli. In the control conditions, there was neither the threat nor the presentation of electrical stimuli. Shapiro et al. reported that the threat of electrical stimulation attenuated the visual dominance effect. The proportion of bimodal trials in which the visual response rather than the auditory response was made was 59% in the threat condition (no different from chance) and 73% in the control condition (significantly above chance level). The presentation of electrical stimuli reversed the Colavita visual dominance effect, i.e., auditory dominance was observed. The proportion of bimodal trials in which vision was perceived before audition was 38% in the threat condition (significantly below chance) and 76% in the control condition (significantly above chance level).

Shapiro et al. (1984) explained this switch in sensory dominance in terms of the evolutionary advantage of audition over vision in aversive situations, i.e., the auditory system has a 360° detection capability whereas the detection properties of the visual system are localized in (frontal) space (cf. Heffner and Heffner 1992a, b). Related to this issue, it has been reported in the animal literature that when an aversive shock is paired with a compound auditory-visual stimulus, conditioned responses are better controlled by the auditory stimulus than by the visual stimulus (Shapiro et al. 1980). Although Shapiro et al.’s (1984) results show how visual dominance is affected when a person’s arousal level is increased they are not particularly informative with regard to the question of what might happen when visual or auditory stimuli constitute threat signals. The aim of the present study was therefore to investigate this important question.

Three groups of participants performed an experiment in which either the visual stimulus (visual threat condition), the auditory stimulus (auditory threat condition), or neither modality (control condition) was paired with an aversive electrocutaneous stimulus by means of classical conditioning. Note that, unlike Shapiro et al.’s (1984) study, the electrocutaneous stimuli were exclusively paired with either the unimodal visual stimuli or with just the unimodal auditory stimuli (and never with the bimodal stimuli), to make sure that one sensory modality would acquire threat value through its signaling of aversive stimulation. By pairing the conditioned stimulus with an aversive event, it should better capture attention, and hence, if attention is an important component of the Colavita visual dominance effect, should modulate the Colavita visual dominance effect. Two different hypotheses can be put forward. First, following on from Shapiro et al.’s (1984) study it could be argued that fear-conditioning, regardless of which modality is conditioned, would reduce visual dominance in comparison with the control condition. Second, based on the assumption that threat-related stimuli are more likely to capture attention than neutral stimuli (e.g., Bar-Haim et al. 2007; Van Damme et al. 2006b), it can be hypothesized that an enhanced Colavita visual dominance effect would be observed in the visual threat condition while a reduced effect should be observed in the auditory threat condition, as compared with performance in the control condition.

Methods

Participants

The sample consisted of 57 undergraduates from Ghent University (8 males and 49 females; mean age of 19.5 years, ranging from 17 to 40 years), who participated in the study in order to fulfill their course requirements. All except ten of the participants were right-handed, with normal or corrected-to-normal vision, and normal hearing. The experiment lasted for approximately 30 min. The study was approved by the Ethics Committee of the Faculty of Psychology and Educational Sciences of Ghent University and conformed to the Declaration of Helsinki. All of the participants gave their informed consent and were free to terminate the experiment at any time. None of the students refused to participate in the study.

Apparatus and materials

The task, which involved the presentation of visual and auditory stimuli, was controlled by the INQUISIT Millisecond software package. INQUISIT measures response times with millisecond (ms) accuracy (De Clercq et al. 2003). The visual stimulus consisted of the illumination of a green light emitting diode (LED) with a luminance of 1.9 cd/m2 for 50 ms. The LED was placed on the table in front of the participant at a distance of approximately 60 cm. The auditory stimulus consisted of a 4,000 Hz pure tone presented for 50 ms from a loudspeaker cone placed directly behind the LED, so that visual and auditory stimuli came from the same spatial position (cf. Koppen and Spence 2007c). The tones were presented at 65 dB (A), as measured from the participant’s head position.

For the purpose of fear-conditioning, a 300 ms low-intensity unpleasant electrocutaneous stimulus was administered, which was delivered by an AC stimulator with an internal frequency of 50 Hz. This stimulus was presented to the median nerve of the non-dominant hand by means of two standard Ag/AgCl electrodes (1 cm diameter) filled with ECG conductance gel (MedCat Supplies). The skin at the electrode sites was first abraded with a peeling cream (Nihon Kohden) in order to reduce the resistance of the skin. The intensity of the stimulus was then individually determined in order to ensure the delivery of a tolerable but unpleasant sensation, with an initial value of 1.5 mA. The mean intensity used was 2.14 mA (SD = 0.57, ranging between 1.50 and 3.50 mA), and the mean unpleasantness rating was 7.25 (SD = 0.74, range between 5 and 9) on a scale from 0 (not aversive at all) to 10 (extremely aversive).

Design

The experiment consisted of two blocks of 100 trials. The order of stimulus presentation was randomized within each block of trials. There were 40 visual, 40 auditory, and 20 bimodal trials in each block of trials. In the second block, visual stimuli (visual threat condition; N = 20), auditory stimuli (auditory threat condition; N = 20), or no stimuli (control condition; N = 17) were fear-conditioned. Fear-conditioning of a particular stimulus modality occurred with a reinforcement ratio of 25% (10 trials). In these trials, an electrocutaneous stimulus was administered immediately after the target stimulus. The reinforced trials were excluded from all analyses in order to prevent any interruption effects. A block of 20 practice trials (without fear-conditioning) was presented before the two main experimental blocks of trials. The trials were the same as in the experimental blocks, but they were not analyzed.

Procedure

The participants were tested individually in a dimly-illuminated testing room. First, they were informed about the task and the stimuli that would be used. It was made clear in both threat conditions that an electrocutaneous stimulus would be used during one block of trials. The experimenter stated that “most people find this kind of stimulation unpleasant”.

After the participants had given their informed consent to take part in the study, the electrodes were attached. The task was then explained to the participants. They were told that a visual, auditory, or bimodal target would be presented on each trial, and that they had to determine as quickly and accurately as possible which type of target had been presented. Participants responded using the thumb, index finger, and middle finger of the dominant hand. Responses were collected using a Cedrus RB-730 response box (Cedrus Corporation, San Pedro, CA) which was placed on the table directly in front of the participant. The participants were instructed to press one button in response to unimodal visual targets, another button in response to unimodal auditory targets, and a third button in response to the bimodal targets. Note that in line with recent work in this area (see Hartcher-O’Brien et al. 2008; Koppen and Spence 2007a; Sinnett et al. 2008) a 3-button response procedure was used in order to rule out response selection confounds. More specifically, we wanted to make sure that any failure to respond correctly on the bimodal target trials did not simply reflect a problem with initiating two responses (one to the auditory target and the other to the visual target). Targets were presented at the start of the trial for 50 ms, followed by a response window of 1,950 ms. Each target was presented 2,000 ms after the onset of the preceding target.

At the start of the second block of trials, participants in the visual (auditory) threat condition were explicitly told that a number of electrocutaneous stimuli would be administered, but that this would only occur in a proportion of trials in which a light (tone) was presented. They were also told that the administration of electrocutaneous stimuli would be completely independent from the response they happened to give. In the control condition, no specific instructions were given between blocks 1 and 2.

After the experiment, the participants rated the extent to which they were expecting that the electrical stimulus would follow the visual and the auditory stimulus, respectively, on a rating scale going from 0 (not at all) to 10 (all the time). This allowed us to confirm the efficacy of the fear-conditioning procedure that was utilized in the present study.

Results

Participants in the visual threat condition expected the electrocutaneous stimulus significantly more after the visual stimulus (M = 5.70, SD = 2.13) than after the auditory stimulus (M = 0.75, SD = 1.25), t(19) = 11.15, P < 0.001. By contrast, the participants in the auditory threat condition expected the electrocutaneous stimulus significantly more after the auditory stimulus (M = 6.30, SD = 1.84) than after the visual stimulus (M = 0.85, SD = 1.31), t(19) = 12.27, P < 0.001. These results therefore show that participants were aware of the contingency between the sensory modality of the target and the presentation of the electrocutaneous stimuli.

The participants failed to make any response on fewer than 1% of trials overall, and these trials were excluded from the data analyses. The results are shown in Table 1.

Table 1 Means and standard errors of error rates and RTs as a function of the block (1, 2) and condition (visual threat, auditory threat, and neutral control)

Analysis of unimodal target performance

Participants’ performance on the unimodal trials was analyzed using a 2 (Target: visual, auditory) × 2 (Block: 1, 2) × 3 (Condition: visual threat, auditory threat, control) analysis of variance (ANOVA). Analysis of the error data revealed a significant main effect of Target [F(1,54) = 14.14, P < 0.001], with participants making more mistakes on the unimodal visual trials (M = 5.8%, SE = 0.6) than on the unimodal auditory trials (M = 3.5%, SE = 0.4). None of the other effects reached significance.

A similar analysis of the RT data revealed a significant main effect of Target [F(1,54) = 38.12, P < 0.001], with participants responding more rapidly to the auditory targets (M = 510 ms, SE = 11) than to the visual targets (M = 550 ms, SE = 10). The analysis also revealed a significant interaction between Target and Block [F(1,54) = 11.95, P = 0.001], indicating that the difference in RTs to the visual and auditory targets was larger in the second block of trials [t(56) = 7.17, P < 0.001] than in the first [t(56) = 3.33, P < 0.01]. None of the other terms in this analysis reached significance.

Analysis of bimodal trial performance

Errors on the bimodal trials (i.e., those trials in which the participants pressed the auditory or visual target response keys rather than the bimodal response key) were analyzed using a 2 (Response: visual, auditory) × 2 (Block: 1, 2) × 3 (Condition: visual threat, auditory threat, control) ANOVA. This analysis revealed a main effect of Response [F(1,54) = 116.15, P < 0.001], with participants making significantly more visual (M = 15.2%, SE = 1.2) than auditory responses (M = 3.8%, SE = 0.5). None of the other main effects were significant. The significant interaction between Block and Condition [F(2,54) = 9.96, P < 0.05], revealed that the participants made more errors in the second block of trials in both threat groups, while they made fewer errors in the control group. Of particular interest here was the significant interaction between Response, Block, and Condition [F(2,54) = 7.53, P = 0.001]. In order to interpret this three-way interaction, we calculated the difference in the magnitude of the visual dominance effect (VD = percentage of visual-only errors minus the percentage of auditory-only errors) between blocks 1 and 2 (VDchange = VD2 − VD1) for each condition separately (see Fig. 1). A positive VDchange indicates an increase in VD from block 1 to block 2, whereas a negative VDchange indicates a decline in VD over time. Post-hoc comparisonsFootnote 1 between the conditions showed that the VDchange was significantly larger in the visual threat condition (M = 8.25, SD = 12.70) than in both the auditory threat condition (M = −1.00, SD = 13.73) (< 0.05) and the control condition (M = −8.53, SD = 13.08) (P < 0.001). Furthermore, there was a non-significant trend for a larger VDchange in the auditory threat condition than in the control condition (P < 0.10).

Fig. 1
figure 1

The graph shows the effects of both threat manipulations on the magnitude of the Colavita visual dominance effect as compared to the pattern of performance observed in the control condition. The values presented reflect the mean (with standard errors) visual dominance effect (percentage of visual minus percentage of auditory responses on bimodal trials) in the two blocks

Reaction times (RTs) were analyzed using a 2 (Block: 1, 2) × 3 (Condition: visual threat, auditory threat, control) ANOVA. This analysis revealed a significant main effect of Block [F(1,54) = 8.24, P < 0.01], with the participants responding more slowly in the second block of trials (M = 655 ms, SE = 14) than in the first (M = 624 ms, SE = 12). The analysis also revealed a significant interaction between Block and Condition [F(2,54) = 4.92, P < 0.05]. In the visual threat condition, the participants responded more slowly in the second block of trials than in the first, t(19) = 3.13, P < 0.01. Similarly in the auditory threat condition, RTs were slower in the second block than in the first, t(19) = 2.68, P < 0.05. By contrast, there was no difference in RTs between the two blocks of trials in the control condition, t(16) = 1.13, ns.

Discussion

In this study, we investigated whether the Colavita visual dominance effect (the phenomenon whereby the simultaneous presentation of a visual and an auditory (or tactile) stimulus leads to a decreased ability to perceive or respond to the auditory (or tactile) stimulus; Hartcher-O’Brien et al. 2008) would be modulated by threat. The participants in the present study were assigned to a visual threat condition (in which the visual modality was fear-conditioned), an auditory threat condition (in which the auditory modality was fear-conditioned), or to a control condition (in which neither modality was fear-conditioned).

The most important results to emerge from our study relate to the pattern of performance observed on the bimodal target trials, and can be readily summarized: First, a robust Colavita visual dominance effect was observed: that is, whenever a visual and an auditory stimulus were presented at the same time, the participants were significantly more likely to make visual-only responses than to make auditory-only responses. Second, when the visual stimulus acquired a threat value (by means of aversive conditioning), the magnitude of the Colavita visual dominance effect increased significantly. The participants also responded more slowly on the bimodal trials in the visual threat condition. Third, when the auditory stimulus was made threatening by means of aversive conditioning, the visual dominance effect was not reduced, although RTs on the bimodal trials did become somewhat slower. In fact, only the control condition showed any reduction in the magnitude of the Colavita visual dominance effect over time (i.e., when performance in Block 2 was compared to that reported in Block 1), whereas the RTs to the bimodal targets remained stable across the two blocks in that condition.

These findings conflict somewhat with the results reported by Shapiro et al. (1984) nearly a quarter of a century ago. In particular, they found that the threat (or actual presentation) of aversive shocks resulted in a reversal of the Colavita visual dominance effect (i.e., their participants started making more auditory-only responses than visual-only responses on the bimodal target trials). However, there is an important difference between their study and the experiment reported here that might help to account for the contrasting findings. Namely, Shapiro et al. informed their participants that they would receive a shock whenever their responses were either too fast or too slow. In the present study, participants were told that they would receive shocks after the presentation of a fixed proportion of the visual or auditory stimuli, irrespective of their response to those stimuli. Shapiro et al.’s results might therefore reflect the effect of arousal on strategic shifts of attention to the modality (i.e., audition) which participants believed that they found it most difficult to respond to, whereas the results of the present study are more likely to reflect the consequence of either the visual or auditory modality signaling threat.

Another important difference between the two studies is that in the present study the auditory and visual stimuli were presented from exactly the same spatial position, whereas in Shapiro et al.’s (1984) study, the visual and auditory stimuli were presented from different spatial locations. The visual dominance effect reported in their study might have been less stable and easier to reduce (or even reverse) than in our study (perhaps due to the focusing of participants’ attention on the location where stimuli from one modality were presented; cf. Spence and Driver 1997). It should also be noted that the participants in the present study used a separate response key in order to respond to the bimodal targets, whereas Shapiro and his colleagues instructed their participants to respond to the modality that they perceived first. This important difference in task instructions might be another reason for the contradictory findings between both studies.

In order to further clarify the pattern of results reported here, it is worth considering the effects of fear-conditioning on participants’ performance in the unimodal trials. Analysis of the data from these trials revealed that the participants responded both more rapidly and made fewer errors on the unimodal auditory trials than on the unimodal visual trials across all conditions. This result is somewhat surprising given the fact that in the bimodal trials we found visual dominance over auditory processing (i.e., a Colavita visual dominance effect was observed). This pattern of results is, however, in line with the results of recent studies of the Colavita visual dominance effect in which the participants responded to the bimodal targets using a separate (third) response key (Koppen and Spence 2007c; Sinnett et al. 2007). The RT advantage for the unimodal auditory trials over the unimodal visual trials indicates that the visual dominance effect cannot simply be explained by participants responding more rapidly to visual stimuli than to auditory stimuli. It would appear instead that the visual modality is only prioritized when there is some form of competition between stimuli presented at the same time in the two modalities, thus suggesting that the Colavita visual dominance effect is indeed an attentional phenomenon (see Colavita and Weisberg 1979; Egeth and Sager 1977; Koppen and Spence 2007c; Posner et al. 1976).

The results of the visual threat condition are consistent with this attentional account. Fear-conditioning of the visual modality is assumed to lead to an increase in the amount of attention being devoted to visual stimuli (e.g., Armony and Dolan 2002; Koster et al. 2004; Stormark et al. 1999; Van Damme et al. 2004, 2006a). Consequently, when presented concurrently, visual stimuli that signal threat are prioritized over neutral auditory stimuli, thus resulting in an increased visual dominance effect. Importantly, however, the results of the auditory threat condition do not seem to fit easily with the attentional account of the Colavita visual dominance effect. Fear-conditioning of the auditory modality should have been expected to lead to an increase in the amount of attention being devoted by participants to the auditory stimuli. Unexpectedly, this did not result in the prioritization of auditory stimuli signaling threat relative to neutral visual stimuli when presented concurrently. The Colavita visual dominance effect was neither reversed, nor reduced, but rather increased in magnitude relative to the neutral control condition.

The data from the control condition indicate that the visual dominance effect typically declines over time. Comparison of both threat conditions with this control condition suggests that the induction of fear and arousal increases visual dominance, but that this increase is more pronounced in the context of visual threat as compared to auditory threat. These findings seem to suggest that the induction of threat leads to a visual hypervigilance, no matter which sensory modality the threat happens to be related to. This suggestion runs counter to that put forward by Shapiro et al. (1984), who argued that audition has an evolutionary advantage over vision in aversive situations. This contradiction might be caused by methodological differences between both studies. For example, in our study, auditory stimuli were presented through a loudspeaker placed in exactly the same position as the LED delivering the visual stimuli. In Shapiro et al.’s study, by contrast, auditory stimuli were presented via headphones as a result of which there was no spatial concordance between visual and auditory information (see Spence and Driver 1997). As has already been demonstrated, the visual dominance effect is most robust when visual and auditory stimuli are presented from the same spatial position (see Hartcher-O’Brien et al. 2008; Koppen and Spence 2007c). Therefore the threat manipulation in our study might have been insufficient to produce a shift from visual to auditory dominance as reported by Shapiro et al. (1984). Obviously, more research will be needed in order to determine the exact mechanisms responsible for these contradictory results.

A number of issues need further consideration. First, it should be noted that the visual dominance effect found in the present study was substantially smaller than that reported in early studies (Colavita 1974; Colavita et al. 1976; Colavita and Weisberg 1979; Shapiro et al. 1984). One plausible explanation for this is the use of a 3-button response procedure, with a separate button for bimodal targets (see Koppen and Spence 2007a). The use of the bimodal response category in our study as opposed to Shapiro et al.’s instruction to respond to the stimulus that was perceived first might have induced certain response strategies such as slowing down responses and waiting to see which stimuli will be presented. Second, although our self-report ratings showed that participants in both threat conditions were aware of the contingency between stimulus modality and electrocutaneous stimulation, this does not necessarily imply that the conditioned stimulus has acquired aversive properties. Perhaps future studies might benefit from including more direct measures of fear-conditioning such as galvanic skin responses.

Third, contrary to both hypotheses, the visual dominance effect was not attenuated by auditory fear-conditioning. One potential explanation for this is that auditory stimuli (or, at least the auditory stimuli presented in the present study), might simply have been less capable of alerting participants to the potential administration of aversive electrocutaneous stimuli than the visual stimuli that were used. However, auditory fear-conditioning had some effect, as RTs to bimodal trials increased in the auditory threat condition but not in the control condition. Furthermore, self-report ratings revealed that the expectation of an electrocutaneous stimulus was equally strong after the presentation of a visual stimulus in the visual threat condition as after the presentation of an auditory stimulus in the auditory threat condition.

In conclusion, the present study shows that fear-conditioning increases the Colavita visual dominance effect. This is particularly apparent when a visual stimulus is paired with an aversive electrocutaneous stimulus (fear-conditioning the visual stimulus), which gives rise to an increase in the magnitude of this particular form of visual dominance. The increase in the magnitude of the visual dominance effect by pairing auditory stimuli with aversive electrocutaneous stimuli, as opposed to the reduction or reversal that had been expected, suggests that in a threatening context visual information is strongly prioritized over auditory information.