Attending points in time and space

It is well known that visual (Posner 1980), auditory (Spence and Driver 1994; Mondor and Zatorre 1995), and tactile (Spence et al. 2000) stimuli within the focus of spatial attention are processed faster and more accurately than stimuli outside the focus of attention. Recently, similar behavioral gains have been observed when visual attention was oriented to a point in time (Coull and Nobre 1998; Miniussi et al. 1999; Griffin et al. 2001; Correa et al. 2004; for reviews see Nobre 2001, 2004). These authors used a cuing paradigm analogous to the task introduced by Posner for investigating spatial attention (Posner 1980): A symbolic cue was presented that indicated the duration of the cue target interval (e.g. 600 vs. 1,400 ms; Miniussi et al. 1999) either correctly (valid trials, P=0.80) or incorrectly (invalid trials, P=0.10). In the remaining trials no target was presented. Participants had to respond to each target, irrespective of the validity of the cue. Targets that occurred at the indicated point in time were detected faster than targets that occurred unexpectedly early. The temporal cuing effect was evident for the short but not for the long intervals. The authors attributed this asymmetry to a re-orientation of attention in long intervals when the expected target was missing at the end of the short interval (Coull and Nobre 1998; Miniussi et al. 1999; Griffin et al. 2001).

Although it is possible to direct visual attention to other non-spatial features as the color or orientation of objects, event-related potential (ERP) studies have provided evidence that selection by location often precedes selection by these features and leads to qualitatively different attention effects. For example, there is evidence, that visual spatial attention affects the visually evoked P1 (about 80 ms post-stimulus) and N1 (e.g. Van Voorhis and Hillyard 1977; Eason 1981; see also Mangun 1995) suggesting a relative enhanced processing of attended as compared to unattended stimuli at the stage of perceptual analysis. By contrast, attending to the color or orientation of visual stimuli is associated with a negativity starting not earlier than 120–220 ms (selection negativity, e.g. Harter and Guido 1980; Hillyard and Münte 1984; see also Harter and Aine 1984; Heslenfeld et al. 1997). Hillyard and Münte (1984) showed that ERP-effects related to attending a color followed spatial attention effects provided that the spatial locations could be easily discriminated. Moreover, color attention effects have been found to be contingent on prior selection by space (see also Anllo-Vento and Hillyard 1996; Eimer 1995). However, when colors were easier to discriminate than the two positions, color attention effects precede spatial attention effects (Hillyard and Münte 1984). These results suggest a flexible, task-dependent rather than fixed selection hierarchy within the visual system. Nevertheless, the earliest ERP signs of color-attention (N150-350) have never been observed earlier than the earliest effects of spatial attention (P122; Hillyard and Münte 1984).

Spatially selective attention seems to affect the processing of visual stimuli earlier than temporally selective attention as well: In a recent study by Griffin et al. (2002) the effects of temporal and spatial attention on visually evoked potentials were compared within the same participants. Even though a modulation of the visual N1 (120–200 ms) by temporal attention was reported, spatial attention still affected earlier visually evoked potentials (P1, 80–120 ms; Griffin et al. 2002, experiment 1). It might be speculated that spatial information is particularly early and easily accessible in vision, because the visual system is spatially organized at its initial cortical processing stages. Additionally, it may be speculated that spatial cues including the fixation cross and the monitor frame, which were continuously available during the experiment, have made selecting by space easier than selecting by time in the study of Griffin et al. (2002).

At its first cortical processing stages, the auditory system is tonotopically organized. Azimuthal locations have to be computed on the basis of interaural differences in the arriving time, phase, and intensity. Therefore, spatial information is less directly available in audition than in vision. Nevertheless, it has repeatedly been shown that spatial attention improves the processing of attended compared to unattended stimuli in the auditory modality as well (Spence and Driver 1994; Mondor and Zatorre 1995). Starting around 60–70 ms, sounds elicit a long lasting negativity, called the “processing negativity” (e.g. Näätänen et al. 1978; Näätänen 1982). This processing negativity is more pronounced for attended than for unattended stimuli, resulting in the so-called “negative difference” between the attended and unattended stimuli (Hansen and Hillyard 1980). Moreover, there is evidence that the negative difference comprises a modulation of the partly exogenous auditory N1 (e.g. Hillyard et al. 1973; Giard et al. 1988), suggesting that behavioral effects of auditory spatial attention are mediated by early, sensory processing stages. Similar early attention effects have been observed when attention is focused on a particular pitch as well. Interestingly, when auditory stimuli have to be selected, participants select first by pitch rather than by spatial position (e.g. Woods et al. 1998, 2001, experiment 2; but see Näätänen et al. 1980 for contrary results) or use both pitch and position for stimulus selection (e.g. Mondor et al. 1998).

Lange et al. (2003) have recently reported the effects of temporal attention on auditory ERPs that were comparable to early effects of spatial attention. Short (600 ms) and long (1,200 ms) empty intervals, marked by bursts of white noise were presented. The task was an adaptation of the sustained spatial attention paradigm developed by Hillyard et al. (1973). In alternating blocks, participants had to attend either to the offset of the short or long intervals. Their task was to detect infrequent offset markers (which differed in intensity from the frequent standard offset markers) that were presented at the attended point in time. Standard stimuli presented at the attended compared to the unattended point in time elicited an enhanced N1 (100–140 ms) over the anterior scalp. In addition, there was a late posterior positivity to the attended compared to unattended stimuli. The modulation of the auditory N1 by temporal attention resembled the N1-effect of spatial attention as described e.g. by Hillyard et al. (1973).

Based on the available findings it may be hypothesized that the visual but not the auditory system favors spatial features for initial stimulus selection. Within a hierarchical selection model, spatial selection therefore might precede temporal selection in the visual modality. By contrast the selection sequence in the auditory system remains unclear.

To explore the relative onset of temporal and spatial selection, effects of temporal and spatial attention have to be investigated with the same paradigm. The present study manipulate temporal and spatial attention simultaneously in order to explore the relative onset of temporal and spatial selection processes. Short and long intervals (600 and 1,200 ms, respectively), marked by a centrally presented sinus tone (S1) and a white noise burst presented on the left or right side (S2) were presented to the participants. Participants had to attend to a combination of one interval and one position, e.g. the short interval terminated by an offset marker on the left side (Fig. 1). They had to respond to infrequent, deviant S2 stimuli (less intense than the frequent standard S2 stimuli) presented at the attended time and at the attended position. In different blocks, different combinations of interval and position had to be attended. The present experiment therefore allowed a direct comparison of physically identical stimuli under different attention conditions. For example, when an offset marker of a short interval presented on the left side was presented in the Attend Short Left condition, it was classified as “Attended time, Attended position” (T+P+). In the attend long left condition, however, the same stimulus was labeled “Unattended time, Attended position” (T−P+), whereas in condition attend short right, it was termed “Attended time, Unattended position” (T+P−). Finally, in the attend long right condition, it was classified as “Unattended time, Unattended position” (T−P−).

Fig. 1
figure 1

Schematic illustration of the task outline, exemplary for condition attend short left. Participants heard on- and offset markers (S1 and S2, respectively; indexed by note symbols) presented by loudspeakers. The SOA between S1 and S2 was 600 ms for the short intervals and 1,200 ms for the long intervals. S1 appeared in the center, whereas S2 was presented either from the right or from the left loudspeaker. Participants had to respond when a deviant stimulus (indicated by the gray note symbol) terminated the attended interval on the attended side. Attended stimuli [for condition attend short left) are marked by arrows. Intervals were separated by 1,000 ms, on average

The first question of the present study was, whether or not temporal and spatial attention modulate the same stages of early auditory processing. We hypothesized that both temporal attention and spatial attention enhance the amplitude of the auditory N1. The second question was, whether temporal and spatial attention modulate early auditory processing independently or in conjunction. If temporal and spatial attention influence the processing of auditory stimuli independently, effects of temporal attention should be evident at both the attended and the unattended position and effects of spatial attention should be found at both the attended and the unattended time point. By contrast, if temporal and spatial attention influence the processing of auditory stimuli in conjunction, effects of temporal and spatial attention should depend on whether the other dimension is attended or not. Third, we asked if and how temporal and spatial attention affect later processing stages. Based on the findings reported in the literature, we expected to find a sustained negative effect of spatial attention over the frontal scalp (e.g. Hansen and Hillyard 1980; Näätänen 1982) and a positive effect of temporal attention over posterior areas (Lange et al. 2003).

Methods

Participants

Fourteen healthy university students participated (three male). Two female participants were excluded from further analysis because of extensive artifacts in the EEG recordings (see Data analyses: ERP data). The final sample comprised data of 12 participants (three male) with a mean age of 23 years (range 20–26 years). All except one were right handed and all reported normal hearing. Participants received a monetary compensation for taking part. Written informed consent was obtained. The experiment followed the ethical standards laid down in the Declaration of Helsinki (1964).

Stimuli and apparatus

Auditory stimuli were presented from two loudspeakers, located at a distance of 60 cm to the participant and separated by 15 cm (see Fig. 1). The stimuli consisted of short (600 ms) and long (1,200 ms) empty intervals that were delimited by auditory on-and offset markers (S1 and S2, respectively). S1 was a 500 Hz sine-wave tone (56 dB(A) at the participants’ ears, duration 50 ms) presented simultaneously from both speakers, which created the impression that the tone originated directly in front of the participants. S2 was a 50 ms long white noise burst that was presented either from the left or from the right speaker. Standard (P=0.66) and deviant intervals (P=0.33) differed in the intensity of their S2. The S2 of a deviant interval was slightly less intense than that of the S2 of a standard interval (63 dB(A) vs. 68 dB(A) at the participants’ ears). In the following, we refer to the S2 of standard intervals as standards and to the S2 of deviant intervals as deviants. To restrict the time for decision and responding and thus increase task demands, a relatively short inter-trial interval was used (900-1,100 ms, mean 1,000 ms, rectangular distribution). The experiment was controlled by a PC using Presentation® software (http://www.neurobs.com).

Procedure

The experiment was conducted in an electrically shielded, anechoic, darkened room. The participants sat in an adjustable chair with their heads immobilized with a chin-rest. Participants received written instructions and were familiarized with the different interval types (short vs. long, left vs. right, standard vs. deviant). Thereafter, they practiced the experimental task in six blocks. During the experiment proper, which consisted of a total of 32 blocks, the participants were blindfolded. In each block 120 intervals were presented, half of which were short and half of which were long. Half of the intervals ended with a left offset marker and the other half ended with a right offset marker. In separate blocks the participants had to attend either to the short or to the long intervals and either to the left or to the right speaker, yielding four conditions (1) attend short left, (2) attend short right, (3) attend long left, and (4) attend long right. This procedure allowed to directly analyze the processing of the same stimulus as a function of temporal (attend short vs. attend long conditions) and spatial attention (attend left vs. attend right conditions). Within each block, ten intervals of each category ended with a deviant (overall probability of deviant: P=0.3333). Deviants presented at the attended point in time and at the attended position were targets (P=0.0825). The participants were asked to respond as quickly and as accurately as possible to targets by lifting the index finger out of a light gate. The hand (left or right) used for responding was systematically varied within each participant so that participants responded equally often with the left and with the right hand in each condition. The attended position alternated between blocks, while the attended interval changed every four blocks. One half of the participants began with the attend short condition, the other half started with the attend long condition. After each block, the participants had the opportunity to have a short rest. Feedback about the number of correct responses of the last block was provided.

ERP recording

The Electroencephalogram (EEG) was recorded from 61 scalp electrodes (non-polarizable Ag/AgCl electrodes), which were mounted with equal distances in a triangular arrangement into an elastic cap (Easy Cap, FMS). All scalp electrodes were referenced to the right earlobe. An additional electrode at the left earlobe was recorded and served for an off-line re-referencing of the data to a linked-earlobe reference. The electrode impedance was kept at 5 kΩ or below by preparing the skin with Every (Gelimed) and alcohol. For all electrodes, ECI Electrogel (Electrocap International, Inc.) was used as electrolyte. The band pass of the amplifiers (Synamps, Neuroscan) was DC to 100 Hz, and the digitization rate was 500 Hz with a resolution of 16 bit. Horizontal eye movements were monitored with a bipolar recording of two electrodes attached to the outer canthi of the eyes. Vertical eye movements were measured with an electrode placed below the right eye recorded against the right earlobe. The electrode impedances of the Electrooculogram (EOG) electrodes were kept at 10 kΩ or below.

Data analyses

Behavioral data

Only reaction times between 200 and 900 ms after S2 were included in the analyses. Responses to targets (deviants of the designated interval and position) were considered as correct responses (hits). Responses to deviants of the incorrect interval but the correct position were considered as temporal false alarms. Responses to deviants of the incorrect position but correct interval were considered as spatial false alarms. Responses to standards of the designated interval and position were considered as standard false alarms. d′ values (z p(hit)z p(false alarm); Green and Swets 1966) were calculated separately for these different error types. To compare the acuity of temporal and spatial discrimination, the corresponding d′ values were compared with a paired-samples t test.

ERP data

Off-line, a linked earlobe reference was computed. The EOG recordings served for off line rejection of trials with eye movements. Segments were removed whenever (1) a voltage change between two consecutive sampling points exceeded 50 μV, (2) the maximum absolute voltage difference between two sampling points in the segment exceeded 100 μV at any electrode within the 700 ms long time epoch following S2 (eye movement artifacts), (3) activity was less than 0.10 μV for a time epoch longer than 100 ms (amplifier saturation). For two participants some noisy channels (never more than three) were replaced by the average amplitudes of the five or six surrounding electrodes. The raw data of three participants were digitally filtered with a low pass filter (Butterworth Zero Phase; 3 dB attenuation at 40 Hz; slope: 12 dB/oct) to eliminate high frequency noise.

To increase the signal-to-noise ratio for ERPs, the electrodes were remapped to ipsi-and contralateral recording sites with respect to the hemifield of stimulation, and ERPs to left and right stimuli of identical conditions were collapsed.Footnote 1 Stimuli were re-labeled as attended time (T+) or unattended time (T−) and attended position (P+) or unattended position (P−), yielding to the four conditions (1) attended time, attended position (T+P+), (2) unattended time, attended position (T−P+), (3) attended time, unattended position (T+P−), and (4) unattended time, unattended position (T−P−). The subset of electrodes used for statistical analysis were combined to clusters with three electrodes each that were specified by one level of factor hemisphere (contra- vs. ipsilateral to stimulation) and one level of factor cluster (medial frontal, central, central anterior, lateral temporal central posterior, medial posterior, lateral posterior; see also Fig. 2).

Fig. 2
figure 2

Outline of the electrode montage. The clusters used for analysis [medio frontal (MF), central (C), central anterior (CA), lateral temporal (LT), central posterior (CP), medial posterior (MP), and lateral posterior (LP)] are marked in the montage. Electrodes ipsilateral to stimulation are shown on the left, electrodes contralateral to stimulation are shown on the right

In attention research it is essential that the participants cannot predict which stimulus will be presented next. In the present task as in most temporal attention paradigms, this criterion is violated for the long intervals. After the short interval has passed with or without a stimulus appearing, it is totally predictable that no stimulus will be presented after the long interval or that and when S2 will occur, respectively (see Lange et al. 2003; Coull and Nobre 1998). Therefore, ERPs to offset markers of the long intervals can not be interpreted unequivocally, and thus we report data of the short intervals only (however, analyses of the long interval data yielded over-all the same results;Footnote 2 see also Lange et al. 2003).

Temporal attention effects were assessed by the comparison of the ERPs to physically identical stimuli at the attended versus unattended point in time ([ERP(T+P+) + ERP(T+P−)] minus [ERP(T−P+) + ERP(T−P−)]). Correspondingly, spatial attention effects were examined by comparing the ERPs to stimuli at the attended versus unattended position ([ERP(T+P+) + ERP(T−P+)] minus [ERP(T+P−) + ERP(T−P−)]). Moreover, effects of temporal attention were analyzed separately for the attended [ERP(T+P+) vs. ERP(T−P+)] and for the unattended position [ERP(T+P−) vs. ERP(T−P−)], and effects of spatial attention were analyzed separately for the attended [ERP(T+P+) vs. ERP(T+P−)] and for the unattended time point [ERP(T−P+) vs. ERP(T−P−)].

Event-related potentials to S2 of short intervals were separately averaged for the two levels of attended time (attended vs. unattended) and the two levels of attended position (attended vs. unattended). A 100-ms pre-S2 baseline was used in order to eliminate possible baseline differences between conditions elicited by S1. Based on an inspection of the grand averages, both temporal and spatial attention effects were assessed by calculating the mean amplitude of time epoch 90–130 ms (N1). Moreover, the mean amplitudes between 150 and 400 ms were analyzed for effects of spatial attention and the mean amplitudes between 300 and 380 ms were analyzed for effects of temporal attention. For each of these time epochs, a repeated-measures ANOVA with the factors attended time (attended vs. unattended), attended position (attended vs. unattended), hemisphere (contralateral vs. ipsilateral) and cluster (seven levels) was conducted. Only effects involving the experimental factors (attended time and/or attended position) are reported. Higher order interactions were examined with appropriate sub-ANOVAs (O’Brien and Kaiser 1985). All statistics were calculated with the SAS software. The Huynh–Feldt correction was applied in order to compensate for violations of the sphericity assumption (Huynh and Feldt 1976). The corrected probabilities together with the corresponding ε-values are reported.

Results

Behavioral data

Participants were able to discriminate the two intervals and the two positions very well (d′(Time) = 2.07, SE = 0.20; d′(Position) = 2.95, SE = 0.14). However, the accuracy of spatial discrimination was higher than that of temporal discrimination (t(12) = −7,43, P<0.0001). Standards and targets were also well distinguishable (d′(Standards) = 2.44, SE = 0.24) Participants responded to less than 0.01% of standard stimuli with no or only one attended feature and deviant stimuli with no attended feature.

ERP data

Event-related potentials were characterized by an N1 (peaking around 120 ms; larger over the hemisphere contralateral to stimulation) and a P2 (peaking around 200 ms, Fig. 3). In the time range of the N1 (90–130 ms), an enhanced negativity was observed both for stimuli at the attended as compared to the unattended point in time and for stimuli at the attended as compared to the unattended position. Consecutive spatial attention effects consisted of a long lasting negativity over the frontal scalp (150–450 ms). Late effects of temporal attention were observed in an enhancement of a posterior positivity, maximally pronounced between 300 and 380 ms.

Fig. 3
figure 3

Grand average ERPs elicited by Standards as a function of attended time and attended position (T+ versus T-: solid lines vs. dashed lines, P+ versus P-: thick lines vs. thin lines). The depicted clusters (central and medial posterior) are marked black in the electrode montage shown in the middle. Electrodes ipsilateral to stimulation are shown on the left, electrodes contralateral to stimulation are shown on the right. The analyzed effects are indicated by arrows. All traces are aligned with respect to a 100-ms-long pre-S2 baseline. The onset of S2 is indicated with a dashed vertical line

Time epoch 90–130 ms

In the time range of the auditory N1 (between 90 and 130 ms), both a temporal attention effect and a spatial attention effect were observed (see Table 1, panel A; Fig. 3). The attended time by attended position interaction indicated that effects of temporal and spatial attention were not independent. Therefore, the temporal attention effect was separately analyzed for the attended and the unattended position, and the spatial attention effect was separately analyzed for the attended and the unattended time (see below). Topographic differences between the temporal and spatial attention effects were not observed (Interactions between attended time and attended position and cluster and/or hemisphere: all P>0.20).

Table 1 ANOVAs on mean amplitudes between 90 and 130 ms

The temporal attention effect

The temporal attention effect was significant only for the unattended position (Table 1, panel B; Fig. 4, panel A) and was most pronounced over the central scalp (main effect attended time for central, anterior central and temporal clusters: P<0.01, for frontal and central posterior clusters: P<0.05).

Fig. 4
figure 4

Grand average ERPs elicited by Standards. a T+ (solid lines) versus T- (dashed lines), for the unattended side. b P+ (thick lines) versus P- (thin lines), for the unattended time point. The ipsilateral and contralateral central clusters shown are marked black in the electrode montage in the middle of the figure. The clusters, for which the main effect of both temporal and spatial attention was significant, are shaded gray. The arrows point towards the N1-attention effects. All traces are aligned with respect to a 100-ms-long pre-S2 baseline. The onset of S2 is indicated with a dashed vertical line. On the right, the normalized topographies (Stanine values, mean = 5, SD = 2) of the temporal and the spatial attention effects are shown (top view). Smaller values (darker shading) indicate a relatively more negative amplitude for the attended than the unattended condition

The spatial attention effect

The spatial attention effect was significant only for the unattended point in time (Table 1, panel C; Fig. 4, panel B) and was most pronounced over the fronto-central scalp (main effect attended position for frontal, central, and anterior central clusters: P<0.01, for lateral temporal and central posterior clusters: P<0.05).

Topographic comparison of the effects of temporal and spatial attention

To compare the scalp topography of the temporal N1 attention effect at the unattended position with the scalp topography of the spatial N1 attention effect at the unattended time, the difference potentials ERP(T+P−) minus ERP(T−P−) and ERP(T−P+) minus ERP(T−P−) were calculated. These scores were submitted to repeated measures ANOVA with factors Attention Type (temporal vs. spatial), hemisphere (contralateral vs. ipsilateral) and cluster (seven levels). Analyses were conducted both on the raw (Urbach and Kutas 2002) and on the normalized difference values (normalized separately for each level of Attention Type and Participant; mean = 5, SD = 2; McCarthy and Wood 1985). Neither analysis revealed significant differences between the topographies of the temporal attention effect and the spatial attention effect (raw scores: all F<1; normalized scores: all P>0.31).

Spatial attention effects between 150 and 400 ms

Following the N1 attention effects, spatial attention was associated with an enlarged negativity for the attended as compared to the unattended position (Table 2, Panel A; Fig. 3, top). The four-way interaction of factors attended time, attended position, cluster, and hemisphere indicated that the spatial attention effect was not identical for the attended and unattended time. Separate analyses for the attended and the unattended time nevertheless demonstrated that a spatial attention effect was reliably measured in both conditions (Table 2, Panels B and C). For both the attended and the unattended time, the negative difference had a fronto-central maximum and was significant for the medial frontal, the central, central anterior, lateral temporal, and central posterior clusters (attended time: all P<0.0004; unattended time: all P<0.0009). For the lateral posterior cluster, a significant main effect of attended position was observed for the attended time (P=0.0216) but just failed to reach significance level for the unattended time (p=0.0647). Differences between the attended and the unattended point in time were due to larger ipsilateral than contralateral effect of spatial attention in the medial frontal and central anterior clusters for the attended (attended position × hemisphere in medial frontal and central anterior clusters: P<0.0001 and P=0.0194, respectively) but not for the unattended point in time (attended position × hemisphere in single clusters: all P>0.1479).

Table 2. ANOVAs on mean amplitudes between 150 and 400 ms

Temporal attention effects between 300 and 380 ms

Between 300 and 380 ms, a posterior-parietal positivity was enhanced for stimuli at the attended as compared to the unattended point in time (Table 3, panel A; Fig. 3, bottom). Interactions involving factors attended time and attended position indicated that the effect was differently pronounced for the attended and unattended position. Separate analyses for the attended and unattended position confirmed, however, that the posterior temporal attention positivity was significant for both (Table 3, panels B and C). For the unattended position, main effects of temporal attention were reliable for the medial posterior (P<0.01) and for the central posterior and lateral posterior clusters (P<0.05), whereas for the attended position, there was only a tendency for a temporal attention effect for the medial posterior cluster (F(1, 11)=4.00, P=0.0707).

Table 3. ANOVAs on mean amplitudes between 300 and 380 ms

Discussion

The present study investigated whether temporal or spatial or both features are used for stimulus selection at auditory processing stages commonly associated with perceptual analysis. Both temporal and spatial attention were associated with a relative enhancement of the N1, suggesting that both initially modulate the same early processing stage. Moreover, the temporal and spatial N1 attention effects were not independent. Later ERP effects, however, suggest that temporal and spatial information activate separate processing streams: Spatial attention elicited a sustained fronto-central negativity, whereas temporal attention was associated with an enhanced posterior positivity.

Do temporal and spatial attention modulate the same stages of early auditory processing?

Both temporal and spatial attention were associated with an enhancement of the auditory N1, which has been associated with early, possibly sensory, processing stages. It can therefore be concluded that temporal and spatial attention modulate the same stage of early auditory processing. The temporal and spatial N1 attention effects were maximally pronounced over the central and fronto-central scalp, respectively, resembling the scalp distributions of temporal and spatial N1 attention effects known from the earlier studies (temporal: Lange et al. 2003; spatial: e.g. Giard et al. 1988; Woldorff and Hillyard 1991; Näätänen et al. 1992; see also Woods 1990). The scalp topographies of both the temporal and the spatial attention effect were symmetrical in the present study. By contrast, several studies have reported that spatial attention effects are more pronounced over the hemisphere contralateral to stimulation (e.g. Teder-Sälejärvi et al. 1999; Woldorff and Hillyard 1991; Woods and Alain 2001). These earlier studies used either headphones or eccentricities of 80°. The lack of a contralateral scalp distribution of the spatial attention effect in the present study may be due to the rather small eccentricities of the speakers (7° from the midline). Given the enhancement of the auditory N1 by both temporal and spatial attention and the similar scalp distributions of the temporal and the spatial N1 effect, it might be speculated that at least partly overlapping mechanisms are involved in temporal and spatial attention.

The scalp topographies of both attention effects are in line with the assumption that they originate in auditory brain areas. The spatial N1 attention effect in the auditory modality has been associated with a relative increase of excitability of neural networks coding the attended as compared to the unattended channel (e.g. Hillyard et al. 1973; see also Giard et al. 1988; Woldorff et al. 1993; Woods et al. 1994). By contrast, Näätänen (1982, 1990) has attributed the early attention related negativity to a matching process, which compares the incoming stimulus with an actively maintained representation of the to-be-attended stimulus presumably in auditory cortex (attentional trace hypothesis). Näätänen (1982) even discussed the possibility that the attentional trace “can be activated and deactivated according to the temporal expectancies the subject has developed in the situation” (Näätänen 1982, p.630), suggesting a common attention mechanism that uses both spatial and temporal information.

In contrast to the analog representation of space in visual cortex, it has been proposed that different locations in auditory space are coded along clusters (e.g. Cohen and Knudsen 1999) or in distributed neural populations (Middlebrooks 2000). Little is known about the representation of temporal information in the brain, but it has been suggested that a population code may be used (for review see e.g. Buonomano and Karmarkar 2002; Mauk and Buonomano 2004). In a recent study, Leon and Shadlen (2003) found that the firing rate of neurons in the parietal cortex of awake monkeys were sensitive to elapsed time relative to the remembered duration of a visual stimulus (Leon and Shadlen 2003). Behavioral accuracy of the monkeys was correlated with the duration specific activity of parietal neural ensembles rather than by single neuron activity. It might be speculated that this or other timing mechanisms (e.g. Gibbon et al. 1984; Buonomano and Karmarkar 2002) are used by attentional control mechanisms that in turn up- and down-regulate the excitability in sensory areas.

Do temporal and spatial attention modulate early auditory processing independently or in conjunction?

Auditory N1 effects of temporal and spatial attention were not independent, suggesting that temporal and spatial attention modulate early auditory processing in conjunction. Strikingly, effects of temporal attention were reliably recorded only for the unattended position and effects of spatial attention were reliably recorded only for the unattended point in time. It may be speculated, that the processing of attended standards meeting both selection criteria is not further improved as compared to the processing of stimuli with just one attended feature (ceiling effect). Nevertheless, this finding is consistent with the notion that temporal and spatial features are simultaneously used to orient attention and to modulate early, perceptual analyses.

In a recent study with visual stimuli, Doherty et al. (2005) induced spatial and temporal expectancies by presenting a moving ball that followed either a constant or a random spatial trajectory and moved either at a constant or random temporal rate. ERPs to visual target stimuli showed an enhancement of the P1 when only spatial attention was induced. The earliest effects of pure temporal attention were observed later and started with an amplitude modulation of the N1 (Doherty et al. 2005). However, when both temporal and spatial information about the upcoming visual stimulus was provided, the amplitude of the visual P1 was larger than by spatial attention alone. Thus, temporal attention seems to modulate the visual P1 in conjunction with spatial attention. It may be suggested that in vision, spatial information is used for early stimulus selection by default. Other (e.g. temporal) information may be associated with later stages of stimulus selection, but might be flexibly included in early selection processes, if beneficial for the task.

Do temporal and spatial attention affect later processing stages?

Consistent with the earlier findings from spatial and temporal attention, a negative difference and a late positivity were observed, respectively. Thus, temporal and spatial attention differently affect processing stages following the N1 time range differently, possibly indicating that after initial selection temporal and spatial features are processed by different neuronal systems.

Spatial attention was associated with a large, fronto-central negativity to stimuli at the attended as compared to the unattended position, which was evoked by stimuli at the attended and at the unattended point in time. Such a negative difference is well known for spatial attention in the auditory modality (e.g. Näätänen et al. 1978; Hansen and Hillyard 1980) and has been interpreted as an index of further processing of relevant stimuli after an initial selection (Hansen and Hillyard 1980, 1983). It might be speculated that further processing of stimuli of the attended spatial position is associated with an up dating of the representation of spatial features of the attended stimuli, which starts with the presentation of S2.

By contrast, the updating of the representation of the temporal features of the attended stimulus might be by and large finished with presentation of S2, because the relevant temporal information is provided by the duration of the S1–S2 interval. Accordingly, similar to our earlier study (Lange et al. 2003), temporal attention was associated with a late posterior positivity between 300 and 380 ms. Miniussi et al. (1999) proposed that the P300-like temporal attention effect indicates improved motor preparation (Miniussi et al. 1999; see also Nobre 2001). However, this explanation does not hold for the positivity observed in the present study since, in contrast to the study of Miniussi et al. (1999), participants did not respond to the analyzed stimuli. It has been suggested that the P300 reflects processes associated with the termination of a perceptual epoch (“context closure”; Verleger 1988). Consistent with this, it might be speculated that the presentation of S2 at the attended point in time terminates the maintenance and the updating of the relevant time interval.

Conclusion

Temporal and spatial features seem to be used simultaneously at early stages of the auditory selection sequence since both temporal and spatial attention modulated the N1 and since these effects were mutually dependent. By contrast, effects of temporal and spatial attention were qualitatively different at later stages of auditory processing, possibly due to the different updating processes necessary for the temporal and the spatial representations.