Introduction

This study explored the spatio-temporal characteristics of the cortical response to emotional visual stimuli, using experimental conditions that have not been previously employed, and which may favor the elicitation of neural processes probably present in natural conditions. Due to their high temporal resolution and their capability for providing information on the origin of the recorded activity through source localization algorithms, event-related potentials (ERPs) are particularly well suited to study the spatio-temporal characteristics of emotional processes. Two experimental conditions are usually employed in ERP research on affective processes.

First, emotional pictures are briefly presented (typically, for less than 500 ms). This is the traditional strategy in ERP methodology, and has also been applied to studies of affective processes. Although brief presentations provide important data on emotional processing, particularly with respect to the initial phases of attention- and evaluation-related processes, complementary data obtained through longer presentations appear to be necessary. In fact, in natural conditions, we are often exposed to long-duration affective events. Emotion- or motivation-related reactions elicited by short-duration stimuli are obviously not as intense or as complete as those elicited by longer-lasting emotional events (more than 1 s). The consequence is that, using brief presentations, it is difficult to study important affective processes (both attraction- and rejection-related) which need effective and intense affective reactions. One example is the defense response (Sokolov 1963), which needs rejection of the stimulation to be produced. However, there are some exceptions to the “norm” of brief presentations, and a number of recent ERP experiments employ sustained affective stimulation (i.e., more than 1 s: Schupp et al. 2000, 2003; Cuthbert et al. 2000; Keil et al. 2002). These studies indeed show relevant information with respect to late phases of the response to emotional pictures.

Secondly, the ERP research on affective responses, rather than using indirect or implicit tasks, typically employs direct or explicit tasks (i.e., participants are asked to categorize emotional stimuli as positive, neutral or negative during the recording phase itself) or requires no active task at all (i.e., subjects simply have to look at emotional and non-emotional affective pictures). Non-indirect procedures may lead subjects to consider that some stimuli are more important than others (i.e., emotional, salient stimuli, more important for an experiment on emotional reactions than neutral ones). This ‘relevance-for-task-effect’ has often been described in previous studies (e.g., Duncan-Johnson and Donchin 1977), which showed that stimuli considered by subjects to be highly relevant for the task elicit higher amplitudes in certain endogenous components than less relevant stimuli. This cognitive interference overlaps on affective effects. Indeed, the ERPs elicited by emotional stimuli in direct tasks are different from those elicited in indirect tasks (e.g., Carretié et al. 1997).

No previous studies exist, to the best of our knowledge, combining sustained stimulation and indirect tasks. These ‘natural’ conditions may provide complementary data to those obtained in previous ERP studies, and the present experiment will employ them in response to pleasant, unpleasant and neutral visual stimuli. Our main interest is to analyze the temporal and spatial characteristics of the neural response to these types of sustained stimuli (presented for 1.5 s) in conditions of low cognitive interference.

Materials and methods

Participants

Thirty students (21 women) from the Universidad Autónoma de Madrid, with an age range of 19 to 28 years (mean = 21.7) voluntarily took part in this experiment. They gave their informed consent to participate, reporting normal or corrected-to-normal visual acuity. The experiment was approved by the research ethics committee of the Instituto Pluridisciplinar (Universidad Complutense de Madrid), where the recording phase took place.

Stimuli and procedure

Twenty-four different visual stimuli were presented to subjects. In all cases, image sizes were between 48.5° (high) × 35.8° (wide) and 25.6° × 35.8°. These 24 stimuli were of three types (n = 8 in each case): arousing positive (A+), arousing negative (A-) and Neutral (N). They were all taken from the International Affective Picture System (IAPS; Lang et al. 2001). These images were selected on the basis of their scores in Arousal and Valence (two affective dimensions widely considered to explain the principal variance of emotional meaning: Lang et al. 1993; Osgood et al. 1957; Russell 1979; Smith and Ellsworth 1985), in such a way that A+ and A- were symmetrical in Valence and similar in Arousal, and N stimuli were neutral in both dimensions (Appendix). In each of the three types of stimuli (A+, A- and N), half of the images (4) contained a person, and the other half did not. Each set of eight stimuli (A+, A- and N) was presented five times, so that the total number of presentations was 120. Participants were instructed to look continuously at a small mark located in the center of the screen. Each of the images was presented for 1,500 ms. This long exposure period was particularly important for the scope of the study, as explained in the Introduction. Order of presentation was random.

Subjects were instructed to press one button if a person appeared in the image, and a different button if there was no person. Subjects had to wait until a 100-ms beep sounded, 1,600 ms after image onset, to respond. The aim of this indirect task was twofold. First, it was designed to help maintain the level of attention, and indeed, none of the 30 subjects who completed the experiment failed in more than 5% of the trials (those in which subjects responded erroneously were eliminated, as explained subsequently). The second and principal objective of the indirect task was to avoid making it easy for subjects to consider that some of the stimuli were more important than others (e.g., emotional stimuli more important than neutral), in order to prevent the ‘relevance-for-task effect’ described in the Introduction.

Recording

Electroencephalographic data were recorded using an electrode cap (ElectroCap International) with tin electrodes. A total of 58 scalp locations homogeneously distributed all over the scalp were used. All scalp electrodes, as well as one electrode at the left mastoid (M1), were originally referenced to one electrode at the right mastoid (M2). For the entire sample of electrodes, originally M2-referenced data were re-referenced offline using the average of the mastoids (M1 and M2) method.

Electrooculographic (EOG) data were recorded supra- and infra-orbitally (vertical EOG) and from the left versus right orbital rim (horizontal EOG). Electrode impedances were always kept below 3 Kohm. A bandpass of 0.1 to 50 Hz (3 dB points for -6 dB/octave roll-off) was used for the recording amplifiers. The channels were continuously digitizing data at a sampling rate of 125 Hz for the entire duration of the recording session. The continuous recording was divided into 1,700 ms epochs for each trial, beginning 200 ms before target onset. Trials in which subjects responded erroneously were eliminated. Moreover, a visual inspection was also carried out, epochs with eye movements or blinks being eliminated. This artifact and error rejection led to the average elimination of 8.23 A+ trials (standard error of mean, SEM: 0.63), 7.73 A- trials (SEM: 0.57) and 8.20 N trials (SEM: 0.60). The ERP averages were categorized according to each type of stimulus (A+, A-, N).

Data analysis

Detection and quantification of ERP components

The waveform recorded at a site on the head over a period of several hundreds of milliseconds represents a complex superposition of different overlapping potentials or components. Such recordings, especially when high-density montages are employed, can stymie visual inspection. The complex superpositions involved can also complicate efforts of source location. In the present study, components explaining most ERP variance were detected and quantified through covariance-matrix-based temporal principal component analysis (tPCA), a strategy that has been repeatedly recommended for these tasks, since the exclusive use of traditional visual inspection of grand averages and voltage computation may lead to several types of misinterpretation (e.g., Chapman and McCrary 1995; Coles et al. 1986; Dien et al. 2005; Donchin and Heffley 1978; Fabiani et al. 1987).

The main advantage of tPCA is that it presents each ERP component with its ‘clean’ shape, extracting and quantifying it free of the influences of adjacent or subjacent components (traditional grand averages often show components in a distorted way, and may even fail to show some of them). In brief, the tPCA computes the covariance between all ERP time points, which tends to be high between those time points involved in the same component, and low between those belonging to different components. The solution is therefore a set of independent factors made up of highly covarying time points, which ideally correspond to ERP components. Temporal factor score, the tPCA-derived parameter in which extracted temporal factors may be quantified, is equivalent to amplitude. In this study, the decision on the number of factors to select was based on the screen test (Cliff 1987). Extracted factors were submitted to varimax rotation.

Analysis of the experimental effects

The analysis of experimental effects required the ERPs, recorded at 58 globally distributed scalp points, to be grouped into different scalp regions, since the ERP components frequently behave differently in some scalp areas than in others (e.g., present opposite polarity or react differently to experimental manipulations). This regional grouping was determined through a covariance matrix-based spatial PCA (sPCA). The sPCAs were carried out, for each of the temporal factors, on their temporal factor scores (directly related to amplitudes, as earlier explained), in order to delimit and quantify these regions. This system is preferable to the traditional, a priori subdivision into geometrically defined scalp regions, since sPCA demarcates them according to the real behavior of each scalp point recording (Carretié et al. 2003; Spencer et al. 1999): basically, each region or spatial factor is formed with the scalp points where recordings tend to covary. As a result, the shape of the sPCA-configured regions is functionally based, and scarcely resembles the shape of the traditional, geometrically configured regions. Moreover, each spatial factor can be quantified through the spatial factor score, a single parameter that reflects the amplitude of the whole spatial factor.

Repeated-measures ANOVAs on these spatial factors with respect to stimuli (three levels: A+, A- and N) were carried out. The Greenhouse–Geisser (GG) epsilon correction was applied to adjust the degrees of freedom of the F ratios where necessary, and post hoc comparisons to determine the significance of pairwise contrasts were performed using the Bonferroni procedure (alpha < 0.05).

Source-location

The next step was to three-dimensionally locate the cortical regions that are sensitive to the experimental effects. Low-resolution brain electromagnetic tomography (LORETA) was applied to temporal factor scores. LORETA is a 3D, discrete linear solution for the EEG inverse problem (Pascual-Marqui 1999; Pascual-Marqui et al. 1994). Although, in general, solutions provided by the EEG-based source-location algorithms should be interpreted with caution due to their potential error margins, LORETA solutions have shown significant correspondence with those provided by haemodynamic procedures in the same tasks (Dierks et al. 2000; Vitacco et al. 2002), their error margin being modest (14 mm: Pascual-Marqui 1999; Pascual-Marqui et al. 1994). Moreover, the large sample size employed in the present study (n = 30), the use of tPCA-derived factor scores instead of direct voltages (which leads to more accurate source-localization analyses: Carretié et al. 2004b), and the estimation of convergence with sPCA data, contribute in reducing this error margin. In its current version, LORETA refers to a three-shell spherical model registered to the Talairach human brain atlas (Talairach and Tournoux 1988).

Results

Experimental effects

Figure 1 shows six grand averages for each stimulus condition once the baseline value (prestimulus recording) had been subtracted from each ERP. These grand averages correspond to a selection of scalp sites where experimental effects and relevant components (described subsequently) are clearly appreciable. As a consequence of the application of the tPCA, seven components were extracted from the ERPs (Fig. 2). As it can be seen in Table 1, the sPCAs subsequently applied to temporal factor scores extracted both frontal and posterior spatial factor or region for each temporal factor, and an additional central region in factors 1 and 7 (see also Fig. 3).

Fig. 1
figure 1

Mean responses to the three types of stimuli (A- arousing negative; N neutral; A+ arousing positive) at a selection of medial prefrontal, medial central and lateral parieto-occipital sites where the experimental effects (described in the text) are easily distinguishable

Fig. 2
figure 2

tPCA: Factor loadings after Varimax rotation. Temporal factors 2 (P400), 3 (P680) and 4 (P160), which were sensitive to the experimental effects, are drawn in black

Table 1 Statistical details of contrasts in which differences between A- and the rest of stimuli, and/or between A+ and the rest of stimuli, were significant
Fig. 3
figure 3

Spatial factors (extracted for P160, P400 and P680) which were sensitive to the experimental treatment (SF spatial factor)

As earlier indicated, repeated-measures ANOVAs on these spatial factors with respect to stimuli (three levels: A+, A- and N) were carried out, together with Bonferroni post hoc tests. Our aim was to detect those components in which A- elicited higher amplitudes than the rest of the stimuli (A+ and N), which would indicate a processing bias towards negative stimuli, and those components in which A+ elicited higher amplitudes than the rest (A- and N), which would indicate a bias towards positive stimuli. Table 1 shows the statistical details resulting from these analyses, which revealed at least one of these two patterns of differentiation in temporal factors 2, 3 and 4. Factor peak-latency and topography characteristics (Figs. 1, 3) associate factor 2 (peaking at 392 ms) with the wave labeled P400 in grand averages (Fig. 1), factor 3 (680 ms) with P680, and factor 4 (160 ms) with P160. These labels will be employed hereafter to make the results easier to understand. Specifically, and as can be seen in Table 1, A- elicited significantly higher amplitudes than the rest of the stimuli in the frontal spatial factor of P160, and in both the frontal and posterior factors of P680. Meanwhile, A+ elicited higher amplitudes than the rest of the stimuli in the posterior factor of P400. Figure 3 shows the distribution of these spatial factors.

Source location

Ninety LORETA solutions were calculated (30 subjects × 3 conditions: A+, A- and N) for each of the three components mentioned earlier (P160, P400 and P680). In order to detect the foci responsible for the experimental effects, solutions were calculated, on a voxel-by-voxel basis, and according to ANOVA results (Table 1), for the differences between A+ and the average of A- and N in the case of P400, and between A- and the average of A+ and N in the cases of P160 and P680. The average of these differences is shown in Fig. 4, and Table 2 shows the main focus for each ERP component. As can be appreciated, the main foci are the ventromedial prefrontal cortex for P160, the middle temporal gyrus for P400 and the precentral gyrus for P680.

Fig. 4
figure 4

Images of neural activity computed with LORETA for the differences between A- and the average of A+ and N, in the case of P160 and P680 factor scores, and between A+ and the average of A- and N in the case of P400 factor scores (A- arousing negative; N neutral; A+ arousing positive; details in the text). Three orthogonal brain views in Talairach space, sliced through the region of the maximum activity, are illustrated. Left slice axial, seen from above, nose up; center slice saggital, seen from the left; right slice coronal, seen from the rear

Table 2 Main foci provided by LORETA for the experimental effects observed in P160, P400 and P680

Discussion

Sustained unpleasant stimulation elicits a pattern of neural reaction that has not been observed, in the complete form seen here, in previous ERP literature using short stimulation and/or non-indirect tasks. Specifically, the present study has revealed a three-phase response (P160, P400 and P680) in which a ‘negative-positive-negative’ pattern is observed (i.e., negative stimuli elicit the highest amplitudes in the first and last phases, and positive stimuli elicit the highest amplitude in the second phase). Source-location algorithms indicate the ventromedial prefrontal cortex, middle temporal gyrus and precentral gyrus as responsible for the effects observed in the first, second and third phase, respectively. There follows a step-by-step description of the process, together with a functional interpretation of the data.

The initial amplitude increase in response to unpleasant pictures, reflected in frontal P160, is in agreement with previous data showing a rapid response to negative stimuli that are briefly presented (Carretié et al. 2001, 2004a; Kawasaki et al. 2001; Northoff et al. 2000; Pourtois et al. 2004) or even unconsciously perceived (Carretié et al. 2005). This group of studies shows, through different tasks and methodologies, that, between 100 and 200 ms, prefrontal regions of the brain respond maximally to unpleasant emotional stimuli. This initial and rapid response to negative stimuli may reflect, as explained in the Introduction, an adaptive neural mechanism that aids survival by facilitating rapid coping with aversive or dangerous events (Cacioppo and Gardner 1999; Taylor 1991). According to recent findings, the prefrontal cortex receives early visual inputs that possibly make the organization of a rapid response and trigger, in some cases, a top–down attentional regulation (see Bar 2003; Bar et al. 2006). In relation to this, anatomical studies on ventromedial prefrontal cortex (orbitofrontal) connections show that this cortical area sends projections to ‘executive’ structures responsible for the motor and autonomic responses to aversive events, such as the amygdala and the premotor cortex (e.g., Cavada et al. 2000), and to posterior areas of the brain involved in visual attention and object recognition (Armony and Dolan 2002; Bar 2003; Bar et al. 2006; Hopfinger et al. 2000; Sarter et al. 2001). It is therefore reasonable to associate this frontal P160 activation in response to negative stimuli with initial processes related to a top–down prefrontal regulation of attentional and/or executive processes.

The second step, reflected in P400, consists of an increase in the response to pleasant pictures (and a decrease in response to unpleasant ones, which elicit a similar amplitude to that elicited by non-emotional stimuli). It should be pointed out that P400-like components (late positive potentials) are usually observed in experiments employing sustained emotional visual stimulation (e.g., Cuthbert et al. 2000; Schupp et al. 2000, 2003). Interestingly, the configuration of this component depends on the stimulus duration (e.g., Cuthbert et al. 2000). This P400 dependence on stimulus exposure period may reflect a sort of recurrent analysis of the stimulation that permits ‘deep processing’ of the environment. In the present case, P400 originated in the visual cortex. In relation to our argumentation, recent fMRI data show that the activation of diverse visual cortical areas increases as duration of visual stimulation increases (Huettel et al. 2004). The bias of P400-like components towards appetitive or positive events has also been observed in previous ERP studies using indirect tasks (Carretié et al. 2001; Delplanque et al. 2004; direct tasks elicit different trends: see Cuthbert et al. 2000; Schupp et al. 2000, 2003). In line with what we presented earlier, this bias may reflect recurrent analyses directed to deep processing of those events about which subjects are motivated. In fact, recent haemodynamic data reveal that visual cortex activation shows significant increases in response to motivating stimuli such as those employed in the present study (Simmons et al. 2005).

The third phase is reflected in P680, and we have found no references to a similar process in previous ERP literature. This phase consists, again, of an amplitude increase in response to negative stimuli, and the effect is observed over the entire scalp, being particularly strong in statistical terms. Source-location solutions suggest that the main generator for this phase is the left precentral gyrus (BA4), which is the location of the left primary motor cortex (M1). The precise point of activation within the left M1 corresponds to the right hand/fingers (i.e., the hand/fingers with which participants had to respond: Fig. 4). This result is suggesting a bias, in motor-related neural processes, towards negative stimuli. Why is the left M1 activated at 680 ms from stimulus onset if motor response is not required until 1,600 ms? It is important to note at this point that M1 is activated not only during motor execution, but also during anticipatory motor processes such as motor planning or motor preparation. Thus, both fMRI (Rodríguez et al. 2004) and ERP (Caldara et al. 2004) studies indicate that planning or preparing hand movements activates M1 to the same extent as performing them. Indeed, at this stage, subjects have already perceived the stimulus, and are probably anticipating the forthcoming button-press movement (1 or 2) corresponding to that stimulus. Several experimental findings suggest that this motor bias towards negative stimuli is an adaptive pattern associated with the flight–fight response typically elicited by aversive stimuli, and that pleasant or neutral stimuli do not need such intense motor reactions (e.g., Pahlavan et al. 2000). In fact, behavioral responses have been found to be biased towards negative stimuli (Coombes et al. 2006; Fox et al. 2000; Hansen and Hansen 1988; Öhman et al. 2001).

By way of conclusion, it can be stated that cortical response to sustained emotional visual stimulation presented within implicit tasks shows peculiar characteristics that complement data obtained under other experimental conditions. Thus, while the present results concerning P160 are similar to those observed in other studies using shorter stimulus durations and/or explicit tasks, the subsequent motivation-related processes reflected in P400 and, particularly, the motor-related biases associated with P680, appear to be better observed under the present experimental conditions. We consider, therefore, that the combination of sustained stimulation and indirect tasks constitute a useful strategy in the study of affective processes. Controlling some potential limitations of the present experiment, such as the repetition effects, that may have been elicited by presenting each picture more than once (four times each), would be an advisable strategy in future experiments within this line of research.