Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

7.1 Introduction

Humans are socially living creatures. They need to interact with others, and for living comfortably, communications among humans have a significant and important role in it. Generally, communications can be divided into two big separable groups, first is the verbal communications, process for conveying and for expressing meaning with the verbal language. The other one is the non-verbal communications, just like its name, it does not contain any orally meaning words, the communications without words.

If we see throughout the life, in every single thing of the daily interactions, never human can be free from this second type of communications, the non-verbal communications. Even while speaking, humans combine their words with the arm movement, facial expression, and so on, for helping them to make the information easier to be understood by the opponent. The first major scientific study of non-verbal communications, especially the facial communication, was published by Darwin in 1872. Darwin concluded that many expressions and their meanings (e.g., for astonishment, shame, fear, horror, pride, hatred, wrath, love, joy, guilt, anxiety, shyness, and modesty) are universal: ‘I have endeavoured to show in considerable detail that all the chief expressions exhibited by man are the same throughout the world’ (Darwin 1872). Many researchers like Silvan Tomkins, Carrol Izard, and Paul Ekman were supporting the universality of facial expressions. But, there were also many other researchers were not supporting this universality idea. Margaret Mead, Gregory Bateson, Edward Hall, Ray Birdwhistell, and Charles Osgood were the researchers who said that the expressions and gestures are culture’s variable, and they were learnt through the social interaction. They said that in many cultures, people keep smiling even when they were in lost. And Birdwhistell concluded that everything that is socially important, like the expression of emotions, must be the product of learning, so they will be different among cultures (Ekman 2003).

For us, these differences, whether the facial expressions are product of cultures or universal, are not so important. Those groups saw the emotional expression from different point of views, and of course, the conclusion would be naturally different. The group, which said that the facial expressions are the product of cultures, was seeing through the facial expression for social interaction, meanwhile the universality group was seeing it through the individual perception in single expression without bias of social faces. What we concern is the way our brain processes the emotional information, especially the facial expression; therefore, we take the universality point of view for the facial expression, an individual perception without bias of social faces. Happy expression will be seen as happy expression and angry as angry without any social disturbance.

As known, we humans can differentiate the facial expression of our opponent generally into its type and the emotional level of that expression. For example, we can easily distinguish the difference of happiness from anger, and we also can evaluate the emotional level of that expression, whether they are in totally joyful or manic condition or just the ordinary happiness. Many researchers have shown that the brain responded to emotionally charged stimuli more than that of the neutrally rated stimuli (Fredikson et al. 1995; Adolphs and Tranel 2003; Balconi and Pozzoli 2003), but less or maybe none of them explored the effect of emotional strength on the brain signals. In this chapter, we would like to answer the fundamental questions of ‘How do the brains distinguish the type of emotions?’ and ‘How does the brain understand the strength of the emotion from the facial expressions?’ When and which area of the brain processes these kinds of emotional information.

7.2 Selected Researches at Emotional Effect in the Brain

When we saw an agony or sadness in the face of someone, we, instinctively, will try to help, to support, to do something to relieve the sadness or the agony. This agony or sadness in faces is a kind of sign or call for help from others. Not like sadness or agony, happiness, surprise, and others facial expressions also have their own special meaning to be responded. In general, humans will react differently, depends on the facial expressions of the opponent.

Recognizing the facial expressions is one of the very skilled ability of human, even babies precociously respond to different facial expressions (Field et al. 1982). From the clinical study, even people with mobius syndrome, a disorder producing facial paralysis, are able to recognize facial expressions (Philipps et al. 1998; Calder et al. 2000). Experimental studies on normal subjects showed that when the subjects were asked to make quick judgments of emotional expressions, their reaction times in judging the emotional contents in the presented expression were equal for both familiar and unfamiliar faces. By these results, we can conclude that the process of recognizing the facial expression is a special process that independents from a structural recognition and the identity of faces.

Cells within temporal visual cortices have long been known to show robust responses to faces, which are modulated by two factors: attention and emotion. Yet, a large number of psychological studies—over past decades in the process of recognizing emotion from facial expressions—sheer diversity in their findings. Wealth neurobiological findings from experiment involving lesions, event-related potential (ERP), magnetoencephalography (MEG), positron emission tomography (PET), and functional magnetic resonance imaging (fMRI) preclude any simple summary and argue against the isolation of only few structures. Instead, it is becoming clear that recognizing facial emotion draws on multiple strategies sub-served by a large array of different brain structures. The neuropsychology of emotion has stressed the left–right brain dimension as fundamental for emotional valences (Heller et al. 1998) with right-hemispheric superiority when processing negative connotations of incoming information and left superiority for positive connotations (Schwartz et al. 1975; Reuter-Lorenz and Davidson 1981; Natale et al. 1983). Clinical study showed agreeing lateralization: depression is associated predominantly with left-hemispheric lesions, inappropriate cheerfulness with right-hemispheric lesions (Robinson 1995). Not so stressing in the left hemisphere for emotional judgments, other researchers concluded that the structures in the right hemisphere appear to be important for the normal processing of emotional and social information (Benowitz et al. 1983; Borod et al. 1985; Bowers et al. 1985). From other lesions studies, damage to the right-hemisphere somatosensory cortices (RSS) caused the impairment for recognizing six basic emotions from facial expressions (Heberlein et al. 2003).

Recent ERP studies have supported the hypothesis that the process of facial expression recognition starts very early in the brain. The 120-ms post-stimulus onset was assumed to be the first perceptive stage, in which the subject completes the ‘structural codes’ of face, which is thought to be processed separately from complex facial information such as emotional meaning (Lane et al. 1998; Pizzagalli et al. 1999a; Junghofer et al. 2001; Utama et al. 2009), and in addition to a ‘structural code,’ the existence around 170-ms post-stimulus onset was supposed of an ‘expression code’ implicated in the decoding of emotional facial expressions (Bruce and Young 1998; Ellis and Young 1998). Faces with emotions elicited a larger negative peak at approximately 270 ms than neutral faces over the posterior visual area (Sato et al. 2001), and the differences in peak amplitude of the brain potential were affected by the experienced emotional intensity, related to arousal and unpleasant value of the stimulus (Balconi and Pozzoli 2003; Utama et al. 2009).

7.3 Psychophysics Experiment

Action and reaction are two natural things happen for humans, and so for the brain. Response in human can be concluded as the response of the brain, especially when the response corresponds to some actions which are related to higher level of mechanisms, such as cognitive-related tasks, emotional-related tasks, and other more. Psychophysics experiment is the name for an experiment which explores the action and reaction of humans or brain under manipulated situation and condition to scientifically study their relation. This concept can be applied into many experimental applications, and so for elucidating the process of emotional information in the brain.

ERP is a study that investigates changes in brain signals correspond to the presented stimuli. In this writing, we would like to combine the ERP with psychophysics experiment using several facial images that have been subjectively rated by normal healthy humans, based on two categories, the type and the emotional level. By the type of emotions, it means that the image was rated if it belonged to the type of facial emotions, such as happiness, sadness, disgust, fear, surprise, anger, and neutral of facial expressions, and by the emotional level, it means the image was rated into ten different levels of emotions, from one (1) to ten (10) in describing the lowest to the highest emotional level of facial expression, respectively. Special for the images of neutral facial expression, the emotional level is set to be zero (0) as default.

For this subjective rating, the original images of facial expressions were taken from Ekman and Friesen (1976). The images were black-and-white pictures of three (3) male actors, presenting happy, disgust, fear, surprise, angry, sad, and neutral faces. Each of the images was morphed from neutral face to each of those emotional types into nine steps of morphing, in 10 % of increment. Total stimuli for this experiment were 198 photos, contained 18 neutral faces and 180 emotional faces from three (3) different actors. Sample of morphing images which were used as stimulus can be seen in Fig. 7.1.

Fig. 7.1
figure 1

Sample stimuli. a Sample of original images derived from the Ekman and Friesen collections of neutral face and the facial emotions of anger, disgust, fear, happiness, sadness, and surprise. The images were cropped with the same outline (6° × 8°). Numbers were designated 0 % as the neutral face and 100 % as the intensity of the original image of the facial emotion. b Transitional images from the neutral face to the emotion disgusted are indicated as the 10 % increment values

Besides the recording of subjective rating from the subjects of experiment, the brain signals of the subjects were also recorded during the whole experiments. These two different data, subjective rating and brain signals, will be inter-correlated in order to elucidate the temporal effect of facial expressions in humans’ brain. Besides that, the task of rating the presented facial expression subjectively is one way to keep the subject stay awake and alert during the experiment.

We presented facial expressions as still photo images at the center of a CRT 21′′ (1,280 × 1,024, 100 Hz) where monitor placed approximately 70 cm from the subjects, with a visual horizontal angle 4° and a vertical angle of 6°. To fixate the subjects’ gaze, a white dot was presented at the center of the display during the transition between image presentations or between trials. The subject was told to stare at the fixation point, to carefully watch the stimulus and to do the press-button tasks as quickly as possible after the instruction was shown. Detail of the experimental design is shown in Fig. 7.2.

Fig. 7.2
figure 2

Experimental design of psychological experiment. A trial was started by the presentation blank gray screen with white dot in the center of the screen as fixation point. A neutral face was then presented followed by the presentation of a randomly selected transitional image of one of the emotional facial expressions from the same actor. The subject was required to identify the type of emotion and to assess the intensity of each image. Presentation time [s] is shown at the left sides of each panel

7.4 Brain Signal Recordings

Despites many medical imaging techniques can be used for recording the brain activities, electroencephalography is still the least dangerous, the less expensive, and yet the most temporally accurate for elucidating the brain activities based on its signals. Electroencephalography is a science of recording and analyzing the electrical activity of the brain. The spontaneous brain’s electrical activities are recorded in the time domain from several electrodes placed on the scalp. The electrodes are then linked to an electroencephalograph, which is an amplifier connected to a mechanism that converts electrical impulses into digital data and displayed on a computer screen. The digital data or the print out of spontaneous brain’s activity is called an electroencephalogram (EEG) (see Fig. 7.3).

Fig. 7.3
figure 3

a EEG system. b Samples of raw EEG

7.4.1 Brief History of Electroencephalography

Experiment by applying electricity to dead frogs’ nerve trunks which induced the movement of their legs was the first demonstration to prove that information in the nervous system may be electrically transmitted. This experiment was done by Italian physiologist Luigi Galvani during the eighteenth century. In 1870, two Prussian (current: German) physicians, Gustav Theodor Fritsch and Eduard Hitzig confirmed Galvani’s work by electrically stimulating areas of motor cortex of the dog’s brain which caused involuntary muscular contractions of specific parts of its body. It was not until 1875, however, that the Liverpool physician, Dr. Richard Caton, became the first person to record electrical activity in the brain by placing electrodes directly on brains of vivisected rabbits and monkeys. Using a primitive measuring device known as a mirror galvanometer, in which a moving mirror was used to amplify very small voltages, he reported finding feeble currents in the cerebral cortex, the outermost layer of the brain (Finger 1994).

Electrophysiological recordings became much more fashionable after Hans Berger (1873–1941) published his human EEG in 1929. The first human EEG was recorded using electrodes (made of lead, zinc, platinum, etc.) attached to the intact skull and connected to an oscillograph. Berger made 73 EEG recordings from his fifteen-year-old son, Klaus. The first frequency he encountered was the 10-Hz range (8–12 Hz), which at first was called the Berger rhythm, currently called alpha rhythm brain wave. He reported that the brain generates electrical impulses or ‘brain waves.’ The brain waves changed dramatically if the subject simply shifts from sitting quietly with eyes closed (short or alpha waves) to sitting quietly with eyes opened (long or beta waves). Furthermore, brain waves also changed when the subject sat quietly with eyes closed, ‘focusing’ on solving a math problem (beta waves). That is, the electrical brain wave pattern shifts with attention (O’Leary 1970). The publication of Hans Berger’s in 1929 changed neurophysiology forever, and because of it, he earned the recognition of ‘Father of Electroencephalography.’

7.4.2 The Electroencephalograph

The system for recording the spontaneous brain’s activity is electroencephalograph. It is safe, and very few risks are associated with it. Some locations on the scalp will be cleaned up by removing the dead cells and oils before attaching the electrodes. The placement of electrodes depends on the purpose of the study. In this writing, we would like to discuss based on the American 10–10 system of electrode nomenclature and placement for electrodes positioning. It contains 73 recording electrodes plus one electrode for ground and another for system reference at nose tip. All of these electrodes’ positioning is embedded in a cap. Besides those electrodes, we attached two different pairs of electrodes to record horizontal and vertical movement of the eyes. We placed these electrodes on the outer canthi of the two eyes for detecting the horizontal eye movements and on the infra- and supra-orbital ridges of the left eye for the vertical eye movements (see Fig. 7.4).

Fig. 7.4
figure 4

Electrodes nomenclature and placement

For this writing, we used amplifiers of SynAmps system (Neuroscan) with amplification of 25,000 times of the voltage between the active electrode and the reference. The amplified signal is digitized via an analog-to-digital converter, after being passed through an anti-aliasing filter. Electroencephalograms (EEGs) and electrooculograms (EOGs) for the eyes were recorded continuously with a band pass filter of 0.1–100 Hz and digitization rate of 1,000 Hz.

7.4.3 The EEG Artifacts

Although EEG is designed to record cerebral activity, it also records electrical activities arising from sites other than the brain. The recorded activity that is not of cerebral origin is termed artifact and can be divided into physiological and non-physiological artifacts. Physiological artifacts are generated from the subject him/herself and include cardiac, glossokinetic, muscle, eye movement, respiratory, and pulse artifact among many others. The EEG recording can be contaminated by numerous non-physiological artifacts generated from the immediate patient surroundings. Common non-physiological artifacts include those generated by monitoring devices, infusion pumps, electric power system, and electrode pops; spikes originating from a momentary change in the impedance of a given electrode may also contaminate the EEG record.

Severe contamination of EEG activity by the artifacts is a serious problem for EEG interpretation and analysis. The easiest way to remove the artifacts is simply rejecting contaminated EEG epochs, but it causes a considerable loss of collected information. In this study, we apply independent component analysis (ICA) to multi-channel EEG recordings and remove a wide variety of artifacts from EEG records by eliminating the contributions of artifactual sources onto the scalp sensors (Jung et al. 2000a, b). ICA-based artifact correction can separate and remove a wide variety of artifacts from EEG data by linear decomposition. The ICA method is based on the assumption that firstly, the time series recorded on the scalp are spatially stable mixtures of the activities of temporally independent cerebral and artifactual sources, secondly the summation of potentials arising from different parts of the brain, scalp, and body is linear at the electrodes, and thirdly, the propagation delays from the sources to the electrodes are negligible. The second and third assumptions are quite reasonable for EEG data. Given enough input data, the first assumption is reasonable as well. The method uses spatial filters derived by the ICA algorithm and does not require a reference channel for each artifact source. Once the independent time courses of different brain and artifact sources are extracted from the data, artifact-corrected EEG signals can be derived by eliminating the contributions of the artifactual sources.

7.4.4 Reducing the Non-physiological Artifacts

Reducing the artifacts in EEG record is an important process to be done. Before the experiment, it is highly recommended to clean up the experimental room from any electrical devices that are not related with the experiment, and it is better to do the experiment in an electronically shielded room. For the artifact from electric power supply or display monitor, applying specific digital notch filter before or after EEG recording might be the easiest and the best way to reduce its effect. For the electrode-pop and electrostatic artifact, treatment after EEG recording is the only way to reduce its effect in EEG record in this study. These kinds of artifacts are originating in electrodes; the electrode-pop artifact is caused by a drying electrode or slight mechanical instability that changes the area of electrode surface in contact with the skin, and the electrostatic artifact is caused by the movement of electrode wires between the electrode on the head and the electrode board or other objects moving in relation to the input electrode leads. We cannot prevent these kinds of artifacts to be happens, but we might be able to reduce its effect in our EEG by always using clean electrodes and ask subjects to sit still and less moving during the recording. Besides that, it is a necessity to have stable electrodes. In this study, ICA decomposition was applied to detect the ICA components with electrode-pop and electrostatic artifacts. We rejected those components to reduce the effect of these artifacts from the EEG record. The process of eliminating these ICA components was carried out under EEGLAB toolbox in Matlab (Delorme and Makeig 2004).

7.4.5 Reducing the Physiological Artifacts

Myogenic or muscle potentials are the most common artifacts in EEG recordings. Frontalis and temporalis muscles (e.g., clenching of jaw muscles) are common causes. Generally, the potentials generated in the muscles are of shorter duration than those generated in the brain and are identified easily on the basis of duration, morphology, and rate of firing (e.g., frequency). Other common physiological artifact is eye movement. Eye movement can simulate a plausible EEG slow wave having eyeball origin. Eyeball artifacts are caused by the potential difference between the cornea and retina, which is quite large compared with cerebral potentials. When the eye is completely still, this is not a problem. But, there are nearly always small or large reflexive eye movements, which generates a potential that is picked up in the frontopolar and frontal leads. Involuntary eye movements, known as saccades, are caused by ocular muscles, which also generate electromyographic potentials. Purposeful or reflexive eye blinking also generates electromyographic potentials, but more importantly, there is reflexive movement of the eyeball during blinking which gives a characteristic artifactual appearance of the EEG.

Those two artifacts above are the focus of physiological artifact in this study. Preventing action like asking subjects to control their blinking and taking the break between the recordings might reduce the occurrence of these artifacts. Applying ICA-based artifact correction can separate and remove these artifacts from EEG data (Jung et al. 2000a, b). Several useful heuristics can be used to discriminate them. For the eye movements, they should project mainly to frontal sites with a low-pass time course. For the eye blinks, they also project to frontal sites, but they have large punctate activations. For muscle artifacts, they usually project to temporal sites with a spectral peak above 20 Hz. Sample of specific topography for artifacts can be seen at Fig. 7.5.

Fig. 7.5
figure 5

Topography of EEG artifacts. a Eye artifact. b Muscle artifact. c Electrode-pop artifact

7.4.6 The Windowing

Brain electrical activity recordings consist of time-varying measurements of the scalp electric potential field, performed for spontaneous activity (EEG) or for ERPs. In studies of ERPs, these recordings are interpreted as being formed by a sequence of components. Each component appears as a peak or trough in the voltage versus time plot, characterized by a certain amplitude and latency value. The different components are assumed to reflect different functional states of the brain, corresponding to different stages of information processing. Therefore, the determination of the functional states and their time sequencing constitutes an important problem of electrophysiology.

7.4.7 K-Means Clustering

In studies of ERPs, the brain signals are interpreted as being performed by a sequence of components. Instead of viewing these sequences of components in the waveforms, k-means clustering technique tries to segment the brain activities into a sequence of momentary potential distribution maps (microstates). In simple way, k-means clustering technique tries to view the multi-channel records of EEG data as a sequence of microstates. Microstate is a stable topographical scalp field persists during an extended epoch or time segment (Lehmann and Skrandies 1984). Each microstate presumably reflects the different step or mode or content of information processing (Michel et al. 1992). But, we have to be careful, because the successive occurrence of microstates does not imply that brain information processing is strictly sequential. The underlying mechanism by which the brain enters a microstate with a given neuronal generator distribution may be composed of any number of sequential or parallel physiological sub-processes.

The scalp electromagnetic field reflects the source distribution in the brain. Due to the non-uniqueness of the electromagnetic inverse problem, it may occur that different source distributions produce exactly the same microstate. However, changes in the microstate are undoubtedly due to changes in the source distribution. Therefore, brain electrical activity can then be seen as a sequence of non-overlapping microstates with variable duration and variable intensity dynamics (Pascual-Marqui et al. 1995). In this technique, the EEG data are assumed to be reference free, and the entire data set at all-time points are examined simultaneously. Therefore, this technique should be applied to averaged EEG (or ERP) data after re-referenced to an average reference. Mathematical and statistical detail of this method can be seen at the study of Pascual-Marqui et al. (1995). Sequential process of this segmentation or windowing with k-means clustering technique can be seen at Table 7.1. Result of windowing with k-means clustering technique can be seen at Fig. 7.6.

Table 7.1 Sequential process of windowing with k-means clustering technique
Fig. 7.6
figure 6

Result of windowing with k-means clustering technique

7.4.8 Source Localization

Source localization is one issue to be answered in EEG study, and it is well known that EEG measurement does not contain enough information for the unique estimation of the electric neuronal generators. Therefore, many possible solutions exist for estimating neuronal generators, and standardized low-resolution brain electromagnetic tomography (sLORETA) is one of them (Pascual-Marqui 2002). The best thing about sLORETA is that it localizes the sources exactly for ideal conditions, but its spatial resolution decreases with depth. Because of its ‘zero error’ for estimating the sources, noisy measurements will produce noisy images with sLORETA estimation. In this study, on the basis of the scalp-recorded electrical potential distribution, sLORETA was used to compute the cortical three-dimensional distribution of scalp current density. Computations were made in a realistic head model (Fuchs et al. 2002), using the MNI152 template (Mazziotta et al. 2001), with the three-dimensional solution space restricted to the cortical gray matter. Anatomical labels including Brodmann areas are reported using an appropriate correction from MNI to Talairach space (Talairach and Tournoux 1988; Lancaster et al. 2000). Software for the sLORETA estimation package can be downloaded for academic purposes only from Web site of the KEY Institute for Brain-Mind Research, University Hospital for Psychiatry, Zurich (http://www.uzh.ch/keyinst/loreta.htm). Detail for using the software is also available from the same link above.

7.5 Temporal Characteristics in the Recognition of Emotional Contain of Facial Expressions

Many studies have been performed to answer the question of how the brain processes the emotional contents in facial expression. But, still there are disagreements in temporal characteristics of the processing of facial emotions. To settle with the disagreements, we examined the brain signals that were evoked by visual stimuli of facial expressions using EEG. As known, facial expressions contain information about the type of the emotion as well as its intensity level. Therefore, to answer how the type and intensity level of emotions affect the brain activity, we parametrically controlled the intensity level, as well as the type of emotional content in facial expression. To elucidate the neural mechanisms related to the processing of these two parameters, we adopted morphed images in between neutral and the emotional ones (0, 10, 20, 40, 60, and 100 %). These percentages correspond to the scale level of morphed into the emotional images, where 0 % means neutral facial expression, and 100 % means full scale of emotion of facial expressions, whereas 10, 20, 40, and 60 % are the artificial images with morphed scale level of 10, 20, 40, and 60 % toward the full emotional of facial expressions (see Fig. 7.1). Subjective ratings in classifying the type and the intensity level of emotional contents in the presented facial expressions were the parameters in categorizing the temporal activities of the brain.

The best of EEG analyses is the precise temporal detection of brain activities regard to the presented stimuli in this ERP study. Through the data analyses, clear responses were observed in the posterior (occipital) and anterior (frontal) regions (see Fig. 7.7). In the posterior electrode locations, there appeared to be a positive deflection at around 100 ms followed by a negative deflection at around 170 ms (inset of Fig. 7.7). The anterior location response patterns counterbalanced those in the posterior locations. In positive valence, we focus in analyzing ERP data correspond to happy facial expression, meanwhile in the negative valence, we focus in ERP data of disgust facial expression. Using the k-means clustering analysis, from data of 30 subjects, we found four time-range windows of interest for each type of facial emotion (see Fig. 7.6). The time-range (post-stimulus onset) windows for happiness were (90–110 ms), (138–180 ms), (182–204 ms), and (206–230 ms), and those for disgust were (86–120 ms), (142–188 ms), (190–210 ms), and (212–258 ms). It is normal to have different result of time-range between the happiness and disgust because happiness and disgust are differently processed by the brain. We designated these four time-range windows as Window-1, Window-2, Window-3, and Window-4, respectively. To identify the effect of the five emotion intensity levels on the ERP response, the grand-averaged waveforms were determined by subtracting the ERP response to the unchanged face from those to the happy face and disgusted face. Due to changes in intensity, the ERP signal changed its magnitude around 100 and 170 ms post-stimulus (see Fig. 7.8).

Fig. 7.7
figure 7

Grand-averaged ERP waveform for the emotions happy (red) and disgusted (blue) and the unchanged face (gray) from 73 electrodes (n = 30). The arrangement of electrodes is based on a top view, and thus, the top corresponds to the anterior and the bottom to the posterior. Inserts are enlarged graphs at PO7 and PO8 locations (left and right, respectively)

Fig. 7.8
figure 8

Subjective rating data and the effects of emotion intensity on the ERP response. Averaged DET and INT scores during the experiment (top left). Grand-averaged waveform calculated by subtracting the ERP response to unchanged faces from that to happy faces with five different intensity levels; 10, 20, 40, 60, and 100 %, and designated as M1–M5 (right), respectively. At the posterior location PO8 and change in the mean voltage value depending on the intensity of the facial emotion (bottom left). The location of PO8 is shown in the central insert. a Data for happy facial expressions. b Data for disgusted facial expressions. DET correct detection of facial emotion; INT assessment of its intensity

7.5.1 Window-1 (P100)

The ERP component within the first time-range Window-1 included the 100-ms post-stimulus onset and was designated as P100. To represent these components, the mean voltage values during Window-1 and Window-2 were calculated. The mean values were collected from all electrode locations, and the change in the value was compared with the change in the type of emotional content (DET) and the intensity level of emotional content (INT). By comparing the peak value among the electrode locations with the ratings (DET and INT), we found that the peak value of P100 is mostly similar to the subjective rating of DET compared with that of INT, especially in the frontal and posterior areas. For example, the peak value at the PO8 electrode location increased along with the increment of emotional level in the presented stimuli. However, when comparing this increment with that of subjective ratings (DET and INT), we found that the peak value at PO8 was significantly and strongly correlated with DET (DET, r = 0.97, p = 0.01) but not with INT (INT, r = 0.86, p = 0.10). Similar results were obtained for the emotion disgusted (solid line, lower left Fig. 7.8b; DET, r = 0.98, p = 0.01; INT, r = 0.84, p = 0.10).

To examine the between-subject variability, we calculated the number of subjects exhibiting a significant correlation between the ERP components and subjects’ performance (p < 0.05) at each electrode location and then made a frequency map of the significant correlation. The right occipito-parietal locations showed a significant and consistent positive correlation between P100 and DET for both the emotions happy and disgusted (see Fig. 7.9, P100). Several other locations showed a significant correlation between P100 and DET or INT but with less consistency among subjects (see Fig. 7.9). We further compared the correlation coefficient between P100 and DET with that between P100 and INT for all 73 electrode locations. For the emotion happy, the DET value was higher than that of INT in seven of nine subjects (Wilcoxon signed rank test, p < 0.05). For the emotion disgusted, in three of nine subjects, the DET value was higher than that of INT (p < 0.05). These data suggest that P100 was more strongly correlated with DET than INT.

Fig. 7.9
figure 9

Topographical maps of the electrodes showing the statistically significant correlation among subjects for each time-range window. The consistency in the significant correlation (p < 0.05) among subjects is represented by the number of subjects (n), indicated by color in the color bar. W1–W4 are the first to fourth time-range windows of interest for the emotion happy, (W1, 90–110 ms), (W2, 138–180 ms), (W3, 182–204 ms), and (W4, 206–230 ms) and for the emotion disgusted (W1, 86–120 ms), (W2, 142–188 ms), (W3, 190–210 ms), and (W4, 212–258 ms). DET correct detection of facial emotion; INT assessment of intensity

7.5.2 Window-2 (N170)

The ERP component within the second time-range Window-2 included the 170-ms post-stimulus onset and was designated as N170. The change in the mean voltage value was compared with the change in DET and INT. The N170 value at the PO8 location increased linearly with emotional intensity (dotted line, Fig. 7.8a). For the emotion happy, the PO8 value was significantly correlated with INT (r = −0.99, p = 0.01) but not DET (r = −0.80, p = 0.10; INT). Similar results were obtained for the emotion disgusted (dotted line Fig. 7.8b; DET, r = −0.86, p = 0.10; INT, r = −0.91, p = 0.05).

As shown in Fig. 7.9 (N170), the right occipito-parietal locations showed a significant and consistent correlation between N170 and INT for both the emotions happy and disgusted. In addition, for the emotion happy, bilateral frontal locations showed a significant correlation between N170 and INT, and the left frontal locations showed a significant correlation between N170 and DET. For the emotion disgusted, the frontal locations also showed a significant correlation between N170 and INT but with less consistency among subjects. When we compared the correlation coefficient between N170 and DET with that between N170 and INT, the INT value was higher than that of DET (p < 0.05) in seven and eight of nine subjects for the emotions happy and disgusted, respectively. These data suggest that N170 was correlated more with INT than DET.

Both DET and INT affected ERP components. The magnitude of P100 sharply increased as the intensity of the facial emotion increased, and the P100 magnitude reached a plateau at less than half of the strongest intensity level. We demonstrated that the P100 magnitude was significantly correlated with DET. On the other hand, we failed to find any significant differences in the magnitude of P100 between the emotions happy and disgusted. These data suggest that the P100 is closely associated with the correct detection of facial emotion. Previous studies have reported that the facial emotion evokes brain activities at a very early stage of processing (Pizzagalli et al. 1999b; Eger et al. 2003). In agreement with these studies, our data suggest that the brain activity evoked by the happy and disgusted faces occurred very early (100 ms) in the processing stage. Our data suggest that the P100 magnitude represents detection accuracy, but not the ability to distinguish these facial emotions. The detection of a facial emotion is probably a more primitive process than identification and needs less perceptual demand. Different neural mechanisms probably underlie these two processes.

However, in the present study, the subjects were required to answer the type of emotion, such as happy, angry, disgusted, etc., not just ‘something emotional.’ In this sense, the subjects’ subjective rating performance, DET, was not ‘detection of facial emotion’ but ‘identification of type of facial emotion.’ Because we adopted the stimuli exhibiting a similar DET–INT discrepancy, i.e., the emotions happy and disgusted, the similar profile in psychological data may cause our failure to find any significant difference in P100 magnitude between the emotions happy and disgusted. We might detect some significant correlation between P100 and the identification of type of emotion if we adopt stimuli exhibiting clearly different psychological profiles, such as the emotions happy and fearful.

7.5.3 Window-3 and Window-4

Similarly, the mean voltage value during Window-3 and Window-4 was calculated, and the change in the voltage value was compared with the change in DET and INT (see Fig. 7.9, W3 and W4). Similar to N170, the Window-3 value for the emotion happy at the occipital and parieto-frontal locations was significantly and consistently correlated with INT but not with DET. For the emotion disgusted, the value at the right posterior and left frontal locations was correlated with INT but not DET. The Window-4 value at the occipito-temporal location was correlated with INT for the emotion happy. The value at the occipital and parieto-frontal locations showed a significant correlation for DET or INT but with less consistency. Because of the less consistency among subjects, we did not extensively analyze the data in Window-4. We compared the correlation coefficient between the Window-3 value and DET with that between the Window-3 value and INT. For the emotion happy, the correlation with INT value was higher than that of DET in 27 subjects’ data. For the emotion disgusted, data from five subjects indicated similar correlation with DET and with INT; fifteen data indicated higher correlation to DET compared with that of INT; meanwhile, the remaining ten data indicated that the correlation with INT is higher correlated than that with DET. Thus, by this finding, we can simply conclude that the Window-3 value was correlated with INT for the emotion happy but not for the emotion disgusted.

Based on these findings, we determined that DET occurred before INT. The system that detects facial emotions could be connected to the ‘early-warning’ system that helps animals to survive by detecting potentially dangerous or threatening signals, such as fear and disgust (Morris et al. 1996; Phillips et al. 1997). Phased or serial-processing mechanisms in the brain may occur in relation to the processing of facial emotion (Adolphs 2002). Initially, the detection of facial emotion begins as early as 100 ms, and the detail is sufficiently constructed by around 170 ms post-stimulus to create distinguishing information. This phased mechanism enriches the functional-level model proposed by Bruce and Young (1986) as modified by Haxby et al. (2000) (see Fig. 7.10). Others have found processing components with a latency of more than 200 ms (Bobes et al. 2000; Carretié and Iglesias 1995). These components might be related to conceptual knowledge of the emotion that is signaled by the face, such as an interaction with gaze (Klucharev and Sams 2004), and face familiarity (Schweinberger et al. 2002). These later components have been localized by intra-cerebral electrode recording (McCarthy et al. 1999). Brain imaging studies have identified that the inferior occipital gyrus, the posterior fusiform gyrus, and the temporal poles are involved in facial processing (Haxby et al. 2000; Kanwisher et al. 1997; Nakamura et al. 2000). These findings suggest that a variety of brain regions function cooperatively to process facial emotion and that the activity in these regions is modulated by top–down and bottom–up signals. Further studies are necessary to elucidate the functions of the later components in the processing of facial emotion.

Fig. 7.10
figure 10

Two-phase model of emotion recognition. Phase 1 represents the initial monitoring process that codes saliency of incoming (structural encoded) facial information. In phase 2, the specific emotional content of faces (the type) is analyzed in emotion-specific recognition systems followed by the coding saliency of facial emotions. The gray box represents the very much simplified identity recognition system. White boxes represent the early step of facial emotion recognition systems or the expression-dependent description systems