Keywords

In this chapter we focus on the neural processes that occur in the mature healthy human brain in response to evaluating another’s social attention. We first examine the brain’s sensitivity to gaze direction of others, social attention (as typically indicated by gaze contact), and joint attention. Brain regions such as the superior temporal sulcus (STS), the amygdala and the fusiform gyrus have been previously demonstrated to be sensitive to gaze changes, most frequently with functional magnetic resonance imaging (fMRI). Neurophysiological investigations, using electroencephalography (EEG) and magnetoencephalography (MEG) have identified event-related potentials (ERPs) such as the N170 that are sensitive to changes in gaze direction and head direction. We advance a putative model that explains findings relating to the neurophysiology of social attention , based mainly on our studies. This model proposes two brain modes of social information processing—a nonsocial “Default” mode and a social mode that we have named “Socially Aware”. In Default mode, there is an internal focus on executing actions to achieve our goals, as evident in studies in which passive viewing or tasks involving nonsocial judgments have been used. In contrast, Socially Aware mode is active when making explicit social judgments. Switching between these two modes is rapid and can occur via either top-down or bottom-up routes. From a different perspective most of the literature, including our own studies, has focused on social attention phenomena as experienced from the first-person perspective, i.e., gaze changes or social attention directed at, or away from, the observer. However, in daily life we are actively involved in observing social interactions between others, where their social attention focus may not include us, or their gaze may not meet ours. Hence, changes in eye gaze and social attention are experienced from the third-person perspective. This area of research is still fairly small, but nevertheless important in the study of social and joint attention, and we discuss this very small literature briefly at the end of the chapter. We conclude the chapter with some outstanding questions, which are aimed at the main knowledge gaps in the literature.

4.1 Sensitivity to Eye Gaze and Social Attention: Active Brain Loci

As noted above from the first-person perspective, changes in gaze direction in someone’s face are typically associated with a change in their social attention . Hence, for the purposes of this chapter, we focus on discussing the existing literature in terms of treating changed gaze direction and changed social attention as being equivalent.

In the late 1990s, neuroimaging studies began to identify brain regions that were sensitive to viewing gaze changes in others (Puce, Allison, Bentin, Gore, & McCarthy, 1998; Wicker, Michel, Henaff, & Decety, 1998). Wicker et al. (1998) used a video in which gaze was changed to a number of different positions. There was a “mutual” condition where the stimulus face looked directly at the observer and then averted its gaze and vice versa. A similar dynamic gaze change sequence occurred in an “averted” condition, where the stimulus face altered its gaze to look at two different points in visual space (but never at the observer). Control conditions included a “no gaze” condition where the stimulus face looked down at paper on a table, and a “rest” condition, where subjects had their own eyes closed. Positron emission tomography (PET) activation was largest in the right inferior temporal and fusiform gyri (FG), and right parietal lobule, as well as in the posterior superior temporal sulcus (pSTS)/middle temporal gyrus (MTG) bilaterally, to viewing the “mutual” and “averted” conditions (Wicker et al., 1998). In another study, dynamic eye gaze changes alternating between averted and direct gaze produced strong functional magnetic resonance imaging (fMRI) activation in the bilateral STS, in bilateral hMT+, and to a lesser extent the left intraparietal sulcus (IPS) (Puce et al., 1998). These two early studies used dynamic gaze changes, unlike subsequent studies in which stimuli consisted of the onset of static faces where the gaze could be either direct or averted. Meta-analyses of studies using mainly static faces have also implicated these same brain regions as being sensitive to gaze direction/social attention. To a lesser extent, regions such as the medial prefrontal and orbitofrontal cortices, amygdala, the frontal eye fields, and a small area on the postcentral sulcus (Allison, Puce, & McCarthy, 2000; Nummenmaa & Calder, 2009; Senju & Johnson, 2009) can also be activated to gaze changes .

The seemingly disparate arrays of brain regions noted to be sensitive to direction of gaze/social attention are thought to be essential components of four separate brain subsystems (Fig. 4.1) associated with processing different aspects of social information—encompassing a single entity known as the “social brain” (Frith, 2007). These subsystems consist of four separate brain networks (see Stanley & Adolphs, 2013) :

Fig. 4.1
figure 1

The social brain: four brain networks and associated component brain structures known to be active in human studies of social cognition. Each of the four networks is depicted in a separate color, whose legend appears at the bottom right of the figure. (Modified, with permission, from Stanley & Adolphs, 2013).

  1. i.

    A mentalizing network

  2. ii.

    A motor simulation/action perception network (mirror system)

  3. iii.

    An empathy network

  4. iv.

    An amygdala network that supports the processing of directed and relevant emotional information and its retrieval.

The mentalizing network activates when making sense of the goals and intentions of others and consists of the pSTS (as part of the temporoparietal junction), temporal pole, precuneus, and medial prefrontal cortex (Bahnemann, Dziobek, Prehn, Wolf, & Heekeren, 2010; Frith & Frith, 2006; Stanley & Adolphs, 2013). The mirror/simulation/action–perception network activates when an individual executes an action, or observes another making that same action, and is thought to support action understanding, and to enable crucial abilities such as imitation and motor learning (Fogassi et al., 2005; Rizzolatti & Craighero, 2004; Rizzolatti & Fabbri-Destro, 2008) and is composed of regions of parietal and frontal cortex (Stanley & Adolphs, 2013). The empathy network encompasses the cortex of the anterior insula and a region bounding posterior anterior cingulate and anterior medial cingulate cortex (Engen & Singer, 2013; Fan, Duncan, de Greck, & Northoff, 2011; Lamm, Decety, & Singer, 2011; Stanley & Adolphs, 2013), whereas the so-called amygdala network includes the amygdala, orbitofrontal cortex, and anterior aspects of ventral temporal cortex including the FG (Olson, McCoy, Klobusicky, & Ross, 2013; Stanley & Adolphs, 2013). These brain regions and network membership are schematically represented in Fig. 4.1 .

With respect to evaluating gaze direction and social attention, are some regions in the social brain more critical than others? One clue from a rare neuropsychological lesion study documents deficits in gaze processing in a patient with a circumscribed lesion involving the right superior temporal gyrus (STG) as a result of a cerebral hemorrhage. The patient could not correctly recognize left averted gaze or direct gaze, interpreting these as being direct gaze and right averted gaze, respectively. These difficulties could not be attributed to issues with visuospatial processing, as other stimuli that signaled direction, e.g., arrows, produced relatively unimpaired task performance. In the acute poststroke period, the patient had initially experienced neglect, which had recovered by the time she was tested chronically for her ongoing gaze-processing issues (Akiyama et al., 2006a, b). When also tested many years later on a visual cueing paradigm, the patient was found to be impaired only when the visual cue was provided by gaze, but had normal performance when the visual cue was an arrow (Akiyama et al., 2006b). An identical behavioral dissociation in cueing across gaze and arrows was also demonstrated in five patients with amygdala lesions (Akiyama et al., 2007). Although it is tempting to speculate that these gray matter regions of the brain are critical for processing information relating to social attention, it is also possible that injury to these regions may have also disrupted white matter pathways that carry this important social information. Future studies examining structural and functional connectivity in both healthy subjects and individuals with lesions will be needed to disentangle these issues .

A study of epilepsy surgery patients with depth electrodes implanted in the STS has demonstrated sensitivity to eye gaze stimuli within this brain region (Caruana et al., 2014). A patient who was cortically blind (with no viable bilateral primary visual cortex) showed greater right amygdala activation to faces with direct gaze relative to faces with averted gaze (Burra et al., 2013). These two studies indicate how critical social cues can be routed to the social brain in the absence of viable striate input, likely via extrastriate and extrageniculate routes, traveling between critical regions such as the amygdala and the superior temporal cortex.

It is tempting to speculate that the cortex of the STG/pSTS is devoted to evaluating changes in social attention/gaze in others; however, it should be noted that the pSTS is also selectively active to different types of mouth movements (Puce et al., 1998), as well as hand and leg motion (Thompson, Clarke, Stewart, & Puce, 2005; Wheaton, Thompson, Syngeniotis, Abbott, & Puce, 2004). The pSTS is known to be sensitive to biological motion in general; however, it is particularly sensitive to changes in gaze (see Allison et al., 2000 Fig. 3; Van Overwalle & Baetens, 2009 Fig. 3a and b; Itier & Batty, 2009). Importantly, the pSTS has been found to be equally active to observing pointing with the finger or the eyes (Materna, Dicke, & Thier, 2008). Overall, the data suggest that the STG/pSTS region is sensitive to the actions of others in general, rather than being only sensitive to behaviors signaling changed social attention .

In daily life the head is rarely still. Both head and eye movements are used to explore one’s visual space and to direct one’s own gaze to novel or relevant stimuli. Equal sensitivity to head and gaze movements has been reported in both the right pSTS and the FG (Laube, Kamphuis, Dicke, & Thier, 2010). However, the relationship of activation elicited to head versus eye movements is complicated, with interaction effects in processing head and gaze direction information having been reported. Specifically, the largest activation was observed in the right STS to a full-on face relative to an angled face, irrespective of gaze condition. Additionally, bilateral FG activation was largest to a full-on face with direct gaze (Pageler et al., 2003). In another study (George, Driver, & Dolan, 2001), greater FG activation occurred to direct gaze relative to averted gaze, irrespective of head orientation. Direct gaze also produced a greater correlation between FG and amygdala activation, whereas averted gaze produced larger correlations in activity between the FG and IPS. These activity profiles occurred irrespective of head orientation. These data indicate that direct versus averted gaze may selectively activate different subsystems within the social brain. Averted gaze has been previously associated with engaging systems related to the visual periphery, whereas direct gaze stimulates systems that deal with emotionally salient stimuli (George et al., 2001). Given that there is a redeployment of an observer’s visuospatial attention in response to observing a gaze aversion in another individual, it might be expected that dorsal structures in the visual pathways might also be activated (Corbetta & Shulman, 2002)—structures that might not be involved in the processing of gaze as such. Performing experiments where task requirements are explicitly social or nonsocial might be able to further disentangle the functional neuroanatomy of a gaze aversion (e.g., see Latinus et al., 2015) .

So far we have discussed activation profiles in the amygdala and the pSTS with respect to social attention stimuli. However, the middle part of the bilateral STS (mSTS) and the left anterior part of the STS (aSTS) can also show interaction effects with respect to changes in head and gaze direction. These regions have been reported to reduce their activation more when subjects followed eye-gaze direction relative to head-gaze direction (Laube et al., 2010). Laube et al. (2010) attributed the reduced activation to the “active suppression of information arising from the distracting other directional cue, i.e., head-gaze direction in the eye-gaze direction task and eye-gaze direction in the head-gaze direction task.” Consistent with these data, Carlin and colleagues (2012) demonstrated involvement of the aSTS/MTG when presented with head turns in either direction, irrespective of gaze manipulation (Carlin, Rowe, Kriegeskorte, Thompson, & Calder, 2012). The various sections of the STS (aSTS, pSTS) are parts of the mentalizing network, and it appears that as one moves in an anterior direction along the STS axis, the processing of social information becomes progressively more complex (Frith & Frith, 2003) .

Studies with no head direction manipulations and only gaze changes appear to show conflicting findings to viewing direct versus averted gaze: augmented activation in the pSTS has been reported to averted gaze relative to direct gaze (Engell & Haxby, 2007) and to direct versus averted gaze (Pelphrey, Viola, & McCarthy, 2004). In the former case, emotional expressions could be present on the stimulus faces, whereas in the latter case a neutral, approaching avatar was used. Indeed, changes in visually expressed social attention usually do not occur in isolation in daily life—they are often accompanied by emotional expressions that clearly indicate who, or what, the expression is being directed at. Emotional expressions themselves can produce augmented activation in the pSTS relative to neutral faces when faces are presented with direct gaze only (Engell & Haxby, 2007). Within the limbic system, increased right hippocampal activation has been found in response to faces with direct gaze, and amygdala activity has been observed in response to faces with angry expressions or direct gaze in a task requiring identity judgments. Notably, better behavioral performance (recall) of individual facial identities was associated with presented direct gaze and angry emotional expressions (Conty & Grezes, 2012).

Although the pSTS is sensitive to changes in gaze direction/social attention, the medial PFC has been found to activate to increased (direct) gaze duration (Kuzmanovic et al., 2009), suggesting that different parts of the mentalizing network may be involved in detecting changes in gaze/social attention versus evaluating the potential significance of the directed gaze. These data raise the question of connectivity within, and between, the different networks that make up the social brain. Recent developments in structural and functional imaging acquisition and analysis are allowing some of these relationships to be investigated. Ethofer, Gschwind, and Vuilleumier (2011) investigated the connectivity of the right pSTS with other brain regions while subjects performed a gender classification task on faces that changed with direct and averted gaze. Gaze shifts towards the observer resulted in increased functional connectivity between the right pSTS, FG, and anterior insula. Activation in the FG was equally large for faces with either directed or averted gaze (Ethofer et al., 2011). Increased functional connectivity between pSTS, MT/V5, IPS, frontal eye fields, STG, supramarginal gyrus, and middle frontal gyrus has also been demonstrated for gaze shifts relative to eye-opening and -closing movements (Nummenmaa, Passamonti, Rowe, Engell, & Calder, 2010) .

The functional connectivity data allow active brain networks to be identified, but cannot speak to the underlying direct structural connections in the brain. Diffusion tensor imaging (DTI) data can allow these direct structural connections to be visualized. Interestingly, direct white matter connections have been described between pSTS and the anterior insula, but not between the pSTS and the FG (Ethofer et al., 2011). Future studies assessing both structural and functional connectivity within, and between, networks comprising the social brain will be necessary to identify which connections in the system are direct, permanent connections, and those that are fleeting (and made for the purposes of achieving a current goal via indirect routes of connectivity). These analyses might also shed some light on why there is so much variability in the literature for viewing averted versus direct gaze, and will be particularly pertinent for studies examining the deployment of social attention in different contexts .

Neuroimaging studies examining brain activation to viewed gaze changes have been informative, as they have identified active brain systems that are sensitive to the eye and gaze cues of others. However, they do not easily speak to the underlying neural dynamics of processing changes in another’s gaze direction and social attention .

4.2 Evoked Neurophysiological Activity Associated with Evaluating Eye Gaze and Social Attention

Electroencephalography (EEG) and magnetoencephalography (MEG) allow the dynamics of neural processing to be studied with high temporal resolution (millisecond accuracy). To this end, neurophysiological activity that is phase-locked to the gaze/social attention stimulus can be readily identified as event-related potentials (ERPs) , where multiple trials of activity within a stimulus condition have been averaged, to visualize activity with a consistent temporal relationship to the stimulus. A typical visual ERP that is elicited to a gaze stimulus consists of a triphasic ERP complex consisting of P100, N170, and P350 components (their nominal latencies in milliseconds are denoted by the numbers and voltage polarity by (P)ositive or (N)egative). These ERP components are typically maximal over the posterior scalp, with P100 and N170 seen over the occipitotemporal scalp and P350 occurring more dorsally over the parietal scalp (see Allison et al., 2000; Itier & Batty, 2009). Neurophysiological activity that is related to stimulus delivery, but that is not exactly phase-locked to stimulus onset can also be elicited to a gaze stimulus. This type of activity consists of changes in oscillatory activity in certain EEG frequencies and requires the analysis of single-trial EEG/MEG data (Tallon-Baudry, Bertrand, Delpuech, & Pernier, 1996). The bulk of existing EEG/MEG studies examining the neural correlates of viewing changes in eye gaze/social attention have reported ERP activity only; however, studies examining oscillatory changes in EEG activity across all EEG frequency bands are beginning to appear in the literature (reviewed in the second half of this section) .

4.2.1 Scalp ERPs and MEG Responses Elicited to Changes in Gaze/Social Attention Viewed Without Making Social Judgments

In the first neurophysiological study to examine the effects of viewing dynamic gaze changes, we used passive viewing tasks where subjects viewed an apparent motion stimulus consisting of either a full face or isolated eyes (Puce, Smith, & Allison, 2000). The N170 ERP was significantly larger to averted gaze, irrespective of whether a full face or isolated eyes were viewed, with earlier N170s to averted gaze being seen at the right temporal scalp (Puce et al., 2000). We have replicated the N170 amplitude effect using apparent motion paradigms using full faces and tasks requiring subjects to respond to non-gaze relevant or nonsocial aspects of the viewed stimuli (Latinus et al., 2015; Rossi, Parada, Kolchinsky, & Puce, 2014; Rossi, Parada, Latinus, & Puce, 2015). Therefore, it appears that when subjects are not actively engaged in making social judgments related to face and gaze stimuli, there is modulation of N170 amplitude by the type of gaze transition (shown schematically in the top panel of Fig. 4.2).

Fig. 4.2
figure 2

Schematic representation of N170 changes for the brain inDefaultmode. The results of a number of different experiments are shown, with stimulus conditions that showed differences in N170 amplitudes being depicted on the left side of the figure. A schematic N170 is depicted on the right as showing significant amplitude differences between conditions. N170 is consistently larger to averted versus direct gaze in isolated face or eye stimuli (top panel), and a similar effect occurs when two faces look away from one another relative to a mutual gaze condition (middle panel). These N170 effects have been documented in experiments where nonsocial task requirements have been imposed. Black arrows between example stimuli indicate apparent motion transitions. Solid and broken lines depicting N170 waveforms are associated with particular stimuli, identified with the same line type. White arrows on images on the lower panel schematically depict the direction of the gaze change and were not present in the experiment.

Gaze changes in a single face viewed from a first-person perspective, such as in our experiments described earlier, limit our understanding of the overall functional significance of the neurophysiological findings. We have performed an experiment to investigate social context from the point of a noninvolved observer, i.e., a third-person perspective (Ulloa, Puce, Hugueville, & George, 2014). Subjects viewed two avatar faces that were initially displayed with downcast eyes (and hence shared no “interaction” with the observer). After 1 s, the avatars changed their gaze to either look at one another in a mutual gaze situation, or look away from one another (and the observer) to a common point to one side (see middle panel, Fig. 4.2). MEG activity was recorded in response to the viewed videos (which also displayed subsequent dynamic facial emotions) while subjects looked for a cross at the center of the display to change color on a random and infrequent basis—a gaze- and emotion-irrelevant target. Significantly larger M170s (the MEG counterpart of N170) occurred when the avatars looked away to a common point relative to when they exchanged (direct) mutual gaze. These data indicate that the increased neural sensitivity to viewed averted gaze is not necessarily driven by direct engagement with, or involvement of, the observer (Ulloa et al., 2014). Critically, we have observed similar neurophysiological effects with respect to gaze aversion using both real images of faces and those of avatars, as well as recording neurophysiological activity across two different methods (EEG and MEG) (e.g., compare Puce et al., 2000 to Ulloa et al., 2014).

Is the gaze aversion effect modulated by the format of the face stimulus being viewed? In addition to demonstrating larger N170s to averted versus direct gaze, we have previously reported larger N170s in a passive viewing paradigm to mouth opening relative to closing movements in both real and line-drawn faces (see Fig. 4.3, top panel) (Puce et al., 2000, 2003). Similarly, fMRI activation in the pSTS did not differ between movements of the real and impoverished face (Puce et al., 2003), leading us to conclude at the time that the hemodynamic and neurophysiological response to mouth movements likely reflected a biological motion response where motion and form are integrated—similar to that observed with point-light displays of human walkers (for reviews see Blake & Shiffrar, 2007; Giese & Poggio, 2003; Puce & Perrett, 2003; Puce et al., 2015) .

Fig. 4.3
figure 3

Schematic representation of N170 changes to different types of facial movements. Mouth movements elicit N170 amplitude changes, irrespective of what type of face stimulus depicts the facial motion: larger N170s are seen to mouth opening relative to mouth closing movements (top panel). Eye aversion elicits larger N170s relative to direct gaze in real faces only. N170 amplitudes do not differ when gaze transitions are represented by line-drawn faces (lower panel). These N170 effects have been documented in experiments where nonsocial task requirements have been imposed. Black arrows between example stimuli indicate apparent motion transitions. Solid and broken lines depicting N170 waveforms are associated with particular stimuli, identified with the same line type.

More recently, we investigated eye movements in parallel with mouth movements in impoverished (line-drawn) faces and replicated the N170 amplitude effect for viewing mouth opening versus closing movements, but saw no significant differences in N170 between averted and directed gaze to line-drawn faces (see Fig. 4.3, lower panel; and Rossi et al., 2014). One potential reason for the lack of N170 differentiation across impoverished eye movements could have been that these effects were dependent on local visuospatial changes in stimulus luminance/contrast, given that the human eye consists of a high-contrast iris–sclera complex (Rossi et al., 2014). An alternative possibility could be an effect of experimental context: where the presence of real faces in the previous experiment (i.e., Puce et al., 2003) may have influenced the N170 to the impoverished faces (Rossi et al., 2015). Strong stimulus context effects for N170 have previously been reported for face and fragmented face stimuli (Bentin & Golland, 2002; Jemel, Pisani, Calabria, Crommelinck, & Bruyer, 2003; Latinus & Taylor, 2006) .

Additionally, our line-drawn face motion experiment also produced different patterns of neural activity depending on whether the baseline stimulus (of a direct gazing face with mouth closed) was preceded by a gaze aversion or a mouth opening movement (Rossi et al., 2014). So as to disentangle these potentially different explanations for our data, we presented gaze changes in stimulus blocks using real and impoverished faces (Rossi et al., 2015), in a similar design to what we had used for mouth movements in real and line-drawn faces (Puce et al., 2003). N170s to real faces were larger to averted gaze relative to direct gaze (replicating Puce et al., 2000 and Latinus et al., 2015), however, N170s to impoverished faces did not differ in amplitude across gaze conditions (replicating Rossi et al., 2014) (see Fig. 4.3, bottom panel). Hence, experimental context (with respect to impoverished and real faces) does not appear to drive the modulation of N170 to dynamic face transitions. Taken together, our ERP data across these multiple studies indicate that N170s that differentiate between types of eye and mouth movements are probably being generated by two very different neural mechanisms. Specifically, we are making the claim (Puce et al., 2015; Rossi et al., 2015) that:

  1. 1.

    The differential N170 elicited to mouth movements likely represents a biological motion response, elicited to viewing articulated human motion. Mouth opening/closing movements are produced by the action of an articulated mandible. Despite the changing contrast between the teeth and lips when the mouth opens and closes, this response is clearly not entirely dependent on stimulus luminance/contrast since it is also elicited to mouth movements in line-drawn faces .

  2. 2.

    The differential N170 elicited to direct versus averted gaze in a real face is produced by a high luminance/contrast change in visual space produced by the movement of the human iris–sclera complex. This effect is abolished when eye gaze is represented by schematic eyes in line-drawn faces with overall low luminance/contrast. Eye movements (and generally movements of the upper face) are not an articulated form of human motion, and therefore elicit a neural response that is different from that of an articulated motion stimulus. Experiments varying luminance and contrast in schematic eye stimuli would be needed to verify these claims.

Consistent with the idea that N170 is affected by changes to high-contrast eyes are data from a study in which we investigated the neural consequences of viewing another’s gaze changes compared with eye closure and eye blinks (Brefczynski-Lewis, Berrebi, McNeely, Prostko, & Puce, 2011). Subjects responded to a target stimulus consisting of a checkerboard pattern superimposed on the continuously displayed face. We had originally predicted that given the potential social significance of gaze transitions, N170s to gaze aversions would be significantly larger than those to eye blinks and eye closure. To our surprise, N170 did not differ as a function of these conditions (Brefczynski-Lewis et al., 2011). However, in all of these stimulus conditions the high-contrast direct gaze was replaced by a stimulus condition with altered local visuospatial contrast. Specifically, direct gaze could change to either averted gaze or closed eyes (depicting either a blink [brief] or eye closure [a longer interval]). For a given pixel in the region of the iris/pupil of the stimulus image, there is a luminance change in the transition from eye opening to eye closure .

Does the size of the gaze transition or the physical direction influence the observed neural response? Human observers can reliably detect 1–3° changes in another’s gaze (Anderson, Risko, & Kingstone, 2011). Given that we have found that low-level factors affect the neural response to viewed gaze movements, it is conceivable that the N170 might also be sensitive to the size of the gaze transition. Extreme gaze aversions, e.g., 30° from the direct gaze position, might generate larger N170s than smaller gaze transitions, e.g., of 15°. In a recent experiment we included stimuli with different sizes of gaze transitions and explicitly looked for modulation of N170 as a function of size of gaze excursion. However, N170 did not differ with size of gaze transition (Latinus et al., 2015). In our earliest study investigating N170 to eye gaze changes, we also explicitly examined our data for gaze transitions occurring to the left and right. N170 was not affected by the physical direction of the gaze movement—N170 amplitudes were not significantly different to viewing gaze changes to the left or the right of the observer (Puce et al., 2000). From these two studies we conclude that, although the N170 is likely generated by a local visuospatial change in luminance/contrast, the physical direction and the size of the gaze transition, as seen in a real face , do not modulate this neurophysiological response. If this is the case, then what does an N170 ERP signal reflect when it is elicited to a gaze change? The previously described experiments cannot address this question (with the exception of Latinus et al., 2015), as they all were either passive viewing paradigms or used tasks where target stimuli were gaze-direction irrelevant .

4.2.2 Scalp ERPs and MEG Responses Elicited to Changes in Gaze/Social Attention Viewed While Making Social Judgments

The previous sections have focused on the effects of gaze changes in situations where social judgments were not required. However, as the studies reviewed subsequently indicate, neural activity will be quite different depending on the type of judgment that is made on the gaze stimulus.

Conty et al. (2007) performed an experiment in which subjects made explicit social judgments related to the direction of the observed gaze change. Their experiment had a trial structure that is shown in Fig. 4.4. A stimulus pair was presented on each trial, producing an apparent motion gaze transition, with the first stimulus always showing an averted gaze at an intermediate position (15°). The subject was asked to indicate with a button press whether the gaze transition induced by the presentation of the second stimulus moved towards them or further away from them. Hence, the subject made a social judgment regarding the gaze change in the observed faces. The subject could not predict whether the next stimulus would be a direct gaze or an even further (30°) gaze aversion. Head position was also varied in the experiment, resulting in a 2 × 2 design for head (full-on, ¾ view) and eye (averted, direct) position. Interestingly, N170 to direct gaze transitions was significantly larger relative to transitions where the gaze aversion became more extreme, irrespective of head position. These data were consistent with an interpretation that N170 signals change in social attention. In the case where the gaze is already averted and then becomes more extremely averted, there is no net change in social attention with respect to the observer, so therefore there would be no differences in N170 amplitude (Conty, N'Diaye, Tijus, & George, 2007). These data are extremely interesting and appear to be at odds with the ERP data that we have reported using extreme gaze aversions and direct gaze in a series of studies (Latinus et al., 2015; Puce et al., 2000; Rossi et al., 2014) .

Fig. 4.4
figure 4

Experimental trial structure ( a ) and stimulus conditions ( b ) from a social attention experiment. a Subjects viewed a display where a central fixation cross was replaced by a face (with varied positions of gaze). After a short interval the face changed its gaze and subjects were required to press one of two buttons to evaluate the gaze change. In a nonsocial task, subjects judged if the gaze change occurred to their left or their right, whereas in a social task subjects indicated if the gaze change moved towards them or away from them. b Stimulus conditions consisted of gaze changes previously studied by Puce et al., 2000 and Conty et al., 2007, and are displayed as red arrows between the grey circles in each of the 6 tested gaze transitions. So as to have a balanced design with respect to gaze changes, two new (previously untested) conditions were also included in the experiment. The gaze change can be regarded as becoming “more averted” or “less averted”—as shown by the thick black arrows identifying the respective stimulus conditions where this is the case.

So as to try and reconcile the differences in N170 data between our two laboratories, we performed two experiments using a subset of stimuli from Conty et al. (2007) consisting of gaze transitions in full-on faces (see Latinus et al., 2015). We opted to run two experiments (with counterbalanced order) in the same subjects using identical stimuli, using two different types of judgments—an overtly social and a nonsocial one. The nonsocial task consisted of subjects indicating with a button press whether the gaze in the stimulus face moved to the left or right of them. In the social task, subjects indicated whether the gaze moved away or towards them (identical to the task used by Conty et al., 2007). (One could make the argument that all stimuli involving faces are inherently social. We, however, are making the distinction here with regard to the type of judgment that the subject has to make on the incoming stimulus.) When subjects made a nonsocial judgment, N170s to any gaze transition where gaze became more averted were significantly larger relative to gaze transitions moving towards the subject. This occurred for stimuli depicting both direct gaze and intermediately averted gaze. These changes were observed in the bilateral occipitotemporal scalp. Notably, when subjects made social judgments, N170s were no longer significantly different across gaze conditions in the right occipitotemporal scalp (see Fig. 4.5, bottom panel). In contrast, N170s in homologous sites in the left hemisphere were identical, irrespective of the social judgment: more extreme gaze aversions produced larger N170s relative to gaze transitions whose gaze was direct or less averted. These data strongly indicate that the right hemisphere is selectively engaged while making explicit social judgments of another’s altered social attention. Hence we were able to replicate our previous studies (Puce et al., 2000; Rossi et al., 2014), which examined extreme gaze changes in real faces (Fig. 4.5, top panel) .

Fig. 4.5
figure 5

Group data from a social attention experiment where N170 ERP modulation occurred as a result of a nonsocial versus social decision from stimulus conditions shown in Fig. 4.4 . ERP data were obtained from a nine-electrode cluster overlying the right occipitotemporal scalp. N170 amplitude modulation as a function of more averted gaze occurs in all tested conditions in the right hemisphere on the nonsocial task. This N170 difference is abolished when subjects engage in an explicit social judgment in the social task. In the left hemisphere (not shown), N170 amplitude modulations occurred for more averted gaze positions for both nonsocial and social decisions. (Modified from Latinus et al., 2015).

We did not exactly replicate Conty et al. (2007) in this study, who found that N170s to direct gaze were larger relative to extreme aversions when made from an intermediate averted gaze position. Given that interactions in head and gaze position are known to occur in both fMRI and neurophysiological studies (as discussed earlier), it is possible that there may have been some additional N170 modulation as a function of head position in the original 2007 study of Conty et al. (Latinus et al., 2015). Itier and colleagues (2007) have noted a complex set of interactions in N170 amplitude data between head and eye gaze positions when subjects had to make judgments related to either head or gaze position. Interestingly, N170 activity to viewing static eyes in faces is also modulated by where the viewer’s gaze falls on the face: if the viewer fixates their gaze on the eye, N170 amplitude will be larger than if another area on the face is viewed (see Nemrodov, Anderson, Preston, & Itier, 2014). There appears to be a very complex relationship between the focus of one’s own social attention and point of gaze on another’s face, which may be additionally modulated by the viewed face’s head and eye positions. An additional important source of variation may come from the reflexive alteration of an individual’s visuospatial attention when they observe a gaze change. To disentangle these relationships would likely require a series of experiments where these variables were varied parametrically using face and non-face stimuli .

4.2.3 Two Different Modes for Processing Another’s Gaze Direction: A Proposed Model

The data from Latinus et al. (2015) and the other studies reviewed here argue for the existence of potentially different modes of processing of social information in the brain. We would like to make the claim that our brains have two modes: a “Default” and “Socially Aware” mode. It would be possible to switch rapidly between one mode and the other—with an active mode at a particular instant being activated in response to one’s current goals and actions. We describe these two modes below.

In Default mode, the subject is not explicitly focusing on, or may not even be aware of, the social meaning of the stimulus. Experiments featuring tasks with passive viewing, or depicting facial movements as irrelevant targets, would fall into this category (e.g., Puce et al., 2000; Ulloa et al., 2014). Similarly, in everyday life we go about our business with an internal focus on our own goals and future actions, irrespective of what others around us might be doing. As we have already discussed in detail earlier, sensory neural responses, e.g., N170, will differ across types of facial movements because of low-level characteristics such as changes in local luminance and contrast (iris/sclera movements) and biological motion (from articulated mouth movements) (see Fig. 4.2) in the Default mode.

In contrast, a Socially Aware mode would occur as a result of having to make overt social judgments, such as where another’s gaze direction must be explicitly evaluated by the observer relative to himself or herself. In everyday life, we might attend to the feelings and emotional state of another, where their facial movements serve as important cues. In Socially Aware mode, our sensory systems are maximally primed, allowing incoming sensory information to be optimally processed. It is as if the gain in the sensory system has been increased to allow more complete social evaluations of incoming stimuli, which would be indexed by ERPs that follow the N170. Neurophysiologically, this would manifest as sensory ERP components, i.e., N170, with equal amplitudes across conditions (see Fig. 4.6), enabling better subsequent processing and differentiation in later (endogenous) ERPs. Socially Aware mode would be particularly important in reading situations where multiple individuals share an interaction. In one of our previous studies (Carrick, Thompson, Epling, & Puce, 2007), subjects made explicit social judgments from sets of face triads with dynamic gaze changes producing one of three different social situations (see the lower panel, Fig. 4.6). The dynamic gaze transition produced N170s of identical amplitude across all conditions—consistent with increased gain in visual pathways—while subsequent ERP activity beyond 350 ms poststimulus differentiated between social conditions.

Fig. 4.6
figure 6

Schematic representation of N170 changes for the brain inSocially Awaremode. The results of two experiments are shown. Isolated faces with gaze changes produce N170 amplitudes of equal magnitude when subjects are required to make social judgments relating to the direction of gaze (top panel). Similarly, when subjects make judgments on the type of social interaction that is taking place when a central face changes its gaze in a triad of faces, N170 amplitude is equal across conditions. Black arrows between example stimuli indicate apparent motion transitions. Solid and broken lines depicting N170 waveforms are associated with particular stimuli, identified with the same line type. White arrows on images on the lower panel schematically depict the gaze interaction of the central face with flankers and were not present in the experiment.

The switch from one mode to another could be made effortlessly and rapidly by top-down or bottom-up mechanisms. Bottom-up mechanisms operating from signals in areas such as the amygdala might be involuntary and may not be available to conscious awareness (e.g., Hardee, Thompson, & Puce, 2008). Top-down mechanisms, on the other hand, would be voluntary and governed by current intended goals and task demands. What exactly leads to a switch to a socially aware mode remains an open question. Although it seems obvious that explicitly asking participants to make social judgments would put subjects into this mode, other less explicit instructions or task requirements may well have the same effect. For instance, Ponkanen, Alhoniemi, Leppanen, and Hietanen (2011) reported larger N170 to direct than averted gaze with live faces, but not with pictures of faces. This may suggest that just seeing real faces rather than pictures may be sufficient to induce the Socially Aware mode. Another way to switch to a Socially Aware mode might simply occur by seeing a face that conveys emotion.

These different modes of information processing are likely not restricted to facial motion, but would extend to movements of the hands and body. Indeed, in a very early study, we have demonstrated significant differences in early ERPs (including not only N170 but also other components that occur at around 200 ms, or earlier, post-motion onset) to hand opening and closing movements as well as leg movements. Specifically, hand closing movements, i.e., making a fist, generated an ERP at around 200 ms (N170) post-motion onset from mainly the left temporal scalp, which was significantly larger than that elicited to hand opening movements (Wheaton, Pipingas, Silberstein, & Puce, 2001). Interestingly, in the same study we also noted significantly larger ERPs from the central scalp (a positive potential at 130 ms post-motion onset, and another positivity at around 270 ms) to viewing a leg stepping forward (i.e., an approach behavior) to a leg stepping back (Wheaton et al., 2001). In Default mode, our brain systems are not socially engaged, but nevertheless could be sensitive to incoming stimuli that are potentially threatening. The enhanced N170s observed to hand closure (a fist), a step towards us, a gaze aversion, or an opening mouth relative to other movements of the same body parts might be generated with the assistance of (subcortical) systems that detect potential threat (Bishop, 2008; Mulckhuyse & Theeuwes, 2010; Porges, 1997). The differentiation of the earlier neural responses to these types of important stimuli would enable us to potentially pay more attention to our surroundings and force us to evaluate our environment.

Where does the salience of the gaze stimulus fit into this picture? Others have argued that direct gaze is a much more socially salient stimulus relative to averted gaze (Conty et al., 2007; Ethofer et al., 2011; Itier & Batty, 2009). Direct gaze is a cue that informs an observer that there is a desire to communicate (Kleinke, 1986). As such, one might expect early ERPs to be modulated in the direction favoring direct gaze given this consideration. On the other hand, as argued earlier, when it comes to threat detection an averted gaze stimulus may also have increased salience (producing altered visuospatial attention and a subsequent reevaluation of the visual environment). To date, there are relatively few studies examining the neurophysiological dynamics that occur to viewing the movements of others under different social manipulations, and more studies are needed to try and disentangle what might be multiple neural mechanisms (social, nonsocial) that might operate in parallel.

When in Default mode, the subject is typically not overtly and explicitly focusing on evaluating social information. This does not imply that it is not possible for this to occur in this mode: later (endogenous) ERP activity is still elicited and can potentially show differences between stimulus conditions, but this activity might not be actively used in the current situation. The fact that late ERP activity has been elicited would be optimal should a sudden switch to Socially Aware mode be required, where what was seen could be reevaluated, i.e., generating an internal social type of “double-take.” Below we provide some examples of later ERPs elicited to situations of social attention.

Two of the studies we described earlier (Carrick et al., 2007; Latinus et al., 2015) have had subjects that make explicit social judgments, i.e., operate in Socially Aware mode. In Carrick et al. (2007), a central face averted its gaze from the viewer while two flanker faces were depicted with unchanged averted gaze, and subjects pressed a button to indicate where the central face “shared an interaction” with none, one, or both flanker faces. We recorded two later ERPs in this paradigm. We observed a P350 with a prominent midline central scalp distribution, and also a subsequent P500 that showed a midline parietal scalp topography. Importantly, these later ERPs were sensitive to social situation: P350 was larger to the two conditions where a social interaction was taking place (relative to a situation where the face ignored the two others). P500, on the other hand, was significantly larger to the condition where the central face “ignored” the two others (Carrick et al., 2007). In Latinus et al. (2015), subjects made social and nonsocial judgments. However, we were able to elicit reliable late ERPs that showed main effects as a function of task (social, nonsocial), gaze direction (averted, direct), and their interactions that were seen over large areas of the scalp. This was particularly true for gaze direction—with the largest changes occurring between conditions at latencies of around 375 ms post-gaze change (Latinus et al., 2015).

Future studies evaluating social attention changes in stimuli would potentially be more informative if two types of task were used in the same subjects using the same stimulus set in a single experimental session. In an explicit task where a social judgment is required, it is likely that the later endogenous ERPs would be informative and show changes that are consistent with social dimensions in stimulus conditions. It would be expected that sensory ERPs would show equal amplitudes across conditions. In implicit tasks with nonsocial task demands, sensory ERPs (e.g., N170) would be driven by low-level stimulus differences, whereas later endogenous ERPs would not differentiate as strongly across this passive dimension. By running implicit and explicit social tasks in the same experimental session, some of the variable differences in the social cognition literature might be reconciled. This multi-task approach is yielding interesting results in the areas of emotion processing and intentionality (Rellecke, Sommer, & Schacht, 2012) in that modulation in ERP components is observed only when subjects engage in gender and emotion discrimination tasks, but not in passive viewing. In this study, P100, N170, and slow-going and diffuse ERPs such as the late positive complex (LPC) were studied. Similarly, Wronka and Walentowska (2011) have observed N170 differences between faces depicting emotions relative to neutral, but these differences were not present when subjects performed a gender discrimination task (Wronka & Walentowska, 2011). If more of these multi-task studies were performed, then we might be able to gain a better understanding of the functional significance of various neurophysiological components.

4.2.4 Neural Responses Elicited to Changes in Gaze/Social Attention in the Presence of Emotional Facial Expressions

So far we have discussed changes in neurophysiological activity to eye gaze/social attention manipulations that have occurred in faces without associated emotions being presented. Facial expressions are usually directed at specific individuals, so changes in their gaze/social attention send a clear signal to others as to who is the target of the directed emotion. Therefore, it would not be unexpected to find interactions between gaze direction and associated facial emotion. Similarly, there may be differences in interaction effects elicited to gaze change/emotion pairings that reflect social stimuli that are likely in real life to produce approach versus withdrawal behaviors in the observer. Quite different neural responses might be elicited between averted gaze in a fearful face versus a direct gaze in an angry face . Both stimuli signal a potentially threatening situation, but likely have different contexts, despite eliciting likely withdrawal behaviors.

A number of fMRI studies have examined the neural processing underlying gaze aversions and displays of emotional facial expressions. Boll et al. (2011) found that angry faces with direct gaze elicited stronger amygdala activation relative to angry faces with averted gaze, i.e., anger targeted at another person. They demonstrated the opposite pattern with fearful faces, in that fearful faces with averted gaze elicited greater amygdala activation relative to fearful faces with direct gaze (Boll, Gamer, Kalisch, & Buchel, 2011). Similar to angry faces with direct gaze, happy faces with direct gaze also elicit more robust activation relative to the same emotional expressions presented with averted gaze (Sato, Kochiyama, Uono, & Yoshikawa, 2010). Indeed, direct gaze in faces that are rated as being attractive can also produce greater activation in the amygdala, relative to averted gaze from those same attractive faces (Kampe, Frith, Dolan, & Frith, 2001). Taken together, the findings of these studies and those of George et al. (2001) discussed earlier suggest that the amygdala maintains a sensitivity to the most salient combination of gaze–emotion signals that are related to explicit approach/avoidance behaviors (Adams & Kleck, 2005; Hietanen, Leppanen, Peltola, Linna-Aho, & Ruuhiala, 2008).

It appears that individual differences in anxiety may modulate the amygdala response to salient gaze–emotion stimuli: individuals who were high on the anxiety scale showed the greatest activation to angry faces with direct gaze, but did not differ in their response in the gaze manipulation of fearful faces (Ewbank, Fox, & Calder, 2010). It should be noted that selective amygdala activation can be elicited by isolated eyes depicting fear with direct gaze: selective activation occurred in the right amygdala in an experiment in which these stimuli were task-irrelevant. In contrast, the left amygdala in the same study was sensitive to all types of changes in the eyes, be it gaze direction, eye widening or narrowing, or change in spatial position of the eyes (Hardee et al., 2008). From these data, it appears that our amygdalae are responsive to changes in gaze, or changes in the eyes that occur when producing emotional expressions, irrespective of whether these are being actively attended to, or whether they are task-relevant.

Neurophysiological studies have the potential advantage over fMRI, as they have the ability to temporally isolate the neural response related to the gaze change from activity generated to the viewed emotion. Importantly, the observed effects from viewing these compound types of stimulation may differ depending on the order in which the dynamic changes in the face occur—as two studies we review below suggest.

Dumas et al. (2013) recorded MEG activity elicited to the onset of isolated static faces with direct gaze showing either a fearful or neutral expression from a gray background. The experiment was set up as a 2 × 2 stimulus design where Expression (Fearful, Neutral) and Gaze (Direct, Averted) were manipulated, and ERP activity to the onset of each static face could be recorded. Rather than measuring ERP peak amplitudes and latencies, changes in evoked activity were expressed as significant differences between ERP waveforms at various time intervals. Subjects’ anxiety levels were assessed and used as a co-variate in the data analyses. Subjects responded to a gaze/emotion-irrelevant target, in the form of an infrequently presented blue dot that would appear after the offset of either face stimulus, ensuring that target-related ERP activity would not impinge on the effects of interest. Neural source modeling generated time courses of putative neural activity in neocortex (ventral and lateral superior temporal cortex) and amygdala. Putative amygdala activity for fearful relative to neural faces was enhanced between 130–170  and 310–350 ms, and that to direct versus averted gaze was enhanced between 190 and 350 ms. This latter activity was selective for fearful faces in the right amygdala. Activity in neocortical sources occurred in parallel with that of the amygdala in the M170 range. The ventral cortical responses were also modulated by emotion, with greater activity to fearful relative to neutral faces (Dumas et al., 2013). This study indicates how complex potential interactions between gaze and emotion can be. Given that the manipulation of emotion and gaze direction was concurrent, in this study it is difficult to separate out neural effects to gaze changes or to emotion.

Earlier we discussed the data of Ulloa et al. (2014) with respect to neural activity elicited to gaze changes. Unlike in the experiment of Dumas et al., Ulloa et al. presented a gaze change in two flanking neutral avatar faces 1 s prior to the onset of a dynamic emotion in both faces that evolved and waned over a further 4 s period, allowing neural activity to elicited gaze changes and viewed emotions (happy and angry) to be separated. Gaze change conditions included a mutual gaze condition and a condition where the avatars looked away from one another (and the observer) to a point to the side of the screen. As noted earlier, irrespective of the subsequent emotion, gaze changes elicited larger M170 activity when the avatars averted their gaze from one another (and the observer) relative to the mutual gaze condition. To examine neural activity to the dynamic emotion, it was necessary to evaluate changes in mean MEG activity over time, as effectively no ERP activity would be observed to a continuously changing face depicting an emotion over a 4 s period. Main effects of emotion were observed in two MEG sensor clusters—one over the occipital scalp and the other anteriorly over the right frontotemporal scalp. In the posterior cluster these effects ranged from 400 to 1300 ms, with activity in the right cluster being more prolonged. There was no main effect of gaze condition when the emotions were unfolding (these effects had occurred earlier to the initial gaze change). Interestingly, there was a three-way interaction between gaze condition, emotion, and hemisphere of recording that occurred at two time intervals: 100–400 and 1000–1900 ms post-expression onset. Post hoc comparisons indicated that these effects in the later range were driven mainly by activity in the left hemisphere for the mutual gaze condition for both emotions. In contrast, activity differences in the right anterior MEG sensor cluster were quite complex, with the earliest main effects occurring for gaze condition in the 100–400 ms post-expression onset time range, and effectively persisting until the end of the presented emotion (until 2500 ms). An interaction effect between gaze condition and emotion began at around 700 ms, also effectively persisting until the end of the presented emotion. Further testing identified the mutual gaze condition, and also the angry expression as being the drivers for these differences (Ulloa et al., 2014). The data from this study demonstrate how complex the interactions in gaze and emotional expression can be, and that they can also be separated in time from the original gaze change.

The data from this study indicate a clear set changes in brain activity that emerge over time: the initial gaze change in a neutral face was diminished when the two faces were not engaged in mutual gaze (larger M170 to averted conditions) in the posterior scalp. As the emotion unfolded (and gaze remained constant) bilateral activity played out across the posterior sensors until about halfway through the depicted emotion (i.e., at its peak). Effects in the right anterior sensors, once active, persisted for the presentation of the whole emotion. Notably, there was an interaction effect between gaze condition and emotion, with the mutual gazing faces with angry expressions showing the greatest prolonged MEG activity. The data from this study raise many questions. One main question that cannot be answered from this study relates to the frequency composition of the increased MEG activity elicited to the stimulus manipulations.

4.2.5 Evoked Intracranial EEG Activity to Viewing Changes in Gaze/Social Attention

Scalp EEG and MEG studies cannot localize the sources of neural activity with certainty, although neural source modeling is performed on these types of data (for a review, see Michel et al., 2004). On rare occasions neuroscientists have the ability to record neurophysiological activity directly from the human brain in neurosurgical patients who are undergoing invasive investigations for the amelioration of drug-resistant seizures, often of temporal lobe origin. Although there is always the question of how this activity might be affected by either an underlying tissue abnormality or anticonvulsant medication, nevertheless these types of recordings provide a window onto the brain. As discussed in Sect. 4.1, fMRI studies have identified active brain loci for processing information related to gaze and social attention changes. This has led some investigators to study neurophysiological activity in these brain regions in neurosurgical patients whose seizure semiology dictates the placement of intracranial recordings in these brain regions.

Caruana et al. (2014) examined intracranial ERPs, as well as oscillatory gamma band EEG activity to viewing gaze changes produced by apparent motion, similar to our previous studies. Epilepsy surgery patients viewed the stimuli and pressed a button whenever the stimulus face closed its eyes. Over 200 recording sites from depth electrodes penetrating all gyri of posterior temporal cortex (superior, middle, and inferior) and angular gyrus were studied. Notably, significantly greater neurophysiological activity was observed to averted gaze relative to direct gaze or to a side switch gaze change, where gaze changed from extreme left to extreme right, or vice versa. Both intracranial N170 amplitudes and high gamma band power (50–150 Hz) were significantly increased to the gaze aversions that followed from a direct gaze position, and these changes were seen on depth electrode contacts centered on the MTG. According to the authors the “crucial aspect of gaze aversion is the prior presence of the eye contact and its interruption” and that this was the likely reason for the resulting augmented neurophysiological activity as shown by both intracranial N170 ERP and high gamma band activity (Caruana et al., 2014).

Increased intracranial ERP activity has also been reported in recordings made from ventral temporal cortex, i.e., FG, to averted versus direct gaze. N200 ERP amplitude was increased to averted versus direct gaze, in an experiment where head position was also manipulated (Pourtois, Spinelli, Seeck, & Vuilleumier, 2010) similar to that performed originally by Conty et al. (2007). Unlike in Conty et al. (2007), the only main effect that was observed was for gaze—no significant differences were observed in head position. Pourtois et al. also reported a late ERP effect in the FG, where larger activity beginning at around 400 ms and lasting for around 600 ms was observed to averted versus direct gaze in a task where the patients were required to perform a gender discrimination task. Similar to the scalp ERP data, the intracranial data show an initial early effect of gaze transition (at around 200 ms) followed by later ERP effects that begin after 300 ms (Pourtois et al., 2010).

Given our Default/Socially Aware information-processing model outlined in the previous section, it will be interesting to perform more invasive studies from these brain regions that compare neurophysiological changes to social and nonsocial tasks in the same individuals. Scalp EEG studies have poor localization value, and invasive EEG studies (despite having limited placement that is dictated by clinical demands) can identify local neurophysiological activity from the presence of large local amplitude gradients and polarity reversals in neural activity.

4.3 Oscillatory EEG Changes Elicited to Viewing Changes in Gaze/Social Attention

A growing number of laboratories, including our own, are beginning to investigate the frequency composition of EEG/MEG activity elicited to viewing changes in social attention . Averaged ERP activity tells only part of the story when examining neurophysiological effects that are produced by any incoming stimulus. Very few studies to date examine ERP and oscillatory activity side by side, so currently it is difficult to get a sense of how brain activity changes overall with respect to viewing changes in gaze /social attention. This is an important issue, because fMRI activation is likely to be a composite of both types of neurophysiological activity (e.g., Logothetis et al., 2001; Puce et al., 1995, 1997), potentially producing different results across assessment modalities. At this stage, we currently still lack an understanding of the true functional significance of neurophysiological activity elicited to changes in the social attention of others. Similarly, the functional neuroanatomy of social attention needs to be placed explicitly within the context of known networks that make up the social brain, i.e., mentalizing network, amygdala network, mirror neuron network, and empathy network (Stanley & Adolphs, 2013). As seen in the previous sections of this chapter, the literature to date implicates mainly the mentalizing and amygdala networks as being crucial to evaluating another’s social attention.

As we noted in the previous section, Caruana et al. (2014) documented increased high gamma band power to gaze aversions that occurred from direct gaze transitions, in addition to their increased intracranial N170s to averted gaze. What is not clear is whether there were changes in other frequencies of oscillatory EEG activity in this study, e.g., in the alpha and beta ranges in the temporal cortex.

In two studies, we recorded ERPs and oscillatory EEG activity in response to viewing faces depicting eye gaze changes in a nonsocial task. In one experiment, stimuli consisting of only line-drawn faces were presented (depicting eye and mouth movements) (Rossi et al., 2014), and in the second experiment real images of faces and line-drawn faces were presented in the same experiment (Rossi et al., 2015). We have already described the ERP features in detail to these experiments above where N170 increases to gaze aversions were observed only to images of real faces. Relevant to the current discussion, we evaluated oscillatory EEG activity over a 5–50 Hz range, segregating the activity into alpha (8–12 Hz), beta (12–30 Hz), and low gamma (30–50 Hz) frequency bands. We looked for significant differences between direct gaze and averted eye conditions in both studies. In the study in which only line-drawn faces were presented, changes in the beta and gamma band were observed. Beta band (12–30 Hz) power increases are thought to reflect maintenance of current behaviorally relevant sensorimotor or cognitive states (Engel & Fries, 2010). Gamma band (> 30 Hz) power increases have been associated with facilitation in cortical processing in situations requiring cognitive control and perceptual awareness (Engel & Fries, 2010; Grossmann, Johnson, Farroni, & Csibra, 2007; Ray & Cole, 1985; Tallon-Baudry & Bertrand, 1999, but see Sedley & Cunningham, 2013). In our recent studies, averted gaze relative to direct gaze elicited suppressed beta activity at two discrete time points: around 150 and 350–450 ms post-gaze transition in the left occipitotemporal scalp. Additionally, beta activity increased at this latter time interval over the right occipitotemporal scalp for the averted relative to direct gaze comparison. In the left hemisphere, a relative increase in low gamma activity was noted to direct gaze at around 450 ms. These changes in oscillatory activity to the eye gaze stimuli were very different to those observed to mouth movements and to movements of scrambled control stimuli. For mouth movements, reduced activity at around 500 ms was seen in the gamma range for mouth closing versus opening movements in both hemispheres, with an increase in beta activity at around 380 ms in the right occipitotemporal scalp occurring to mouth closing versus opening. Motion control stimuli produced a different pattern to either eye or mouth movements, with brief changes in activity occurring only in the left occipitotemporal scalp in the beta range at ~ 425 ms and the gamma range at around ~ 100 and 380 ms (and with a higher frequency in the lower gamma range). In this study, participants were asked to respond on each experimental trial with a button press to indicate whether the current line-drawn stimulus was white or red (Rossi et al., 2014).

In our second study examining oscillatory EEG changes in both real and line-drawn faces, participants detected an infrequent target stimulus that could be a photonegative image of each of the different stimulus types (Rossi et al., 2015). We presented blocks of real and line-drawn faces depicting gaze aversions and direct gaze transitions to look for effects of experimental context on neurophysiological activity. Although this did not occur with ERPs, our oscillatory EEG data showed some differences to those described earlier. In this experiment oscillatory activity to real faces and line faces showed changes only in the gamma range at similar time points around the 200–300 ms post-gaze change time range. These patterns of activity were quite different to control motion stimulation with changes in beta and gamma activity occurring at different time points relative to changes observed with faces (Rossi et al., 2015). It is possible that the differences in oscillatory profiles of activity across the two experiments were driven by the different task requirements: a color change detection task with required response on each trial, as opposed to the detection of an infrequent target stimulus consisting of a photonegative of any presented stimulus type. The other possibility is that the context in which the stimulus was presented may have affected the type of elicited neurophysiological activity. At this point in time we cannot distinguish between these two possibilities. Having said that, there is a clear difference in the behavior of ERP activity (phase-locked to the stimulus and hence visible in an averaged ERP) relative to oscillatory EEG activity (not necessarily be phase-locked to stimulus delivery, but still elicited to the gaze change). From our studies with line-drawn faces it is clear that context/task does not influence N170 ERP activity, but that is not the case for oscillatory EEG activity, at least when the brain is working in Default mode.

Amygdala activity is impossible to detect with scalp EEG. It is also difficult to detect with MEG sensors (with the ability to detect activity in deep sources depending in part on detector type and sensitivity). On occasions, intracranial recording electrodes for seizure detection are implanted in this region. In a study on six epilepsy surgery patients, Sato et al. (2011) reported on changes in oscillatory EEG activity to viewing gaze and control stimuli changes and also changes in control stimuli. Patients viewed isolated eye stimuli that changed their gaze position and were required to respond infrequent change in color of a centrally presented fixation cross on a white background that occurred between presented eye stimuli. Oscillatory EEG activity in the range 4–60 Hz was examined, and statistical testing revealed a significant differential broadband gamma burst of activity that occurred at around 200 ms after the gaze transition when the eye conditions were compared with the control (dynamic mosaics) (Sato et al., 2011). Unfortunately, ERP activity was not evaluated in this study. It would be interesting to see if parallel changes in ERP activity and gamma activity would have been observed, as was seen in the study by Caruana et al. (2014).

At this very early stage of investigation of oscillatory EEG/MEG activity elicited to gaze/social attention changes, it appears that gamma activity may play an important role in processing this important visual stimulus. The intracranial investigations indicate that gamma activity is augmented in the lateral temporal cortex to gaze aversion (Caruana et al., 2014) and that this type of activity is clearly larger than that observed to non-eye/face controls (Sato et al., 2011) in nonsocial tasks. Also, in nonsocial tasks, changes in gamma activity recorded in scalp EEG are also affected by type of gaze transition. What remains unknown at this point in time is how social versus nonsocial judgments are likely to influence elicited gamma activity, and how also experimental context modulates these data. Unfortunately, scalp EEG studies cannot reliably record higher frequency gamma activity, because of the low-pass filtering effects of the skull (Srinivasan, Nunez, Tucker, Silberstein, & Cadusch, 1996). Hence, more intracranial EEG studies and perhaps MEG studies will be required to gain a better understanding of the functional significance of these changes in high-frequency oscillatory activity.

Using an interaction task between two individuals, Iwaki (2013) recorded MEG activity in a subject as they observed another and altered their gaze relative to the eye movements of the observed individual every couple of seconds. Direct and averted gaze were alternated. Interestingly, significant changes in MEG activity in response to viewing direct versus averted gaze were seen in the gamma range (35–45 Hz) at a large number of MEG sensors that were located over bilateral aspects of the posterior temporal, parietal, and frontal aspects of the head. These effects occurred at isolated intervals during the 2 s recording epoch (Iwaki, 2013). In this study only one subject was studied at a time, and it may be that the presence of a real (live) person might have driven these effects—as found by Ponkanen et al. (2008), where the effects of direct gaze produced stronger frontal EEG changes in the alpha band (discussed in the next section).

In a fascinating dual-interactive EEG study, Lachat et al. (2012) recorded scalp EEG from a pair of subjects engaging in a task manipulating gaze direction and joint attention. In an ingenious experimental design, gaze direction was cued by light-emitting diodes (LEDs) in a semi-arc between the two subjects, where one could be also instructed to follow the gaze of the other. In one manipulation of joint attention, one subject would follow the gaze of the other to look at an illuminated LED (whose onset had cued the first subject’s gaze transition). A condition in which the subjects both looked at the same LED served as a control—here their gaze was on the same target but had been initiated under different conditions. To examine the effects of gaze direction, subjects could look at different LEDs, but could be cued to this either by the gaze of one of the subject’s or alternatively by LED onset. These various conditions resulted in a 2 × 2 design for Joint Attention (present, not present) and Instruction (social—gaze dependent, nonsocial—LED color dependent). The experimenters specifically investigated oscillatory EEG activity in the alpha range across the entire scalp in these experimental manipulations that examine EEG activity during an epoch immediately following each gaze transition. Significant changes in alpha range activity across the left centro-parietal-occipital scalp were noted as a main effect of joint attention . No main effect for instruction or interaction effects for joint attention/instruction were documented (Lachat, Hugueville, Lemarechal, Conty, & George, 2012). Activity in other EEG frequency bands was not investigated in this study, so it is not clear how these data fit with the other studies we have discussed.

So far, the existing changes in oscillatory EEG data appear to be somewhat at odds with one another. Investigators have not typically examined the whole EEG frequency range, or the whole scalp, so it is unclear if changes in EEG are spatially localized and confined to a narrow frequency band or are more extensive. Intracranial data show increased gamma to averted gaze at a time period corresponding in time with the occurrence of the N170 (Caruana et al., 2014). Scalp EEG studies show very brief differential gamma effects for direct and averted gaze across the occipitotemporal scalp (Rossi et al., 2015), as well as changes in the beta band (Rossi et al., 2014) and alpha band (Hietanen et al., 2008; Lachat et al., 2012). These varied data indicate a clear need to systematically investigate the oscillatory EEG activity across the entire frequency spectrum in the same subjects under a series of experiments that compare social versus nonsocial judgments, as well as examine potential within-experiment stimulus context effects that might be present. By performing these studies and also examining ERP activity concurrently, a more coherent neurophysiological profile of activity elicited to gaze/social attention changes will emerge. Currently, the functional significance of the observed oscillatory changes with respect to social attention remains unknown.

4.4 Naturalistic Tasks and Ecological Validity of Experimental Stimuli

Ecological validity and stimulus type also need to be considered in tasks evaluating gaze perception and social attention . As noted earlier, most studies have used the onset/offset of static images of full-on gray scale faces whose gaze may appear as direct or averted—a somewhat unrealistic representation relative to what we experience in our daily lives. N170 ERP activity elicited to gaze changes has been found to be significantly larger to gaze changes and eye closure performed by a real (live) actor, relative to the face of the same actor presented as a static two-dimensional image in a passive viewing task. Interestingly, however, N170 modulation as a function of gaze was only reported to the real actor and not to the presented video of the same individual. Specifically, the largest N170s were reported to the direct gaze condition (Ponkanen et al., 2011). Hietanen et al. (2008) have also investigated oscillatory EEG changes in the alpha band (8–13 Hz) in a similar passive viewing task. Specifically, an asymmetry in alpha band activity across the frontal scalp occurred for viewing a live actor changing their gaze, and not to viewing images of the same actor performing the same action. Alpha activity was relatively larger in the left relative to the right frontal scalp when direct gaze was viewed, and was larger in the right frontal scalp (relative to left) when averted gaze was viewed. These hemispherically selective changes in frontal EEG asymmetry were interpreted as engaging approach and avoidance systems in the brain, respectively. Additionally, measurements of autonomic activity, as assessed by skin conductance, when the actor directed their gaze at the observer showed greater galvanic skin responses to viewing the actor, and particularly to a direct gaze situation (Hietanen et al., 2008). Increased N170 amplitudes and autonomic responses were attributed to direct gaze being more arousing to the subject (Hietanen et al., 2008; Ponkanen et al., 2008), and potentially being more socially salient. The effects of direct gaze do not appear to be affected by culture: similar effects of direct gaze occur for observers in Western and East Asian cultures (Akechi et al., 2013), despite prolonged direct gaze being regarded as rude behavior in some of these East Asian cultures (Knapp, 1972; Sue and Sue, 1977). Interestingly, individuals from East Asian cultures tend to fixate more on the eyes when making judgments of emotion, as opposed to individuals from Western cultures who show a tendency to focus more on the mouth region (Jack, Blais, Scheepers, Schyns, & Caldara, 2009; Yuki, 2007).

The studies of Ponkanen et al. (2008, 2011) and Hietanen et al. (2008) underscore the need for studies of social attention (and social cognition in general) to use live actors in ecologically valid contexts—the observed experimental effects to viewing real actors in three dimensions are clearly augmented relative to those seen in their two-dimensional image counterparts. Therefore, three-dimensional stimuli might be more likely to elicit significant differences between experimental conditions. Interestingly, the data of Ponkanen et al. did not show differences between gaze conditions for gaze stimuli presented on a monitor, unlike our own multiple studies that demonstrate clear differences between gaze conditions in apparent motion of face stimuli presented on a monitor. Although one can always use task demands and stimulation conditions as a convenient reason to explain divergent findings between different studies, it may well be that monitor refresh rates and resolution/frequency of presented digital video may affect elicited neurophysiological activity, because the gaze transition might not appear as “sharp” or rapid when presented on some monitors. Similarly, gaze transitions generated in an apparent motion paradigm where two successive still images (one with direct gaze and the other with averted gaze) are presented successively may also produce a sharper motion transition than a gaze transition viewed in a real actor. This more instantaneous transition in the apparent motion task might have produced ERPs that are larger and less widely dispersed than those to real motion transition.

Potential differences in the robustness of elicited ERPs to apparent motion versus real video stimulation may be bolstered by our own data: our previous studies using stimuli presented on a monitor have mostly used apparent motion transitions that have shown systematic differences in N170 amplitude between mouth opening and closing conditions (Puce et al., 2000, 2003). We have previously also evaluated ERPs to videos of real mouth motion in a study that also tested the responses to viewing hand and body motion. Notably, although N170 appeared larger for mouth opening movements, the differences between mouth conditions in this study were not significantly different (Wheaton et al., 2001), in videos that were presented at a 30 Hz digitization rate. It could be that videos captured at this rate cannot fully depict the face movement, which is typically rapid, and that this results in an ERP that is not elicited under optimal stimulation conditions. Indeed one could argue that video displays with low refresh rates themselves actually constitute apparent motion stimuli. Future studies comparing real versus video stimulation to different types of facial and bodily movements, as well as video stimulation compared with apparent motion studied in the same experimental session, will be needed to disentangle the effects of these visual stimulation parameters on neurophysiological activity.

Naturalistic experimentation involving multiple subjects engaging in social interactions poses many technical challenges, but in principle, could be studied using ambulatory EEG recordings (Gramann et al., 2011; Sipp, Gwin, Makeig, & Ferris, 2013). If artifacts could be reliably detected and removed, then it might be possible to evaluate changes in oscillatory EEG that will occur as an individual approaches another or a facial expression slowly unfolds. Laboratories that have the capability to examine these types of interactions are relatively few (e.g., Lachat et al., 2012; Sipp et al., 2013) but have the potential to evaluate the brain in a situation that is much more ecologically valid than that reported in earlier studies. This would include the ability to record simultaneous EEG from multiple individuals while they engage in a social interaction.

4.5 Joint Attention and Gaze-Cueing Experiments

Very few studies have explicitly manipulated joint attention in the naturalistic manner described earlier (Iwaki, 2013; Lachat et al., 2012). The change in another’s social attention is thought to be automatic and reflexive and to reflect a reorienting of attention in space. Therefore , gaze-cueing experiments have been typically used to study processes related to joint attention (Frischen, Bayliss, & Tipper, 2007). These experiments have evolved largely out of Posner-style paradigms (Posner, 1980) that have cued a subject’s covert visuospatial attention to a location in space. These studies are beyond the scope of this chapter.

4.6 Some Outstanding Questions

Over the course of the chapter we have alluded to a number of knowledge gaps in the social attention area. The largest gaps to be filled, in our opinion, are listed below:

How does the functional connectivity of activity elicited in the four social brain networks change as a function of social context and required behavior in tasks investigating social attention? Which networks (and brain structures) are critical to this process, and drive other parts of the network?

How does ERP activity relate to oscillatory EEG/MEG changes in social attention tasks?

What is the functional significance of oscillatory EEG/MEG changes in social attention tasks, and how does this relate to proposed roles for different types of oscillatory EEG/MEG activity changes in other perceptual/cognitive/affective manipulations?

Related to our proposed dual processing mode for social information in the brain (Default/Socially Aware):

Are there consistent neurophysiological correlates of these two states (in both ERP and oscillatory EEG activity)?

Are these two modes associated with different profiles of functional connectivity in the brain’s social networks?

The discussion in this chapter has related only to the healthy human brain: Are these two information-processing modes affected to different extents in social cognition disorders?

4.7 Conclusions

Given the above outstanding questions, there is clearly a lot of work to be performed in generating a more complete understanding of the neural processes underlying the evaluation of social/joint attention in the healthy adult brain. The use of multiple assessment methods and the search for converging evidence across EEG/MEG , functional MRI, neuropsychological lesion studies, as well as studies of structural connectivity will be required to disentangle a number of different issues in social attention. What is known, however, is that the brain has a set of networks, which activate selectively to social stimuli and situations. For social attention, networks such as the mentalizing and amygdala networks are important, with areas of the brain such as the pSTS, amygdala, and FG being particularly important for evaluating another’s social attention . Neurophysiologically, it is clear that at around a fifth of a second (at around 200 ms), a social attention stimulus is differentiated by the brain, with subsequent neural activity being modulated by the social context of the social attention stimulus. An emerging set of studies have indicated that the use of live human models and naturalistic stimulation will enhance and change the neural activity that is elicited to social stimuli, stressing the importance of using ecologically valid stimuli to evaluate the neural basis of human social interactions. Nonetheless, static images and dynamic videos depicting eye gaze and mouth movements will continue to be used when using live actors is not methodologically feasible.

On the basis of our neurophysiological investigations, we propose a model in which incoming social information can be processed by one of the two modes: a “Default” (or nonsocial) mode and a “Socially Aware” mode. The latter mode is active when making explicit social judgments, whereas the former will be active in most other contexts. Rapid switching from one mode to the other can occur by way of either top-down or bottom-up mechanisms. The nature of this switching and characteristics of each mode remain to be clarified by future studies, which will require the use of both naturalistic stimulation and more controlled laboratory studies, as well as first- and third-person contexts.