Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Background

Social anxiety disorder (SAD) is a debilitating mental illness primarily characterized by an excessive fear of negative social evaluation. Cognitive models of SAD (e.g. Rapee and Heimberg 1997) suggest that this condition is maintained and exacerbated by various information processing biases, which may occur from early automatic stages of processing, through to later interpretive stages of processing.

With regard to the former, a strong base of literature suggests that SAD is associated with an attentional bias to threat (Bar-Haim et al. 2007) . That is, the preferential attentional processing of stimuli which may indicate feared negative social evaluation. Such stimuli may include disapproving or angry facial expressions exhibited by other individuals in a given social situation. This threat bias may maintain social anxiety by increasing arousal, falsely confirming various negative beliefs, and may provoke safety behaviours and avoidance, thus, exacerbating this condition (Hofmann 2007; Rapee and Heimberg 1997) .

Although the relationship between social anxiety and attentional bias to threat is well established, the attentional components which underpin such a bias necessitate empirical inquiry. For instance, theoretical distinctions have been made between the attentional engagement with a stimulus, and the subsequent disengagement from the stimulus. Therefore, an attentional bias to threat may either pertain to either a facilitated engagement with the threat stimulus, or some form of disruption to the disengagement from the threat. Studies employing reaction time (RT)-based measures of attentional engagement and disengagement, commonly the modified Posner cuing task (Fox et al. 2001) , have consistently observed RT data interpreted as disrupted disengagement from threat in high-anxious individuals. However, the validity of this measure has been questioned. It has alternatively been suggested that the same RT data may reflect an additive effect of facilitated engagement and a general slowing in the presence of threat for high-anxious individuals (Mogg et al. 2008) . The modified Posner cuing task cannot differentiate between these two alternate conjectures.

Recently, there has been a growing interest in the use of eye tracking to assess attentional bias, conceivably due, in part, to the limitations of RT-based measures. For instance, when a stimulus is presented for a duration of 500 ms, it is entirely possible that multiple shifts of attention may occur during this interval . However, RT is not sensitive to such shifts, only providing a snapshot of attention (Bradley et al. 2000) . In contrast, eye movement (EM) may provide a continuous and relatively direct measure of visual attention . Previous studies using EM assessments have found facilitated engagement with threat (Mogg et al. 2000) or both threat and positive stimuli (Calvo and Avero 2005; Garner et al. 2006) . In addition, facilitated disengagement from positive stimuli, relative to threat, has been observed in clinically socially anxious individuals (Chen et al. 2012) .

In addition to engagement and disengagement biases, avoidant attentional styles in high-anxious individuals have been observed from EM studies. That is, when social stimuli are presented for a long duration, such as 3 s, anxious individuals have shown reductions in the sustainment of attention over time to either threat stimuli (Calvo et al. 2005; Rohner 2002) or both threat and positive stimuli (Chen et al. 2012; Garner et al. 2006). It has been suggested that while anxiety-linked attentional bias to threat may reflect an automatic process, attentional avoidance may represent a strategic effort, in which a socially anxious individual may avoid attending to emotional social stimuli in an attempt to regulate their emotional state (Cisler and Koster 2010) .

While recent research has incorporated eye tracking to assess attentional selectivity, a further advantage of eye tracking is its capacity to unobtrusively assess visual attention during realistic simulations. For instance, in the field of human computer interaction, eye tracking has been utilized in aviation and driving simulations to assess information selection and management, and situational awareness (Duchowski 2002) . Such simulations possess high ecological validity. In contrast, the majority of anxiety and attentional bias research has been conducted in contrived laboratory settings. While this setting allows for rigorous experimental control, the artificial nature of these laboratory settings begs the question of whether their findings generalize to real-world situations. However, in clinical research, the use of eye-tracking realistic simulations remains a novel methodology.

2 Proposed Paradigm: Eye-Tracking Speech Simulation

Public speaking is a common fear for socially anxious individuals, as it requires them to engage in social performance and exposes them to potential negative social evaluation from the audience. Given this, a public speaking simulation, combined with eye tracking, may provide critical insights into the attentional processes that occur during conditions of psychosocial stress for individuals with SAD, in comparison to low-anxious control participants .

2.1 Design

Participants are required to give a brief, 5-minute speech on a topic of their own choice, in front of a large display playing a pre-recorded video of an audience, while EM at the display is continuously recorded. The audience consists of eight confederates, assigned to either express socially positive or negative gestures, or remain neutral throughout the speech (see Fig. 1). Socially positive gestures may include a smile, or an agreeing nod, whereas socially negative gestures may include a disagreeing shake of the head, or a sigh of boredom. Participants receive one of two counterbalanced audience videos, in which the emotional confederates are switched for valence.

Fig. 1
figure 1

Layout of audience display

An initial 40-s neutral period is presented, in which all confederates are neutral. Following this a number of trials are presented, one trial every 10 s. Each trial begins with a flashing cross cue presented for 1 s on either an emotional face, or a neutral face. Immediately following this, a gesture pair consisting of one positive and one negative confederate make a dynamic social gesture for 6 s, and then return to neutral. Participants are asked to look at the cue whenever it appears. This method of presentation derives from the recent development of an engagement disengagement cuing (EDC) task which uses a cue to secure attention in order to assess attentional engagement and disengagement (Chen et al. 2012) .

For trials assessing engagement (see Fig. 2a), the cross cue is presented on the neutral face. The participant’s attention is, therefore, secured in between and equidistant from the positive and negative gesture. Following gesture onset, the propensity and speed of initial EM orientation may be assessed to provide measures of engagement propensity and engagement speed respectively, to positive and negative stimuli .

Fig. 2
figure 2

Example of trials assessing engagement (a, left) and disengagement from threat (b, right)

For trials assessing disengagement (see Fig. 2b), the cross cue appears on either the positive or the negative face. The participant’s attention is, therefore, secured at the location of the emotional face. Following gesture onset, the latency to saccade away from the face may be assessed to provide a measure of disengagement speed from positive and negative stimuli.

2.2 Data preparation

Given that speaking may cause momentary disruptions to the EM samples. A two-sample noise reduction filter is first applied to the EM data (Stampe 1993) , followed by an interpolation filter to smooth out brief signal loss for gaps smaller than 100 ms where EM is held within 1°visual angle (VA) immediately before and after the gap. Subsequently, fixations are defined as EM samples held within 1°VA for a minimum duration of 100 ms. Due to the noise introduced by concurrent speaking, fixation detection algorithms are preferable to velocity and acceleration-based saccade detection algorithms.

2.3 Engagement Propensity and Engagement Speed

Trials assessing engagement are included for analysis if (a) fixation is present at the location of the cross cue immediately prior to gesture onset, (b) fixation occurs on at least one of the emotional faces before offset, and (c) this critical fixation occurs at least 100 ms following gesture onset (to remove anticipatory saccades). From the included trials, engagement propensity may be calculated as the relative likelihood to initially orient to positive and negative stimuli, and engagement speed as the mean latency of initial orientation to positive and negative stimuli.

2.4 Disengagement speed

Trials assessing disengagement are included for analysis if (a) fixation is present at the location of the cross cue immediately prior to gesture onset, (b) a saccade away from the emotional face is made before offset, and (c) this critical saccade occurs at least 100 ms following gesture onset. From the included trials, disengagement speed may be calculated as the mean latency to saccade away from positive and negative stimuli.

2.5 Sustained attentional processing

To assess for the manner in which attention is sustained throughout the speech, total fixation time to positive, negative, neutral and non-face regions of the display are summed for the duration of the speech. The non-face region refers to the display area in between and around the confederate faces .

3 Expected Outcomes and Implications

If SAD is associated with an engagement bias to threat, it is anticipated that socially anxious individuals, relative to controls, will exhibit a greater propensity to initially orient to threat, possibly at a faster speed. If SAD is associated with a disengagement bias from threat, then it is expected that socially anxious individuals, relative to controls, will be slower to disengage from threat gestures. In addition, if socially anxious individuals employ avoidance strategies in an attempt to self-regulate, it is likely that this will be reflected in reduced total fixation time to emotional social stimuli, and a relative increase in the total fixation time towards non-face regions in between and around the confederate faces. Such anxiety-linked findings would provide validation of previous empirical and theoretical literature using a novel simulation-based methodology. Thus, it will be possible to elucidate whether such expected findings generalize to realistic settings. Moreover, given the consistent and evidently causal relationship between social anxiety and attention (e.g. Amir et al. 2009; MacLeod et al. 2002), the online assessment of attentional selectivity under conditions of psychosocial stress may provide utility as a potential marker for SAD, and as a quantitative heuristic for symptom reduction in the clinical treatment of SAD .