Introduction

People with autism spectrum disorder (ASD) show pervasive impairments in visual attention including reduced attention towards social stimuli, but enhanced attention towards non-social stimuli (Dawson et al. 2005; Sasson et al. 2011). Such atypical preferences are evident early in infancy (Osterling and Dawson 1994) and the circumscribed attentional patterns in eye tracking data can be found in 2–5 year-olds (Sasson et al. 2011) as well as in older children and adolescents (Sasson et al. 2008). To better understand atypical social attention in autism, there is a growing trend to employ more natural stimuli (e.g., complex scenes taken with a natural background), which have greater ecological validity and likely provide a better understanding of how attention is deployed in people with ASD when viewed in the real world (Ames and Fletcher-Watson 2010). Using natural scene stimuli, people with ASD have demonstrated reduced attention to social scenes (Birmingham et al. 2011; Chawarska et al. 2013) and socially salient aspects of the scenes (Rice et al. 2012; Shic et al. 2011), and reduced attentional bias toward threat-related scenes when presented with pairs of emotional or neutral images (Santos et al. 2012). In particular, our recent study has provided a very comprehensive investigation of eye tracking in autism using over 5000 regions annotated in 700 images and a novel and sophisticated computational model (Wang et al. 2015a).

One important open question, however, is where in the brain such abnormal social attention arises. A specific neural structure hypothesized to underlie such deficits in ASD is the amygdala (Baron-Cohen et al. 2000; Bauman and Kemper 1985). The idea of amygdala abnormalities in autism is supported by a substantial amount of literature showing structural abnormalities (Amaral et al. 2008; Bauman and Kemper 1985; Ecker et al. 2012; Schumann and Amaral 2006; Schumann et al. 2004) and atypical activation (Gotts et al. 2012; Philip et al. 2012). The contribution of amygdala dysfunction to social deficits in those with autism is supported by findings of rather similar patterns of deficits seen in patients with amygdala damage. In particular, an amygdala lesion patient unable to recognize fearful faces did not fixate on the eyes in faces (Adolphs et al. 2005) Furthermore, in single-neuron recordings in the human amygdala a weaker response to eyes was found in people with ASD (Rutishauser et al. 2013), and a neuroimaging study showed that amygdala-mediated orientation towards the eyes is dysfunctional in ASD (Kliemann et al. 2012).

Although the majority of prior studies focused on investigating whether the abnormal facial scanning patterns in people with ASD could be attributed to the amygdala, it still remains largely unknown whether amygdala dysfunction directly results in impaired visual attention and exploration when viewing complex natural scene stimuli. In this study, gaze patterns were directly compared between people with ASD and a rare patient with bilateral amygdala lesions using an effective free viewing task and a set of well characterized natural scene stimuli that showed systematic differences between people with ASD and matched controls in visual attention (Wang et al. 2015a).

Methods

Participants

Twenty high-functioning adults with ASD (three female; mean age ± SD: 30.8 ± 11.1 years), 19 neurologically and psychiatrically healthy participants with no family history of ASD (3 female; 32.3 ± 10.4 years), and a rare patient with bilateral amygdala lesions (female; 43 years) were recruited. The detailed demographic information of participants is provided in Table S1. Participants gave written informed consent and the experiments were approved by the Caltech Institutional Review Board. We verbally confirmed with all participants that they were able to see all fine details of the pictures on the screen. To the best of our knowledge, all ASD and control participants were not taking medications during the study, and they were free from epilepsy, had suffered no encephalic trauma, and were not taking neuroleptic or psychotropic medication. Furthermore, we confirmed that all participants were in good medical health during the testing.

All ASD participants met DSM-IV/ICD-10 diagnostic criteria for autism, and all met the cutoff scores for ASD on the Autism Diagnostic Observation Schedule (ADOS) (Lord et al. 1989) (Table S1). We assessed Intelligence Quotient (IQ) for participants using the Wechsler Abbreviated Scale of Intelligence (WASI™). The ASD group had a full scale IQ of 108.0 ± 15.6 (mean ± SD) and controls had a comparable full scale IQ of 108.2 ± 9.6 (t test, P = 0.95). The groups were also matched for age (t-test, P = 0.66), gender, race and education. As expected, the ASD group had higher scores than controls on the Social Responsiveness Scale-2 Adult Form Self Report (SRS-AR) (ASD: 83.8 ± 18.5; control: 34.8 ± 16.4; P = 8.46 × 10−7) and Autism Spectrum Quotient (AQ) (ASD: 29.6 ± 7.1; control: 15.2 ± 4.8; P = 5.45 × 10−8). The amygdala lesion patient, BG, has Urbach-Wiethe disease (UWD) (Hofer 1973), a condition that caused complete bilateral destruction of the basolateral amygdala and variable lesions of the remaining amygdala while sparing the hippocampus and all neocortical structures. BG was diagnosed with UWD late in childhood, following a grand mal seizure at age 12 (Patin and Hurlemann 2016). Recent genetic findings in BG show a novel homozygous missense mutation in exon 7 and a resulting switch from tryptophan 237 to arginine (c.709T > C; p.W237R). The p.Trp237Arg extracellular matrix protein 1 gene (ECM1) mutation is likely an underlying source of pathological changes in BG’s phenotype (Becker et al. 2012; Patin and Hurlemann 2016).

Stimuli and Task

A free-viewing task with natural scene images from the OSIE dataset was employed (see Fig. 1 for examples). This dataset has been characterized and described in detail previously (Xu et al. 2014). Briefly, the dataset contains 700 images and each image contains multiple dominant objects in a scene (see Fig. 1 for examples). Participants viewed 700 images freely for 3 s each and presented in random order. Images were randomly grouped into seven blocks with each block containing 100 images. It is worth noting that the task and stimuli used in this study have been shown to be very effective in detecting atypical visual attention in people with ASD across multiple levels and categories of objects (Wang et al. 2015a): (1) people with ASD look more at image centers, even when there is no object at the center; (2) people with ASD fixate more on regions with pixel-level saliency and less on regions with object-level and semantic-level saliency; and (3) people with ASD are slower to fixate on faces, but faster to fixate on mechanical and manipulable objects. More information can be found in Supplementary Methods.

Fig. 1
figure 1

Example stimuli with gaze densities from amygdala lesion patient, people with ASD, and controls. Heat map represents the gaze density. The similarity metrics are shown in between images. ac Images that are more similar between amygdala lesion patient and controls. df Images that are more similar between amygdala lesion patient and people with ASD

In this study, the same data from people with ASD and controls were re-analyzed from Wang et al. (2015a) whereas new data from amygdala lesion patient using the same stimuli and task were included. It is also worth noting that Wang et al. (2015a) employed a saliency modeling approach and focused on comparing between people with ASD and controls, whereas the present study employed a gaze similarity approach (Kennedy et al. 2017) (see below) and focused on comparing between amygdala lesion patient and people with ASD or controls.

Gaze Similarity Analyses

Gaze density maps were created to assess gaze similarity (Kennedy et al. 2017). Gaze density maps were derived by smoothing gaze locations using a 2D Gaussian kernel (size = 15 pixels, SD = 7.5 pixels) and were then normalized within each image. Gaze density maps represent the likelihood of looking at a particular location of the stimulus and are shown in arbitrary units. Similarity in gaze pattern for an image was then assessed by correlating two gaze density maps for that image. To calculate correlation coefficients between density maps, density maps were vectorized (i.e., converted to a single column of values), and then Pearson correlation coefficients between these vectors could readily be calculated. Pearson correlations are a straightforward method to capture similarity and dissimilarity, and match closely with human intuition (Kennedy et al. 2017). To calculate the similarity between amygdala lesion patient and people with ASD or controls, the gaze density map from amygdala lesion patient was first correlated with the corresponding gaze density map from each individual participant with ASD or control participant. It was then averaged across participants for each image.

Further methods of eye tracking, data validity, and permutation analysis are described in Supplementary Methods.

Results

Similarity Between Gaze Pattern

A similarity metric (see “Methods” section) to directly compare the gaze pattern between amygdala lesion patient and people with ASD or controls was used (Fig. 1). The amygdala lesion patient showed a more similar gaze pattern compared with controls (similarity metric in correlation coefficient r: 0.50 ± 0.13 (mean ± SD)) but a less similar gaze pattern compared with the ASD group (0.45 ± 0.12; Fig. 2a, b; two-tailed paired t-test on similarity metric: t(699) = 25.0, P = 8.13 × 10−99, effect size in Hedges’ g (standardized mean difference): g = 0.44, permutation P < 0.001), and the amygdala lesion patient had a more similar gaze pattern compared with controls for most of the images (580 for controls vs. 119 for ASD). This was the case for both social images (i.e., images containing faces; Fig. 2c; similarity with controls: 0.53 ± 0.13, similarity with ASD: 0.47 ± 0.12; t(520) = 22.7, P = 5.60 × 10−80, g = 0.47, permutation P < 0.001) and non-social images (Fig. 2d; similarity with controls: 0.45 ± 0.12, similarity with ASD: 0.40 ± 0.11; t(178) = 10.7, P = 5.65 × 10−21, g = 0.39, permutation P < 0.001), but the difference in similarity metric was greater for social images (social: 0.059 ± 0.059, non-social: 0.046 ± 0.057; two-tailed unpaired t-test: t(698) = 2.57, P = 0.010, g = 0.22, permutation P = 0.010). Overall this suggests that the amygdala lesion patient was more similar to controls when viewing social images.

Fig. 2
figure 2

Summary of similarity metric. a All images. Each dot represents the similarity metric of an image. Red: similarity between the amygdala lesion patient and the ASD group. Blue: similarity between amygdala lesion patient and controls. Individual values are shown on the left and average values are shown on the right. Error bars denote one SEM across images. Asterisks indicate significant difference using two-tailed paired t-test. ***P < 0.001. b Histogram of similarity metric of all images. c Social images. d Non-social images. e Images with more objects. f Images with fewer objects

In addition, the amygdala lesion patient had a more similar gaze pattern compared with controls rather than people with ASD for both images with more semantic objects (more than the median of 7 objects; Fig. 2e; similarity with controls: 0.48 ± 0.13, similarity with ASD: 0.43 ± 0.12; t(304) = 13.9, P = 2.05 × 10−34, g = 0.35, permutation P < 0.001) as well as images with fewer semantic objects (Fig. 2f; similarity with controls: 0.53 ± 0.13, similarity with ASD: 0.46 ± 0.12; t(394) = 21.3, P = 1.50 × 10−67, g = 0.52, permutation P < 0.001), although the difference in similarity metric was greater for images with fewer objects (more: 0.044 ± 0.055, fewer: 0.064 ± 0.060; two-tailed unpaired t-test: t(698) = 4.58, P = 5.42 × 10−6, g = 0.35, permutation P < 0.001), suggesting that amygdala lesion patient was more similar to controls when viewing images with fewer objects.

It is worth noting that the chance similarity by shuffling image order was 0.12 ± 0.10 for controls and 0.11 ± 0.10 for ASD, and the similarity with people with ASD was even slightly higher (t(699) = 8.66, P = 3.24 × 10−17, g = 0.11, permutation P = 0.040). This might be due to the fact that all participants started viewing from image center given preceding central fixation cross and people with ASD have previously been shown to have a stronger image center bias (Wang et al. 2015a).

Taken together, the amygdala lesion patient had a more similar gaze pattern compared with controls, but a less similar gaze pattern compared with people with ASD, suggesting that the amygdala lesion did not result in abnormal gaze pattern as is observed in people with ASD. The comparison between people with ASD and controls using this similarity measure is further shown in Supplementary Results.

Temporal Evolution of Gaze Similarity

The temporal dynamics of gaze similarity were characterized by analyzing the similarity in gaze density in six consecutive 500 ms bins. Similarity significantly dropped with time (Fig. 3a; two-way repeated measure ANOVA of time bin X participant category; main effect of time bin: F(5,6984) = 1864, P < 10−50, η2 = 0.53), suggesting that amygdala lesion patient had a different temporal scan path compared with people with ASD or controls. However, amygdala lesion patient was still more similar to controls than people with ASD for each bin (Fig. 3a; main effect of participant category: F(1,6984) = 25.3, P = 5.44 × 10−7, η2 = 0.0029; comparison for each time bin: all Ps < 0.001). Similar results were found for social images only (Fig. 3b), non-social images only (Fig. 3c), images with more objects (Fig. 3d), and images with fewer objects (Fig. 3e). Together, the present results suggest that although amygdala lesion patient had a different visual scan path, the gaze pattern was still more similar to controls, and this was independent of the image content.

Fig. 3
figure 3

Temporal evolution of similarity metric. a All images. b Social images. c Non-social images. d Images with more objects. e Images with fewer objects. Shaded area denotes one SEM across images. Asterisks indicate significant difference using two-tailed paired t-test. *P < 0.05, ***P < 0.01, and ***P < 0.001

Region of Interest (ROI) Analysis

Was the difference in similarity specific to certain semantic categories? One major advantage of the natural scene stimuli used in this study was that there was a broad range of different semantic categories that could be compared. An ROI analysis was conducted to examine the percentage of gazes for each semantic category. The semantic categories include face, emotion, touched, gazed, motion, sound, smell, taste, touch, text, watchability, and operability.

The amygdala lesion patient did not have a significant difference compared with controls for the majority of semantic categories (Fig. 4; except motion, watchability, and touched), but showed a higher percentage of gazes onto semantic ROIs compared with people with ASD (Fig. 4), again confirming the result that the amygdala lesion patient had a more similar gaze pattern compared with controls rather than people with ASD. When comparing between people with ASD and controls (Fig. 4), people with ASD had reduced saliency for semantic features, confirming the previous finding (Wang et al. 2015a).

Fig. 4
figure 4

Percentage of gaze density in each ROI. Note that because ROI categories might overlap with each other (e.g., “emotion” was a subset of “face”), the percentage did not add up to 100%. Error bars denote one SEM across images. Asterisks indicate significant difference using two-tailed paired t-test. +P < 0.1, *P < 0.05, ***P < 0.01, and ***P < 0.001. ns not significant

Discussion

In this study, gaze patterns were directly compared between an amygdala lesion patient and people with ASD when participants freely viewed static images of complex natural scenes. Although the amygdala lesion did not result in atypical visual saliency similar to that observed in people with ASD, it is still possible that more subtle malfunction of the amygdala could contribute to ASD, even though a bona fide lesion of the amygdala has no effect that bears similarity to ASD [see also (Paul et al. 2010)]. Autism spectrum disorders are well known to be highly heterogeneous at the biological and behavioral levels, and it is likely that there will be no single cause for the diverse symptoms defining autism (Happe et al. 2006). It is also important to note that a multitude of factors can fundamentally alter the behavioral manifestations of a lesion, including the etiology and developmental time course of the lesion, the extent of damage, the brain’s compensation following the damage, and the unique personality and set of life experiences of each individual lesion case (Feinstein et al. 2016). Although amygdala lesion patients in general show consistent responses [see (Terburg et al. 2012; Wang et al. 2015b, Wang et al. 2017)], the present findings will need to be replicated by additional amygdala lesion patients.

Human neuroimaging studies show that the typically developing amygdala continues to undergo substantial growth throughout childhood and well into adolescence (Schumann et al. 2004); and the amygdala may play a more critical role early in the developmental neurobiology of social and emotional orienting, but a less relevant role in this process in adulthood (Schumann et al. 2011). It is worth noting that our patient’s amygdala lesion is developmental in nature—calcification of the amygdala in UWD seems to start early in childhood and possibly even at birth, although this may depend on the type of ECM1 mutation, and slowly progresses over the course of adolescence and adulthood (van Honk et al. 2016). Due to the critical involvement of the amygdala in emotional and social learning, the behavioral presentation of a developmental lesion may differ from that of an adult-onset lesion (Feinstein et al. 2016; Hamann et al. 1996). In addition, the amygdala lesion patient might have compensatory function provided by other brain regions over time [e.g., (Becker et al. 2012)]. These may account for the present findings; and an important future direction is to replicate the present findings in patients with acute-onset amygdala lesions.

Faces are among the most commonly perceived visual stimuli and people preferentially attend to faces (Wang and Adolphs 2017). Abnormal face processing in autism has been hypothesized to arise from amygdala dysfunction (Baron-Cohen et al. 2000) and direct evidence supporting this hypothesis comes from both single-neuron recordings in the human amygdala (Rutishauser et al. 2013) as well as neuroimaging studies (Dalton et al. 2005; Kliemann et al. 2012). However, the present finding showed that an amygdala lesion did not result in abnormal gaze patterns as is seen in people with ASD when viewing images containing faces (Fig. 2c), nor reduced gazes onto faces (Fig. 4). One possible reason is that previous studies used isolated faces whereas the faces used in the present study were embedded in complex scenes with other competing visual information. Although the present study can not distinguish whether reduced attention to faces in people with ASD is due to competing visual information in the scene or reduced saliency of eyes as shown in the literature (Wang and Adolphs 2017), it offers a more ecologically relevant condition to study visual attention and exploration in autism and thus provides a platform to further investigate these possibilities.