Introduction

Individuals with autism spectrum disorder (ASD) display deficits in three main areas; communication, social interaction, and restrictive and repetitive behaviours and interests (American Psychiatric Association 2000). The social deficits that are characteristic of ASD are perhaps the most devastating (Volkmar et al. 1987; Wing and Gould 1979). Disturbances in emotion understanding and perception exemplify such social deficits. Understanding the perception of facial expressions in individuals with ASD thus helps us understand the primary deficits in ASD. The current study focuses on the strategies used in the perception of emotional facial expressions.

There is not currently a consensus regarding strengths and deficits in emotional face processing in autism (see Jemel et al. 2006, for review). Some studies have reported deficits in matching facial expressions (Celani et al. 1999) or intermodal perception of expressions (Loveland et al. 1995). Individuals with ASD appear to scan expressive faces differently than typical individuals (Pelphrey et al. 2002) and may focus more on individual features than typical individuals do (Klin et al. 2002). Gross (2004) demonstrated that children with ASD focused on the lower portions of the face when processing facial expressions, and performance of children with ASD did not improve when the whole face, rather than isolated parts, was presented, as it did for typical children. Spezio et al. (2007) reported that although high-functioning adults with ASD correctly identified facial expressions in a “Bubbles” task, which presents faces with only randomly selected visible parts, they used an abnormal strategy to complete the task. As a whole, this evidence suggests that individuals with ASD may rely on the individual features to a greater extent than typical individuals when processing emotional facial expressions.

Others have not found any group differences in expression processing (e.g., Ozonoff et al. 1990; Gepner et al. 2001). One possibility for the discrepancies in results across studies is that while typical individuals use specialized, automatic perceptual processes, those with ASD may employ an alternative, more deliberate strategy for processing faces and emotional facial expressions (see Harms et al. 2010, for review). McPartland et al. (2004) reported that individuals with ASD showed longer latencies for the face sensitive N170 ERP component for faces compared to objects, indicating slower information processing speed specific to faces. Behrmann et al. (2005) reported that while there were no group differences in accuracy between ASD participants and controls for identity and gender face processing tasks, the ASD participants had slower reaction times even though the authors employed several methods to control for overall group differences in processing speed. Capps et al. (1992) reported that although children with ASD did not demonstrate difficulties on tasks involving labeling emotions, they took longer to answer and their answers sounded rehearsed. Taken together, these studies suggest that individuals with ASD may use an atypical, less automatic strategy when processing facial expressions.

If typical emotion perception involves specialized perceptual processes such as an intuitive and automatic prototype matching strategy, while those with ASD lack this intuitive and automatic perception of facial expressions, those with ASD might rely on an alternative compensatory strategy to process facial expressions. For example, those with ASD might use a rule-based strategy whereby memorized lists of characteristics define emotional expressions, leading them to look for the presence of these specific characteristics when performing emotional perception tasks. For example, if the “rules” for sadness (corners of the mouth turned down, lowered eyebrows) are present in a face display, then the face is labeled sad. The more of these rules that are present or the more intense they are, the more likely a face is to be perceived as a better representation of a sad face.

An experimental design that manipulates the level of exaggeration in emotional facial expressions has been used to test the idea that individuals with ASD employ a rule-based strategy when processing emotional expressions, while typical individuals do not. Rutherford and McIntosh (2007) presented participants with stylistic drawings of faces displaying each of the six basic emotions (happy, sad, fear, anger, disgust, surprise). The faces within each emotion category varied on level of intensity, and included unnaturally exaggerated expressions. Participants were presented with pairs of faces displaying the same emotion at different levels of intensity and asked to choose the face that looked like a real person would look if they were really feeling that emotion. The results indicated that participants with ASD were more likely than typical participants to choose the most exaggerated faces. The authors suggested that participants with ASD were employing a deliberate rule-based emotion perception strategy whereas the typical participants were employing an intuitive prototype matching strategy. For participants with ASD, the more exaggerated faces better exemplified the “rules” or feature characteristics for each emotion and therefore were likely accepted as a better representation of the given emotion.

The current experiment is designed to replicate the findings of Rutherford and McIntosh (2007) using more ecologically valid stimuli and to provide stronger evidence that individuals with ASD use an alternative strategy when processing facial expressions. Specifically, we used a task similar to that of Rutherford and McIntosh (2007) here called the Emotions Task, to examine whether typical individuals and those with ASD employ the same types of strategies when perceiving emotional expressions of photographic images of faces. In addition, the current study has two new control tasks that are designed to further test the hypothesis that the rule-based strategy used by those with ASD to perceive emotional facial expressions is specific to categorizing expressions, and not used in face processing generally. In the Realism Task, we use the same photographs and procedure as the Emotions Task, but we ask participants to judge how realistic the faces are. We expect results in the Realism Task to differ from those in the Emotions Task, because the rule-based strategy, by hypothesis, is used to categorize emotions and would not be relevant to the Realism Task despite the fact that the faces show emotional expressions. We also added a Discrimination Task to ensure that all participants, particularly the ASD group, are able to discriminate between all the faces used the experiment. Results from this control task will allow us to ensure that any group differences found in the Emotions and Realism Tasks are not a result of ASD participants not being able to discriminate among the levels of expression exaggeration.

Methods

Materials

Photographs were taken of 3 male and 3 female undergraduate students displaying happy, sad, and neutral facial expressions. Using PsychoMorph face morphing software (Tiddeman et al. 2001, 2005) composite faces comprised of all six faces were created for each facial expression. The composite faces were created by first placing many individual point-landmarks on each individual face, then averaging the position of each landmark across the individual faces to create an average face shape, and finally averaging the colour content across the individual faces. From these composite, or 100 % expression, photographs, a continuum of varying intensity levels was created for each emotion. These were created by calculating the physical difference between the neutral and happy/sad composite faces and then extrapolating beyond the 100 % face in 50 % increments to create the different levels of exaggerated faces (Calder et al. 2000). Each continuum included 5 images; 100, 150, 200, 250, and 300 %. The eye and mouth areas from these faces were pasted onto the same neutral face outline to create face images used in the experiment (see Fig. 1) so as not to change the contour of the face between individual faces. The final set of stimuli was comprised of 10 greyscale images, cropped to 950 by 450 pixels. All tasks were presented and data was compiled in MatLab Student Version 7.4 on the same 17-inch desktop Macintosh Dual 2.7 GHz PowerPC G5 computer with OS X operating system. Participants sat with their chin in a chin rest 60 inches from the monitor.

Fig. 1
figure 1

100, 150, 200, 250 and 300 % happy and sad faces

Participants

Participants were 20 high-functioning adults (13 male, average age 26.85 years, range 18–39) with a diagnosis of autism or Asperger’s syndrome and 19 typical adults (16 male, average age 28.95, range 20–40). The groups did not differ in chronological age and IQ (see Table 1 for demographic information).

Table 1 Chronological age and IQ of participants

Participants with ASD were recruited from a local assisted living group home as well as from a database of individuals who had previously participated in research studies in our lab. The typical participants were recruited off-campus, via online advertising. Those with ASD were free from other medical conditions. The participants with ASD had been given a diagnosis of autism or Asperger’s by outside agencies, and were also evaluated using the ADOS-G (Lord et al. 2000) Module 4 to confirm diagnosis and group membership for this study, see Table 2. All participants had normal or corrected to normal vision. Participants were given a small honorarium for their participation in the study.

Table 2 ADOS scores for ASD participants

Procedure

Each participant completed the following tasks, in the following order: the Emotions Task, the Realism Task, and the Discrimination Task. For the Emotions Task, participants completed a forced-choice key press task for two side-by-side face images depicting two different levels of intensity of the same emotion. For the test trials, happy and sad faces were presented in separate blocks and the order of blocks was counter balanced across participants. Participants saw each possible pairing of the 100, 150, 200, 250, and 300 % images twice, to balance the side each level of exaggeration was presented on. In total there were 20 trials presented in a pseudo-random order for each block. For each trial, two images displaying the same expression were displayed side-by-side and the question “Which one looks like a REAL person looks if they feel happy/sad?” appeared at the top of the computer screen. Participants pressed the “A” key, which was labeled “Left” with a removable paper label for the face on the left and the “L” key, which was labeled “Right” for the face on the right. The images remained on the screen until the participant made their response, and then the next trial immediately began. Participants were told that all the faces were created by the computer, so none of them were actual real people’s faces, but they should chose the one that was closest to what real people look like when they feel that emotion.

Prior to the test trials, participants completed 16 practice trials consisting of happy, sad, and neutral faces each paired with each other and paired with each test question. Participants were required to obtain 75 % accuracy before moving onto the test trials and were allowed up to three attempts to reach this criterion. These trials were included to ensure that participants understood the question being asked and the key press procedure. All participants in both groups met the criterion on the first or second attempt.

The Realism Task was identical to the Emotions Task except that participants were asked; “Which face is the most realistic?”

The Discrimination Task was designed to evaluate whether participants were able to discriminate between the various levels of exaggeration. Participants completed a forced-choice key press task for three images that were displayed in a triangle configuration. The three faces displayed the same emotion, however two images were the same intensity level (e.g., 100 %) and one target image was a different level of intensity (e.g., 200 %). The location of the target image (the different exaggeration level) was randomly selected. The question “Which one is different from the other two?” appeared at the top of the computer screen. Participants pressed 1, 2, or 3 on the number pad. The face images remained on the screen until the participant made their response. The next trial began immediately after the participant made their response. Each intensity level was paired with each other level twice, once as the target face and once as the distractor, for a total of 20 trials for each block. Happy and sad images were presented in separate blocks and the order of blocks was the same as the participants had completed the Emotions and Realism Tasks and was counterbalanced across participants. The entire session lasted approximately 30 min.

Results

We analyzed responses to happy and sad facial expressions separately for the Emotions and Realism Tasks. The analysis was conducted on the number of trials participants chose a selection of the exaggerated faces; 150, 200, 250, and 300 % faces across all trials that each face was presented. Although our main focus was the influence of exaggeration of facial expressions on participants’ responses, we also analyzed participants’ election of the non-exaggerated faces (100 %) separately. This analysis was done to test our prediction that the ASD group would be more tolerant of exaggerated facial expressions in the Emotions Task, but not the Realism Task. We predicated that there would be group differences in the selection of the 100 % faces in the Emotions Task, but not the Realism Task. Specifically, we predicted that the typical participants would choose the 100 % face more often compared to the ASD participants. Because of the forced-choice design with every level paired with each other, each measure was not independent and therefore we could no conduct a single analysis including all levels.

Emotions Task

A 2 (emotion) × 4 (exaggeration level) × 2 (group) repeated measures mixed-model ANOVA was conducted on the number of times participants chose each exaggerated face (150, 200, 250, and 300 %) during the Emotions Task (a maximum of 8 trials). The results indicated a significant main effect of exaggeration level, F(1, 37) = 63.67, p < .001, as well as a significant interaction between exaggeration level and group, F(1, 37) = 9.32, p = .004; see Fig. 2. Follow up paired-samples t tests were conducted between groups, collapsed across emotions, corrected for multiple comparisons with a Bonferroni correction (α = .0125). These comparisons revealed significant group differences for the 150 % faces t(37) = −3.159, p = .003, where typical participants chose this face on more trials (M = 6.40, SD = .86) than the ASD participants did (M = 5.13, SD = 1.54), as well as the 300 % faces [t(37) = 2.979, p = .005] where the ASD participants chose these faces on more trials (M = 2.45, SD = 2.81) than the typical participants did (M = .45, SD = .83).

Fig. 2
figure 2

Number of trials participants chose each exaggerated face during Emotions Task, collapsed across happy and sad emotions. Error bars represent standard error of the mean

We conducted a linear trend analysis on participants’ responses for the exaggerated faces. We tested the hypothesis that participants’ responses followed the linear trend of decreasing as exaggeration level increased, which would indicate intolerance for exaggeration in the facial expressions. After confirming that both groups’ linear contrasts were significantly different from zero, i.e., changing across exaggeration levels (all p < .05), we conducted a 2 (expression) × 2 (group) mixed model repeated-measures ANOVA on the contrast scores associated with the linear comparison. The results indicated a significant effect of group, such that the typical group had significantly higher contrast scores (M = 9.88, SD = 2.78) compared to the ASD group (M = 4.60, SD = 7.04), t(37) = −3.05, p < .005, indicating that the typical groups’ responses were a closer fit to the expected linear trend. All other main effects and interactions were not significant.

We also examined the proportion of trials that participants chose the more exaggerated face of the pair across all trials. A 2 (emotion) × 2 (group) repeated measures mixed-model ANOVA revealed a significant main effect of group, F(1, 37) = 7.726, p = .009, but not emotion, F(1, 37) = .646, ns. Across emotions, ASD participants chose the more exaggerated face on more trials (M = .36, SD = .36) compared to the typical participants (M = .11, SD = .13); see Fig. 3. The interaction between group and emotion was not significant.

Fig. 3
figure 3

Proportion of trials participants chose the more exaggerated face during the Emotions Task. Error bars represent standard error of the mean

Additionally, we examined whether the number of participants in each group who were using a rule-based strategy more exclusively during the Emotions Task (i.e., choosing the more exaggerated face on 80 % or more of the trials) for either happy or sad faces differed between the ASD and typical groups. Seven out of 20 ASD participants chose the more exaggerated face on 80 % or more of the trials for either the happy or sad task compared to one out of 19 typical participants. A difference of proportions test (Blalock 1972) showed a significant difference between the two groups (z = −2.10, ϕ = .37, p = .01).

We also examined possible group differences in the number of times participants chose the non-exaggerated (100 %) faces. A 2 (expression) × 2 (group) mixed model repeated-measures ANOVA revealed a significant effect of group, F(1, 37) = 5.62, p = .023. Typical participants chose the 100 % face on more trials (M = 6.45, SD = 1.69) compared to the ASD participants (M = 4.45, SD = 3.28).

Realism Task

Parallel analyses were conducted on data from the Realism Task. A 2 (emotion) × 4 (exaggeration level) × 2 (group) repeated measures mixed-model ANOVA was conducted on the number of times participants chose each exaggerated face during the Realism Task (maximum of 8 trials). The analysis revealed a significant main effect of level of exaggeration [F(1, 37) = 11,491.06, p < .0001]; see Fig. 4. All other main effects and interactions were not significant.

Fig. 4
figure 4

Number of trials participants chose each exaggerated face during Realism Task, collapsed across happy and sad emotions. Error bars represent standard error of the mean

We conducted a linear trend analysis on participants’ responses for the exaggerated faces. We tested the hypothesis that participants’ responses followed the linear trend, decreasing as exaggeration level increased, which would indicate intolerance for exaggeration in the facial expressions. After confirming that both groups linear contrasts were significantly different from zero (all p < .05), we conducted a 2(expression) × 2 (group) mixed model repeated-measures ANOVA on participants contrast scores. The results indicated no significant main effects or interactions, all p > .364.

We examined possible group differences in the number of times participants chose the non-exaggerated (100 %) faces. A 2(expression) × 2(group) mixed model repeated-measures ANOVA revealed no significant main effects or interactions (all p > .149).

Additionally, we examined whether the number of participants in each group who were using a rule-based strategy more exclusively during the Realism Task (i.e., choosing the more exaggerated face on 80 % or more of the trials) for either happy or sad faces differed between the ASD and typical groups. No participants in either group chose the more exaggerated face on more than 80 % of the trials. In fact the highest proportion of trials that any participant in either group chose the more exaggerated face during the Realism Task was .5.

We also examined the proportion of trials that participants chose the more exaggerated face of the pair across the 20 trials (including those with the 100 % faces). A 2 (emotion) × 2 (group) repeated measures mixed-model ANOVA revealed no significant main effects or interactions.

Reaction Times on Emotions and Realism Tasks

To examine possible group differences in reaction time on the Emotions and Realism Tasks we compared participants’ median response time. One participant from the ASD group was excluded from this analysis for having median responses times more than 3 SDs above the group mean, leaving 18 ASD participants and 19 typical participants in this analysis. A 2 (task type) × 2 (emotion) × 2 (group) repeated measures mixed model ANOVA revealed a significant main effect of task type, F(1, 35) = 7.98, p = .008; see Fig. 5. Across the two groups and emotions, participants were significantly faster at the Realism Task (M = 1.78, SD = .768) compared to the Emotions Task (M = 2.82, SD = 1.06). The main effects of emotion and group were not significant, neither was the interaction between task type, emotion, and group (all p > .328).

Fig. 5
figure 5

Median response times, collapsed across emotions. Error bars represent standard error of the mean

Discrimination Task

Participants’ accuracy scores on the Discrimination Task were the proportion of trials on which they correctly chose the target image. A 2 (emotion) × 2 (group) repeated measures mixed model ANOVA was conducted on participants’ accuracy scores. This analysis revealed no significant main effect of emotion, F(1, 37) = 3.46, p = .071, or group, F(1, 37) = .026, p = .872. The interaction between emotion and group was also not significant F(1, 37) = 1.594, p = .215.

We collapsed participants’ accuracy scores across groups and conducted separate one-sample t tests for happy and sad faces to see if participants’ performed above chance, which would have been a proportion of .333. Participants’ accuracy scores were significantly different from chance for both happy [M = .956, SD = .075; t(38) = 52.26, p < .001] and sad [M = .94, SD = .078; t(38) = 48.01, p < .001].

Discussion

The results of this study show that individuals with ASD are more tolerant of exaggeration when asked to identify happy and sad facial expressions. In the Emotions Task, when participants were asked which face was a better representation of what people actually look like when they feel happy or sad, individuals with ASD chose the most exaggerated (300 %) face on more trials than the typical individuals and were also more likely to choose the more exaggerated face across all test trials than typical individuals. These results were present for happy and sad facial expressions. The linear trend analysis of participants’ responses for the exaggerated faces (150–300 %) also revealed group differences in tolerance for exaggeration. Typical participant’s responses showed a stronger linear trend compared ASD participants’ responses, indicating that the typical participants’ responses decreased more with increased exaggeration compared to the ASD group. The results from the difference of proportion analysis demonstrated that while not every participant in the ASD group chose the more exaggerated face on every trial, a greater number of participants in the ASD group chose the exaggerated face on 80 % or more trials, consistent with the idea that they were employing a more rule-based strategy, and doing so reliably. Taken together, the results of the current study are consistent with Rutherford and McIntosh’s (2007) results with schematic drawings of facial expressions.

There is at least one important difference between our results from the Emotions Task and the results reported by Rutherford and McIntosh (2007), who found that ASD participants’ responses consistently increased with increased exaggeration of facial expressions. In the current study, ASD participants’ responses decreased with exaggeration, though significantly less than controls’ responses decreased. Although ASD participants chose the more extremely exaggerated faces on more trials compared to typical participants, they did not exclusively chose the more exaggerated face as a better representation of the emotion expression. The differences between the two studies are most likely due to differences in stimuli: Rutherford and McIntosh used schematic line drawings, while we used photographs of real people, which are much more realistic than schematic line drawings.

The increased tolerance of exaggeration in happy and sad faces in the ASD group is consistent with the idea that these individuals rely to a greater extent than typical individuals on an explicit, rule-based strategy for processing these expressions. Such a strategy could lead an individual to base their judgments on the extent to which a particular defining feature, or rule, is present in the expression (e.g., the corners of the lips turned upward for happy expressions). Exaggerating the facial expression amplifies the presence of these defining characteristics, and would therefore make it a better example of that expression. If ASD participants are employing this type of strategy we would expect them to be more tolerant of the exaggerated facial expressions, choosing them more often than the control group does. Although the ASD group displayed this pattern of results for the Emotions Task, it is important to note that there were individual differences within the ASD group; not every participant chose the more exaggerated face on the majority of trials. However, more individuals in the ASD group appeared to be relying on this strategy, as revealed by the difference of proportion analysis, compared to typical individuals.

In contrast, if one is using a prototype matching strategy, then the more exaggerated faces will be further from the prototype and not consistent with what a typical happy or sad face would look like. After a certain level of exaggeration, the participants would reject the faces as being an appropriate representation of what people actually look like when they feel that particular emotion, because exaggerated face is not a compatible with their prototype for that expression. The typical participants appeared to use a prototype matching strategy, as they were less likely to choose the more exaggerated face of a trial pair in the Emotions Task.

Importantly, the current study revealed no group differences in performance in the Realism Task. The Realism Task was designed to examine whether individuals with ASD use an alternative rule-based strategy to process faces in a face perception task that does not require emotion perception, or if instead the rule-based processing is specific to expression processing. This task included the same faces as the Emotions Task, but required participants to choose the more realistic face. In this task, both groups were similarly intolerant of exaggeration, suggesting that the ASD individuals recognize that the more exaggerated faces are not realistic. The contrasting results of the Emotions and Realism Tasks further supports the hypothesis that more individuals with ASD are using a rule-based strategy when processing the emotional content of faces, but not other aspects of the face such as how realistic it is.

The results of the Discrimination Task provide further evidence that the group differences in the Emotions Task are not an artifact of the autism group not being able to perceive the physical differences between the levels of exaggeration. For both happy and sad faces, the ASD and typical participants were able to successfully discriminate between each level of exaggeration. These results give further support that the results of the Emotions Task are due to each group employing qualitatively different strategies rather than an inability to discriminate the stimuli.

The notion that individuals with ASD develop an alternative compensatory strategy for processing facial expressions is in line with the current models of the social deficits characteristic of ASD, such as the social orienting hypothesis (Dawson et al. 1998). The social orienting hypothesis suggests that individuals developing with ASD fail to attend to social stimuli in their environment (e.g., faces) from an early age. This leads to these individuals missing crucial social information necessary for more complex social cognitive processes that emerge later in development, such as the perception of facial expressions. If individuals with ASD lack the early social experience that lead to the ability to automatically process facial expressions as typical individuals do, they maybe able to develop an alternative strategy that relies on explicitly learning the “rules” for each facial expression. For example, over time one might learn that happy faces always have the corners of the lips turn up, whereas sad faces always have the corners of the lips turned down. Although the current study did not directly examine why individuals with ASD use a rule-based strategy or why it develops in individuals with ASD, the results clearly show that they use this alternative strategy when processing the emotional content of faces, but not other information, such as how realistic a face is. These findings may be useful for those developing treatment and intervention focused on improving facial expression processing skills, as they suggest that explicit teaching of the characteristics of each facial expression may be necessary for those with ASD.

The idea that individuals with ASD use a different type of strategy to process facial expressions may explain, at least to some extent, why some studies have found group differences in emotional perception (Celani et al. 1999) and others have not (Castelli 2005; Spezio et al. 2007). In their review of dozens of studies examining facial expression processing in individuals with ASD, Harms et al. (2010) noted that along with demographic factors, differences in task demands across studies might account for the majority of discrepancies in results. Experimental paradigms that make expression processing more difficult (e.g., short presentation time, or presenting inconsistent information) may limit individuals’ ability to use more practiced, cognitive based strategies, which may exaggerate group differences in performance. In contrast, other paradigms may facilitate ASD participants’ use of their alternative strategy and show similar results in performance as typical individuals.

One might expect that if individuals with ASD rely to a greater extent than typical individuals on a rule-based strategy to process facial expressions, the greater cognitive demands of such a strategy would lead to greater response times in the ASD group. However, we observed no group differences in response times in the current study. It may be the case that with sufficient experience applying this strategy to process facial expressions, high functioning adults with ASD become very efficient at employing this strategy. One possible limitation in the current study is that although the stimuli used in this study were created from photographs of real people, the final stimuli only displayed the internal features of the face, which may not match the way faces are typically viewed in the real world. Future research could repeat the current study using faces that include the hair and external contour of the faces in order to more closely match the way faces are seen in the real world. Although we would not predict that excluding these features would influence individuals’ processing of the facial expressions, it would be an important follow up to ensure the results truly generalize to how faces are processed in the real world. Also, the faces in the current study were static photographs. To further improve the ecological validity, future studies should examine whether the same results are found with dynamic faces.

In summary, the current study demonstrated that individuals with ASD are more tolerant of exaggeration in happy and sad facial expressions of naturalistic faces. Participants with ASD were more likely than typical participants to choose the most extremely exaggerated facial expression (300 %) as an accurate representation of what people actually look like when they feel happy or sad. The ASD group’s tolerance for exaggeration could result in greater reliance on a rule-based strategy to complete the task compared to typical participants. Interestingly, when asked about how realistic the same faces were, there were no group differences; both groups were intolerant of exaggeration of the facial expressions. This result suggests that group differences in strategies employed during face perception are limited to tasks that focus on the emotional content of faces rather than other physical characteristics.