Introduction

From infancy and throughout development, typically developing children monitor others’ attention and behaviour towards objects, and respond to others’ gaze and motor behaviour directed towards objects by proactively looking to the area or object that the person is likely to act upon (Bruner 1995; Flanagan and Johansson 2003; Falck-Ytter et al. 2006; Biro 2012). According to many scholars, this behaviour reflects an appreciation of the goal-directedness or ‘aboutness’ of the agent’s behaviour, thus being an important organizer of social cognition, and a foundation to the development of more sophisticated processes underlying social understanding and social reciprocity (Frith and Frith 2003; Csibra and Gergely 2007; Poulin-Dubois et al. 2009; Cannon and Woodward 2012; Cannon et al. 2012).

Against this background, a number of research studies have focused on the ability to monitor and understand the goal-directed nature of others’ behaviour in autism spectrum disorders (ASD), a group of conditions characterized by impaired social communication and behavioural rigidity (American Psychiatric Association 1994; Lord and Jones 2012). Overall, it appears that individuals with ASD have difficulties interpreting the goals of an agent’s actions in tasks involving an appreciation of his or her mental state (e.g. understanding that the agent intends to grasp a block because he or she wants to build a tower) (Cattaneo et al. 2007; Boria et al. 2009; Vivanti et al. 2011). However, several studies report intact performance in tasks in which individuals with ASD are asked to understand or predict the immediate outcome of a motor action (e.g. understanding that when the agent is moving the hand towards a block, he or she is going to grasp the block). To illustrate, two studies have documented that children with ASD, just like typically developing controls, can reproduce an agent’s goal-directed action when observing his or her failed attempt to perform an action (Aldridge et al. 2000; Carpenter et al. 2001). Moreover, a recent study using an eye-tracking paradigm showed that young children with ASD predict upcoming goals of motor actions by proactively looking at the goal sites before the action is completed (Falck-Ytter 2010).

However, a number of studies have reported that individuals with ASD are impaired in their ability to monitor others’ gaze and motor behaviour to detect goals and intentions (Vivanti et al. 2011) as well as in the ability to simply follow an agent’s gaze towards the target (e.g. (Baron-Cohen et al. 1997; Leekam et al. 2000; Leekam and Ramsden 2006). It remains unclear from these studies whether these abnormalities reflect a lack of understanding of the goal-directed nature of gaze, or just a general lack of attention/interest in faces and in others’ behaviour. To date, no studies have examined the impact that reduced attention to social cues (e.g. the agent’s gaze direction, gestures and actions) may have on children’s ability to appreciate the goal-directed nature of others’ actions.

This issue is relevant, as there is evidence that individuals with ASD, in particular those in the higher functioning end of the spectrum, might use atypical/compensatory strategies when processing others’ actions, including diminished reliance on social cues such as changes in gaze direction and increased reliance on non-social information (e.g. objects’ standard use; Vivanti et al. 2011; Boria et al. 2009). The appreciation of social cues, in particular gaze direction, is considered to be crucial for interpreting others’ behaviours (Phillips et al. 2002) and is very likely to play an important role in understanding the immediate goal of observed actions in everyday life (e.g. indicating which one of the two items an agent is going to grasp). As previous studies have not addressed this issue, it is important to determine whether attention to changes in gaze direction, or lack of thereof, in children with ASD, affects the ability to understand an agent’s action goals. Moreover, as previous research has involved higher functioning individuals on the autism spectrum, it is important to investigate goal understanding and its link to social attention in a representative sample of the ASD population, as some of the ASD-specific difficulties might be masked by the use of compensatory strategies in the higher functioning population (Rutherford and Troje 2012).

In this study, we investigated the ability to monitor and respond to changes in the agent’s behaviour that are relevant to understand the goal of her action, using a task in which appreciation of the agent’s gaze direction was necessary to understand the immediate goal of her action. Young children with ASD were compared to chronological and developmental age-matched children without ASD using an eye-tracking paradigm involving the observation of an uncompleted grasping action directed to one of the two visible items. In one condition (head-turning condition), the agent turned her head towards the target of the action as she moved her hands, while in another condition (neutral condition), she did not. We hypothesized that (1) participants without ASD would look proactively at the action’s target in the head-turning conditions but not in the neutral condition; (2) participants with ASD would not look proactively at the action’s target in either condition; and (3) that proactive looking to the target would be related to the observation of the agent’s head turning across the samples.

Methods

Participants

The participants were 24 preschoolers with ASD (ASD group) with mixed cognitive abilities. A further 24 preschoolers without ASD were tested, consisting of 17 children with a global developmental delay (GDD) and 7 with typical development (non-ASD group). Participants’ developmental age was measured with the Mullen Scales of Early Learning (MSEL) (Mullen 1992), which provide age equivalent scores in four subscales: visual reception, fine motor, receptive language and expressive language. Overall developmental age was calculated as the mean age equivalent score on the four MSEL subscales. A pairwise matching procedure was used, with each participant with ASD individually matched to a comparison participant on developmental age. At the group level, the ASD and the non-ASD group were also matched for chronological age and for developmental age on each subscale of the MSEL (see Table 1). Participants with ASD were recruited through the Victorian Autism Specific Early Learning and Care Centre (Victorian ASELCC), an autism-specific programme located at the La Trobe University Community Children’s Centre. Participants in the control group with GDD were recruited at Kalparrin Early Intervention Centre, a community Early Childhood Intervention Program serving young children with a developmental disability. The seven participants in the control group with typical development were recruited through the La Trobe University Community Children Centre.

Table 1 Participants characteristics

The diagnoses of ASD were previously made by community-based health care professionals and were confirmed by for the study using the social communication questionnaire (SCQ) (Rutter et al. 2003) completed by a parent and the Autism Diagnostic Observation Schedule (ADOS) (Lord et al. 2000) administered by a clinician with demonstrated reliability in the use of this measure. Three children met ADOS criteria for ASD and 21 met criteria for Autistic Disorder. Exclusionary criteria for the ASD group included the presence of a genetic or metabolic disorder known to cause autistic-like features (e.g. fragile X syndrome or tuberous sclerosis), and the presence of a major medical problem. Participants in the control group with GDD had also all been assessed by professionals in the community and deemed eligible for early intervention services on the basis of presenting with GDD, defined as ‘significant delay in two or more of the following developmental domains: gross/fine motor, speech/language, cognition, social/personal, and activities of daily living’ (Shevell et al. 2003). Exclusionary criteria for these participants included the presence of autistic features as assessed through the SCQ. Exclusion criteria for participants with typical development in the non-ASD group included a known history of developmental or medical conditions. Participants’ characteristics are presented in Table 1.

Apparatus and stimuli

We tested our hypotheses using a modified version of the predictive gaze paradigm (Falck-Ytter et al. 2006), which involved measuring whether participants show anticipatory gaze to the target of observed actions. A series of six video stimuli were shown on a 60-Hz Tobii 1750 binocular eye-tracker monitor with an imbedded camera (1,024 pixels resolution, average precision of 0.5 of visual angle). The stimuli consisted of an agent conducting a 6-s video demonstration of an incomplete action. All videos featured the same female actor seated on a couch, with her arms behind her head. Two objects, similar in shape and dimensions, were each located on the right and on the left, within the actor’s reach. There were two conditions: in each condition, after one second, the actor simultaneously moved her arms from behind her head towards the sides, where the two objects were positioned. The videos ended as the actor’s hands moved close to the objects. Importantly, the actor’s hands were not pre-shaped in a grasping action, and there was no contact between hands and objects, so that the observer could not use the grip information or the interaction between hands and objects to predict the goal of the ongoing action. Three different pairs of objects were used in different trials (two books, two cones and two balls).

In the neutral condition, the actor’s head remained still, fixating on a point above the camera (see Fig. 1) while she moved her hands towards the objects. In the head-turning condition, the video was exactly the same as in the neutral condition except that the actor turned her head towards one of the two objects as she moved her hands (see Fig. 2). We reasoned that in the neutral condition, the actor’s movements (hands moving towards two objects) would not be interpreted as goal-directed (as no information was provided on whether she intended to touch either of the objects). In the head-turning condition, the actor’s head turned towards one of the objects indicating that one of the objects was the target of her hand movement.

Fig. 1
figure 1

Neutral condition. The actor moves her arms towards the two objects. Her head remains still. Areas of interest are marked

Fig. 2
figure 2

Head-turning condition. The actor moves her arms towards the two objects. Her head turns towards one of the objects. Areas of interest are marked

Procedure

The study was approved by the La Trobe University Human Ethics Committee, and informed consent was obtained from all participants’ parent/s. All participants were tested in a quiet room at their respective Centres. The length of experimental testing was approximately 10 min; the current experiment was part a longer session of experimental testing.

Participants were seated in a comfortable chair 60 cm from the monitor. No specific instruction was given. The session began with a 5-point calibration procedure that was saved and used for the entire protocol. After calibration was obtained, participants passively viewed the video clips. The trials were presented in one of the two fixed random orders, and videos were interspersed with filler stimuli to maintain attention.

During observation of the video clips, participants’ eye movements were recorded to determine whether they were gazing to the target versus the other object (competitor object). Data were analysed using frame-by-frame-defined areas of interest using Tobii Studio analysis software. Fixation criteria were set to Tobii Studio defaults of a 30-pixel dispersion threshold for 100 ms. Since all the videos stopped before the grasping action was accomplished, the proportion of fixations to the target versus the competitor object (during observation of the actions) was used as a measure of goal understanding. Visual attention to the actor’s face and to her action directed to the target was also measured (see Figs. 1, 2). The average proportion of fixations to the areas of interest across the trials was calculated to obtain an average proportion of gaze to the target, to the actor’s face and to her actions for each condition.

Results

Deviations in kurtosis and skewness from the normal distribution curve were tested for all variables following guidelines set by Tabachnick and Fidell (Tabachnick and Fidell 1996), and no violation of normality was identified. Therefore, study hypotheses were tested via parametric analyses.

Each participant’s proportion of time looking at the target object is presented in Fig. 3. Results of a 2 (Group) × 2 (Condition) ANOVA show a main effect of Condition, F (1, 46) = 15.31, p < .001, η 2 = .25, and no main effect for Group (F = .35, p = .55). There was also a significant Group × Condition interaction (F = 4.81, p < .05, η 2 = .09). Pairwise comparisons (using Bonferroni correction) show that the non-ASD group gazed proactively to the target significantly more in the head-turning condition compared to the neutral condition (p < .001, η 2 = .29), while this was not the case in the ASD group (p > .1, η 2 = .05). Moreover, attention to the target in the neutral condition was similar in the two groups (p > .1, η 2 = .02), while in the head-turning condition, participants without ASD looked more at the target compared to those in the ASD group (p = .05, η 2 = .08).

Fig. 3
figure 3

Proportion of visual attention to the target. Participants in the non-ASD group, unlike those in the ASD group, significantly increase their attention to the target in the head-turning condition compared to the neutral condition. *p < .05

The proportion of visual attention to the actor’s face increased in the head-turning condition compared to the neutral condition in the non-ASD group only, as seen in Fig. 4, with the 2 (Group) × 2 (Condition) ANOVA indicating a significant Group × Condition interaction, F (1, 46) = 6.06, p = .01, η 2 = .11. There was no main effect of condition (F = 1.03, p = .31) or Group (F = .75, p = .39). Pairwise comparisons (using Bonferroni correction) show that the non-ASD group gazed proactively to the actor’s face significantly more in the head-turning condition compared to the neutral condition (p = .01, η 2 = .11), while this was not the case in the ASD group (p > .1, η 2 = .02). Moreover, attention to the actor’s face in the neutral condition was similar in the two groups (p > .1, η 2 = .01), while in the head-turning condition, participants without ASD looked more at the target compared to those in the ASD group (p < .05, η 2 = .09).

Fig. 4
figure 4

Proportion of visual attention to the actor’s face. Participants in the non-ASD group, unlike those in the ASD group, significantly increase their attention to the face in the head-turning condition compared to the neutral condition. *p < .05

Finally, the proportion of visual attention to the actor’s grasping action was submitted to a 2 (Group) × 2 (Condition) ANOVA, with results showing a significant effect of condition [F(1,46) = 5.47, p < .05, η 2 = .1]. The main effect of Group was not reliable (F = 3.38, p = .07). There was a Group × Condition interaction (F = 4.50, p < .05, η 2 = .09), suggesting that attention to the actors’ action decreased in the head-turning condition, compared to the neutral condition, in the non-ASD group only (see Fig. 5).

Fig. 5
figure 5

Proportion of visual attention to the actor’s action. Participants in the non-ASD group decreased their attention to the action directed to the target in the head-turning condition compared to the neutral condition

No correlations were found between predictive gaze to the target in the head-turning condition and attention to the actor’s face and to her action, or between the visual fixation measures and the Mullen or SCQ scores in either group, or with the ADOS scores in the ASD group.

Discussion

In this study, we investigated whether children with ASD and those without ASD (matched for chronological and mental age) differ in their ability to monitor and respond to an agent’s goal-directed gaze behaviour by proactively looking at the target of her action. We found that children without ASD increased their attention to an agent’s face, decreased their attention to her action and looked more often to the most likely target of her action when her gaze behaviour signalled ‘goal-directedness’ and her gaze direction was critical to determine the target. Conversely, participants with ASD failed to show changes in their attention pattern in response to the agent’s goal-directed gaze behaviour. Importantly, in our study, the actor’s goal (which object she was likely to act upon) could not be predicted on the basis of her motor actions alone, because she moved her two hands simultaneously towards two different objects. Furthermore, the actor’s goal could not be predicted on the basis of interaction between her and the target object, because each video ended before the grasping action was completed. The only available information for predicting the actor’s goal was her head turning towards the target. Consistent with our hypotheses, in the neutral condition, (in which the actor did not turn her head towards either object), participants in both groups looked at the scene in a similar way. In the head-turning condition, however, children without an ASD responded to the actor’s goal-directed behaviour by increasing their attention to the actor’s face, decreasing their attention to her action and gazing to the correct target of her action. This suggests that the appreciation of the goal-directed nature of the agent’s behaviour resulted in processing her behaviour predictively rather then reactively (as reflected in the increased focus on the future target of her action and the decreased focus on the action in itself) in children without an ASD. This appeared not to be the case in the ASD group.

We believe that these results are important for several reasons. First, this study adds to current literature indicating that difficulties in various processes that are foundational to social understanding and social learning in ASD might originate from differences in social attention. Previous studies indicated that lack of attention to relevant social stimuli in ASD affects sophisticated social understanding processes (e.g. understanding others’ intentions and emotions (Vivanti et al. 2011; Nuske et al. 2013), and the current study suggests that this might be the case for a basic appreciation of goal-directedness in others’ behaviour. Given the relevance of a goal-directed interpretation of others’ behaviour for social cognitive development and learning (Tomasello et al. 2005; Csibra and Gergely 2007; Vivanti et al. 2013a, b) (Carpenter 2010), understanding the mechanisms associated with difficulties in this type of task is of utmost relevance. The discrepancy with previous results indicating normative predictive eye movements and goal understanding in ASD might be explained with the ability to derive actions’ goals using different information, such as the kinematics of hand–object interaction in this population. It is therefore possible that difficulties in goal understanding in ASD emerge when social cues (such as gaze direction) become relevant to interpret or disambiguate observed actions. In everyday life, people’s motor actions are often ambiguous and additional information is needed to infer the goals (Beer and Ochsner 2006). Previous research shows that children, from infancy on, in the face of incomplete information, use social cues to solve ambiguity and make sense of what is happening (Striano and Vaish 2006). Our results suggest that this mechanism might be impaired in ASD.

Secondly, the study hypotheses were tested in a representative sample of young children with ASD involving both severely affected and mildly affected children. While previous research in this area has been almost entirely focused on higher functioning individuals with ASD, there is an increasing recognition from the scientific community of the necessity to involve samples that reflect the spectrum of severity of the ASD population (Dyckens and Lense 2011; Vivanti et al. 2013a, b). We addressed this issue by recruiting children with different levels of abilities and children with no history or current symptoms of ASD who were carefully matched on verbal and non-verbal cognitive level. The use of eye-tracking technology enabled us to include children who would otherwise have been untestable with most paradigms.

In conclusion, this study suggests that children with ASD, compared to children without ASD, show abnormalities in monitoring and responding to others’ goal-directed behaviour (gaze and actions directed towards an object). These difficulties appear to be related to a diminished attention to relevant social cues.

Given the nature of our task, it is possible that the increased attention to the target in the head-turning condition shown by participants without ASD merely reflects the tendency to orient in the same direction in which the actor’s head is oriented, without any appreciation of the action’s goal. However, if this were the case, then participants would have increased their attention to the first item on their scan path (see Moll and Tomasello 2004). Our data, however, show that attention to the actor’s hand (a stimulus congruent with the actor’s gaze direction) was decreased, rather than increased, in the head-turning condition, suggesting that our results reflect an appreciation of goal-directedness, rather than a purely ‘geometrical’ orientation in the same direction of the head turn. However, more research is needed to further investigate the goal-directed versus reflexive/geometrical nature of gaze following in children with and without ASD.

A limitation in the current study is that the sample size was relatively small. However, the logistical difficulties with conducting eye-tracking research with severely disabled children, including recruiting and working with this population involve additional costs and accommodations. Thus, it remains important that our results are replicated in other samples of children with ASD. Future research should focus on understanding the mechanisms underlying diminished social attention in large samples of children with and without ASD, and the impact of such difficulties in early emerging processes supporting social understanding and social learning.