Introduction

The etiology of the neurodevelopmental disorder, Autism Spectrum Disorder (ASD), likely involves multiple genes interacting with each other rather than a chromosomal abnormality of a single gene (Abrahams and Geschwind 2008; Geschwind and Alarcón 2006). According to this model, it is predicted that individual genetic risk factors are more closely related to specific features or components of ASD (i.e., endophenotypes) rather than ASD itself. Defining and measuring relevant endophenotypes is an important step in advancing our understanding of the etiology of ASD. Ideally, such endophenotypes should (1) be associated with known behavioral characteristics of ASD, (2) be closely related to neural mechanisms that underlie these behavioral characteristics, (3) be more prevalent in family members of individuals with ASD than individuals without a family history of ASD, and (4) be on a continuum with such traits in the general population (Geschwind and Alarcón 2006; Abrahams and Geschwind 2008; Pelphrey et al. 2011). Finally, to enable family-wide genetic linkage studies, endophenotype measures should be valid across the life span, for both individuals with and without a diagnosis of ASD.

Several endophenotypes have been proposed for ASD. Examples include: increased brain volume (Courchesne and Pierce 2005; Palmen et al. 2005), language delay or impairment (Spence et al. 2006), impaired facial expression identification (Spencer et al. 2011), deficits in executive function (Happé et al. 2001), and deficits in social cognitive skills like theory of mind. Theory of mind is the ability to attribute mental states to others, a developmental skill that has been found to be a deficit in children with ASD (Baron-Cohen et al. 1985; Heerey et al. 2003; Tager-Flusberg 1999). Lower theory of mind scores have also been reported in some parents of children with ASD (Losh et al. 2009), siblings of children with ASD (Dorris et al. 2004) (although another study did not report deficits in siblings of children with ASD; Shaked et al. 2006), and typically developing children who display subclinical traits associated with ASD (Ronald et al. 2006). The current study utilizes a candidate endophenotype measure that takes advantage of modern eye-tracking technology to evaluate how individuals allocate their attention when viewing social video clips that elicit the coordination of attention with another person.

An important milestone in children’s early social-communicative development is the ability to coordinate interest in external objects or events with other people. Initiations of Joint Attention (IJA) emerge typically between 9 and 15 months and include child behaviors used to direct others’ attention to external objects or events (Bakeman and Adamson 1984; Carpenter et al. 1998). Early IJA behaviors include pointing, showing, or alternating gaze between an object and another person’s eyes. The origin of children’s ability to respond to others’ bids for joint attention (RJA) typically precedes the origin of IJA. At 3 months of age infants are able to follow the gaze direction of an adult 60 % of the time when the adult is in close proximity and gazes at one of two objects directly in front of the infant (D’Entremont 2000). Performance during this task increases to over 80 % when infants reach 5 months of age (D’Entremont 2000). Research by Striano and Stahl (2005) found that 3 month old infants looked longer at an object while an adult was also looking at the object in comparison to periods of face to face interaction, but it wasn't until 9 months when the infants were able to follow gaze of the adult during periods of coordinated joint attention (where the adult looked at the object, then regained eye contact with the infant before looking to the object again). Other research has found that around 9 months of age infants will follow the gaze direction of an adult when the target is outside the infant’s immediate visual field (Moore 2008).

Interestingly, it is not until 12 months of age when infants will follow an adult’s gaze to objects that the infant cannot readily see, including objects behind the infant (Deák et al. 2000) and objects behind a barrier (Moll and Tomasello 2004). Tomasello (1995) proposes that it is right around this time when infants begin to realize that other people are intentional agents that have their own goals. It is this new ‘mind reading’ ability that sets the stage for development of more complex social skills like language learning and theory of mind. Experimental studies have found that infants as young as 15–18 months (Houston-Price et al. 2006; Baldwin 1993a, b) use gaze cues to learn novel object labels. Further, longitudinal studies have found that joint attention around 20 months of age predicts theory of mind abilities at 27–30 months (Nelson et al. 2008) and 44 months (Charman et al. 2000).

Spontaneous gaze following in typically developing preschool and school aged children has received little attention in the literature. However, preschoolers with ASD show a characteristic deficit in RJA (Sigman et al. 2004; Mundy et al. 1986; Loveland and Landry 1986). Moreover, several prospective longitudinal studies of children with ASD have shown that individual differences in RJA reliably predict children’s subsequent language acquisition (Mundy et al. 1990; Sigman and Ruskin 1999; Sigman and McGovern 2005; Anderson et al. 2007; Siller and Sigman 2008; Sullivan et al. 2007).

Research from recent prospective longitudinal studies of baby-siblings of children with ASD (Sibs-ASD) has shown that deficits in RJA emerge very early in development. Sibs-ASD can be distinguished from baby siblings of typically developing children (Sibs-TD) as early as 12 months on a response to name measure (Nadig et al. 2007), and 18 months of age on a point following measure (Cassel et al. 2007). In a study comparing Sibs-ASD who later received a diagnosis of ASD to unaffected Sibs-ASD, RJA performance as early as 14 months predicted ASD outcome at 3 years (Sullivan et al. 2007). In a similar study of Sibs-ASD, Yoder et al. (2009) reported measures of RJA collected as early as 15 months reliably predicted diagnostic outcomes at 3 years. Despite this growing body of research on RJA, it remains unclear whether alterations in spontaneous gaze following are found in older typically developing children who display traits of the broad autism phenotype.

The present study introduces an eye-tracking paradigm that evaluates how individuals allocate their visual attention when viewing social video clips that display an adult model who is gazing at a series of targets that appear and disappear in the four corners of the screen (congruent condition). Gaze allocation in the experimental condition is compared to a set of control stimuli where the model’s gaze moves equally as often but is not directed at the appearing/disappearing targets (incongruent condition). When evaluating children’s eye gaze behavior, two dependent variables were considered: gaze time allocation to the model’s face and gaze time allocation to the target.

Gaze following abilities in older, typically developing individuals are often studied by evaluating reaction times in the context of a spatial cuing paradigm (Posner 1980; Frischen et al. 2007). In this research, participants are shown a centrally located face that gazes either left or right; after a time lag a target stimulus appears on either side of the face and participants are tasked with pressing a key once they have visually located the target stimulus. Studies have found that reaction times are faster when the target appears in a gazed-at location when compared to a non-gazed-at location (Friesen and Kingstone 1998; Driver et al. 1999). Importantly, several recent studies involving clinical samples have failed to identify a gaze cueing deficit in individuals with ASD (Swettenham et al. 2003; Senju et al. 2004; Kylliäinen and Hietanen 2004). One possible interpretation for the lack of significant group differences between individuals with ASD and matched controls is that most stimuli used as part of spatial cuing paradigms are gray scale photographs or computer generated faces. In a recent eye-tracking study that systematically varied the nature of the stimuli, Speer et al. (2007) reported that the differences in fixation duration between individuals with ASD and matched controls were only significant for dynamic videos of social scenes, but not for static photos of isolated individuals.

In addition to using dynamic videos as stimuli, and participant eye gaze (rather than reaction time) as response, the current eye-tracking paradigm differs from a spatial cuing paradigm in the timing of specific events. That is, in our eye-tracking paradigm, the sequence of events starts with the appearance of a target object followed by the model’s gaze switch. A similar timing sequence to the current study has been used in two separate fMRI studies, and results indicated that brain areas associated with social cognition (e.g., ventral medial frontal cortex; Williams et al. 2005) and the processing of gaze shifts (e.g., superior temporal sulcus; Pelphrey et al. 2003) were activated when the model viewed the target, but not when she gazed elsewhere. In contrast, during a spatial cuing paradigm, the gaze switch occurs first, followed by the appearance of the target.

By administering our novel eye-tracking measure to a sample of typically developing children, we will test the feasibility of administering this measure to children across a broad age spectrum. We will also test two specific hypotheses. First, we hypothesize specific differences in gaze time allocation between the experimental and control condition. Specifically, we predict that children allocate more gaze time to the model’s face in the incongruent as compared to the congruent condition. Similarly, we predict that children allocated more gaze time to the target in the congruent as compared to the incongruent condition. Second, we hypothesize that these differences in gaze time allocation between the experimental and control condition are moderated by the presence or absence of subclinical symptoms of the broad autism phenotype (BAP). Specifically, we predict that differences in gaze time allocation between the experimental and control condition are more pronounced for children with low levels of BAP, and less pronounced for children with high levels of BAP.

Methods

Participants

Between mid-2010 and mid-2011, 50 typically developing children (27 boys and 23 girls) between the ages of 3 and 9 years participated in this study. Families were recruited through local print and electronic advertisement. Participants were tested during a single laboratory visit at Hunter College, City University of New York. Parental informed consent and child assent was attained from all participants and the study was conducted in accordance of the Institutional Review Board at Hunter College, City University of New York. Mothers ranged in age between 24 and 49 years (M = 33.39, SD = 5.97) and varied considerably in their educational attainment: 19.6 % had an advanced degree (e.g. master’s or doctoral degree), 17.6 % had completed a standard college degree (e.g. bachelor’s degree), 43.2 % had a high school diploma, 7.8 % had completed 11th grade. Educational information was unavailable for 11.8 % of the mothers in our sample. The sample was predominantly Hispanic (33 %) and African American (33 %) but included groups of children with Asian (16 %), European American (12 %), and mixed (6 %) ethnic and racial origin. Finally, 25 % of the sample reported an annual household income of less than $20,000/year, 50 % of the sample reported incomes of less than $40,000/year, and 75 % of the sample reported incomes of less than $70,000/year. The diversity within this sample in terms of children’s ethnicity/race as well as the families’ annual household income is largely representative of the local community (New York City, NY, USA).

Procedures

Assessment sessions lasted 120 min and included the administration of our eye-tracking measure of joint attention and a series of developmental and language assessments. In addition parents completed a questionnaire about sub-clinical behaviors associated with the broad autism phenotype (Social Responsiveness Scale, SRS; Constantino 2002). Finally, parents completed a medical history questionnaire intended to screen for a broad range of neurodevelopmental disorders. All of the children in this sample were free of genetic conditions (i.e. Fragile X Syndrome, Down Syndrome, Tuberous Sclerosis, Angelman Syndrome) and developmental disorders (i.e. autism spectrum disorder, pervasive developmental disorder-not otherwise specified, Aperger’s syndrome). One child had previously received a diagnosis of attention deficit disorder. Two children (4 %) had a first degree relative, and six children (12 %) had a second degree relative with an autism spectrum disorder. Likewise, seven children (14 %) had a first degree relative, and six children (12 %) had a second degree relative with a learning disability or speech delay requiring therapy.

Assessment of Developmental and Language Skills

Children participated in a series of standardized developmental assessments. Non-verbal cognitive skills were evaluated with the Differential Abilities Scales II Special Nonverbal Composite (DAS II; Elliott 1990). The Special Nonverbal Composite is derived from four subscales (Picture Similarities, Matrices, Pattern Construction, and Copying) that each yields an age equivalent and standard score from which we calculated a composite non-verbal mental age and a non-verbal mental standard score. Receptive language was measured using the Peabody Picture Vocabulary Test-4 (PPVT-4; Dunn and Dunn 2007). In this untimed standardized test children are shown four color illustrations per page, the examiner then delivers a stimulus word, and the child indicates what picture corresponds to the stimulus word by pointing or verbally responding with the picture number. Expressive language was evaluated using the Expressive One-Word Picture Vocabulary Test, 4th Edition (EOWPVT-4; Brownell 2000). The EOWPVT-4 is a standardized test of vocabulary where children are presented with a series of full color illustrations depicting objects, actions, or concepts and the examinee is asked to name each illustration (Brownell 2000). Both the PPVT-4 and the EOWPVT-4 produce standard scores as well as age equivalents. For detailed participant characteristics, please see Table 1.

Table 1 Participant characteristics, (N = 50)

Assessment of the Broad Autism Phenotype

The Social Responsiveness Scale (SRS; Constantino 2002) is a 65-item parent-report questionnaire designed to highlight behaviors that are characteristic of ASD. Parents report on a 4 point scale, with responses ranging from not true, to almost always true. The SRS has been standardized, and thus yields both raw scores and T-scores (Constantino 2002). Items are selected to make up 5 subscales (social awareness, social cognition, social communication, social motivation, autistic mannerisms), examples include “Is aware of what others are thinking of feeling” from the social awareness subscale, “Is able to understand the meaning of other people’s tone of voice and facial expressions” from the social cognition subscale, “Knows when he or she is too close to someone or is invading someone’s space” from the social communication subscale, “Does not join group activities unless told to do so” from the social motivation subscale, and “Does extremely well at a few tasks, but does not do as well at most other tasks” from the autistic mannerisms subscale. For the current study data analyses focused on the SRS Total T-scores and the social awareness subscale since this subscale taps into behaviors associated with response to joint attention. Table 2 outlines descriptive information for SRS scores in the current sample.

Table 2 Descriptive statistics for Social Responsiveness Scale total and subscale scores, (N = 50)

Psychometric properties of the SRS were previously evaluated (Constantino, 2002), demonstrating excellent internal consistency; Cronbach’s α scores were as follows: 0.94 for the male parent rating, 0.93 for female parent rating, 0.97 for clinical rating. The SRS has also been shown to have high discriminate validity when children with an ASD were compared to children with other psychiatric disorders, as well as high concurrent validity when a sample of children with ASD were administered the SRS as well as the Autism Diagnostic Interview-Revised (ADI-R; Lord et al. 1994) (Constantino et al. 2003; Bölte et al. 2008). Further, the SRS has successfully been used to measure BAP symptoms in siblings of children with ASD (Constantino et al. 2006) as well as children in the general population (Constantino and Todd 2003).

Total raw score cut points have been established for male and females in the general population as a way of screening for an ASD (a cut point of 70 and 65 is recommended for males and females, respectively). Use of these cut points results in sensitivity rating of 0.77, indicating an identification rate of 77 % of children where a previous diagnosis has been established. Further, this measure has a specificity of 0.75, indicating that 75 % of children who scored above the cut point qualified for a diagnosis in a follow up assessment (Constantino 2002). Although none of our participants had received a diagnosis of an ASD, 10 % of our participants, all whom where male, received scores above this cut point (n = 5; M = 75.2, SD = 4.76, range = 71–82).

Eye-Tracking Measure of Joint Attention

Eye-Tracking Stimuli

Participants were presented a series of 8 color videos that were each 34 s long and included 4 gaze shifts trials (see Fig. 1 for a detailed timeline). At the center of the screen, each video displayed the head and face of a model, while targets appeared and disappeared in each of the four corners. Half of the videos were congruent (i.e., the model’s gaze followed the target on the screen) and the other half of the videos were incongruent (i.e., the model’s gaze was directed elsewhere). Videos began with a model looking straight into the camera for 4 s (simulating the model making eye contact with the participant). Next, a unique target for each video (i.e., a popular cartoon characters, © 2011 Nintendo/Pokémon) appeared in one of the four corners of the screen and 0.5 s later the model shifted her gaze to a corner for an average of 3.8 s, either congruent or incongruent. In both conditions, the model’s gaze returned to the center of the screen 0.5 s after the target disappeared, after which the next trial was presented.

Fig. 1
figure 1

On the left is an example of a Congruent Condition video, on the right is an example of an Incongruent Condition video. In both conditions videos include 4 phases. Phase 1 (4 s): The model looks straight into the camera (simulating the model making eye contact with the participant). Phase 2 (0.5 s): One of eight targets (i.e., popular cartoon characters, © 2011 Nintendo/Pokémon) appears in one of the four corners of the screen. Phase 3 (3.8 s): The model shifts her gaze to a corner and holds her gaze, either congruent or incongruent. Phase 4 (0.5 s): The target disappears while the model continues gazing at the corner. After this last phase, the model shifts her gaze back to the center and the next trial begins

Data Acquisition

Stimuli presentation and participant eye gaze behavior was recorded using a free standing Tobii T-60 eye tracker with infrared cameras integrated under a 17” LCD display monitor (TOBII Technology AB, Tobii T-60). Eye movements from both eyes were collected at a rate of 60 recordings per second (60 Hz), from an average distance of 60 cm, with an accuracy of 0.5°. Videos covered a rectangular space on the screen of 21.5 cm by 30.5 cm. Viewed at a distance of 60.0 cm, videos corresponded to a 20.3° by 28.5° visual angle. In terms of average size, Faces were 8.4 cm and circular, while Targets were 7.0 cm and polygons. Viewed at a distance of 60.0 cm, Faces and Targets corresponded to an 8.0° and 6.7° visual angle, respectively.

Data acquisition focused on specific time intervals (scenes) that coincided with the first frame when the target appeared on the screen; similarly, the end of each scene coincided with the moment the target disappeared from the screen (see dotted line in Fig. 1). Thus, for each participant, we created 32 scenes (8 videos with 4 trials each), 16 of which were congruent and 16 incongruent. Gaze time allocation during each scene was evaluated by specifying two Areas of Interest (AOI, Boraston and Blakemore 2007): (1) Target AOI, or gaze to the target and (2) Face AOI, or gaze to the model’s face (see Fig. 2). In addition, for each scene, we quantified the amount of time during which the participant was ‘on-task’, defined as the participant gazing at the screen. Thus, data analysis for the current project included three dependent variables (DVs); each DV was measured 32 times per participant (i.e., one measurement per scene). The first two DVs were defined as the percentage of gaze time allocation to the Face/Target AOIs, relative to the participants’ on-task time. The third DV was defined as the percentage of on-task time, relative to the scene duration.

Fig. 2
figure 2

Sample stimuli with an illustration of the Face (red) and Target (green) Areas of Interest (AOI) used in analysis for the congruent condition (a), and the incongruent condition (b)

Statistical Analysis Plan

This research included three kinds of data analyses. The first set of analyses was preliminary and evaluated possible confounds such as (1) individual variation in on-task behavior, (2) variation in the size of the target stimuli, and (3) participant characteristics such as gender, age, as well as various indicators of global development (e.g., language and nonverbal cognitive abilities). The second set of analyses concerned our key hypotheses and evaluated differences in gaze time allocation between the congruent and incongruent condition. Specifically, we predicted that differences in gaze time allocation between the two experimental conditions were moderated by the presence or absence of subclinical symptoms associated with the broad autism phenotype. All analyses were completed by fitting mixed models for repeated measures using SAS Proc Mixed (SAS software, Version 9.2 Copyright © 2002–2008 SAS Institute Inc., Cary, NC, USA). The three DVs included in this analysis (i.e., percentage of gaze time allocation to Face/Target AOIs, on-task percentage) are described above. All models were specified with a random intercept, which is equivalent to a Repeated Measure ANOVA. Finally, when significant moderation was observed (i.e., significant interaction effects between experimental condition and measures of the broad autism phenotype), the method recommended by Preacher et al. (2006) was used to identify higher and/or lower regions of significance. These empirically derived regions of significance specify the upper and/or lower values of the moderator at which the congruent and incongruent condition differed on the dependent variable of interest. This method is statistically conservative and practically more meaningful than other methods used to analyze continuous moderators (i.e. splitting participants into quartiles).

Results

Preliminary Data Analyses

The first set of preliminary analyses focused on the on-task percentage and aimed to describe the quality of the collected eye-tracking data and evaluate whether missing eye-tracking data systematically biased our statistical analyses. Descriptive analyses revealed that the mean on-task percentage for each participant ranged between 32.3 % and 99.5 % (M = 79.0 %, SD = 31.7). Four participants were excluded from all subsequent analyses due to poor eye-tracking data. That is, all four participants showed on-task percentage scores below 50 % on at least 17 of the 32 gaze trials. To evaluate whether on-task percentage scores were predicted by experimental condition (congruent vs. incongruent) and/or global measures of child development, we specified a series of mixed models for repeated measures with on-task percentage as outcome. Results showed that the mean on-task percentage did not differ significantly between the congruent and incongruent condition, p > .05 However, results revealed a marginally significant main effect for chronological age, F(1, 44), p = .07, and significant main effects for receptive language age, F(1, 44) = 9.98, p < .01, receptive language standard scores, F(1, 44) = 5.67, p < .05, expressive language age, F(1, 44) = 9.74, p < .01, and nonverbal mental age, F(1, 44) = 7.67, p < .01. Results did not reveal a significant main effect for SRS total or subscale scores (p > .05). Finally, we did not find any evidence suggesting a significant interaction effect between experimental condition and global measures of child development. In summary, these results findings suggest that children with younger developmental ages found the demands of viewing our experimental stimuli somewhat more strenuous than children with older developmental ages.

The second set of preliminary analyses evaluated associations between global developmental measures and gaze time allocation to the Face and Target AOIs. Throughout this analysis, we found no evidence of child characteristic-by-condition (congruent, incongruent) interaction effects (p > .05). In regards to the Face AOI, results showed that fixations to the face were significantly predicted by children’s expressive language standard scores, F(1, 44) = 5.76, p < .05. In regards to the Target AOI, results showed that fixations to the target were significantly predicted by children’s nonverbal mental age, F(1, 44) = 4.62, p < .05. Other indicators of global development such as chronological age, language age, receptive language standard scores, or nonverbal cognitive standard scores were not associated with fixations to the AOIs as main effects (p > .05). Since we found no evidence to suggest that the association between child characteristics and total fixations to the viewing screen or AOIs differed by condition (child characteristic-by-condition interaction effect), we ruled out the possibility that children’s global development confounded our subsequent hypothesis testing.

Lastly, we investigated whether target-size predicted differences in on-task percentage, as well as gaze allocation to the Target and Face AOIs. Across the 8 targets used in this research, targets covered on average 4.02 % of the screen (SD = 0.75 %; range 2.84–4.78 %). Results from fitting a mixed model showed that there was a significant main effect of target size for gaze time to the Target AOI, F(1, 1,370) = 14.63, p < .01, and Face AOI F(1, 1,370) = 4.50, p < .01, but not for on-task percentage, p > .05. On average, during trials with larger targets, participants allocated more gaze to the Target AOI, and less gaze to the Face AOI, as compared to trials with smaller targets. For all the above models, target size was not significantly associated with gaze time as an interaction effect with experimental condition (p > .44). In order to control for possible confounds associated with variation in target size, all subsequent models included target size as a main effect.

Differences in Gaze Time Allocation Between the Congruent and Incongruent Conditions

We predicted that participants would find the targets more interesting, and hence look longer, if the model was gazing at the targets (congruent condition) than when she gazed elsewhere (incongruent condition). Conversely, we predicted that when compared to the congruent condition, participants would look more to the face throughout the incongruent condition because their expectation of gaze following was violated. To test these hypotheses, we fit two mixed models with gaze time to each AOI (Face, Target) as outcome. Experimental condition (i.e., congruent vs. incongruent) was entered as a fixed effect. Significance tests confirmed both of our hypotheses. As illustrated in Fig. 3, typically developing children allocated less gaze time to the Face AOI in the congruent (estimated marginal mean = 30.04 %, SE = 1.83) than in the incongruent (estimated marginal mean = 32.89 %, SE = 2.41) condition, F(1, 1369) = 4.70, p < .05. Conversely, participants allocated more gaze time to the Target in the congruent (estimated marginal mean = 62.28 %, SE = 1.98) than in the incongruent (estimated marginal mean = 59.09 %, SE = 2.72) condition, F(1, 1369) = 5.92, p < .05.

Fig. 3
figure 3

Percent gaze time allocation to the Face and Target AOIs during the congruent and incongruent conditions. *p < .05. Error bars represent SE calculated from estimated marginal means

Association Between Features of the Broad Autism Phenotype and Gaze Time to AOIs

We predicted that broad autism phenotype, as measured by the Social Responsive Scale, would moderate gaze time allocation between the congruent and incongruent condition. Specifically, we predicted that differences in gaze time would be more pronounced for participants with low SRS scores and less pronounced for individuals with high SRS scores. To evaluate this hypothesis, we fit a series of mixed models with Target AOI and Face AOI as outcome. All models included two main effects (i.e., SRS scores, experimental condition) and one interaction effect (i.e., SRS*condition). We first specified our models using the SRS Total T-scores. Subsequently, we also specified models using T-scores from the five SRS subscales (Social Awareness, Social Communication, Social Motivation, Social Cognition, and Autistic Mannerisms). Results are presented in Table 3.

Table 3 Associations between gaze time allocation and features of the broad autism phenotype, n = 46

Table 3 also reports the Regions of Significance (RoS) values (upper limit) for all significant SRS*condition interaction effects. In addition, Fig. 4a display a graphic illustration the condition* SRS Social Awareness Subscale T-score interaction effect for the Target AOI. For example, with regards to the Target AOI participants with Social Awareness scores below 54.29 reliably differed in their gaze time to the Target AOI between the congruent and incongruent condition. In contrast, participants scoring above this value (n = 19, 41.30 %) did not reliably differ in their gaze time to the Target AOI between the congruent and incongruent condition. With regards to the Face AOI, the RoS analysis revealed that participants with Social Awareness scores below 53.52 reliably differed in their gaze allocation to the Face AOI between the congruent and incongruent condition, whereas participants scoring above this cutoff allocated gaze indistinguishably across the conditions (n = 19, 41.30 %) (Fig. 4b).

Fig. 4
figure 4

Graphs depicting of condition* SRS Social Awareness Subscale T-score interaction effect for a Target AOI, and b Face AOI. Regions of Significance (RoS) are indicated with a dotted line

Previous studies using representative samples have found that within the normal range of cognitive functioning, SRS scores are not associated with IQ (Constantino et al. 2000). In order to rule out participant characteristics as a potential confound we computed correlations to examine possible associations between the broad autism phenotype and participant characteristics. Results showed non-significant trends suggesting that SRS T-scores may be associated with individual differences in receptive, r = −0.28, p = .06, and expressive language standard scores, r = −0.27, p = .07. Other indicators of global development such as chronological age, language age, nonverbal mental age, or nonverbal cognitive standard scores were not associated with SRS Total T-scores, p > .20. Relationships between SRS Social Awareness subscale scores and measures of global development were not significantly related, p > .15. To test whether the SRS-by-condition interaction effects identified above can be attributed to individual differences in children’s language abilities, expressive and receptive language standard scores were added to the mixed models testing our hypotheses. Variables were simultaneously added as a main effect and a language-by-condition interaction effect. When predicting the Target AOI, the SRS Total T-score-by-condition interaction effect remained significant even when individual differences in children’s language abilities were statistically controlled. The corresponding model with Face AOI was not retested since the SRS-condition interaction effect was not significant. In predicting the Target and Face AOI, the SRS Social Awareness T-score-by-condition interaction effect remained significant even when language abilities were statistically controlled (see Table 3 for all other SRS scales).

Next, to be sure that the significant interaction effects were not the result of outliers in our sample we removed participants with first and second degree relatives with ASD (n = 8). The SRS Social Awareness Subscale T-score-by-condition interaction effect for gaze time to Target and Face were re-tested, and remained significant. Likewise, participants with SRS total raw scores above 70 (n = 5) were removed and the SRS Social Awareness Subscale T-score-by-condition interaction effect was retested for both AOIs and also remained significant. The SRS Total T-scores-by-condition interaction effect for Target AOI was also tested in a similar manner. The SRS Total T-score-by-condition interaction effect remained significant even after removing participants with SRS raw scores above 70, as well participants with a family history of ASD.

Discussion

The present study contributes to a research program that aims to develop and validate an endophenotype measure of autism spectrum disorder (ASD). This measure utilizes eye-tracking technology to evaluate how individuals allocate their visual attention while watching social video clips. These videos display an adult model who gazes at a series of targets that appear and disappear in the four corners of the screen (congruent condition). Gaze allocation in the experimental condition is compared to a set of control stimuli where the model’s gaze moves equally as often, but is not directed at the appearing/disappearing targets (incongruent condition). The current project demonstrated the feasibility and validity of administering our paradigm to typically developing children across a broad age range. Results revealed two major findings. First, gaze allocation differed significantly between the experimental and control condition. Second, the Social Responsiveness Scale, a parent-report measure evaluating features of the broad autism phenotype significantly predicted gaze allocation during our experimental stimuli.

In an effort to make our stimuli valid across the lifespan we chose color movies over static images. We also chose to use a variety of popular cartoon characters as targets. Despite the attentional limitations of children, we successfully collected data from 46 children (92 % of the total sample) aged 3 to 9 years. Two of the excluded children were boys, and two were girls, together they averaged 4 years and 9 months in age (ranging from 3 years, 2 months to 6 years, 8 months). Three of these four children were amongst the youngest third of our population, possibly indicating that the demands of our experimental paradigm may be too strenuous for some younger children.

Results from this research revealed that typically developing children showed several significant differences in gaze allocation between the two conditions. First, gaze time to the target was longer in the congruent condition than in the incongruent condition. This finding suggests that the target was inherently more interesting to the participant if the model’s gaze was directed at the target as well, and less interesting if the adult model gazed elsewhere. A similar cuing effect has been reported in infants as young as 4 months of age (Reid and Striano 2005). Second, gaze allocation to the model’s face was greater in the incongruent than the congruent condition. One possible interpretation of this finding is that children were more likely to check the model’s face in the incongruent than the congruent condition because the participants’ expectations were violated.

This pattern of differences in gaze allocation between the congruent and incongruent condition was most pronounced for children with low levels of parent reported subclinical symptoms of the broad autism phenotype. In contrast, differences in gaze allocation between the congruent and incongruent condition were substantially attenuated in children with high levels of parent reported subclinical symptoms of the broad autism phenotype. Results showed that differences in gaze time allocation between the congruent and incongruent condition were most strongly moderated by the SRS Social Awareness Subscale. Importantly, this finding was very robust and remained significant after (1) statistically controlling for children’s language abilities, (2) removing children (n = 5) who scored above established cut points for ruling out ASD in general populations, or (3) removing children with first or second order relatives with an ASD (n = 8). Finally, post hoc analyses utilized the ‘regions of significance approach’ to identify specific SRS Social Awareness subscale scores for which differences between the congruent and incongruent condition were significant. Differences in gaze allocation to the Target AOI were significant for children with Social Awareness scores below 54.29, which corresponded to 58.69 % of our sample. Similarly, differences in gaze allocation to the Face AOI were significant for children with Social Awareness scores below 53.52, which also corresponded to 58.69 % of our sample.

To our knowledge this is the first report of the BAP predicting eye gaze patterns in a community sample of typically developing children. However, there is a growing body of eye-tracking studies examining the broad autism phenotype in first order relatives of children with ASD. Merin and colleagues used eye-tracking during the Still Face paradigm and found that a subgroup of 6 month old sibs-ASD showed reduced gaze to the mother’s eyes and an increased gaze to the mother’s mouth, only one sibs-TD displayed this viewing pattern (Merin et al. 2007). However, in a follow up study reduced gaze to the eyes at 6 months was not related to symptom severity or symptom frequency at 24 months (Young et al. 2009). Fewer studies have focused on school-aged typically developing siblings of children with ASD. In a study on 13 and 14 year old children, unaffected sib-ASD spent significantly less time fixating on the eye region of photographed faces when compared to sibs-TD (Dalton et al. 2007). Finally, parents of children with ASD have been shown to spend less time viewing the eyes of a face when asked to discriminate emotions when compared to parents of typically developing children (Adolphs et al. 2008). All of these studies included groups of typically developing individuals, but none tested the broad autism phenotype in these control groups.

One limitation of the current study is that we did not administer an observational measure of the broad autism phenotype. Currently, the number of valid observational measures of traits associated with ASD is limited. The most well established observational measure in ASD research is the Autism Diagnostic Observation Schedule-General (ADOS; Lord et al. 2000), but the ADOS has not been widely tested in typically developing children so it remains unclear if the measure has enough sensitivity to be used in the typical population. Dawson and colleagues developed The Broad Phenotype Autism Symptom Scale (BPASS; Dawson et al. 2007), a clinical interview that can be used with both adults and children. The BPASS has shown good internal consistency, but has yet to be standardized and tested for re-test validity.

A second limitation of the current study is that gaze time allocation was the only metric used to capture individual differences in children’s eye gaze pattern. Even though findings from this research clearly suggest that this measure captures meaningful variation, it does not capture qualitative differences in children’s eye gaze pattern such as ‘sticky attention’ to the target (which could be evaluated using the duration of the first fixation to the Target AOI) or the frequency with which children checked the model’s face while the target was present (which could be evaluated based on the frequency of fixations to the Face AOI). Thus, the eye-tracking paradigm presented in the current research should be interpreted merely as a marker for individual variation associated with gaze following, rather than a direct measure of RJA.

In conclusion, the current study is a modest step in the validation of an endophenotype measure for autism spectrum disorder. Future research is still needed to fulfill the criteria of validation outlined earlier. One, research still needs to determine if impaired performance on this measure is more common in first order relatives of children with ASD. Second, the neural mechanisms involved in the social information processing of this paradigm still need to be evaluated, and third, behavioral characteristics associated with performance on this measure still need to be evaluated in a sample of children with autism spectrum disorder.