Imagine attending a baby’s first birthday party with friends, family, and toys all vying for her attention. Amidst the noise and confusion, you want the birthday girl to notice your gift. You call her name, saying “Look at this!” She looks at you, sees the present, then grins at you and claps her hands. This baby has demonstrated a sophisticated skill that emerges around 10–12 months of age and develops throughout the second year of life (Carpenter, Nagell, & Tomasello, 1998). The ability to follow or respond to another person’s attention and to participate in the sharing of attention and affect sets the stage for children to learn about the physical and social environment. Responding to joint attention (RJA) has been associated with several social and communicative behaviors considered to be milestones of development in the second year of life, such as language acquisition and social cognitive development (Baldwin, 1995; Tomasello, 1995). For example, RJA is positively related to vocabulary comprehension and production, concurrently and longitudinally (Carpenter et al., 1998; Mundy, Kasari, Sigman, & Ruskin, 1995). Thus, joint attention appears to play a pivotal role in development, and failure to attend or respond to cues from social partners may impact children’s learning about the world and others’ experiences (Corkum & Moore, 1998b).

RJA can be accomplished by following the direction of a social partner’s gaze, head turns, verbal cues, or communicative gestures such as points (Corkum & Moore, 1998b). Important developments in RJA occur during the second year of life, as joint attention engagements evolve beyond episodes of simple coordinated attention (e.g., looking where someone else is looking) and children begin to experience a shared awareness of the mutual focus of attention (Bruner, 1995). Disturbances in RJA have been reliably observed in young children with autism (e.g., Mundy, Sigman, Ungerer, & Sherman, 1986; Stone, 1997), and failure to follow an adult’s gaze or point is considered a critical feature for early screening and detection of autism (Stone, Coonrod, Turner, & Pozdol, 2004). In addition to deficits in RJA, children with autism often have impaired language and social-communicative skills, including play, requesting, directing attention, and motor imitation (Stone, Coonrod, & Ousley, 2000). Moreover, associations between RJA and social-communicative development have been reported for children with autism. Specifically, RJA is linked concurrently with receptive language (Mundy et al., 1986) and both concurrently and predictively with expressive language (e.g., Mundy et al.; Sigman & Ruskin, 1999).

In the present study, we investigated RJA in children 12–23 months of age who are at elevated risk for autism or a related behavioral phenotype by virtue of having an older sibling diagnosed with ASD. Not only are younger siblings of children with ASD more likely than the general population to receive a diagnosis of autism, they are also more likely to demonstrate similar behavioral symptoms—such as social impairments, language delays, or repetitive behaviors—to a milder, subclinical degree (Folstein, Bisson, Santangelo, & Piven, 1998; Rutter, Bailey, Siminoff, & Pickles, 1997). Thus, we investigated RJA in children at-risk for ASD, as well as the relation of RJA to expressive and receptive language, and social-communication development.

Studies of younger siblings of children with ASD have shown impairments in several social-communicative behaviors that are known to be problematic for children with autism. Compared to low-risk infants, high-risk siblings demonstrate lower levels of language development (Yirmiya et al., 2006; Zwaigenbaum et al., 2005), produce fewer gestures (Goldberg et al., 2005; Yirmiya et al.; Zwaigenbaum et al.), and engage in less frequent eye contact and turn-taking (Goldberg et al.). Despite these indications of social communicative difficulties, two studies have not found that RJA is impaired in younger siblings of children with ASD (Goldberg et al.; Yirmiya et al.). Both studies used the abridged version of the Early Social Communication Scales (ESCS; Mundy, Hogan, & Doehring, 1996), which measures RJA in response to multiple redundant cues that elicit and direct the child’s attention to a target (i.e., calling the child’s name, waiting until the child looks at the examiner’s face, shifting eye gaze, and pointing to the target). The use of multiple cues has been found to facilitate responsiveness to attention-directing bids. For example, children respond more often to an adult’s head/gaze shift when it is accompanied by verbal and gestural cues (as in the ESCS), than to a silent head/gaze shift (Leekam, Hunnisett, & Moore, 1998; Walden, Deák, Yale, & Lewis, 2001). Moreover, most parents of children with autism report that their children follow gaze only when accompanied by verbal and gestural cues (Leekam et al., 1998). Thus, measuring RJA only in response to “rich” directives could mask important differences between children at high- and low-risk for ASD. High-risk children may show impairments in RJA in response to less redundant attention-specifying cues, which may be more difficult to follow.

In the current study, RJA was assessed in response to different combinations of verbal and nonverbal cues that varied in redundancy of attention-specifying information. We investigated the effect of increasing the number of attentional cues, such as verbalizations and pointing gestures, which were presented in combination with head turns and shifts in eye gaze. In addition, we measured RJA in a setting where multiple objects or events competed for the child’s attention. Attentional cues were given while children were playing with toys, and therefore involved eliciting and redirecting children’s attention.

Previous research in “busy” experimental settings containing objects and toys has shown that gaze shifts accompanied by either eliciting (e.g., “Chris, Chris!”) or directing verbalizations (e.g., “Look at that!” or “Look at the dog!”) were easier to follow than silent gaze shifts (Walden et al., 2001). Typically developing year-old children were more likely to follow gaze accompanied by directing rather than eliciting verbalizations; however, there was no further benefit when gaze was accompanied by both eliciting and directing verbalizations (“Chris, Chris- Look at the dog!”). Pointing increased attention-following compared to both silent gaze shifts and gaze shifts with eliciting, but not directing, verbalizations (Walden et al.). Thus, verbalizations and pointing gestures may help children follow others’ gaze shifts by providing redundant attentional cues.

This study had three aims. The first aim was to compare RJA in younger siblings of children with ASD to typically developing children of the same age. We hypothesized that the at-risk siblings would be less likely to follow an adult’s attention than typically developing children. Furthermore, we expected this effect to be most pronounced in response to cues containing moderate levels of redundancy. Thus, the second aim was to determine whether specific sets of prompts were particularly problematic for at-risk younger siblings. Consistent with previous research, we did not expect to find group differences in RJA for highly redundant cues (i.e., those consisting of head turns and gaze shifts with a verbal cue and point), because these rich cues may help compensate for impairments in RJA. We also did not expect to detect group differences in RJA in response to the most subtle and least redundant cue (i.e., silent head turns and gaze shifts), because these directives are difficult to follow for all children of this age. The third aim was to determine whether individual differences in RJA were correlated with language abilities and social-communicative behaviors. We hypothesized that RJA would be correlated with language and social-communicative skills for children in both groups.

Method

Participants

Eighty-one children 12- and 23-months-old (inclusive) participated: 46 younger siblings of children with autism spectrum disorder (SIBS-ASD; 26 males, 20 females) and 35 younger siblings of typically developing children (SIBS-TD; 24 males, 11 females). Seventy children were Caucasian, 6 African-American, 2 Hispanic, 1 American Indian, 1 Asian, and 1 child was multi-racial. English was the primary language spoken in the household. Informed consent was obtained from parents prior to participation.

SIBS-ASD were recruited from regional multidisciplinary evaluation and speech-language centers, a statewide birth-to-three service network, autism parent groups, and a university-based autism-specialized service and outreach program. Eligibility requirements were (1) An older sibling with a diagnosis of autism or PDD-NOS, determined by clinical diagnosis and Autism Diagnostic Observation Schedule (ADOS) classification (Lord et al., 2000); (2) Absence of severe sensory or motor impairments; and (3) Absence of identified metabolic, genetic, or progressive neurological disorders. Of the 46 probands, 29 were diagnosed with autism, 15 with PDD-NOS, and 2 with Asperger’s syndrome; chronological ages ranged from 2–12 years (M = 4.8, SD = 2.2).

SIBS-TD were recruited from birth records. Eligibility required (1) A typically developing older sibling; (2) No family history of autism or mental retardation in first-degree relatives; (3) Absence of severe sensory or motor impairments; and (4) Absence of identified metabolic, genetic, or progressive neurological disorders. Three eligible 23-month-old children (1 male, 2 females) were excluded to equate the chronological age (CA) means between the groups (see Table 1).

Table 1 Sample characteristics

Measures

Responding to Joint Attention (RJA) Task

RJA was assessed in a 2.4 × 8.1 m room; target stimuli were displayed on individual shelves arranged in three columns (left, middle, right) across one wall (see Fig. 1). Three brackets were mounted vertically on the wall (spaced 2.7 m apart) and supported three rows of clear shelves (spaced 0, 1, and 1.9 m above the floor), resembling a 3 × 3 matrix. The objects were placed on 8 of the 9 shelves, leaving the middle section of the bottom row empty. In the middle column, a video camera with zoom lens was mounted approximately 90 cm above the floor to record the child’s face at eye-level. Children were seated in a Rifton chair at a child-sized table (61 cm2) placed 2 m from the stimulus wall in the center of the stimulus display. The child was given age-appropriate toys to play with; toys were replaced, as needed, to maintain the child’s interest. The primary experimenter sat on a short stool on either the child’s right or left.

Fig. 1
figure 1

Configuration of experimental room

Two additional miniature cameras were mounted at the far left and far right of the stimulus array positioned level with the middle row of stimuli to record the child’s head and upper body movements. Images from the three cameras were integrated to produce one videotape, allowing coders to view images from the three cameras at once.

Because delayed language development has been observed in younger siblings of children with ASD (Yirmiya et al., 2006; Zwaigenbaum et al., 2005), novel objects served as the target stimuli, and novel object labels were used during the verbal attention-directing cues (see Table 2 for examples of novel labels). This procedure was employed to reduce bias that might result from one group being more familiar with the object labels used in the verbal cues. That is, children with larger receptive vocabularies (i.e., siblings of typically developing children) might more easily locate the referent of the verbal cue because of increased understanding of the verbal label. Pilot testing indicated that none of the novel objects resembled real objects that could be readily labeled by children or adults and that none of the novel labels sounded similar to English words.

Table 2 Verbal and nonverbal cues by prompt set

Ten different types of attention-specifying prompts were used, each containing a different combination of physical and verbal cues (see Table 2). The timing of the gaze and gestural cues depended on the verbal content in the prompt. If the prompt type did not include an eliciting verbalization, the experimenter delivered the directing verbalization and gestural cues simultaneously while shifting gaze to one object on the target wall. For example, the experimenter would say, “Look at the blicket!” while simultaneously shifting head/gaze and pointing to the target. If the prompt type involved an eliciting verbalization, the experimenter looked directly at the child and repeated the child’s name twice, followed by the nonverbal cues (i.e., gaze shift or gaze shift + point). For example, the experimenter would call the child’s name (e.g., “Chris, Chris!”) while looking at the child, then shift head/gaze to the target.

Each child received all 10 RJA prompt types, with order randomized and counterbalanced across participants. Each prompt type was repeated twice, once with the experimenter on the child’s left and once on the right, yielding two trials for each of the 10 RJA prompt types. The experimenter delivered the cues when the child was visually engaged with the toys at the table. Each RJA trial lasted 10 s while the experimenter held the physical position and facial expression constant. Four mutually exclusive variables were coded: number of trials during which children (1) remained visually engaged with the toys on the table, (2) looked toward the stimulus wall, (3) fixated on a specific location on the stimulus wall (both correct and incorrect locations), and (4) engaged in another visual response (e.g., looks to experimenter or ceiling). In addition, the accuracy with which children located the target was evaluated for each trial during which children fixated on a location on the stimulus wall (see below).

Mullen Scales of Early Learning (MSEL; Mullen, 1995)

The MSEL measures cognitive function on a gross motor scale and four cognitive scales assessing nonverbal problem-solving (Visual Reception), fine motor skills, receptive language, and expressive language. We administered the four cognitive scales only. An estimate of mental age was obtained by averaging the mental age equivalents for the four scales. Mean mental age and age equivalent scores for the four scales are in Table 1. Raw scores for receptive and expressive language were used in correlational analyses with RJA.

Screening Tool for Autism in 2-year-olds (STAT; Stone et al., 2000, 2004)

The STAT is a 20-min interactive, play-based measure that provides a standard context for eliciting and observing early social-communicative behaviors. It consists of 12 items assessing behaviors in 4 social-communicative domains: Play, Requesting, Directing Attention, and Motor Imitation. Items in each domain are scored as pass (0) or fail (1) and a domain average is calculated. The total STAT score is the sum of the domain averages and ranges from 0 to 4, with higher scores reflecting greater impairment. When used as a screening tool, the total score is compared to the established cutoff score for autism risk. The present study used the total STAT score for correlational analyses.

Procedure

The RJA task was administered first. When families arrived, the procedures were explained to parents while the primary experimenter played with the child in the experimental room. After a brief warm-up period, the child was seated at a table. Parent(s) watched from an adjacent observation room. If the child was unable to separate from his/her parents, one parent was present during the procedure (SIBS-ASD: 7 out of 46; SIBS-TD: 7 out of 35) and the child was seated on the parent’s lap at the same position and height as those without a parent present. Parents closed their eyes and remained silent to ensure that the child’s responses were not influenced by verbal or nonverbal cues from the parent.

Coding

To ensure that coders remained unaware of the correct target location and cues, cameras were positioned such that the experimenter was not visible during the RJA task and coding occurred without sound. Videotapes were converted to digital format and coded using ProcoderDV software (Tapp, 2003), allowing the onset and offset of each RJA trial to be recorded with single-frame accuracy. RJA was coded by trained observers blind to sibling group membership, with a partial interval coding system. Coders watched the 20 RJA trials and designated 1 of the 8 target locations or an alternate looking pattern, such as visual scanning of the stimulus wall, as the child’s primary focus in each trial. If the child looked to a location on the wall, the target location code (1–8) was determined by the child’s initial visual fixation unless the child clearly referred back to the experimenter and then visually oriented to a new target during the interval.

Correct looks to targets (accuracy) were determined by comparing codes to the actual target location using the following criteria. If the code matched the target location, a score of 1 was given. If the coded location was vertically adjacent to the target location, a score of 0.5 was given (e.g., child looked at the top row position of the left column, but the target location was the middle row position of the left column). This procedure compensated for a fairly small visual angle between vertically adjacent target locations, which made it difficult for coders to distinguish them. If the coded location did not match or was not vertically adjacent to the target, a score of 0 was given. Possible RJA scores were 0, 0.5, and 1 for each trial; across the 20 RJA trials accuracy ranged from 0 to 20, with higher scores reflecting increased accuracy.

Reliability

Coders were trained to an established standard (κ > .80). Twenty percent of the files were randomly selected to be coded by a second observer (SIB-ASD: 10 out of the 46; SIB-TD: 8 out of 35). Agreement was estimated using weighted kappas calculated at the participant level; disagreements between vertically adjacent codes were considered less serious than other disagreements. Average agreement between coders for the SIB-ASD group was .83 (SD = .12) and for the SIB-TD group .92 (SD = .09). The intra-class correlation coefficient for the overall RJA score was .99. Intra-class correlation coefficients for the five sets of prompt types (G, G + E, G + D, G + E + D, G + V + P) were as follows: 1.0, 1.0, .75, .78, and .97, respectively.

Results

Overview

The RJA accuracy score was aggregated across performance on the 20 RJA trials; this score was expected to be the strongest and most reliable measure of overall RJA ability. To address Hypothesis 1, group differences in overall RJA accuracy were examined. To understand how the different attentional cues influenced responding, theoretically driven inter-group and intra-group contrasts were conducted for five sets of prompt types (Hypothesis 2). Finally, we examined correlations between individual differences in RJA and receptive language, expressive language, and social-communicative behavior (Hypothesis 3).

Did SIBS-ASD Follow Attention Less Accurately Than SIBS-TD? (Hypothesis 1)

To determine whether RJA skills were weaker in SIBS-ASD than SIBS-TD, a 2 × 2 (Group × Gender) ANOVA was performed with correct looks to targets as the dependent variable. There was a significant main effect for group, F(1,77) = 5.58, P < .05, Cohen’s d = .54, with SIBS-ASD obtaining lower RJA scores than SIBS-TD (see Table 3). The main effect for gender was not significant, F(1,77) = 1.93, ns, d = .32; neither was the interaction between group and gender, F(1,77) = 0.04, ns, d = .05. These results indicate that SIBS-ASD were less able to follow the experimenter’s attention than SIBS-TD across a variety of attentional cues.

Table 3 Proportion of total trials in which each behavioral response occurred

A child might fail to follow attention either because the child did not look away from the toys to notice the cue, or the child was less accurate in locating the target, even when disengaged from the toys. We investigated the extent to which each of these possibilities accounted for reduced RJA scores in the SIBS-ASD group.

A one-way ANOVA indicated that children in both groups looked away from the toys during similar proportions of trials, F(1,79) = 1.19, ns, d = .25. Children in both groups looked away from the toys during about 2/3 of the trials (see Table 3). Moreover, the proportion of trials in which children looked to a target on the stimulus wall did not differ between groups, F(1,79) = 2.29, ns, d = .34. These results indicate that the reduced RJA accuracy for SIBS-ASD cannot be attributed either to looking away from the toys less often or to a reduced tendency to fixate some target on the stimulus wall.

To ensure that group differences in correct looks to the target could not be attributed to difficulty disengaging with the toys, data were re-analyzed for only those trials in which the child looked away from the toys. A one-way ANOVA indicated a significant main effect of group, F(1,79) = 6.16, P < .05, d = .56, with SIBS-ASD looking to the correct target (M = .30, SD = .19) significantly less often than SIBS-TD (M = .41, SD = .18). Thus, the SIBS-ASD were less accurate in following attention than the SIBS-TD, even after controlling for trials in which children did not look away from the toys.

To investigate whether the group differences were produced by a few children with extreme scores, we examined the distributions of RJA scores. Table 4 presents the number of children in each group for each decile of RJA scores. Sixty-one percent of the children in the SIBS-ASD group (28 out of 46) obtained accuracy scores within the lowest two deciles (i.e., 20% or less of the trials), compared to 34% of the children in the SIBS-TD group (12 out of 35), χ2 = 5.62, P < .05. Thus, the lower RJA scores of SIBS-ASD were not due to a few children who did not follow the experimenter’s attention; rather, the performance of the majority of the SIBS-ASD fell in the lowest two deciles, whereas the majority of SIBS-TD were in the highest four deciles.

Table 4 Number of children in each group by percentage of correct looks to targets

Were Certain Types of Prompt Sets More Difficult to Follow for SIBS-ASD than for SIBS-TD? (Hypothesis 2)

Inter-Group Comparisons

We hypothesized that SIBS-ASD would be less accurate in following the experimenter’s attention than SIBS-TD when presented with moderately redundant attentional cues (i.e., those involving gaze shifts plus verbalizations: (G + E), (G + D), (G + E + D); see Table 2). However, we did not expect group differences in responding to highly redundant cues (i.e., those involving gaze shifts, verbalizations (either directing or eliciting and directing], and points (G + V + P)) or differences in responding to impoverished cues with less attention-redirecting information (i.e., silent gaze shifts (G)). Planned inter-group and intra-group contrasts were conducted for one prompt type involving a silent gaze shift ((G), two trials), one prompt type involving a gaze shift with an eliciting verbal cue ((G + E), two trials), two prompt types involving gaze shifts with directing verbal cues ((G + D), four trials), two prompt types involving gaze shifts with eliciting and directing verbal cues ((G + E + D), four trials), and two prompt types involving gaze shifts with pointing gestures accompanied by either directing or eliciting and directing verbal cues ((G + V + P), four trials).

A repeated measures 2 × 5 (Group × Prompt Set) ANOVA was conducted with group as the between-subjects variable and prompt set as the within-subjects variable. Results are in Table 5. Results supported the hypothesis that the SIBS-ASD would have fewer correct looks than SIBS-TD for two of the three moderately redundant prompt sets (i.e., those involving gaze shifts with verbal cues): gaze shifts with directing verbalizations (G + D), d = .25, and gaze shifts with eliciting and directing verbalizations (G + E + D), d = .32. There was no group difference in response to gaze shifts with eliciting verbalizations (G + E), d = .13. As predicted, there were no group differences in response to silent gaze shifts (G) or gaze shifts + verbalizations + points (G + V + P). Furthermore, for both groups, the mean accuracy for silent gaze shifts (G) did not differ from zero. Thus, responding to silent head/gaze shifts was difficult for all children in this age range.

Table 5 Correct looks (to targets), looks away (from toys), & correct looks given looks away for different prompt sets

Again, lower accuracy scores could result from a failure to look away from the toys, or failure to accurately locate the target of the adult’s attention. To examine whether SIBS-ASD looked away from the toys less often than SIBS-TD during each set of prompts, a 2 × 5 (Group × Prompt Set) repeated measures ANOVA was conducted; the dependent variable was the proportion of trials during which children looked away from the toys. Results are in Table 5. There was a significant group difference only in response to gaze shifts with directing verbalizations (G + D), d = .23; SIBS-ASD looked away from the toys during these trials significantly less often than SIBS-TD, which may have contributed to the lower RJA scores in the SIBS-ASD group for prompts involving gaze shifts with directing verbalizations. Interestingly, this was not true for the gaze shifts with eliciting and directing verbalizations (G + E + D), d = .18; for these cues, the SIBS-ASD had significantly lower RJA accuracy even though they looked away from the toys during similar proportions of these trials as the SIBS-TD.

To further explore whether the group differences in correct looks to targets observed for the two types of prompts involving gaze shifts with verbalizations (G + D, G + E + D) could be attributed to difficulty looking away from the toys, we re-analyzed the data for only those trials in which the child looked away from the toys. A 2 × 5 (Group × Prompt Set) repeated measures ANOVA was conducted on correct looks to targets, controlling for the number of trials during which children looked away from the toys. Results are in Table 5. There was only a marginally significant group difference in response to gaze shifts with directing verbalizations (G + D; P < .10, d = .20), indicating that the lower RJA accuracy scores in SIBS-ASD in response to these types of prompts could be due to their failure to look away from the toys. Thus, for the G + D trials in which children looked away, the accuracy of the SIBS-ASD in locating the target was similar to that of the SIBS-TD. In contrast, the group differences in response to prompts involving gaze shifts with eliciting and directing verbalizations (G + E + D) remained significant after controlling for the number of trials during which children looked away from the toys, d = .26, further indicating that the lower accuracy for this prompt set in the SIBS-ASD reflected a failure to locate the target of the experimenter’s attention.

In the ESCS, the examiner directs attention by calling the child’s name (an eliciting verbalization), waiting until the child looks at the examiner’s face, shifting head direction and eye gaze, and pointing to the target. In the present study we included a type of prompt similar to that used in the ESCS: an eliciting verbalization (e.g., “Chris, Chris!”) followed by a head turn and gaze shift with a point. However, in the ESCS procedure the examiner waits until the child has established eye contact before delivering the cues, whereas our procedure did not require eye contact before delivering the cues. In fact, children had to be visually engaged with the toys prior to receiving cues. Thus, we analyzed group differences in response to this particular prompt type because it is similar to the cue used in the ESCS. Consistent with previous sibling studies using the ESCS (Goldberg et al., 2005; Yirmiya et al., 2006), we found no significant difference between the groups in response to this type of prompt, F(1,66) = 1.53, ns.

To explore whether the group differences in RJA could be explained by group differences in visual spatial abilities, we examined whether visual reception was significantly correlated with RJA and whether group predicted RJA when visual reception was controlled using multiple regression. Visual reception was significantly correlated with correct looks to targets and correct looks after controlling for looks away from the toys (r = .34, P < .01 for both). The standardized regression coefficient for group without visual reception was β = .25, P < .05; with visual reception the coefficient for group was β = .19, P < .10. After controlling for the number of trials during which children looked away from the toys, the standardized regression coefficient for group without visual reception was β = .29, P < .01; with visual reception the coefficient for group was β = .23, P < .05. Thus, although group differences in visual reception did account for some of the variation in RJA, group status continued to predict RJA accuracy after controlling for visual reception and the number of trials during which children looked away from the toys.

Intra-Group Comparisons

Results are presented in Table 6. Certain patterns were common to both groups. For example, gaze shifts (G) were more effective cues for both groups when combined with any of the following: a directing verbalization (G + D; SIBS-ASD d = .40, SIBS-TD d = .66), an eliciting and directing verbalization (G + E + D; SIBS-ASD d = .56, SIBS-TD d = .90), and pointing gestures with directing or eliciting and directing verbal cues (G + V + P; SIBS-ASD d = .86, SIBS-TD d = .93). Similarly, the addition of a point (G + V + P) increased RJA accuracy over gaze shifts with eliciting verbalizations (G + E; SIBS-ASD d = .58, SIBS-TD d = .52) and gaze shifts with directing verbalizations (G + D; SIBS-ASD d = .56, SIBS-TD d = .34). Gaze shifts were more effective cues for SIBS-TD, but not SIBS-ASD, when combined with an eliciting verbalization (G + E; d = .36) or eliciting and directing verbalizations (G + E + D; d = .49). Adding a point to gaze shifts with eliciting and directing verbal cues (G + E + D) increased accuracy only for the SIBS-ASD, d = .37.

Table 6 Intra-group comparisons of correct looks to targets between prompt sets

Were Individual Differences in RJA Correlated with Language and Social-Communicative Behaviors? (Hypothesis 3)

Correlations between RJA and language and social-communication skills were significant for the SIBS-ASD only (see Table 7); pair-wise comparisons for each correlation indicated that no correlation differed between groups. After controlling for chronological age, the correlations remained significant for the SIBS-ASD.

Table 7 Correlations of language and social communicative behaviors with RJA

Discussion

In everyday situations young children are often surrounded by objects, events, and people that compete for their attention. To follow another person’s attention, children must monitor their social partners and notice cues to their attentional focus. Social partners may use a variety of different verbal and nonverbal cues to elicit and direct children’s attention. Noticing and following those cues lets the child in on the internal world of others’ attention and intentions.

Our experimental setting provided competition for attention and prompts ranged from those expected to be difficult (e.g., head and gaze shifts only) to those expected to be easily followed by children in this age range (e.g., redundant prompts with head/gaze shifts, verbalizations, and pointing). Because the experimenter gave attention cues only when children were occupied with toys, children had to shift their attention from the toys to the social partner in order to notice the cues and locate the new target. In addition, because novel stimuli and object labels were used, verbal cues did not specify the target. Even the verbalizations that contained labels (e.g., “Look at the koba!”) did not use real words, which might have allowed more able language learners to scan the stimulus display to identify the designated object. To follow the experimenter’s attention, children had to use gaze and pointing cues.

Hypothesis 1 proposed that SIBS-ASD would demonstrate impaired RJA relative to SIBS-TD across the range of prompt types. This hypothesis was supported by the finding that SIBS-ASD were less accurate in locating the targets than SIBS-TD. This finding is inconsistent with previous studies that used the ESCS to measure RJA. Yirmiya et al. (2006) found no differences between SIBS-ASD and SIBS-TD in response to ESCS cues in 14-month-old infants nor did Goldberg et al. (2005) find differences in 14- to 17-month-old infants. Methodological differences between the studies are a likely source of the discrepant findings. The aggregate measure of RJA employed in this study differed from the ESCS in several ways, including the variety of attentional cues presented (10 different prompt types versus 1 prompt type) and the number of trials used to assess RJA (20 vs. 6 trials). In the present study, RJA was assessed using prompts ranging in type and difficulty, whereas the ESCS uses only one type of prompt.

The weaker RJA performance evidenced by SIBS-ASD was not due to a small number of children who performed especially poorly; rather it was characteristic of a majority of SIBS-ASD. The distributions of scores differed for the two groups (Table 4). Whereas the majority of the SIBS-TD scored in the highest four deciles, the majority of the SIBS-ASD scored in the lowest two deciles. Thus, it appears that SIBS-ASD experience difficulty responding to joint attention, although the extent to which these group differences reflect an early indicator of a broader autism phenotype awaits confirmation through longitudinal follow up. We would not expect all younger siblings of children with ASD to demonstrate the broader phenotype, but those with the most severe impairments in RJA and other skills may be at higher risk than less impaired siblings.

Of interest were findings that SIBS-ASD appeared to monitor the attention of the social partner to a similar degree as SIBS-TD, in that they showed comparable rates of disengaging attention from toys and looking toward the experimenter. These findings are not consistent with previous reports of deficits in attention disengagement for young children with autism (Landry & Bryson, 2004) and for a subgroup of high-risk siblings who receive a later diagnosis of autism (Zwaigenbaum et al., 2005). Differences in the nature of the samples as well as the disengagement tasks employed may account for these discrepant findings. The earlier studies employed a computer-based task using geometric shapes as stimuli, whereas the present study used a more naturalistic, socially based task. Performance differences in these two situations would not be unexpected. In addition, the Zwaigenbaum et al. study reported attention disengagement difficulties only for the subset of siblings who received a later diagnosis of autism. In the present study, follow-up diagnoses were not yet available; thus it is possible that continued follow-up would reveal a bimodal distribution for disengagement in our sibling sample, as well. Nevertheless, our results did suggest that difficulty in locating the target of another person’s attention, rather than an inability to shift attention, was present in a substantial proportion of high-risk siblings.

The ability to locate the target of another’s attention involves the development of social cognitive processes involved in understanding social gaze shifts and gestures, as well as maturation of basic spatial analytic processes involved in encoding locations relative to the child (Mundy, Card, & Fox, 2000). We found that accurately locating the target of another’s attention is related to visual spatial ability. However, although visual spatial abilities are a component of RJA, group status contributed to RJA accuracy above and beyond spatial ability.

Hypothesis 2 proposed that SIBS-ASD would show the most pronounced RJA impairments to cues containing moderate—rather than high or low—levels of redundancy. This hypothesis was supported. The prompt set comparisons indicated that SIBS-ASD were less accurate than SIBS-TD when following attention in response to two types of moderately redundant cues: gaze shifts accompanied either by directing verbalizations or by directing and eliciting verbalizations. In contrast, there were no group differences in RJA in response to silent head/gaze shifts (which were difficult for all children in this age range) or to prompts involving a combination of gaze shifts, verbalizations, and points (which were expected to be easier because of redundancy of cues). The latter prompt type is similar to that used in previous studies of RJA in SIBS-ASD using the ESCS, which have reported no group differences (Goldberg et al., 2005; Yirmiya et al., 2006). Thus, despite a general deficit in following attentional cues for the SIBS-ASD, under some conditions the SIBS-ASD performed as well (or as poorly) as the SIBS-TD (i.e., on trials in which highly redundant cues or few redundant cues were used).

The moderately redundant cues (i.e., gaze shifts with verbalizations) proved more difficult for the young SIBS-ASD than for the SIBS-TD. For the SIBS-TD, calling the child’s name before shifting attentional focus resulted in a significant increase in RJA (though still fairly low, it rose to almost one-third from less than one-fifth of the trials), suggesting that these children were not spontaneously monitoring the partner’s attention and they missed the shifts. However, when their names were called, the children were more likely to notice and follow the head/gaze shift and their RJA scores rose. When nonspecific directing instructions were added to the name calling and gaze shifting (“Look at the blicket!” or “Look at that!”), there was an improvement in RJA accuracy, rising to almost half the trials. The addition of a pointing gesture to gaze shifts with eliciting and directing verbalizations did not significantly improve RJA. Thus, 3 of the 5 cue combinations were equally effective in eliciting good performance from SIBS-TD.

For the SIBS-ASD, silent head/gaze shifts were difficult to follow, as was true for the SIBS-TD. Calling the child’s name prior to the shift did not significantly help, even though it did result in children looking away more often from objects that had previously occupied their attention. Nonspecific directing verbalizations did help the SIBS-ASD and adding pointing gestures to these cues helped even more. With the easiest, most redundant cue combinations, which included pointing in addition to head/gaze shifts and verbal directives, the accuracy of the SIBS-ASD was no lower than the accuracy of the SIBS-TD, suggesting that with sufficient redundancy in attentional cues, younger siblings of children with ASD do as well as their low-risk counterparts.

The importance of responding to others’ attentional cues is perhaps most compelling when considered in the context of language learning. Several studies have found that attention-following in typically developing infants is associated both concurrently and predictively with vocabulary development (e.g., Carpenter et al., 1998; Mundy & Gomes, 1998). A similar pattern has been reported for children with ASD, in that concurrent and predictive relations between RJA and vocabulary development have been found (McDuffie, Yoder, & Stone, 2005; Sigman & Ungerer, 1984). Relative weaknesses in language development have also been found among siblings of children with ASD (Yirmiya et al., 2006; Zwaigenbaum et al., 2005). It is possible that early deficits in RJA may contribute to these language difficulties.

Hypothesis 3 addressed the relation between RJA and language and social-communicative skills. Results indicated that RJA correlated with language ability and social-communicative behavior for the SIBS-ASD only, even after controlling for the effects of chronological age. That is, although there were no significant correlations between RJA and language or social-communicative skills in the SIBS-TD group, RJA was significantly correlated with receptive and expressive language, as well as with social-communicative behaviors in SIBS-ASD group. These results add to the evidence supporting the relation of RJA with language and social communicative development by extending the findings to include younger siblings of children with ASD.

There is evidence that RJA may be amenable to change over a relatively short period of time through structured experiences. Corkum and Moore (1998a) found that typically developing infants showed improvement in RJA in the context of an assessment that provided external reinforcement for following the examiner’s line of visual regard. Similar findings were obtained by Leekam et al. (1998) for school-aged children with autism. In addition, studies with young children with autism have found generalized improvements in RJA after participating in focused treatments (Kasari, Freeman, & Paparella, 2006; Whalen & Schreibman, 2003).

Future research on high-risk siblings should include longitudinal analyses of RJA, as well as other social communicative behaviors. In the current study, children under the age of two had substantial room for improvement for even the easiest cue combinations. We would expect that as children develop, they become better able to follow even the most difficult cues. We do not know whether siblings at risk for ASD continue to lag behind typically developing children in RJA, especially when faced with less redundant attentional cues such as silent shifts in gaze. In addition, the inclusion of clinical control groups (e.g., siblings of children with learning disabilities or mental retardation) will inform research as to whether impaired RJA is specific to children at-risk for ASD. Conclusions from the present study are limited to at-risk siblings who may or may not receive a diagnosis of ASD; prospective longitudinal studies of at-risk siblings (e.g., Landa & Garrett-Mayer, 2006; Zwaigenbaum et al., 2005) will not only provide critical information about the early development of ASD, but will also allow for comparisons between high-risk unaffected siblings to high-risk siblings later diagnosed with ASD.

For at-risk younger siblings, RJA deficits may have implications for social, cognitive, and communicative development, regardless of the silbings’ final clinical diagnostic status. Early-emerging deficits in RJA may lead to impoverished social input early in life (Mundy & Burnette, 2005). We do not yet know how early attenuation of social information may affect the course of development; however it could explain the weaker performance on language measures that have been observed in younger siblings of children with ASD. The ability to accurately follow a social partner’s attention provides children with unique opportunities to learn about the environment and others. For children at risk for ASD, failure to follow others’ attention may result in reduced learning opportunities.