Keywords

1 Introduction

An 8-month-old is sitting on the floor playing with toys. He looks intently at a shiny red car and vocalizes, and his mother says, “there’s the car!” An 11-month-old is sitting in her highchair eating a snack while her father watches. Looking at him, she reaches for her cup and holds it up for him to see. When he responds, “That’s your cup,” she resumes eating. A 16-month-old visiting the zoo spots a lion, points excitedly and vocalizes. His father says, “Do you see the lion o ver there?” A 20-month-old, playing in the clean laundry, picks up her father’s t-shirt and holds it up for her mother to see while saying “Daddy.” Her mother says, “Yes, honey, that’s daddy’s old black t-shirt.”

These examples are illustrative of two crowning developmental achievements of the first 2 years of life: the emergence of intentional (i.e., directed at a communicative partner) and symbolic (e.g., using gesture, sign, or word to stand for a specific referent) communication. Both have been widely discussed in the literature because they represent major advances in social communicative and cognitive development (e.g., Bates, 1976; Bates, Benigni, Bretherton, Camaioni, & Volterra, 1979; Bloom, 1993). However, as illustrated by the parent responses in these examples, the emergence of intentional and symbolic communication is also remarkable because it impacts the communicative environment in which very young children are immersed and the individuals with whom they interact.

While typically developing (TD) infants produce the sorts of communicative acts described above frequently and seemingly without effort, many individuals with autism spectrum disorder (ASD) struggle to communicate with others. For some, intentional and symbolic communication eventually emerges on a delayed timetable. For others, both types of communication may be rela tively limited or very infrequent. Delays and atypicalities in the development of intentional and symbolic communication are a hallmark of ASD.

Discussions of developmental delay typically take the perspective that delay is a characteristic of the individual; a great deal of research effort has been devoted to identifying earlier-appearing individual factors that predict subsequent delay, along with relations between delayed development and the emergence of more sophisticated behaviors at later time-points. While these are worthwhile endeavors, they result in a quite limited picture of the way in which delay emerges over time and impacts subsequent development. Our central thesis is that delays and atypicalities in the development and use of intentional and symbolic communication have far-reaching, cascading effects that extend beyond the individual to impact the behavior of social partners and the communicative environment more broadly. Over time, these effects may fundamentally alter both the nature of the input that the communicator receives and the availability of opportunities for learning that may support future advances.

Our defense of this proposal will proceed in the following way. We begin by reviewing research on TD infants and toddlers indicating that the emergence of new forms of intentional communication impacts caregiver responding, and that these alterations occur in ways that support the development of more advanced communication skills. Following a brief discussion of the impact of the emergence of symbolic communication on the communicative environment, we provide a general overview of the delays and challenges in communicative development that are generally characteristic of individuals with ASD. We then use this overview as a starting point for discussing a conceptual framework that identifies ways in which delays in a set of basic, early emerging communicative behaviors – eye contact, gesture, and vocalization – can impact the social and communicative environment and thus the development of intentional and symbolic communication. Finally, we conclude with some recommendations for research and clinical practice suggested by this framework.

Before proceeding with this discussion, however, we would like to note that although many of the examples to be discussed in what follows will come from childhood, we recognize that individuals with ASD of all ages face challenges in usin g intentional and symbolic communication. Although a substantial portion of the content presented here is taken from research on children, and the conceptual framework that we present is grounded in early development, we believe that the principles of cascading developmental effects are relevant for individuals across a wide range of ages.

2 Current Research on the Topic

2.1 Intentional and Symbolic Communication in Typical Development

2.1.1 How Does the Transition to Intentional Communication Impact the Communicative Environment?

Communication is said to be intentional when there is clear behavioral evidence that the message being conveyed is directed toward a communicative partner. In preverbal individuals, the behavioral evidence is typically of two types. The first type involves the pairing of a communicative behavior (e.g., a gesture, a vocalization) with eye contact with the partner (or alternating gaze between the referent of the communicative act and the partner). The second type involves the communicator’s behavior following the communicative act. Intentional communication is typically followed by a pause, during which the communicator waits for a response or acknowledgement from the social partner . If the partner fails to respond, the signal may be repeated, this time supplemented with additional behavioral cues (e.g., vocalization) to ensure that it is recognized as a communicative signal (Iverson & Thal, 1998).

It is important to note here that although evidence of intentionality need not necessarily come from the presence of eye contact with a communicative behavior, eye contact has become the sine qua non of intentional communication, such that it is often required in order for communicative behaviors to be considered acts of communication. However, this criterion may underestimate the communicative abilities of preverbal individuals with ASD, for whom eye contact occurs significantly less frequently and may be more effortful than for neurotypical peers (Akhtar & Gernsbacher, 2008). This is an issue to which we will return below.

Vocalizations

From the first moments of life, infants vocalize. They cry; they also produce a wide variety of non-cry sounds that are consi dered to be precursors to the sounds of spoken language (e.g., Oller, 2000). Although these early pre-speech vocalizations are not intentionally communicative according to the above criteria, caregivers and adults respond to them as though they are (e.g., Snow, 1977). It is perhaps for this reason that TD infants appear to have expectations about the social value of their vocalizations from a relatively young age.

One demonstration of this expectation comes from research using the face-to-face-still-face (FFSF) paradigm (e.g., Tronick, Als, Adamson, Wise, & Brazelton, 1978). In this classic methodology, infants and caregivers are seated facing one another and caregivers are instructed to interact as they typically would, usually for a period of 2 min. Next, caregivers are asked to stop responding to the infant and to assume an expressionless face. This manipulation disrupts the reciprocity of the interaction, and numerous studies have examined changes in infants’ social behaviors (e.g., smiling, eye contact) over the course of the still-face period, reporting that initially, infants increase efforts to re-engage the caregiver, and then gradually begin to spend more time looking away and fussing. Results such as these have been interpreted as indicating that infants have expectations about the inherently reciprocal nature of social interactions (e.g., Adamson & Frick, 2003; Moore, Cohn, & Campbell, 2001; Striano, 2004; Tarabulsy et al., 2003).

In a recent study, Goldstein, Schwade, and Bornstein (2009) examined 5-month-old infants’ rate of product ion of non-cry vocalizations in the FFSF paradigm. While vocalizations provide an opportunity for infants to receive a response from a caregiver, the contingency between infant vocalization and caregiver responses is imperfect (i.e., not every vocalization that infants produce receives a response). They thus hypothesized that if infants have learned about the contingency between their own vocal behavior and caregiver responses and appreciate the value of their vocalizations as social signals, they should exhibit an extinction burst (a hallmark of learning from imperfect contingencies) at the beginning of the still-face period, with rat e of vocalization initially increasing relative to the prior face-to-face interaction phase and then declining over time. Data were consistent with this prediction: overall, vocalizations peaked after 75 s and then declined across the rest of the still-face episode, and this pattern was evident in the production of 37 of 38 infants in the sample. Thus, by 5 months of age, infants appear to have learned that their vocalizations elicit reactions from others and have social value.

At around 8 months, TD infants begin to integrate eye gaze with vocalizations (e.g., Bates et al., 1979; Golinkoff, 1986), which some authors have termed directed vocalizations. One type of directed vocalization involves the infant vocalizing while looking at an object that is either held or within reach. These object-directed vocalizations (ODVs) appear to provide valuable opportunities for interactions that advance word learning. The best evidence for this relationship comes from experimental work conducted by Goldstein and colleagues (Goldstein, Schwade, Briesch, & Syal, 2010). In a pair of experiments, they recorded vocalizations produced by infants as they explored novel objects. Results indicated that: (a) 12-month-old infants’ learning of the visual features varied in relation to ODV production, with features being learned for objects that elicited the most ODVs but not for those that elicited the fewest ODVs; and (b) 11.5-month-old infants successfully learned object-word associations when the label was paired contingently with an ODV. Learning did not occur when the label was paired with a look alone. In a subsequent study, Goldstein and Schwade (2010) demons trated that adult responsiveness to the ODVs of 9-month-old infants predicted vocabulary size at 15 months. Overall, these findings suggest that ODVs may be indicative that an infant’s attention is focused on a particular object and serve as a salient index of interest to an adult, who is likely to respond with timely input about the object (i.e., its label). This type of input may contribute to infants’ growing awareness of sound-object links.

A second type of directed vocalization involves the coupling of a vocalization with looking at the caregiver. There is surprisingly little research on caregiver-directed vocalizations, but the existing findings suggest that for caregivers, eye gaze is a powerful cue for interpreting infants’ intentions, and that this information shapes their responses to these vocalizations (e.g., Golinkoff, 1986). Consistent with this view, Gros-Louis, West, and King (2014) stud ied caregiver-directed vocalizations longitudinally in a sample of 12 mother-infant dyads observed every 2 weeks from 8 to 14 months. Although ODVs occurred more frequently than mother-directed vocalizations, they found that mothers were more likely to respond to mother-directed vocalizations (range .55-.68 across sessions) than to ODVs (range .38-.52 across sessions). This simple difference in relative frequency of responding may be sufficient to provide infants with valuable information about the impact of their vocalizations on caregiver behavior . This possibility is supported by the finding that the likelihood of providing a contingent response focusing on an object currently in the infant’s visual line of regard predicted growth in infants’ mother-directed vocalizations in subsequent months.

Gros-Louis et al. (2014) also asked whether mother-directed vocalizations were related to developmental change in infant vocal complexity and to word production at 15 months. Interestingly, while mother-directed vocalizations were not related to word production at 15 months, maternal responses to mother-directed vocalizations were positively and significantly associated with an increase in infant production of vocalizations containing consonant-vowel (CV) clusters. Thus, infants who received proportionately more responses to their mother-directed vocalizations exhibited a larger increase in production of CV vocalizations from 8 to 14 months. This is important because CV vocalizations are considered to be more developmentally advanced and “speech-like” than those containing only vowel sounds, and prior research has indicated that caregivers respond differentially to CV vocalizations, providing more imitations and expansions than they do to vowel-only vocalizations (Gros-Louis, West, Goldstein, & King, 2006).

In sum, the research reviewed above indicates that there is a dynamic developmental cascade unfolding over time in the interplay between infant vocalization and caregiver response and suggests the operation of powerful social learning mechanisms. By the end of the first 6 months of life, infants appear to appreciate that their vocalizations have social value, presumably because active, attentive caregivers frequently attribute intentionality to those vocalizations. Once infants begin to combine vocalizations with eye gaze toward an object or a caregiver, attentional focus can provide caregivers with additional information regarding the potential function and meaning underlying the vocalizations, information that may guide the responses caregivers provide. Differences in both the frequency an d nature of responses to ODVs and caregiver-directed vocalizations may then influence patterns of developmental change in the two types of vocalizations, and changes of this sort are highly likely to influence subsequent patterns of caregiver responding.

Gestures

As noted previously (see Chap. 2), first gestures generally appear in TD infants between the ages of 8–14 months (see also Bates, 1976; Bates et al., 1979). The emergence of gestures marks a key transition in the development of intentional communication because gestures provide a more explicit means for establishing reference. Gestures such as giving, showing, requesting, and pointing (collectively termed deictic gestures ) are the first to emerge, with pointing generally the last to appear (Bates et al., 1979). Collectively, these gestures serve to indicate the object of an infant’s interest and to draw another’s attention to it.

While the appearance of deictic gestures represents a significant advance in communicative development, these gestures enjoy a long developmental history prior to their emergence as communicative signals. Thus, for example, requesting initially occurs as a response to adult behavior (e.g., reaching for a toy that is being extended by the adult), but gradually it becomes less tightly linked to the specific contexts and action patterns in which it occurs. An early form of the reaching gesture might consist of an exaggerated reaching movement toward an inaccessible object accompanied by fussing or intense vocalization. Over time, infants begin to produce a more abbreviated reach toward the desired object while looking at the caregiver (e.g., Bruner, 1977). Reaching therefore changes in both form and function, progressing from being a signal of difficulty in obtaining an object to one that indicates a particular interest in that object.

Similarly, components of the pointing gesture are observed in the spontaneous behavior of very young infants. Two-month-olds extend their index fingers reliably during social interaction, although the movement is not object-directed, nor is it paired with arm extension or eye gaze (Fogel & Hannan, 1985). Six-month-olds will spontaneously point toward an object that attracts their attention in a social context (without extending the arm or looking at the caregiver); older infants will point at an object while inspecting it closely (e.g., see the lovely series of detailed observations of pointing-for-self reported in Bates, 1976). It is not until around the first birthday that po inting shifts from a self-directing attentional device that appears to help infants highlight their current focus of attention for themselves to a social gesture used to direct the attention of others to an object of interest. Evidence of this shift comes from the coordination of pointing with eye contact: infants will point to an object while looking back at an adult, as though to check that their social partner has located the referent of the gesture and is now attending to it (e.g., Bates, 1976; Masur, 1990).

Not only does the emergence of gestures impact infants as communicators; it also affects the language-learning environment . Deictic gestures provide caregivers with clear, salient, and relatively precise cues as to the child’s current focus of interest to which they can provide a well-tailored response. Such responses can in turn provide rich opportunities for word learning because the child is already focused on the object while the caregiver is speaking, conditions that are known to be prime for acquiring a new word (e.g., Tomasello & Farrar, 1986).

One way in which adults can tailor their responses to infants’ gestures is by translating the referent of the gesture (Golinkoff, 1986; Masur, 1982). For example, when an infant points to a dog, a caregiver might translate the referent of the pointing gesture by saying, “Yes, do you see the dog? I see it too.” In a longitudinal study of ten children, Goldin-Meadow, Goodrich, Sauer, and Iverson (2007) identified all referents that infants re ferred to only in gesture and never in speech (e.g., infant points to a ball but never says the word “ball”) and classified them according to whether mothers translated (e.g., “let’s go get your ball!”) or never translated the gestures into speech. To determine whether these translation responses affected word learning, they then examined the likelihood that the verbal equivalents of the gestures in these two categories entered children’s word vocabularies. Data indicated that verbal equivalents of child gestures were significantly more likely to enter children’s word vocabularies when mothers provided translations of the gesture than when they did not. Gestures thus appear to provide valuable signals to adults about a child’s current state of interest, and this information allows calibration of adult input to the young language learner in ways that appear to support word learning.

2.1.2 How Does the Transition to Symbolic Communication Impact the Communicative Environment?

Communication is said to be symbolic when it involves the use of a particular form (e.g., gesture, sign, word) to refer to a specific referent. The relation between form and referent can vary along a continuum of complexity, ranging from relatively transparent (e.g., holding the hand to the ear as though talking on the telephone) to highly abstract (e.g., the relation between most words and their referents). In addition, the form-referent relation remains constant despite variation in the characteristics of the referent and across changing contexts (e.g., the word “cat” refers to all cats regardless of their size or color and whether they are in the kitchen, sleeping, or lying on the windowsill).

Most TD infants demonstrate a newly emerging symbolic ability at around the age of 12 months, when they begin to say their first words (Bates et al., 1979).Footnote 1 However, these early words do not have fully symbolic status because they are usually only produced in highly specific contexts. For instance, a child might say the word “byebye,” but only when his older sibling leaves for school in the morning. These early word-like productions co-exist with non-word vocalizations and gestures. Over time, however, words become decontextualized and used in a more flexible manner to refer to a variety of different exemplars of the referent and in multiple contexts (e.g., Werner & Kaplan, 1963).

Despite the importance of first words as an index of cognitive advance and for the impact that they have on proud parents, to our knowledge there is no existing research that has examined the impact of first words on the communicative environment. This may be due at least in part to the methodological difficulties inherent in reliably identifying first words an d distinguishing them from other non-word vocalizations (e.g., see Vihman & McCune, 1994) and to the fact that, at least initially, they occur relatively infrequently.

Indirect evidence that the transition to symbolic communication influences the communicative and linguistic environment comes from studies examining the ways in which very young children combine single words with gestures. Gesture-word combinations are widely observed among one-word speakers (e.g., Capirci, Iverson, Pizzuto, & Volterra, 1996; Iverson & Goldin-Meadow, 2005; Özçalışkan & Goldin-Meadow, 2005). When children verbally label an object to which they are simultaneously gesturing (e.g., pointing at a car while saying “car”), they reinforce the meaning conveyed by their gesture. Relative to gestures produced alone or with a non-word vocalization, the addition of a word to a gesture may provide caregivers with an even clearer and more salient cue as to the child’s current focus of attention; this may in turn enhance the richness of the linguistic response.

Children also combine words and gestures that convey distinct but related meaning about the referent (e.g., pointing at the car while saying “byebye”). These supplementary combinations appear in children’s production just prior to the transition to two-word speech and reliably predict onset of two-word combinations (e.g., Iverson & Goldin-Meadow, 2005). From the caregiver’s perspective, however, supplementary combinations convey more information (car and byebye) than do reinforcing combinations (car), and they may therefore provide adults with opp ortunities for producing more complex responses that may be especially beneficial for learning. Work by Goldin-Meadow et al. (2007) supports this possibility. They compared mean length of utterance for sentences mothers produced in response to supplementary versus reinforcing conditions and found that sentences produced in response to supplementary combinations were significantly longer than those produced in response to reinforcing combinations. In addition, mothers’ sentences were longest when they incorporated information from the child’s word and gesture. In sum, these results suggest that the incorporation of a symbol (a word) into an act of intentional communication (a gesture), particularly one that adds meaning to that conveyed by the gesture, impacts the communicative environment in ways that further enrich the quality and complexity of caregiver response.

2.2 Intentional and Symbolic Communication in ASD

Unfortunately, there is very little research in the ASD literature directly addressing the impact of the child's changing communicative abilities on the communicative environment. There is, however, evidence in individuals with ASD for the existence of developmental delays and atypicalities in the behaviors (vocalizations, gestures) and behavioral coordinations (e.g., vocalization with gesture, gesture with eye gaze) that signal intentional communication. Given the likelihood, as discussed above, that these delays and atypicalities alter the nature of the communicative environment and, therefore, exert an impact on the emergence of symbolic behavior, we will review the nature of the research find ings on vocalization, gesture, and vocalization-gesture coordinations (gesture-eye gaze coordinations, which are presumed to index states of joint attention , are discussed elsewhere in this book). This will provide the basis for a schematic process account of the way in which these early delays and atypicalities can exert an impact on the communicative environment and through that impact lead in turn to a cascading series of developmental effects.

Vocalizations

The few studies that exist on vocalization in ASD fall, roughly speaking, into three categories. The first consists of studies focusing on the frequency of vocal production (i.e., volubility); the second on atypicalities in vocal quality ; and the third on the frequency of communicative coordinations involving vocalization. Results from studies of all three types provide evidence for delays and atypicalities in vocalization of individuals with ASD. With regard to the first, for example, Patten et al. (2014) retrospec tively examined vocalization during home videos taken at 9–12 and 15–17 months in 23 children later diagnosed with ASD. In comparison to 14 infants w ith no such diagnosis, vocalization rates of the infants with ASD were significantly reduced. In addition, vocal quality, specifically low rates of canonical babbling (which is usually well in place in typical development by 10 months), was atypical in the infants with ASD.

The finding of reduced frequency of canonical babbling is consistent with other research showing that older children with ASD exhibit deficits in the production of well-formed syllables and frequent production of unusual sounds. Thus, for example, two studies of preverbal children with autism have reported excessive production of atypical vocalizations (e.g., trills, clicks, growls; Wetherby, Cain, Yonclas, & Walker, 1988) and vocalization with atypical phonation (e.g., falsetto, breathy voice; Sheinkopf, Mundy, Oller, & Steffens, 2000), accompanied by significantly lower rates of occurrence of well-formed syllables and marginally higher proportions of syllables with overlong vowels. Similar difficulties with syllable production have been noted in a case study from birth to 2 years of an infant later diagnosed with autism. Dawson, Osterling, Meltzoff, and Kuhl (2000) report ed that at 9 months, the infant’s vocal responses were “…primarily limited to guttural sounds with few, if any, recognizable consonant or labial sounds…” (p. 302). Although these data are taken from a single infant, the relative absence of these sounds is clearly deviant from patterns reported for typically-developing infants in this age range, for whom labial sounds (e.g., [b], [m]) tend to be among the most frequently produced (e.g., Davis & MacNeilage, 1995).

With regard to the frequency of communicative coordinations involving vocalization, data come primarily from three studies of infants who are at heightened biological risk for ASD (Heightened Risk; HR; because they have an older sibling with an autism diagnosis) and who also eventually receive an ASD diagnosis themselves (HR/ASD). Ozonoff et al. (2010) exam ined the co-occurrence of vocalization with eye gaze to the experimenter's face during longitudinal administration of the Mullen Scales of Early Learning (MSEL) (Mullen, 1995) when children were 6, 12, 18, 24, and 36 months. Results indicated that the HR/ASD infants coordinated vocalization with eye gaze at levels comparable to a comparison group of children with no known ASD risk (Low Risk; LR; and no follow up ASD diagnosis) only at the earliest age. From 12 months on, f requency of vocalization-gaze coordinations was lower in the ASD group than for TD comparison infants and while this frequency increased significantly over time for the TD infants, it decreased sharply for those in the ASD group.

In a second study of HR infants, Winder, Wozniak, Parladé, and Iverson (2013) coded the spontaneous product ion of vocalization coordinated with either eye contact or a gesture as these were produced by 15 HR and 15 LR infants at both 13 and 18 months during in-home naturalistic interaction. Although these data should be interpreted with caution since only three children in their HR sample received an eventual ASD diagnosis, at both 13 and 18 months, these three children coordinated non-word vocalizations with eye gaze and gesture at far lower rates than did either LR infants or those HR children who did not eventually receive an ASD diagnosis. Finally, Parladé and Iverson (2015) compared communicative coordinations in nine HR infants later diagnosed with ASD, to those of 13 HR infants with language delay, 28 HR infants with no diagnosis, and 30 LR infants. Hierarchical linear modeling analyses indicated that HR/ASD infants exhibited significantly slower growth in coordinations overall and in gestures coordinated with vocalizations than children in the other groups, even relative to HR infants with eventual language delay.

In summary, although there is only a small body of research on vocalization in ASD, findings have been generally consistent. Whether researchers have examined frequencies of vocal production, atypicalities in vocal quality, or frequencies of communicative coordinations involving vocalization, they have generally reported delays and/or atypicalities in the vocal behavior of individuals with ASD.

Gesture

Since publication of the DSM-III-R (American Psychiatric Association, 1987), impaired gesture (failure to gesture, abnormal gesture use in initiating or modulating social interaction, deficits in understanding and use of gestures) has been among the central diagnostic criteria for ASD. In addition, items assessing gesture atypicalities figure prominently in major diagnostic and screening instruments such as the ADOS-G (Lord et al., 2000), ADI-R (Lord, Rutter, & Le Couteur, 1994), and M-CHAT (Robins, Fein, Barton, & Green, 2001). It is surprising, therefore, that research to date on gesture production in individuals with ASD has been somewhat limited. Several factors may account for this. First, many studies have focused solely on differences between ASD and other clinical groups in the frequency of gesture production. Second, ASD gesture research has often been contextualized within the context of interest in joint attentional impairments in autism and has, therefore, been heavily and sometimes solely focused on pointing; and third, studies have varied widely in the ages and severity levels of participants, in methods of data collection (e.g., retrospective video analysis, online interaction coding) and in coding schemes and terminology.

Nonetheless, the preponderance of the evidence suggests that across a wide variety of ages, individuals with autism produce fewer gestures overall than various typical and clinical comparison groups (e.g., Pedersen & Schelde, 1997; Töret & Acarlar, 2011; Winder et al., 2013; but see also Attwood, Frith, & Hermelin, 1988; and Capps, Kehres, & Sigman, 1998 for failure to find overall frequency differences) and their gesture repertoires are less varied than those of their peers (Colgan et al., 2006; Winder et al., 2013). Individuals with autism are relatively more likely to produce gestures to regulate the behavior of others (e.g., “reaching” to have someone provide a desired object) than for purposes of social interaction (e.g., waving “hi,” or “bye bye,” shaking head “yes” or “no”) or joint attention (e.g., pointing while making eye contact with the interlocutor to share interest in an object or event, Carpenter, Pennington, & Rogers, 2002; Töret & Acarlar, 2011). Indeed, pointing to establish joint attention is often found to be virtually or completely absent (e.g., Camaioni, Perucchini, Muratori, Parrini, & Cesari, 2003; Curcio, 1978; Pedersen & Schelde, 1997; Wetherby & Prutting, 1984), somewhat rare even for requesting (Töret & Araclar, 2011), or atypical in form (e.g., “taking aim with one eye closed”; Hobson, García-Pérez, & Lee, 2010). Furthermore, at varying ages, gestures subserving all three functions but especially joint attention have been found to be less common in children with ASD than comparison peers (Landry & Loveland, 1989; Watson, Crais, Baranek, Dykstra, & Wilson, 2013). Evidence for joint attention deficits is discussed in detail elsewhere in this book.

In summary, re search on gesture in ASD has, like research on vocalization, been somewhat limited. In addition, results in this area have not always been consistent. Nonetheless, the weight of the evidence suggests that in comparison to TD peers, individuals with ASD produce fewer, less varied gestures overall and are more likely to employ these gestures for purposes of behavior regulation than for social interaction or to establish joint attention.

3 Challenges

Thus far, we have seen that advances in the development of intentional and symbolic communication engender changes in the learning environment that appear to support further advances in these skills. We have also seen that delays and atypicalities in the development of intentional and symbolic communication are characteristic of individuals with ASD. Although, as indicated earlier, there is little research directly addressing the impact of delays and atypicalities in children’s communicative behavior on the learning environment, it seems likely that such effects exist, that they may occur in ways that do not support further development, that they may be magnified over time, and that they may impact development in domains removed from communicative behavior. In other words, early-appearing disruptions in the emergence of intentional and symbolic communication may have far-reaching, cascading effects on development. A schematic illustrating such a developmental cascade is depicted in Fig. 4.1.

Fig. 4.1
figure 1

Cascading developmental effects of early communicative delays on the learning environment

The fact that from early in development, individuals with ASD demonstrate clear disruptions in the emergence of three primary communicative behaviors – eye gaze, gesture, and vocalization – is depicted on the left side of Fig. 4.1. Because joint attention as it is currently conceptuali zed involves the coordination of eye gaze with either a gesture or a vocalization, and because disruptions in any of the component behaviors will obviously impact the likelihood with which they will be coordinated with one another (e.g., Iverson & Thelen, 1999; Parladé & Iverson, 2011), joint attention behaviors will be impaired as well. Infrequent initiation of joint attention will in turn have significant implications not only for opportunities that social partners have for responding, but also for their perceptions of the communicator. These factors are illustrated on the right side of Fig. 4.1.

Thus, communicators who initiate interactions and shared moments of attention less frequently than same-aged peers are likely to be perceived as delayed by caregivers and social partners. This perception can influence the social partner’s expectations of and behavior toward the communicator. One way in which this effect may be manifested is in a reduction in the range of potential shared topics for communication. Thus, for example, in dyads with a TD child, control of conversational topics appears to shift as children become more sophisticated communicators. When children are very young and relatively less skilled, adults initiate most topics of conversation. Over time, as their language abilities become more sophisticated, children begin to initiate topics more frequently, and these child-initiated topics are then continued in adults’ speech (e.g., Hoff-Ginsberg, 1987). However, some research indicates that in dyads with a child with early language difficulties (i.e., Developmental Language Delay, or late talkers), proportions of topic initiations by caregivers are significantly higher than those for caregivers of TD children and do not show a comparable developmental shift (van Balkon, Verhoeven, & van Weerdenburg, 2010).

Communicative interactions are by definition bidirectional, and successful communication requires reciprocity between participants. When reciprocity is compromised because one participant initiates communication and shared attention only infrequently, the burden of maintaining the interaction falls on the other participant (e.g., see Rescorla, Bascome, Lampard, & Feeny, 2001, for an example from caregivers of late talkers). The consequence of this is a reduction in shared topics for communication; with one partner constantly taking the lead and receiving relatively few communicative initiations from the other participant, topic choice is primarily left to the leader of the interaction, and topics may therefore not be shared.

Reductions in initiation of joint attention and shared communication topics likely impact the nature of the input received by indi viduals with ASD as well as opportunities for learning more broadly. This could happen in at least two ways. First, fewer initiated communicative acts on the part of the communicator give social partners fewer opportunities to provide responses, and responses are important because the meaning conveyed is often related to that expressed by the communicator. Work reviewed above and that of others provides strong evidence that caregiver responses (particularly contingent responses) scaffold prelinguistic skills (e.g., growth in caregiver-directed vocalizations; Gros-Louis et al., 2014) and relate to later advances in language (e.g., vocabulary growth; e.g., Tamis-LeMonda, Bornstein, & Baumwell, 2001). Reductions in opportunities to respond could therefore negatively impact the development of these skills.

Second, a hallmark of caregiver response to joint attention episodes initiated by the communicator is that they typically provide input that is well tailored to the communicator’s current focus of attention (e.g., Goldin-Meadow et al., 2007). Moments such as these are “magic moments” for language learning: as the communicator’s attention is focused on an object of interest, the caregiver labels the object. Work with TD infants indicates that they are more successful at learning new words under these conditions than when a label is provided for an object to which they are not currently attending (Tomasello & Farrar, 1986). Although this effect has not been directly assessed in children with ASD, Siller and Sigman (2008) have provided indirect eviden ce to suggest that a similar mechanism may be operating. In a longitudinal study designed to examine predictors of language growth in childre n with ASD, these researchers found parent communication responsive to the child’s attention and ongoing activities (i.e., synchrony) during early play sessions to be positively related to the child’s rate of language growth.

Thus, vocalizations and gestures accompanied by eye gaze (i.e., intentional communication) create opportunities for caregivers to respond, and to respond in ways that are beneficial for learning. Consider now the case of an individual (child or adult) who does not produce communicative bids of this sort, or who does so relatively infrequently. Opportunities for caregiver responses would be much less frequent overall, and over time, this could significantly limit access to input that is linked in time and content with the referent. For the communicator who is alread y disadvantaged due to delays and vulnerability in communication and language development, this type of alteration in communicative input – which reflects environmental and caregiver adaptation to the communicator’s skill set and perceived developmental level – may not be optimal for advancing development.

In a recent study of caregiver responses to infant gestures, Leezenbaum, Campbell, Butler, and Iverson (2014) demonst rated just such a cascading effect. They studied two groups of infants who were observed in free play at home with a primary caregiver at ages 13 and 18 months. The first group included infants who had an older sibling with ASD (HR infants) but who did not themselves receive an ASD diagnosis at 36 months. HR infants were the focus of the study because of the extensive variability observed in communicative and language development among HR infants as a group, with many exhibiting significant delays in both of these domains (e.g., Jones, Gliga, Bedford, Charman, & Johnson, 2014). The second was a group of infants who had a typically-developing older sibling (LR infants). Overall, HR infants were delayed relative to their LR peers in the production of showing and pointing gestures, producing significantly fewer of these gestures even by 18 months. Examination of caregiver responses to infant gestures revealed that mothers of HR and LR infants were equally responsive to their infants’ gestures, and that they were more likely to translate the referent of the infant’s gesture when the gesture was a show or a point, rather than a request or give. Thus, because HR infants produced significantly fewer show and point gestures that were most likely to elicit a translation response, they received fewer translations, which are precisely the type of response that is effective for promoting word learning.

Returning now to the schematic presented in Fig. 4.1, it is important to consider the implications of the notion of cascading developmental effects on how we conceptualize communicative and language delay. This will in turn affect our agendas for research and practice (see below). There is a great deal of research aimed at identifying early predictors of communication and language disorder, and while this is an important endeavor, it has set the stage for models of the emergence of delay that are entirely focused on the communicator (e.g., delayed joint attention is a characteristic of the individual and, therefore, so are language difficulties). While it is certainly of value to know that delays in joint attention are a reliable predictor of delayed and/or disordered language development, the communicator-centered model ignores the dynamic interplay between the communicator, the communicator’s current social and communicative/linguistic abilities, and the environment and individuals who interact with the communicator. It also does not account for the potential cascading effects of delays in early-appearing skills on the subsequent emergence and development of more complex abilities both within and beyond the communicative and linguistic domains (see Iverson, 2010, for additional discussion and examples).

4 Implications for Research and Practice

The illustration in Fig. 4.1 highlights the dynamic nature of the relationship between the communicator and the social environment and underscores the fact that communicative behavior is a joint product of an individual’s available skills and what the environment provides at a particular moment in time. This conceptual framework has several implications for assessment and treatment. Two brief examples must suffice here.

With regard to assessment of individuals with communication and language challenges, it is of paramount importance to create a supportive context within which to elicit communication. If the environment does not provide presses for communication that are interesting and salient to the communicator, the likelihood of occurrence of a communicative behavior in response to the press will be quite low. Currently, there are several widely used observational measures of nonverbal social communication that have been developed for toddlers and young children (e.g., Early Social Communication Scales, Mundy et al., 2003; Communication and Symbolic Behavior Scales, Wetherby & Prizant, 2002) and involve the use of items such as bubbles and windup toys that appeal to this age group. However, normed observational tools that permit a detailed, systematic assessment of communication skills that are developmentally appropriate for older individuals are virtually nonexistent. One exception to date is the Communication Complexity Scale (Brady et al., 2012), which permits substantial flexibility in the choice of objects/events that can be used as opportunities for communication. This flexibility enhances the likelihood of providing a supportive communicative environment, and therefore of obtaining a representative sample of the communicative repertoire and the ways in which it is utilized by the communicator.

With regard to treatment, we began this chapter with a review of research on TD infants indicating that although caregivers initially respond to virtually any signal produced by their infant (even burps and sneezes) as though it is intentional, over time and with the emergence of increasingly sophisticated infant behaviors, adults gradually become more selec tive in the types of behaviors to which they respond and in the types of responses that they provide to these behaviors. The implication of this growing selectivity is that over time, communicative forms that are earlier emerging and less advanced may begin to receive progressively fewer responses, particularly those of the sort that can be beneficial for development.

For individuals who are delayed in the emergence of intentional and/or symbolic communication and for whom the window for use of earlier-emerging communicative forms (e.g., eye contact alone, vocalization alone) may be temporally extended, such changes in caregiver responding could create a further disadvantage for an already vulnerable communicative system. Recall, for example, Leezenbaum et al.’s (2014) findings that mothers were significantly more likely to t ranslate their children’s show and point gestures than they were give and request gestures and that even at 18 months, HR children produced four times as many gives and requests as they did shows and points. The implication of these findings is that although HR children were communicating intentionally, because they were doing so in a way that was less developmentally advanced, they were much less likely to receive translation responses. From a treatment perspective, it may be worth encouraging the caregivers of individuals with communication delays and challenges to broaden their patterns of responding so that they respond consistently and contingently to communicators’ gestures and non-word vocalizations, regardless of their developmental level or social salience.

The framework illustrated in Fig. 4.1 also has at least two major implications for research on intentional and symbolic communication in ASD. In particular, it suggests a need for modifications to our current definition of intentional communication and to the paradigms and measures we use for studying developmental transitions and the emergence of new skills. With regard to the first of these, as noted earlier, eye contact is generally considered to be the sine qua non of intentional communication. In much of the existing literature, children are not credited with producing an act of intentional communication unless they combine a communicative behavior (gesture or vocalization) with eye gaze directed to the social partner . It is widely assumed that TD children spend a great deal of time looking at the social partner while communicating in social interactions . However, recent research has called this assumption into question. Using head-mounted eyetracking in a naturalistic parent-child play session, Yu and Smith (2013) reported that 12-month-old infants rarely looked at their pare nt’s face (only about 11 % of the time), and that hand actions were actually more eff ective in eliciting a partner’s looking than was direct gaze following. This finding strongly suggests that while gaze to the social partner may be sufficient for establishing intentional communication, it may not be necessary (see Akhtar & Gernsbacher, 2008, for additional discussion).

Along these lines, Gernsbacher and colleagues (2008) have reviewed evidence indicating that when individuals with ASD are not required to perform an overt response such as turning the head to make eye contact, but can instead attend covertly (i.e., use peripheral vision, or “look out of the corner of their eye”), they readily attend to social stimuli, performing as well as children who do not have ASD on tasks that require, for example, following the direction of another’s gaze. Gernsbacher and colleagues propose an intriguing hypothesis, namely that individuals with ASD may utilize other behaviors (e.g., peripheral eye gaze) to initiate intentional communication, albeit in atypical and unconventional ways. To date, however, this hypothesis remains unexamined. It is worth noting that the idea that a broad variety of behavioral forms could be utilized for purposes of intentional communication is not new. Indeed, research on very young congenitally blind children has documented a wide range of ways in which behaviors other than eye contact are employed for intentional communicative purposes (e.g., Bigelow, 2003; Iverson, Tencer, Lany, & Goldin-Meadow, 2000). To our knowledge, this type of descriptive, observational ap proach has not been taken in ASD research. Work of this sort would take the field beyond the by now well-replicated findings of group differences in frequency and quality of intentional communication; it would permit the identification of cues that signal intentionality and provide us with new and valuable insights into how and under what circumstances intentional communication is achieved by individuals with ASD.

Finally, studying the emergence of new skills at developmental transitions and understanding their impact on the broader communicative and social environment requires a methodological approach that goes beyond assessments of the communicator’s behavior alone averaged across an observation period. Understanding how transitions to more sophisticated forms of communication impact the environment requires dense, longitudinal sampling of behavior prior to, at, an d following the emergence of the new skills, ideally at frequent intervals. Observation schedules of this sort permit the precise identification of the first appearances of new skills and the detailed description of ways in which they change over time.

Understanding how developmental transitions impact the larger social and communicative environment also requires broadening our lens to include a focus on the social unit participating in the interaction (e.g., a dyad) and the inclusion of measures that permit rigorous examination of the communicative interplay between participants, rather than focusing exclusively on the behavior of the communicator and/or the responses of the interlocutor individually. For instance, Northrup and Iverson (2015) examined dyadic vocal interactions during a free play observation recorded when HR and LR infants were 9 months old and found that individual measures of vocal behavior (infant or caregiver) were not predictive of later language development. The only significant predictor of expressive language in the third year was a variable measuring the extent to which members of the dyad coordinated their response latencies (i.e., the intervals between the offset of one participant’s vocalization and the onset of the other participant’s subsequent vocalization). Children from dyads with larger differences in response latency tended to have lower expressive language scores in the third year of life. Thus, examining an individual’s ability to coordinate intentional or symbolic behavior with a social partner may provide information about the stability and flexibility of the skill that is not provided by simple frequency counts alone.

5 Conclusion

We began this chapter with the proposal that delays and atypicalities in the development and use of intentional and symbolic communication have far-reaching, cascading effects in development that extend beyond the individual to impact the behavior of social partners and the communicative environment more broadly. In typical development, the emergence of intentional and symbolic communication impacts caregiver responding in ways that support the development of more advanced skills. The conceptual framework that we have presented suggests that when these behaviors fail to emerge, emerge on a delayed timetable, or appear in atypical form, as in individuals with ASD, th e environment may respond in ways that m ay negatively impact the development of communicative skills. Although future research is needed to characterize the nature of this environmental response and the ways in which it plays out developmentally, it is clear that improving our understanding of communicative delays of the sort observed in ASD and developing effective intervention methods requires an approach that goes beyond the individual to consider the constant, complex interplay between the developing communicator and the social communicative environment.