Introduction

Gesture serves as a forerunner of developmental change in language acquisition. Typically developing (TD) children use gesture to substitute for words (e.g., point at dog) before they can produce similar words (Bates 1976). After the onset of first words, children combine gesture with speech as a building block for sentences (e.g., “pat” + point at dog), before expressing such sentences exclusively in speech (e.g., “pat the dog”; Özçalışkan and Goldin-Meadow 2005a). Parents provide models for their children, producing similar types of gestures and gesture-speech combinations as their children (Özçalışkan and Goldin-Meadow 2005b). Compared to their TD peers, children with developmental disorders, such as autism spectrum disorder (ASD) or Down syndrome (DS), may experience delay in achieving early language milestones (i.e., first words, first sentences; Chapman 2003; Tager-Flusberg 2007). Furthermore, diagnosis-specific differences in gesture production have been observed. Most notably, children with ASD frequently exhibit difficulties in gesture production compared to TD children, particularly in early pointing gestures (Mundy et al. 1990). In contrast, children with DS gesture at rates similar to mental-age matched TD peers during early parent–child interactions (Iverson et al. 2003). However, we know relatively little about the gestures parents produce when interacting with their young children with ASD or with DS. We ask, in particular, whether the differences that we observe in the gesture production of children with developmental disorders as compared to their TD peers reflect the gestures children receive as input from their parents.

Gesture-Speech System in TD Children and Children with Developmental Disorders

TD children communicate with gestures before they do so with words (e.g., Bates 1976; Bates et al. 1979). They use several different types of gestures, including deictic (e.g., point at ball) and give (e.g., extend empty palm toward ball) gestures to indicate or request referents, iconic gestures to characterize them (e.g., hold cupped hands in air to convey roundness of a ball), and conventional gestures to convey culturally-prescribed meanings (e.g., nodding head for yes; Iverson et al. 1994; Özçalışkan and Goldin-Meadow 2005a, 2011). At this early stage, children’s communicative repertoire includes a greater variety of referential meanings conveyed in gesture than in words, with minimal overlap between the two modalities (Iverson et al. 1999). Moreover, many of these referents initially conveyed in gesture eventually enter children’s spoken vocabularies as words, with an average time lag of 3 months (Iverson and Goldin-Meadow 2005).

Children with ASD and with DS differ from TD children in the overall amount of gestures they produce. Children with ASD gesture less than TD children—a difference that is particularly pronounced for deictic gestures (Mundy et al. 1990; Özçalışkan et al. 2016a). Children with DS also show large individual variability in their amount of gesture production across different communicative contexts. They gesture as often as TD children in their early spontaneous interactions with parents (Iverson et al. 2003) but gesture more than TD children in experimental contexts that are specifically designed to elicit gestures (Stefanini et al. 2007, 2008).

However, despite differences in the amount of gesture production, children with ASD and with DS show similarities to TD children in the types of gestures that they produce and in the way their early gestures relate to their emerging vocabularies in speech. Similar to TD children, children with ASD and with DS produce predominantly three gesture types, including deictic, give, and conventional gestures, in their early one-on-one interactions with their parents (Özçalışkan et al. 2016a; Özçalışkan et al. 2016b). More importantly, their early repertoires include a greater variety of referents in gesture than in speech, with very little overlap between the two modalities. And not surprisingly, many of the referents that they initially convey in gesture eventually become part of their vocabularies in speech as words—with a slightly more extended time lag, compared to their TD peers (Özçalışkan et al. 2017).

TD children continue to gesture even after they begin to use words, producing different types of gesture-speech combinations. They initially produce complementary combinations, in which gesture conveys the same information as the accompanying speech (e.g., “dog” + point at dog; Greenfield and Smith 1976). The early complementary combinations are shortly followed by supplementary combinations, in which gesture adds new semantic information to the accompanying speech (e.g., “bite” + point at dog; Masur 1983). These early supplementary combinations allow children to convey a diverse set of semantic relations across modalities, at a point when they cannot convey such relations exclusively in speech. More impressive, the supplementary combinations precede and predict the emergence of not only first sentences (Goldin-Meadow and Butcher 2003; Iverson and Goldin-Meadow 2005), but also of increasingly more varied and complex sentence constructions in children’s speech (Özçalışkan and Goldin-Meadow 2005b, 2006a, 2010).

Compared to TD children, we know very little about the early gesture-speech combinations that children with ASD or with DS produce. One of the few existing studies (Iverson et al. 2003) examined the gesture-speech combinations produced by 5 children with DS (ages 3;1–4;8) in parent–child interactions and found that all of the children produced gesture-speech combinations at rates comparable to mental-age matched TD children. Interestingly, the gestures in most of these combinations conveyed the same information as the speech it accompanied (e.g., “cookie” + point at cookie; Iverson et al. 2003). Another study (Sowden et al. 2008) focused on 2 children with ASD (ages 2;4 and 2;8) and showed that both children produced gesture-speech combinations at a time when they were not yet producing word–word combinations. Thus, a few studies with small samples suggest that children with developmental disorders show a pattern akin to TD children in combining gestures with speech. These studies also show that children with ASD and with DS use gesture to complement what is already conveyed in their speech rather than to convey supplementary information. Moreover, existing research suggests that the gesture-speech system remains relatively intact in children with developmental disorders. Children with ASD and with DS use gesture and gesture-speech combinations at a time when they do not have similar speech-only expressions in their verbal repertoires, thus following a pattern similar to TD children.

Gesture-Speech System in Parents of Children with or without Developmental Disorders

Parents of TD children adjust their verbal input to the communicative needs of their young children—a finding that has been replicated across numerous studies over several decades. For example, parents fine-tune the speech that they produce to their children’s level of prosody (e.g., Cooper et al. 1997), lexicon (e.g., Vibbert and Bornstein 1989), and syntax (e.g., Furrow et al. 1979) during one-to-one interactions. They produce shorter sentences with simpler words and with more enunciated intonation, thereby accommodating to the communicative needs of their children (see Snow 1995).

Parents of TD children not only talk but also gesture when communicating with their young children. More importantly, they also fine-tune their gestures to their children’s communicative needs. For example, English-speaking mothers gesture less frequently, using conceptually simpler gestures (i.e., deictic points) when talking to their 1;6 year-old children than when talking to other adults (Bekken 1989). The simplification of gestural input also becomes evident across several other studies with children across a broader age range (ages 1;2–2;10; Özçalışkan and Goldin-Meadow 2005a, 2006b, 2011) and with children learning other languages. Italian mothers, despite living in a culture known for its rich iconic gestural repertoires, also produce greater numbers of deictic and conventional gestures compared to the relatively more complex iconic gestures when interacting with their young children (ages 1;4–1;8; Iverson et al. 1999). Similarly, mothers further simplify the gestural input that they provide to their young children by using predominantly complementary gesture-speech combinations and relying less often on the more complex supplementary combinations during interactions (Iverson et al. 1999; Özçalışkan and Goldin-Meadow 2005a). Importantly, in addition to simplifying their gestural input, parents also provide models to their children for the different types of gestures and gesture-speech combinations. As shown in previous work (Iverson et al. 1999; Özçalışkan and Goldin-Meadow 2005a, 2006b, 2011), the types of gestures and gesture-speech combinations young TD children produce reflect the types of gestures and gesture-speech combinations that their parents produce. Both parents and children use deictic gestures the most, followed by conventional gestures. Parents also rarely produce iconic gestures—a pattern that also becomes evident in children’s gestures (Özçalışkan and Goldin-Meadow 2011).

The same pattern appears in the production of the different types of gesture-speech combinations. Both parents and children produce three distinct gesture–speech combination types, in which gesture either complements (e.g., “bottle” + point at bottle), disambiguates (e.g., “hold it” + point at bottle), or supplements (e.g., “thirsty” + point at bottle) the information conveyed in speech (Özçalışkan and Goldin-Meadow 2005a, 2006b; see also; Özçalışkan and Dimitrova 2013, for a review).

Turning next to children with developmental disorders, most of the previous work on parental input focused solely on verbal input. It showed fine-tuning of parents’ speech to the communicative needs of their children, akin to the fine-tuning observed for the verbal input directed to TD children. For example, studies focusing on TD children, compared to either children with ASD (Bang and Nadig 2015; Talbott et al. 2015) or with DS (Mundy et al. 1988; Rondal 1988; Zampini and D’Odorico 2011), matched in language ability, showed no group differences in the amount, diversity, and complexity of the verbal input that parents provided to their children.

In contrast to the numerous studies on the verbal input directed to children with ASD and with DS, less is known about the nonverbal input these two groups of children receive from their parents. The few existing studies show that parents do not differ in the types of gestures (i.e., deictic, conventional, iconic) and/or gesture-speech combinations (i.e., complementary, disambiguating, supplementary) that they produce when interacting with their children with different developmental disorders. For example, Mitchell (2013) examined the gestures produced by mothers of 9 children with low risk versus 8 children with high risk for ASD when interacting with their 1;3-year-old infants. She found no differences between the two groups in either the amount or the types of gestures or gesture-speech combinations that the mothers produced. A few studies, however, show that even if parents do not differ in the types of gestures that they produce, they do show some differences in how often they produce the different types of gesture. For example, Talbott et al. (2015), similar to Mitchell, found no evidence of a difference between mothers of 1;0 year-old children at high risk for ASD (n = 38) and mothers of children at low risk for ASD (n = 27) in the types of gestures that they produced. Parents produced deictic gestures most frequently, followed by conventional and then iconic gestures, and at similar proportions in both groups. But different from Mitchell’s (2013) findings, Talbott et al. (2015) found that mothers of children with high risk for ASD gestured more than mothers of children with low risk for ASD, thus producing more of each of the three gesture types.

Gestural input to children with DS has been shown to be similar to that observed in studies of children with ASD. Iverson and colleagues (Iverson et al. 2006) compared parent gesture input provided to 5 children with DS and TD children. They found that parents produced similar types of gestures and gesture-speech combinations and at similar distributions—with deictic gestures and complementary gesture-speech combinations being expressed most frequently in the two groups. However, there were also differences in how often parents gestured: Parents of children with DS produced more deictic gestures than parents of TD children—a pattern that was reversed for conventional gestures. Overall, the few studies on parental gestural input to children with developmental disorders suggest that parents show strong similarities to parents of TD children in the types of gestures and gesture-speech combinations that they produce, but they may differ in the amount which they produce each type of gesture.

Current Study

Most of the earlier work on gesturing in children with developmental disorders and their parents focused on either the amount or the types of gestures produced, leaving the informational relation gesture holds to accompanying speech (i.e., gesture-speech combinations) relatively unexplored. Similarly, the few previous studies examining parental gesture input to children with developmental disorders focused on a single disorder (ASD or DS), mostly with small sample sizes, making it difficult to draw broader conclusions about variability evident in patterns of gesture use within and across children with different developmental disorders and their parents.

In this study, we aim to fill in these gaps by providing a comprehensive account of the gesture-speech system of a larger sample of children with different developmental disorders and by comparing children’s gesture-speech system to that of their parents. More specifically, we study the early gesture-speech system of two groups of children with developmental disorders in comparison to a group of TD children, similar in expressive vocabulary. We first ask whether children with developmental disorders show patterns akin to TD children in the amount and types of gestures and gesture-speech combinations that they produce. We next ask whether diagnosis-specific patterns observed in children’s gestures reflect the gestures and gesture-speech combinations produced by their parents.

We predict that children with developmental disorders will differ from TD children in their amount of production of the different types of gestures and gesture-speech combinations (e.g., Iverson et al. 2003; Mundy et al. 1990). We also predict that parents might provide models for their children for the different types of gestures and gesture-speech combinations (e.g., Özçalışkan and Goldin-Meadow 2005a, b; Talbott et al. 2015). But, given the scarcity of existing research, it remains a distinct possibility that any differences observed in the gesture-speech system of children with ASD and with DS might mirror differences in their parents’ gesture-speech system. By systematically observing the relation between the parents’ and the children’s gesture-speech system, we hope to learn more about the source of diagnosis-specific communication differences. Moreover, these observations may shed new light on gesture’s potential as an intervention tool that parents, clinicians, and educators may use to help children with developmental disorders overcome difficulties in communicative development.

Methods

Sample

The participants came from a larger project on the development of joint engagement (Adamson et al. 2004, 2012, 2009, 2010). To study gesture, we selected 69 child-parent dyads, including 23 TD children (18 boys, Mage = 1;6 years; range 1;6–1;6), 23 children with ASD (20 boys, Mage = 2;6 range 1;9–3;1), and 23 children with DS (17 boys, Mage = 2;6, range 1;8–3;4), along with their parents. The 23 children in each group were selected so that they did not differ significantly—at group level—in their word use during interactions for both word tokens, Kruskal–Wallis, χ2(2) = 3.39, p = .18 (i.e., number of words, MTD = 51.91 [SD = 59.68, range 3–247], MASD = 74.43 [SD = 116.01, range 0–392], MDS = 25.26 [SD = 39.39, range 0–190]) and word types, χ2(2) = 3.58, p = .17, (i.e., number of different words, MTD = 18.48 [SD = 20.51, range 3–95], MASD = 24.74 [SD = 32.98, range 0–106], MDS = 11.22 [SD = 18.87, range 0–93]).Footnote 1 The child-parent dyads were predominantly Caucasian (TD: 74%, ASD: 83%, DS: 83%) and included mostly mothers, with the exception of two father-child dyads within the DS group. The parents in each group were comparable in age (MTD = 32.39 [SD = 4.92], MASD = 33.35 [SD = 3.34], MDS = 37.95 [SD = 5.09]) and education; the majority of the parents of children with TD (78%), ASD (65%), and with DS (78%) had at least a college degree at the time of our initial observation. All children were learning English as their native language.Footnote 2

As part of the inclusion criteria in the larger project (Adamson et al. 2009), the children in the ASD group were referred by one of the three clinicians who had previously diagnosed the children with autism according to the DSM-IV-TR criteria for autistic disorder (American Psychiatric Association 2000). All three clinicians held doctoral degrees in clinical psychology and had extensive experience working with children with ASD. Once the parent consented to participate in our project, we confirmed the clinician’s diagnosis using the Autism Diagnostic Interview-Revised (ADI-R; Lord et al. 1994) that was administered by staff with MA degrees who were research-reliable on the ADI-R. ADI-R results were consistent with the clinicians’ diagnoses in all cases. All of the children scored above cut-off for autism on the social interaction, restricted or repetitive behavior, and communication scales—with the only exception of one child who scored one point below cut-off on the nonverbal communication scale. None of the TD children had developmental problems, as reported by their parents.

Procedure

Child-parent dyads were observed in a laboratory setting, using a play protocol designed to elicit semi-naturalistic observations of parent–child communication, applicable to a diverse group of young children (Communication Play Protocol: CPP; Adamson et al. 2004). We used four 5-min CPP scenes: two that encourage requesting (getting toys from a high shelf, playing with complex toys) and two that encourage commenting (discussing pictures, discussing objects in container), for a total observation time of approximately 20 min per child.

Transcription and Coding

All observations were previously transcribed for child and parent speech (Adamson and Bakeman 2006). Sounds referring to entities, events, properties, along with onomatopoeic (e.g., “woof”, “choo-choo”), and conventionalized evaluative sounds (“oopsie”, “yay”) were counted as words. We further coded the parent–child interactions for gesture, using trained coders who were blind to the study’s hypotheses. We defined gesture as a communicative hand or body movement, directed to an interlocutor. Only hand movements that did not directly manipulate objects and were not part of a ritualized game were coded as gestures, with the exception of show gestures in which the gesturer held up an object to bring it to the attention of the interlocutor. These gestures, which functioned like the pointing gestures, were coded as deictic gestures, following earlier work (Özçalışkan and Goldin-Meadow 2005a, b). We divided transcripts into communicative acts, defined as a sequence of words and/or gestures that were preceded and followed by a short pause or a change in conversational turn between child and parent. Each communicative act corresponded roughly to a sentence or a phrase, reflecting a single thought; its boundaries were typically marked with a falling or rising intonation or a short pause.

Speech

The frequency of word types, word tokens, and mean length of utterance (MLU; i.e., number of words in intelligible utterances) was calculated for each child and each parent using Systematic Analysis of Language Transcripts (SALT; Miller and Iglesias 2015).

Gesture

We coded each gesture produced by children and their parents for its type, following earlier work (Özçalışkan and Goldin-Meadow 2005a). Gesture types consisted of (1) deictic gestures that indicated referents (e.g., pointing to or holding up ball to indicate ball), (2) give gestures that requested referents (e.g., extending open palm toward ball to request ball), (3) conventional gestures that expressed culturally-shared meanings with prescribed gesture forms (e.g., nodding head to mean yes), and (4) iconic gestures that conveyed attributes and actions associated with objects (e.g., thrusting arm forward to convey throwing). Iconic gesture use by the children in our study was extremely rare, accounting for a total of 14 instances across the three groups; we therefore excluded iconic gestures from our gesture type analysis for the children. In addition, we did not observe any incidence of a child or a parent producing a beat gesture, which was defined as a meaningless hand movement that was rhythmically related to speech but did not convey any semantic information (McNeill 1992).

Gesture + Speech

We coded each communicative act in which gestures are produced along with speech (i.e., gesture + speech) for the informational relation gesture held to the accompanying speech, following earlier work (Özçalışkan and Goldin-Meadow 2005b). The gesture-speech combination types included (1) complementary combinations, in which gesture and speech conveyed the same information (e.g., “bike”+point at bike), (2) disambiguating combinations, in which gesture clarified a pronominal referent in speech (e.g., “this”+point at bike), and (3) supplementary combinations, in which gesture added semantic information not found in the accompanying speech (e.g., “ride” + point at bike). Given the important role supplementary gesture-speech combinations play in the onset of particular sentence constructions in speech in TD children (Özçalışkan and Goldin-Meadow 2005b, 2006a, 2010), we also coded the supplementary gesture-speech combinations further in order to perform a qualitative analysis of the types of semantic relations conveyed. Following earlier work (Özçalışkan and Goldin-Meadow 2005b), supplementary gesture-speech combinations were classified into one of four types: (1) multiple arguments without a predicate (e.g., “mommy” + point at toy), (2) a predicate with at least one argument (e.g., “book” + give gesture), (3) an argument with an adjective or a filler (e.g., “silly” + point at toy; “yeah” + point at mirror), and (4) multiple predicates with or without arguments (e.g., “You want to air it again?” + give gesture).

Reliability

Inter-coder agreement was assessed on a randomly selected 15% of the video recorded sessions in each group by an independent coder, who was blind to the hypotheses of the study, separately for children and parents. For the children, inter-coder agreement was 89%, κ = .87 (TD: 91%, ASD: 86%, DS: 86%) for identifying presence of a gesture independent of its type, 92%, κ = .91 (TD: 93%, ASD: 93%, DS: 89%) for assigning meaning to gestures, 95%, κ = .93 (TD: 96%, ASD: 98%, DS: 91%) for classifying gestures into types as deictic, conventional, give, and iconic, and 96%, κ = .93 (TD: 98%, ASD: 96%, DS: 96%) for classifying gesture-speech combinations according to the informational relation gesture held to the accompanying speech as complementary, disambiguating, and supplementary. For the parents, agreement between coders was 92%, κ = .94 (TD: 95%, ASD: 94%, DS: 90%) for identifying gestures, 97%, κ = .98 (TD: 96%, ASD: 98%, DS: 96%) for assigning meaning to gestures, 97%, κ = .94 (TD: 97%, ASD: 99%, DS: 96%) for identifying gesture types, and 88%, κ = .82 (TD: 93%, ASD: 94%, DS: 84%) for classifying gesture-speech combinations according to the informational relation gesture held to the accompanying speech.

Analysis

The total number of words (tokens and types), gestures, gesture-speech combinations, and MLU were computed and analyzed separately for children and parents, using one-way ANOVAs or Kruskal–Wallis tests—where the assumption of homogeneity of variance or normality was violated—with group (TD, ASD, DS) as a between-subjects factor. We next computed the amount of each gesture type (deictic, give, conventional, iconic) and each gesture-speech combination type (complementary, disambiguating, supplementary), produced by each child and parent. The amount of children’s gestures and gesture-speech combinations showed large group differences. We therefore converted all raw frequencies—separately for the children and the parents—into proportions, and arcsine-transformed the proportions for analysis. We analyzed differences with two-way ANOVAs, with group as between (TD, ASD, DS) and either type of gesture or type of gesture-speech combination as within-subject factors, separately for the children and their parents. The number of children who produced deictic gestures and supplementary gesture-speech combinations varied to some extent across groups. However, because we did not find such variability across groups in other types of gestures (i.e., give, conventional) or gesture-speech combinations (i.e., complementary, disambiguating), we included all children in all our analyses, using children’s relative production of each type of gesture or gesture-speech combination as our unit of analysis, described above.

Results

Speech

Children did not show group differences for either the amount (i.e., word tokens), Kruskal–Wallis, χ 2(2) = 3.39, p = .18, or the diversity of the words (i.e., word types), χ 2(2) = 3.58, p = .17, that they produced (see Table 1, upper half). The lack of a significant difference between the three groups on word production—types and tokens—reflects our selection criteria for participants that produced groups that did not differ significantly on the amount and variety of word production (see sample description). The three groups also did not differ in the complexity of their speech (MLU), χ 2(2) = 1.04, p = .59. Most of the children (64/69) produced at least a few single words during the interaction; less than half of them (32/69) produced multi-word combinations.

Table 1 Children’s and their parents’ production of speech

The parents were also similar in their speech production, showing no group differences in the number of word tokens, F(2, 66) = 1.56, p = .22, and word types, F(2, 66) = 0.81, p = .45, that they produced. It is noteworthy that unlike child word tokens and word types, parent word tokens and word types were not used as criteria for participant selection. Parents, similar to their children, did not show a group difference in the complexity of the speech that they produced (i.e., see Table 1, lower half), F(2, 66) = 0.95, p = .39.

In summary, children—with or without developmental disorders—were similar in their amount and diversity of speech production (which was by design), and in the complexity of their speech. Parents were also similar across the three groups in terms of the amount, diversity and complexity of the speech that they produced in communicating with their children.

Gesture

We first looked at children’s overall gesture production and observed group differences, χ 2(2) = 14.86, p = .001. As can be seen in Table 2 (top row), children with DS and with ASD were comparable in the number of gestures that they produced (Bonferroni, p = 1.0), but both groups produced significantly fewer gestures than TD children (Bonferroni, p s ≤.01)—a finding that we also reported in earlier work (Özçalışkan et al. 2016a, b). We next looked at children’s proportional use of different gesture types (deictic, give, conventional). As can be seen in Fig. 1A, children’s gesture production showed an effect of group, F(2, 66) = 4.27, p = .02, η 2 p  = .12 (with a significant difference between children with DS and with ASD in follow-up pairwise comparisons, Mdifference DS–ASD = .03; Bonferroni, p = .02), an effect of gesture type, F(2, 132) = 10.24, p < .001, η 2 p  = .13, but no interaction between gesture type and group, F(4, 132) = 1.79, p = .14. Across all three groups, children produced a higher proportion of deictic and give gestures than conventional gestures (Bonferroni, p s  < .001). Iconic gestures were extremely rare but comparable across groups and were not included in the analysis.

Table 2 Children’s and their parents’ production of gesture
Fig. 1
figure 1

Mean proportion of deictic, give, conventional, and iconic gestures that A children with typical development (TD), autism spectrum disorder (ASD) and Down syndrome (DS) and B their parents produced at Mage = 1;6 for TD children and Mage = 2;6 for children with ASD and with DS. The error bars represent standard error

Unlike their children, parents’ overall production of gesture did not show group differences, χ 2(2) = .04, p = .98, with similar numbers of gestures produced by parents of children in each group (see Table 2, lower half). This pattern was also evident in the parents’ proportional use of each gesture type, which showed no effect of group, F(2, 66) = 1.43, p = .25, and no interaction between group and gesture type, F(6, 198) = 2.05, p = .06. In contrast, there was a main effect of gesture type, F(3, 198) = 589.83, p < .001, η 2 p  = .90, with parents producing a higher proportion of deictic than conventional, iconic and give gestures, and a higher proportion of conventional than give and iconic gestures (Bonferroni, all p s < .001; see Fig. 1B).

In summary, children, but not their parents, showed group differences in their overall production of gesture and in how often they produced each type of gesture. Yet the types of gestures and the relative distribution of each type were similar: Both children and parents were most likely to produce deictic gestures and least likely to produce iconic gestures.

Gesture + Speech

First looking at children’s overall production of gesture-speech combinations, we once again observed group differences, χ 2(2) = 13.39, p = .001. Children with DS produced significantly fewer gesture-speech combinations than TD children (Bonferroni, p = .001); while children with ASD did not differ from TD children (Bonferroni, p = .11) or children with DS (Bonferroni, p = .37; see Table 3, upper half). The group differences also became evident at the individual level. Almost all of the TD children were producing gesture + speech combinations (22/23), while roughly half of the children with ASD (15/23) or with DS (11/23) used gesture in combination with speech during our observation. Turning next to children’s proportional use of each gesture-speech combination type, we also found an effect of group, F(2, 66) = 8.35, p = .001, η 2 p  = .20—with significant differences between TD children, compared to children with ASD and with DS (Bonferroni, p s  ≤ .04), an effect of gesture-speech combination type, F(2, 132) = 12.27, p < .001, η 2 p  = .16, but no interaction between group and combination type, F(4, 132) = 1.33, p = .26. Across all three groups, children produced greater proportion of complementary and supplementary than disambiguating gesture-speech combinations (Bonferroni, p s  < .001; see Fig. 2A).

Table 3 Children’s and their parents’ production of gesture-speech combinations
Fig. 2
figure 2

Mean proportion of complementary, disambiguating, and supplementary gesture + speech combinations that A children with typical development (TD), autism spectrum disorder (ASD) and Down syndrome (DS) and B their parents produced at Mage = 1;6 for TD children and Mage = 2;6 for children with ASD and with DS. The error bars represent standard error

The parents, unlike their children, did not show group differences in the number of gesture-speech combinations that they produced, χ 2(2) = 0.48, p = .78 (see Table 3, lower half); and they all—regardless of group—produced at least several gesture-speech combinations at the time of our observations. This pattern was also evident in parents’ proportional use of each gesture-speech combination type—with no effect of group, F(2, 66) = 0.94, p = .40, or interaction between group and gesture-speech combination type, F(4, 132) = 1.31, p = .27, but an effect of gesture-speech combination type, F(2, 132) = 201.58, p < .001, η 2 p  = .75. Parents produced greater proportion of complementary than disambiguating (Bonferroni, p < .001), and a greater proportion of disambiguating (Bonferroni, p = .03) than supplementary gesture-speech combinations (see Fig. 2B).

In summary, children showed group differences in their overall production of gesture-speech combinations as well as each combination type, while their parents did not. At the same time, the children and their parents showed similar patterns in the types of gesture-speech combinations that they produced, but produced them at different rates. While parents frequently used gesture (59%) to further complement what they already conveyed in speech (e.g., “bottle” + point at bottle), children used gesture not only to complement but also to further supplement (42%) what they conveyed in speech (e.g., “baby” + point to bottle). This pattern suggests that gesture may serve a different function in the early interactions for children with or without developmental disorders than for their parents.Footnote 3

Types of Supplementary Gesture-Speech Combinations

Unlike other types of combinations, supplementary gesture-speech combinations act like sentences as they convey propositional information across gesture and speech (Iverson and Goldin-Meadow 2005; Özçalışkan and Goldin-Meadow 2005b). As such, they might serve as an important communicative tool for children to expand their repertoire of sentence-like constructions at a time when they have more limited abilities to convey such semantic relations in speech alone. We next examined whether children with ASD and with DS would show similarities to TD children in the types of supplementary gesture-speech combinations that they produce, and if so, whether parents might be providing models to their children for the different supplementary gesture-speech combination types.

First looking at children, we found that children produced three types of supplementary gesture-speech combinations, including (1) argument + argument, which conveys relations between two or more arguments, without a predicate (e.g., “Mommy” + point to bubble, “Daddy door” + point to car), (2) argument(s) + predicate, which conveys relations between a predicate and one or more arguments (e.g., “book” + give gesture; “I want to hold” + point to clock; “mama” + lift gesture), and (3) argument + adjective/filler, which adds an argument to either a filler expression or an adjective in speech (e.g., “yeah” + point to car; “uhoh” + hold up balloon; “pretty” + point to flower). The number of children producing at least one of the three supplementary gesture-speech combination types was similar in the ASD (n = 12) and the TD (n = 15) groups, but substantially lower in the DS group (n = 4). Children in all groups also occasionally used gesture to convey conflicting information as speech (e.g., “dinosaur” + hold up lion; “dog” + point to lion; “it is a car” + hold up truck); see Table 4 for additional examples of children’s supplementary gesture-speech combinations.

Table 4 Examples of the types of semantic relations children with autism spectrum disorder (ASD), Down syndrome (DS) and typical development (TD) conveyed in supplementary gesture-speech combinations

Turning next to parents, we observed similar types of supplementary gesture-speech combinations in their repertoire, including (1) argument + argument(s): e.g., “[Child’s name]” + point to car; “a butterfly?” + point to flower, (2) argument(s) + predicate: e.g., “splash” + point to water; “she can sit” + point to truck; “put the elephant in” + point to container; “mommy” + give gesture; and (3) argument + adjective/filler: e.g., “nice and soft” + hold up stuffed toy; “bye” + hold up monkey; “uhoh” + point to toy. Importantly, however, the parents’ supplementary gesture-speech combinations also differed from their children’s in several interesting ways. One such difference was that parents used such combinations to ask a question in speech and provide the answer in gesture (e.g., “Where did the apple go? + point to floor; “What do you want me to do?” + hold up bubble wand; “More?” + hold up balloon)—a pattern that we never observed in children. Another interesting difference was that caregivers used supplementary combinations to add an additional predicate to an existing predicate in speech (predicate + predicate; e.g., “What do you want?” + come here gesture; “Let’s not bite it” + give gesture requesting balloon; “You have to blow into it” + give gesture requesting bubble wand); such predicate + predicate combinations, which are akin to complex sentences in speech, were never observed in children’s communications, with the one exception of one child with ASD who produced one instance of such a supplementary combination (i.e., “I want to throw it” + give gesture requesting cat picture); see Table 5 for additional examples of parents’ supplementary gesture-speech combinations. Overall, however, the types of supplementary gesture-speech combinations children produced resemble the ones that their parents produced, suggesting that parents might be providing models for their children for the types of supplementary gesture-speech combinations.

Table 5 Examples of the types of semantic relations parents of children with autism spectrum disorder (ASD), Down syndrome (DS) and typical development (TD) conveyed in their supplementary gesture-speech combinations

Discussion

In this study, we asked whether children with developmental disorders and TD children differ in gesture production during parent–child interactions, and if differences in children’s gesture production appear to mirror their parents’ gestural input. We addressed these questions by systematically observing the types of gestures and gesture-speech combinations produced by children and parents in three groups of 23 dyads that were defined by the child’s diagnosis (TD, ASD, and DS) but similar in terms of child’s expressive language use. We found that parents and their children produced similar types of gestures (deictic, conventional, iconic) and gesture-speech combinations (complementary, disambiguating, supplementary) across the three groups. However, only children—but not their parents—showed diagnosis-specific group differences in the amount with which they produced each type of gesture and gesture-speech combination. These results suggest that even though parents in the three groups provide their children with similar models of the types of gestures and gesture-speech combinations, diagnosis-specific variability in how often children produce each type of gesture is not related to parental gesture input.

Do Children with Developmental Disorders Differ from TD Children in the Types of Gestures and Gesture-Speech Combinations that they Produce?

Children showed the expected diagnosis-specific differences in gesture production, with lower frequency of gesture production in children with ASD and with DS compared to TD children—a finding also reported in earlier work (Özçalışkan et al. 2016a, b). Importantly, however, group differences observed in children’s production of gesture did not vary as a function of gesture type. Across groups, children produced a greater proportion of deictic and give gestures than conventional and iconic gestures. The robust effect of gesture type that cuts across groups might be explained by the relative complexity of what the gesture represents. The mapping between a deictic or a give gesture and its referent is more transparent than a conventional or an iconic gesture, because deictic and give gestures both relate to the world more directly by indicating or requesting perceptually cohesive entities. In contrast, both conventional and iconic gestures convey relational concepts, such as actions (e.g., thrusting empty palm forcefully forward to convey throwing; placing index finger in front of lips to convey being quiet) or features (e.g., hold pinched fingers in air to convey size of a window), rendering them cognitively more challenging for children at the younger ages (Özçalışkan et al. 2014).

The complexity of the form of the gesture itself may further heighten the cognitive demand associated with each gesture type. Iconic and conventional gestures involve representing referents with particular symbols and might require more complex representational abilities that do not begin to emerge until ages 2 to 3 (e.g., DeLoache 2004; Lillard 1993). More specifically, the form of either deictic or give gestures does not vary as a function of its referent, while the form of a conventional or iconic gesture does vary considerably across different referents. As such, it might be easier for young children to produce deictic and give gestures, each of which uses a single gesture form for all referents that they represent (see Özçalışkan et al. 2014, for further discussion).

Our findings also show lower rate of gesture production, particularly in children with ASD. One plausible explanation is that the lower rate of gesture production in the ASD group is due largely to difficulties producing deictic gestures. Indeed, several children in the ASD group, but not in the other two groups, did not produce any deictic gestures. The diminished use of deictic gesture can be viewed as one aspect of a broader constellation of difficulties young children with ASD have developing joint attention skills (Dawson et al. 2004; Mundy et al. 1990) and sustaining joint engagement during interactions (Adamson et al. 2009).

Interestingly, the lower rate of gesture production was also evident in children with DS in our study. This finding contrasts with earlier studies that showed relative strengths in gesture use within this group (e.g., Caselli et al. 1998; Franco and Wishart 1995; Iverson et al. 2003; Stefanini et al. 2007). Methodological differences may account for the differences in findings. Previous work relied on smaller sample sizes (e.g., Iverson et al. 2003), more indirect measures of child gesture production (e.g., parental checklists; e.g., Singer Harris et al. 1997), and/or broader definition of gesture, including other nonverbal behaviors (e.g., baby signs; e.g., Caselli et al. 1998). Our study, in contrast, relied on systematic observations of young children interacting with their parents in a semi-naturalistic setting, used a relatively larger sample size, and applied a more precise definition of gesture. Thus we may have obtained a more representative picture of how young children with DS produce gestures.

The children in our study, even though they were comparable in expressive language use, differed in chronological age, raising the possibility that chronological age, but not diagnosis-specific variability, might be driving the differences in gesture use. Interestingly, however, gesture production was not correlated with children’s chronological age in any of the three groups (TD: r s = .01, p = .96, ASD: r s = .37, p = .08; DS: r s = .27, p = .21). In fact, it was children’s expressive language use (i.e., word tokens) that served as a better predictor of children’s gesture production in the TD group: r s = .51, p = .01 and the ASD group: r s = .84, p < .001), but not for the group of children with DS (r s = .10, p = .66). The lack of a relation between chronological age and gesture production thus suggests that diagnosis-specific variability in gesture use might be more closely linked to children’s spoken language abilities.

Children also showed diagnosis-specific differences in their production of gesture-speech combinations, with fewer instances of gesture-speech combinations observed in children with ASD and with DS compared to TD children. One explanation could be that differences in the overall production of gesture-speech combinations are driven largely by differences in the production of gestures. This is particularly plausible given that by design the children in our study did not differ significantly in the amount of speech that they produced (i.e., word tokens). Thus, even if the children were speaking at similar levels, children with ASD and with DS were less likely to gesture when speaking compared to TD children, thereby producing fewer gesture-speech combinations. Furthermore, children with DS were at the lower end of the distribution with respect to their word production, compared to children with TD and with ASD—a difference that might have also led to the lower number of gesture-speech combinations observed within this group.

More importantly, however, group differences in children’s overall production of gesture-speech combinations did not vary as a function of gesture-speech combination type—a pattern akin to the one observed for gesture type. Across all three groups, children produced a greater proportion of gesture-speech combinations in which gesture either conveyed the same information as speech (i.e., complementary gesture + speech) or additional information not found in speech (i.e., supplementary gesture + speech) than combinations in which gesture further clarified a proform (i.e., disambiguating gesture-speech). These findings follow earlier work with young TD children (ages 1;6 − 1;10) who also showed more frequent incidence of complementary and supplementary gesture-speech combinations than disambiguating ones in their communications with their parents (Iverson et al. 1999; Özçalışkan and Goldin-Meadow 2005b).

The occurrence of supplementary gesture-speech combinations is noteworthy given that this type of gesture-speech combination has been shown to play a particularly important role in language development because it allows children to convey different semantic relations across gesture and speech before they have the ability to express such relations as sentences in speech (Iverson and Goldin-Meadow 2005; Özçalışkan and Goldin-Meadow 2005b, 2006a, 2009). Importantly, the kinds of meanings children conveyed in using supplementary combinations were similar across the three groups (see Table 4), allowing the child to either add an additional argument to a sentence with a predicate or a predicate to a sentence with argument(s). These findings further highlight the commonalities across different diagnostic groups in the way they utilized these gesture-speech combinations to express different elements of a sentence (one in gesture and one in speech) at a time when the child may not be able to express those elements within a single spoken utterance.

However, it is also important to note that although many of the TD children (17/23) and children with ASD (14/23) produced supplementary gesture-speech combinations, only 5 of the 23 children with DS did. This finding—coupled with the findings that less than half of the children with DS were combining gestures and speech—suggests that children with DS might not have been advancing as quickly as children in the other two groups in the way they used gesture in relation to speech, still utilizing gesture predominantly to complement what they already conveyed in speech.

In this study, we approached the question about diagnosis-specific variability in patterns of gesture production at a single period in development, one that precedes the production of sentences in speech; and we found evidence of emerging sentence-construction abilities in children’s early supplementary gesture-speech combinations. Children conveyed different semantic relations across gesture + speech at a period when they cannot yet produce such semantic relations exclusively in speech. However, we still do not know whether children with ASD or with DS follow a trajectory akin to TD children from gesture-speech combinations to speech-only utterances in conveying the different semantic relations—a question that requires future longitudinal studies that follow children from producing their first words to their first complex sentences.

Are Differences that We Observe in the Early Gesture-Speech System of Children with Developmental Disorders a Reflection of the Gestural Input That They Receive from Their Parents?

Unlike their children, parents of children with ASD and with DS did not differ from parents of TD children in the number of gestures that they produced, suggesting that their amount of gesture use did not parallel the production levels exhibited by their children. Importantly, however, the relative distribution of each gesture type showed strong similarities across the three groups of parents, and followed a pattern akin to their children—with greater proportion of deictic gestures than conventional and iconic gestures.

These findings suggest that children might have learned not only the different types of gestures from their parents, but also how often to use them. The only exception to this pattern was ‘give’ gestures, which children but not parents produced quite frequently. This difference might be due in part to our use of the Communication Play Protocol that included scenes, such as one where enticing objects were placed on a high shelf, that were designed to observe how the child made requests. Thus the study design may have encouraged the child to produce a greater proportion of give gestures than their parents.

Nevertheless, even if children are learning about gesture from their parents’ input and some of them were beginning to combine gestures and speech, the overall use of gesture as a means of communication was markedly different. Gesture accounted for a small proportion of parents’ communicative acts addressed to their children; only 24% of parental communication acts contained a gesture across the three groups (TD: 25%, ASD: 26%, DS: 22%), and almost all of these gestures were produced with speech. In contrast, across groups, 64% of communicative acts children produced contained gesture (TD: 65%, ASD: 57%, DS: 69%), and the majority of these gestures were produced without speech. Thus it appears that gesture was serving a different function for the children, compared to their parents, providing them with a way to indicate or request referents that they cannot yet express using words alone.

Turning next to gesture-speech combinations, parents of children with ASD and with DS did not differ from parents of TD children in the number of gesture-speech combinations that they produced, exhibiting a pattern different from their children who did show such group differences. Perhaps, more importantly, the production rate of each gesture-speech combination type did not differ across the three groups of parents, but did differ from their children. The majority of the gesture-speech combinations parents produced were complementary (59%), in which gesture further reinforced what they conveyed in speech. In contrast, children—across groups—were as likely to produce complementary combinations as supplementary ones, in which gesture conveyed additional information not found in speech.

Why are supplementary gesture-speech combinations more prevalent in children’s, but not in parents’ communications? As shown across numerous studies, children’s initial grasp of a concept—be it linguistic or cognitive—becomes evident first in gesture. Moreover, children who are at the cusp of mastering a new concept gesture differently than children who are not at that stage (Goldin-Meadow 2003). More specifically, children are apt to use gesture to convey additional information not found in their speech, i.e., mismatches—akin to supplementary combinations in our study—when explaining concepts that they are in the midst of learning, thus indexing their readiness to take the next developmental step (e.g., Church and Goldin-Meadow 1986; Goldin-Meadow and Singer 2003; see; Goldin-Meadow 2014; Özçalışkan and Hodges 2016, for reviews). Many of the children in our study were at the verge of producing their first sentences. Thus, they might be showing their readiness to take the next step of expressing sentences in speech first in their supplementary gesture-speech combinations. Unlike their parents, who were experts in expressing sentences exclusively in speech, children were still novices, using gesture + speech as a segway to convey their burgeoning knowledge of expressing propositional information. Parents—across groups—also produced greater proportion of disambiguating gesture-speech combinations than their children—a difference that is likely an outcome of their greater use of pronominal referents.

It is important to note that both parents and their children—across groups—produced all three types of gesture-speech combinations (complementary, disambiguating, supplementary), suggesting that children who were using gesture-speech combinations might have learned the particular ways to combine gesture and speech from their parents. Parents also showed similarities to their children in the types of supplementary combinations that they produced, using gesture to add arguments or predicates to their speech, thus providing detailed models in conveying different kinds of sentence-like constructions across modalities. There is mounting evidence that suggests that young children—with or without developmental disorders— are extremely good at gleaning information from both gesture and speech input, when presented with supplementary gesture-speech communications (Dimitrova et al. 2017; Morford and Goldin-Meadow 1992). For example, by age 1;3, TD children can successfully act on an object that was uniquely identified in a deictic gesture-speech combination (e.g., “open” + point at bag), and a few months later (1;8), can even do so when presented with a conventional gesture that requests a similar referent (“ball” + give gesture; Morford and Goldin-Meadow 1992). At a later age, when given an iconic co-speech gesture that expresses object information not found in speech (e.g., “I am eating” + move empty cupped hands in parallel as if holding a sandwich; “I have this toy” + flapping downward-facing open palms as if a bird flying), 2;8- to 3;5-year-olds can even correctly choose the picture of the referent expressed uniquely in an iconic gesture-speech combination (e.g., a sandwich, a bird, respectively; Hodges et al. 2017; Stanfield et al. 2014). Input containing gestures conveying different information from speech has also shown to be a powerful teaching tool to promote learning on other cognitive tasks with older children. School-age children, when exposed to multiple types of new information (e.g., multiple strategies in solving a math equation), show better learning of the information, when the different types were conveyed across gesture + speech than when they were presented only in speech (Singer and Goldin-Meadow 2005). As such, the types of supplementary gesture-speech combinations produced by the parents in our study, might serve as the right target input for children with or without developmental disorders, who are at the cusp of learning to express propositional information in the spoken modality.

Our study showed some similarities between parents and their children—at the group level—in their patterns of gesture production, as well as some marked differences. The question still remains, however, whether parents play a causal role in teaching their children what types of gestures or gesture-speech combinations to produce. There is some observational evidence that young children who live in linguistic communities that show richer use of iconic gestures, such as Italy (Iverson et al. 2008) or Turkey (Furman et al. 2014), go on to produce iconic gestures at an earlier age and at greater frequencies compared to children who live in cultures that do not show such abundant use of iconic gestures (e.g., North America, Özçalışkan et al. 2014). Future experimental studies that systematically vary the amount and type of gestural input children receive from an adult might shed further light on the link between input and child gesture in children with different developmental profiles.

Our study also relied on a relatively modest sample size (n = 23/group)—a size further affected by the fewer number of children producing deictic gestures within the ASD group and supplementary gesture-speech combinations within the DS group. Future larger-scale studies examining diagnosis-specific variability in the types of gestures and gesture-speech combinations children produce can further confirm the robustness of the nonverbal communicative strategies young children with ASD and with DS in our study employed.

In summary, our study showed that children with developmental disorders resemble TD children in the types of gestures and gesture-speech combinations that they produce, but also differ from them in how often they produced of each type. Parents of children with developmental disorders also resemble their children in the types of gestures and gesture-speech combinations that they produce, but, in contrast to their children, they do not differ from parents of TD children in their amount of production of each type of gestural communication—a finding that further extends earlier work that showed no group difference in parental verbal input addressed to children (e.g., Mundy et al. 1988; Talbott et al. 2015). Even if parents were providing similar gestural input, the child’s uptake of this input might have been different in the three groups (as argued for speech input; Arunachalam and Luyster 2015), particularly in the way children garnered and incorporated this information into their existing understanding of referents and semantic relations, as reflected in the group differences observed in children’s gestures and gesture-speech combinations. At the same time, however, the largely intact gesture-speech system in children with developmental disorders suggests that gesture might be serving a similar role in early language development for these children by providing a scaffold for emerging spoken language abilities in learning new words and sentences—a scaffold that might be further aided by the parents’ modeling of different gesture types and gesture-speech combinations.