Introduction

Gesturing is a widespread phenomenon in the animal kingdom, as well as an important facet of human language. Indeed, before children start to speak, they produce a variety of gestures, which paves the way of their spoken language development (Bates et al. 1979; Carpenter et al. 1998; Iverson and Goldin-Meadow 2005). Moreover, adults continue to use gestures to accompany spoken and signed languages (McNeill 1985; Goldin-Meadow 2002, 2003). While the evolutionary emergence of language is still equivocal, studying non-human primate gestures is relevant to inform evolutionary models about the commonalities of forms, functions, and cognitive and neurobiological underpinnings of its gestural components. Communicative gestures of our close phylogenetic relatives have been relatively little studied compared to vocalisations. However, it is now well acknowledged that the studies on the gestural system are of primary interest to reconstruct a coherent evolutionary scenario of language considered as a multimodal communication system (Call and Tomasello 2007; Arbib et al. 2008; Waller et al. 2013). Notably, there is increasing evidence that humans and apes share some intentional communicative abilities likely to have evolved through gestural communication (Arbib et al. 2008; Liebal and Call 2012). However, it is not well established that the gestural communication of non-ape primates possesses similar properties; including the forms, functions, flexibility, and intentionality of gestures.

One of the main characteristics of human language is its incredible flexibility in acquisition and usage. Recent studies have shown that gestures in apes are also used flexibly. This flexibility is determined by the so-called ‘means–ends dissociation’. This criterion, originating from developmental psychology through the investigation of communication in human infants, is characterised by the flexible relation between forms and functions of communicative signals, where different gestures can be used for the same goal and the same gesture can be used for different goals (e.g. Call and Tomasello 2007; Pollick and de Waal 2007). In non-human primates, this is usually assessed by analysing the range of functional contexts (such as play or agonistic) in which a gesture occurs, and the diversity of gestures which occurs within a single context. A means–ends dissociation between gesture type and context has been found in several species of apes, both in captive and wild populations (chimpanzees, Pan troglodytes: Tomasello et al. 1994; Call and Tomasello 2007; Hobaiter and Byrne 2011b; Roberts et al. 2012; bonobos, Pan paniscus: Pika et al. 2005b; Genty et al. 2015; Graham et al. 2017; gorillas, Gorilla gorilla: Pika et al. 2003; Pika 2007; Genty et al. 2009; orangutans, Pongo pygmaeus: Liebal et al. 2006; and siamangs, Symphalangus Syndactulus: Liebal et al. 2004b). Apes are also able to learn new gestures taught by humans such as sign language, often through extensive training including moulding of hands to form signs, but also through no more extensive moulding of the signs than can be seen in human mothers of deaf children (Patterson and Linden 1981; Fouts et al. 1989; Gardner et al. 1989; Miles 1990; Tomasello and Camaioni 1997). Indeed, their gestural communicative system is also variable, as all individuals do not produce the same set of gestures (e.g. Tomasello et al. 1994; Liebal et al. 2004b; Pika et al. 2005b; Liebal et al. 2006). Individual repertoire size varies particularly across age classes, with juveniles using a larger variety of gestures than adults (Tomasello et al. 1994, 1997; Liebal et al. 2004b, 2006; Genty et al. 2009; Hobaiter and Byrne 2011a, b). However, sex differences are scarcer and often limited to sexual context (e.g. Liebal et al. 2004b; Scott 2013).

More importantly, there is increasing evidence that the production of gestures in apes possesses the main criteria of intentionality, especially in terms of directing gestures toward recipients, waiting for a response, and taking into account the attentional state of the recipient (e.g. Call and Tomasello 2007; Byrne et al. 2017). Indeed, gestures are directed toward an audience and the signaller waits briefly after gesturing to monitor the recipient for behavioural response (i.e., response waiting; Tomasello et al. 1985, 1994; Call and Tomasello 2007). Intentionality criteria have then notably been used as baseline conditions to select which type of gesture to record or not when investigating the repertoire of gestural communication of apes (e.g. Pika et al. 2003; Liebal et al. 2006; Hobaiter and Byrne 2011a; Roberts et al. 2014). Furthermore, the signaller takes into account the attentional state of the recipient when producing a gesture. This is the so-called ‘audience effect’, characterised by a sensitivity to the presence/absence of a potential recipient and by the differential use of gestures as a function of the attentional state of the recipient (Call and Tomasello 2007; Leavens et al. 2004a). Particularly, gestures can vary in modality, and silent, visual gestures (i.e., gestures that create no sound and no contact with the recipient) can only be effective if produced toward a recipient that is visually attending, whereas tactile (i.e. gestures that create a contact with the recipient) or audible gestures (i.e., gestures that create a sound while being performed) can potentially be effective even if the recipient is not visually attending. For example, it has been shown experimentally that chimpanzees are able to adapt their visual and auditory communicative behaviours in accordance to the attentional and intentional status of a human observer when begging for food (Hostetter et al. 2001; Povinelli et al. 2003; Leavens et al. 2004b, 2010; Poss et al. 2006). However, individuals that have the opportunity to move in front of the recipient before producing visually based gestures seem to favour this option (Liebal et al. 2004a, b). Observational studies of spontaneous communicative behaviours also indicated that apes use more visual gestures when the recipient is already attending, and can, to some extent, modify their use of tactile or audible gestures when the recipient is not attending (Tomasello et al. 1994; Pika et al. 2003, 2005a, b; Liebal et al. 2004b, 2006; Genty et al. 2009; Hobaiter and Byrne 2011a; but see also Tempelmann and Liebal 2012). In addition, in the absence of a response from the recipient, or when the response is apparently unsatisfactory, apes either persist with using the same gesture, or elaborate using another gesture or signal until they are satisfied by the response (e.g., towards humans: Leavens et al. 2005; Cartmill and Byrne 2007; towards conspecifics: Liebal et al. 2004a; Hobaiter and Byrne 2011b; Roberts et al. 2013). Moreover, intentional communication may be more widespread in the animal kingdom than originally thought, as suggested by recent evidence of intentional production of gestures in fishes, birds, dogs, and ungulates (Gaunet and Deputte 2011; Vail et al. 2013; Malavasi and Huber 2016; Nawroth et al. 2016; Smith 2017; Townsend et al. 2017). For example, horses (Equus caballus) were able to take into account the attentional state of the human recipient when communicating about a desired out of reach reward (Malavasi and Huber 2016). These pieces of evidence indicate that intentional communication may have provided adaptive benefits in the course of evolution.

In contrast with such an extended knowledge in apes, virtually nothing is known about monkey gestural communication. Some studies have investigated the repertoire of gestures used by several species of macaques, by looking notably at the effect of social structure on the type of gestures performed and the context of use (Hinde and Rowelln 1962; Maestripieri 1996, 1997, 1999, 2005; Hesler and Fischer 2007). Macaque species displaying higher levels of tolerance and relaxed dominance might possess a wider range of communicative signals than less tolerant species (e.g. Maestripieri 2005). In baboons, some gestural behaviours have been described in olive baboons (Papio anubis, e.g., Smuts 2002) as well as in hamadryas baboons (Papio hamadryas) within the ethogram provided by Kummer (1968). Some studies have also shown that mandrills (Mandrillus sphinx) were able to spontaneously invent new gestures (Laidre 2008, 2011). However, compared to apes, there is a real lack of systematic and comparable studies on the gestural communication of monkeys. Notably, most studies showing that the production of gestures by monkeys was intentional have been done in experimental settings using trained gestures to request food toward humans (e.g. Hattori et al. 2010; Maille et al. 2012; Meunier et al. 2013; Bourjade et al. 2014; Canteloup et al. 2014, 2015; but see also Gupta and Sinha 2016). For example, when begging for food, olive baboons gestured more often when the experimenter could see them and adjusted their visual and auditory gestures to the visual attention of the human recipient (Bourjade et al. 2014). This raises the question of whether these skills have been learned during the experiments or whether monkeys possess a preexisting ability to discriminate recipients’ attention. Consequently, it remains unclear which types of intra-specific gestures are used by monkeys, and whether they possess the same advanced properties as ape gestures (Pika et al. 2005a).

It is worth noting that the gestural communication of both baboons and chimpanzees involves cerebral areas located in the left hemisphere which appear similar to the areas involved in human language (Meguerditchian et al. 2011a; Meguerditchian and Vauclair 2014; Marie et al. 2018). Recent studies further showed that olive baboons, like apes and humans (e.g. Knecht et al. 2000; Hopkins et al. 2012; Meguerditchian et al. 2012, 2013), were mostly right-handed for gesturing (i.e., the ground slapping gesture: slapping of the hand on the ground), and those hand preferences were very consistent over time and across populations (Meguerditchian and Vauclair 2006; Meguerditchian et al. 2011b; Molesti et al. 2016). Baboons seem to share interesting neurobiological underpinnings with chimpanzees and humans, and, therefore, rise as an excellent model to investigate the communicative and socio-cognitive precursors of language (e.g., Fagot et al. 2018).

Therefore, the present study investigated whether the abilities shown by olive baboons expanded beyond the experimental context and applied to intra-specific communicative interactions. Using a methodology closely modelled after ape studies, we established the first naturalistic repertoire of gestural communication in olive baboons, based on observations of three groups of captive baboons. Then, we examined the flexibility, variability, and intentionality of gesture use to determine if an old-world monkey species would possess similar communicative properties to human and non-human apes. By providing the first quantitative description of monkey gestures, this study will help further document baboon communication as well as the evolution of complex communication and sociality within the primate lineage.

Method

Subjects

This study was conducted on three social groups of captive-born olive baboons (P. anubis) living at the Station de Primatologie of the Centre National de la Recherche Scientifique (CNRS, UPS 846, Rousset, France). In total, 47 subjects were systematically observed in this study: 13 males and 34 females; 4 infants (0–1 year), 9 juveniles (1–4 years), 7 subadults (4–7 years), and 27 adults (from 7 years). The subjects were aged from 0 to 25 years, and were housed in large cages or parks from 15 to 650 m2. They received monkey pellets twice per day, as well as fresh fruits, vegetables, and grains. Water was available ad libitum. The groups 1, 2, and 3 were, respectively, composed of 32, 6, and 9 individuals.

Procedure

Data were collected during 1 year, from October 2015 to October 2016. A communicative gesture was defined as a movement of the body or part of the body, directed to a specific partner or audience. This definition thus included actions of the whole body, of parts of the body (e.g. limb, head), and facial expressions (i.e. movements of parts of the face). A gesture could be directed to a partner via eye gaze, body orientation, or physical contact (e.g. Liebal et al. 2004b). In contrast with ape studies (e.g. Liebal et al. 2006; Hobaiter and Byrne 2011a), the methodological approach was to record any behaviour corresponding to this definition without screening gestures a priori with intentionality criteria. Instead, we tested every single criterion of intentionality on our gestural data set, leaving the case for non-intentional communicative gestures open throughout.

Focal animal sampling was used (Altmann 1974) to observe each subject for a total of 5 h. For this, each focal monkey was randomly selected and followed for 60 sessions of 5 min. In total, 80% of the focal sessions were collected in live using a voice recorder, and 20% of the focal sessions were videotaped using a digital video camera (SANYO Xacti ®) recording at 30 fps (1920 × 1080 Full-SQH). The data were then transferred to Excel spreadsheets while listening to the records and scanning the videos (see details of data collection in Online Resource 1). If the focal subject moved outside the vision range for more than 1 min, the record was deleted, and the session was started again once the subject became available. Each monkey was observed only once per day, and the focal sessions were balanced between the morning and afternoon periods and spread over seasons. All gestures produced by the focal monkey were recorded to extract the following information:

  1. 1.

    The ID of the recipient.

  2. 2.

    The type of gesture produced (see details in Online Resource 1).

  3. 3.

    The orientation of the signaller (Liebal et al. 2004b): (a) ‘looking’ was defined as the signaller having its eyes and/or face directed toward the recipient; (b) ‘not looking’ was defined as the signaller having its head turned away from the recipient with no eye contact.

  4. 4.

    Response waiting (Hobaiter and Byrne 2011a): (a) ‘response waiting’ was recorded when the signaller maintained its related recipient-directed posture beyond the end of the gesture and/or some visual contact with the recipient; (b) ‘no response waiting’ was recorded otherwise.

  5. 5.

    The recipient attention (Liebal et al. 2004b): (a) ‘attending’ was defined as the recipient having its eyes and/or face directed toward the signaller; (b) ‘not attending’ was defined as the recipient having its head turned away from the signaller or having its attention distracted by another individual or event in its environment.

  6. 6.

    Behavioural context, as judged qualitatively by the available pre- and post- information that accompanied the signaller’s gesture (Schneider et al. 2012): (a) parental care (behaviours involving the care of a mother toward her infant), (b) agonistic (aggressive behaviours such as chasing, biting, or threatening), (c) submissive (submissive behaviours such as fleeing, usually following an aggressive behaviour received), (d) play (play behaviours such as play-wrestle and rough-and-tumble play), (e) sexual (behaviours accompanying mating interaction), (f) allo-grooming (a monkey grooms a partner, i.e., goes through the fur of another monkey with its fingers, removing dirt and/or parasites), (g) affiliative (friendly approaches toward other individuals such as greeting, excluding allo-grooming), and (h) other (i.e. gesture that could not be categorised in a particular context).

  7. 7.

    Combination (Liebal et al. 2004b, 2006): gestures were either produced in isolation and recorded as ‘single’ or simultaneously with others and recorded as ‘combined’.

Data analysis

A total of 2820 focal sessions were collected (i.e. 235 h of focal observation), corresponding to 60 sessions (i.e. 5 h) for each of the 47 subjects. Following the data collection, a total of 2256 audio sessions were transcribed (i.e. 80% of the sessions) and 564 videos were coded (i.e. 20% of the sessions) to extract all the information on the gestures produced by the subjects. To assess the reliability of the behavioural sampling, 75% of the videos (i.e. 15% of the total focal sessions) were coded by a second observer blind to the hypotheses of the study. Consistency between observers was excellent (Cohen’s Kappa, k = 0.94 for gesture type, k = 0.90 for the orientation of the signaller, and k = 0.89 for response waiting; see Online Resource 1 for details). According to their intrinsic structure, each gesture was classified (e.g. Pika et al. 2003) as either visual, audible, or tactile. While all gestures had a visual component, a gesture was classified as audible if it generated some sound while being performed, as tactile if it included physical contact with the recipient, or as visual in all other cases. All gestures that were observed at least two times were included in the repertoire and in the analyses. All gestures were treated as independent gestures in the analyses.

Flexibility

Flexibility refers to the so-called ‘mean–end dissociation’ between gesture form and function. It was assessed by counting the number of different gesture types used within the same context and the number of contexts in which one gesture type was used (Pika et al. 2005b; Liebal et al. 2006; Call and Tomasello 2007; Genty et al. 2009). We analysed whether the proportions of gesture types used in several contexts or in only one context differed statistically from a uniform distribution using a Binomial test. For this, the proportions observed in our data set were compared to a theoretical uniform distribution where the proportion of gestures used in several contexts was equal to the proportion of gestures used in one context.

Variability

We ran a series of analyses to investigate whether gestural communication was variable across individuals, ages, and sexes. Particularly, we investigated whether the repertoire size, the rate of production of gestures, and the use of the modalities were variable. First, we calculated the repertoire size of each individual (i.e. the number of different gesture types that the individual produced at least once). As it followed a normal distribution, we compared repertoire sizes across age classes using a one-way ANOVA, and across sex classes using a T test. Then, for each individual, we calculated the rate of gesture production (i.e. the number of gestures produced per hour). This variable was not normally distributed, and the rates across age classes and between sexes were compared using a Kruskal–Wallis and a Mann–Whitney U test, respectively. Finally, we investigated whether the modalities of the gestures used were affected by the age or sex of the individuals, using GLMMs with a Poisson distribution and a log-link function (see Online Resource 2, Table S1). The number of gestures produced was the dependent variable, whereas the type of modality (i.e. audible, tactile, or visual), age class (i.e. infant, juvenile, subadult, or adult), and sex (i.e. male or female) were the categorical test variables.

Intentionality

To assess whether the gestural communication of olive baboons was intentional, we investigated three indicators of intentionality: the orientation of the signaller while producing the gesture, whether the signaller waited for a response from the recipient, and the attentional state of the recipient. Each indicator of intentionality was investigated separately on the total gestural output. In addition, percentages of gestures on which these criteria were observed are reported for each single gesture type in Table 1.

Table 1 Detailed repertoire of communicative gestures in olive baboons

First, we ran GLMMs with a Poisson distribution and a log-link function to assess whether subjects produced more gestures: (a) when looking at the recipient than when not looking, (b) when waiting for a response of the recipient than when not waiting, and (c) when the recipient was attending than when the recipient was not attending (see Online Resource 2, Table S2). The number of gestures produced was the dependent variable, and the variable orientation (i.e. looking vs. not looking), response waiting (i.e. waiting vs. not waiting), and attention (i.e. attending vs. not attending) were the categorical test variables entered in each corresponding model. Sex (i.e. male or female) and age class (i.e. infant, juvenile, subadult, or adult) were the categorical control variables.

Second, we evaluated the effect of age on intentionality. For this, for each individual, we calculated the percentage of gestures produced when they looked at the recipient, when they waited for a response of the recipient, and when the recipient was attending, and we compared the percentages across age classes using Kruskal–Wallis tests.

Finally, we further examined the effect of the recipient’s visual attention on the gesture modality used by the signaller. For this, we investigated whether baboons actively adjusted their gesture modality to the recipient’s attention, using the method described by Hobaiter and Byrne (2011a). Thus, we calculated the variation in the choice of audible, tactile, and visual gestures, according to whether the recipient was attending or not attending. First, for each individual, we calculated the proportions of all gestures produced that involved audible, tactile, or visual gestures. Then, we divided this individual’s data set in two subsets depending on whether the recipient was attending or not attending, and we recalculated the proportions of each modality for each subset. Finally, we calculated the percentage of variation, which corresponded to the variation in the use of each modality according to the attentional state of the recipient, based on the formula (β/α – 1) × 100, where, for example, α represented the proportion of visual gestures produced in the overall corpus, while β represented the proportion of visual gestures produced when the recipient was attending. These percentages of variation, which could be positive or negative, indicated active adjustment of the modality to the attention of the recipient. We analysed whether the choice of different modalities varied according to the attentional state of the recipient with a Friedman test. As we could not disentangle the link between attention and modality when several gestures of different modalities were produced at the same time, these analyses were run only on single gestures.

All tests were two-tailed, and the level of significance was set at 0.05. We used parametric statistics when the data followed a normal distribution and used non-parametric statistics otherwise. GLMMs were run in Stata v12.1 (StataCorp 2011), while all the other tests were run in IBM SPSS v21 (IBM Corp 2012). For the GLMMs, we used a statistical model selection approach to determine which models best fitted our data (see details in Online Resource 2). We followed a three-step procedure: (1) we fitted several models with the test and/or control variables as fixed effects; (2) we selected the models that best fitted the observed data on the basis of the lowest AICc (i.e., Akaike information criterion corrected, Burnham and Anderson 2004; Symonds and Moussalli 2011); and (3) we performed tests of significance on the retained models using χ² tests of the log-likelihood ratios (Brown and Prescott 2006). For each GLMM, the ID of the subject as well as the ID of the group were entered as random factors (Pinheiro and Bates 2000; Rabe-Hesketh and Skrondal 2008). Only results of the retained models are presented in the results section below. Further information is available in Online Resource 2. Note that supplementary analyses were conducted to evaluate the potential effect of using two distinct methods of data collection on some of the results presented hereafter. We obtained exact same results on (1) the complete data set, (2) the subset of data collected on videos, and (3) the subset of data from live coding, indicating that our results are not impacted by data collection methods (see Online Resource 2, Table S4).

Results

Repertoire

In total, 8855 occurrences of gesture were recorded. This allowed us to establish the first repertoire of gestural communication in olive baboons with a list of 67 gestures produced, which included facial expressions, manual gestures, and bodily gestures (Table 1). Among all these gestures, four were audible gestures, 24 were tactile gestures, and 39 were visual gestures. Examining the cumulative number of new gestures recorded for all subjects indicated that our observation time was sufficient to reach the repertoire size of olive baboons, as an asymptote was reached at 117.5 h of observation (i.e. 30 sessions of 5 min, so 2.5 h, for each of the 47 subjects; Fig. 1). It can be noted that we described most of the gestures based on action (Table 1). The gestures ‘presentation’, ‘lip smack’, and ‘give ground’ were observed the most often (i.e. more than 600 occurrences), whereas the gestures ‘headstand’, ‘invite young’, and ‘kiss’ were observed the least often (i.e. up to 5 occurrences). There were two idiosyncratic gestures (i.e. gestures that are exclusively produced by one individual): ‘hand own genitals’ was produced only by a subadult female, and ‘elephant’ was produced only by a juvenile male. The gestures ‘headstand’ and ‘roll’ were used by less than 5% of the subjects. While the gestures ‘groom present’, ‘give ground’, ‘grooming intention’, and ‘lip smack’ were used by more than 94% of the subjects, the gesture ‘hand–body touch’ was used by all subjects. Each subject produced around 188 gestures (mean ± SE = 188.4 ± 11.2). Among the 8855 gestures recorded, 6549 (74%) were performed as single gesture and 2306 (26%) were combined with another gesture at the same time.

Fig. 1
figure 1

Cumulative record of olive baboons’ gestural repertoire. The cumulative number of new gestures recorded (i.e. repertoire size) is plotted against the number of hours of observations of all baboons. Asymptote was reached after 117.5 h of observation

Flexibility

Several gestures in one context

Several gesture types were systematically recorded for each of the eight contexts (from 16 to 56 gestures, Fig. 2a) emphasizing the diversity of the gestural lexicon used by baboons to fulfil social functions. On average 31 different gesture types were used in each context (mean ± SE = 31.1 ± 5.5). Most of the gesture types were used in the affiliative (83.6% of the repertoire), play (74.6%), and agonistic (61.2%) contexts. On average a third of the gesture types were used in the submissive (35.8%), sexual (34.3%), and other (31.3%) contexts. A smaller number of different gesture types were used in the context of parental care (25.4%) and grooming (23.9%).

Fig. 2
figure 2

Flexibility of the repertoire: a number of gesture types recorded in each context. b Number of gesture types as a function of the number of contexts in which they were recorded

Same gesture in several contexts

If gestures were bound to specific contexts, we would observe specific gestures used in single social contexts. However, most of the gesture types were used in more than one context (from 1 to 8 contexts, Fig. 2b), with on average each gesture type being used in four different contexts (mean ± SE = 3.7 ± 0.2). While 83.6% of the gesture types of the repertoire were used in several contexts, only a small proportion of gestures was actually used in only one social context (16.4%, Table 1), which statistically differed from a uniform distribution (Binomial test compared to the proportion 0.5, p < 0.001, N = 67). Among the 11 gesture types that were recorded in only one context, seven were observed less than 15 times (see Table 1). The gestures ‘lip smack’, ‘mock bite’, and ‘peer’ were used in all eight contexts.

Variability

When looking at individual repertoire size, there was high variability across individuals (from 15 to 45 gestures), with on average 31 gesture types per subject (mean ± SE = 31.1 ± 1; Fig. 3a). None of the 47 subjects showed the entirety of the 67 gesture types observed.

Fig. 3
figure 3

a Distribution of individual repertoire size. b Mean ± SE individual repertoire size across age classes. c Individual repertoire size as a function of age (in years). d Mean ± SE percentage of gestures produced for each modality and for each age class

Across age classes

The size of individual repertoires differed significantly across age classes (One-way ANOVA, F3, 43 = 8.5, p < 0.001; Fig. 3b). Bonferroni post hoc analyses indicated that juveniles had significantly bigger repertoire than infants (p = 0.006), subadults (p = 0.005), and adults (p < 0.001). There was no statistically significant difference between the other age classes (p > 0.05 in all other cases). Furthermore, the size of the repertoire significantly decreased when age (in years) increased (Spearman correlation, r45 = − 0.35, p = 0.016, Fig. 3c). The gesture ‘invite young’ was produced only by adults, while the gesture ‘somersault’ was produced only by juveniles and the gesture ‘headstand’ was produced only by two infants.

The rate of gestures produced by each individual differed across age classes (Kruskal–Wallis test, H3 = 13.2, p = 0.004, N = 47). Dunn–Bonferroni pairwise comparisons indicated that juveniles (mean ± SE = 55.4 ± 4.8 gestures/h) produced more gestures than adults (mean ± SE = 33.5 ± 2.2 gestures/h; p = 0.004) and subadults (mean ± SE = 34.5 ± 7.3 gestures/h; p = 0.018). There was no significant difference between the other age classes (mean ± SE = 36 ± 8.3 gestures/h for infants; p > 0.05 for all the other comparisons). The rate of production of gestures significantly decreased when age (in years) increased (Spearman correlation, r45 = − 0.38, p = 0.008).

The best fitting model revealed an interaction effect between the modalities of the gestures produced and the age class of the individuals (Wald test: χ2 = 374.28, p < 0.0001, N = 47; Best fitting model: AICc = 1697.68; Fig. 3d; Online Resource 2, Table S1). Indeed, the proportion of tactile gestures produced decreased significantly with the increase of age (in years; Spearman correlation, r45 = − 0.35, p = 0.016), whereas the proportion of audible gestures increased significantly (Spearman correlation, r45 = 0.55, p < 0.001). There was no significant correlation between the age and the proportion of visual gestures (Spearman correlation, r45 = 0.08, p = 0.58). However, when adults were removed from the sample, the proportion of visual gestures produced increased significantly with age (Spearman correlation, r18 = 0.66, p = 0.001).

Between sexes

No significant difference was found between the repertoire size of males (mean ± SE = 33.7 ± 2.2 gestures) and females (mean ± SE = 30.1 ± 1.1 gestures; T test, t45 = 1.6, p = 0.11). The gestures ‘pursed lips’, ‘spread leg touch’, ‘pelvic thrusts’, and ‘invite young’ were produced only by females, while ‘mating intention’ was, following our definition of this specific behaviour, only produced by males. There was no significant difference between the rate of gestures produced by males (mean ± SE = 43.5 ± 5.8 gestures/h) and females (mean ± SE = 36 ± 2.3 gestures/h; Mann–Whitney U test, U = 179, z = − 1, p = 0.33, N = 47).

Intentionality

Orientation

Subjects produced significantly more gestures when looking at the recipient than when not looking (Table 2a, Fig. 4a; Online Resource 2, Table S2). On average, the subjects produced 90.5% of the gestures (± 0.9) when looking at the recipient (Fig. 4a). The percentage of gestures produced when the subjects were looking at the recipient differed significantly across age classes (Kruskal–Wallis test, H3 = 12, p = 0.008, N = 47). Dunn–Bonferroni pairwise comparisons indicated that infants (mean ± SE = 75.8% ± 1) produced fewer gestures when looking at the recipient than adults (mean ± SE = 91.6% ± 0.7; p = 0.014) and juveniles (mean ± SE = 93.2% ± 1.1; p = 0.004). There was no significant difference between the other age classes (mean ± SE = 90.9% ± 2.1 for subadults; p = 0.06 for infants vs. subadults and p > 0.05 for all the other comparisons).

Table 2 Coefficients and significance of the variables entered in the GLMMs with a Poisson distribution to analyse whether subjects (N = 47) produced (a) more gestures when looking at the recipient than when not, (b) more gestures when waiting for a response from the recipient than when not, and (c) more gestures when the recipient was attending than not attending
Fig. 4
figure 4

a Mean ± SE percentage of gestures produced when the subject was looking or not looking at the recipient, when the subject was waiting or not waiting for a response from the recipient, and when the recipient was attending or not attending. b Mean ± SE percentage of variation in the use of each modality according to whether the recipient was attending or not attending. The deviation above and below the zero line indicates the direction of the signaller’s adjustment to the recipient’s attention

Response waiting

Subjects produced significantly more gestures followed by response waiting than gestures that were not (Table 2b, Fig. 4a; Online Resource 2, Table S2). On average, the subjects produced 87% of the gestures (± 1) when waiting for a response (Fig. 4a). The percentage of gestures followed by response waiting differed significantly across age classes (Kruskal–Wallis test, H3 = 13.4, p = 0.004, N = 47). Dunn–Bonferroni pairwise comparisons indicated that infants (mean ± SE = 72.6% ± 1.4) produced significantly fewer gestures followed by response waiting than adults (mean ± SE = 87.4% ± 1.3; p = 0.028) and juveniles (mean ± SE = 91.4% ± 1.2; p = 0.002). There was no significant difference between the other age classes (mean ± SE = 87.8% ± 1.4 for subadults; p > 0.05 for all the other comparisons).

Attention

Subjects produced significantly more gestures when the recipient was attending than not attending (Table 2c, Fig. 4a; Online Resource 2, Table S2). On average, the subjects produced 81.2% of the gestures (± 1.2) when the recipient was attending (Fig. 4a). There was no significant difference across age classes in the percentage of gestures produced when the recipient was attending (Kruskal–Wallis test, H3 = 5.3, p = 0.15, N = 47).

The choice of different modalities varied significantly according to the attentional state of the recipient (Friedman test, χ25 = 137.5, p < 0.001, N = 47, Fig. 4b). Specifically, the use of audible and visual gestures decreased when the recipient was not attending (Wilcoxon signed-rank test, audible: z = − 4.3, p < 0.001, visual: z = − 5.6, p < 0.001), whereas the use of tactile gestures increased (Wilcoxon signed-rank test, z = − 5.8, p < 0.001).

Discussion

This study is the first comprehensive and quantitative description of the types and properties of an old-world monkey species’ gestural communication. Over 1 year of observation, 67 gestures were consistently recorded and compose the gestural repertoire of olive baboons. This repertoire may serve as a tool for researchers, as it can notably be used to select which type of gesture can be of interest for further studies, based on criteria of variability (i.e. across individuals, sexes, and ages), flexibility across contexts, and intentionality (i.e. signaller’s orientation, response waiting, and recipient’s attention). Olive baboons used a variety of audible, tactile, and visual gestures, which were produced by movements of the whole body, parts of the body, and parts of the face. This repertoire included a majority of visual gestures (58% of the repertoire), which is consistent with the hypothesis that the type of gestures used by a species may be related to its degree of terrestriality (Marler 1965; Liebal and Pika 2005). Indeed, the nature of the communication of a species depends notably of its ecology, social structure, and cognitive skills (e.g. Maestripieri 2005; Pika et al. 2005a; Parr et al. 2015). Specifically, it has been suggested that more terrestrial species such as olive baboons (e.g. Patel and Wunderlich 2010), that do not live under dense vegetation compared to more arboreal species (e.g. siamangs; Liebal et al. 2004b), could rely on the use of visual modality of communication, because their environment does not constrain the perception of this type of communication (Marler 1965; Liebal and Pika 2005; Pika et al. 2005a; Parr et al. 2015). In this regard, baboons have evolved in an environment comparable to the paleo-environment of early humans (Cerling et al. 2011), and they also form multi-tiered societies that closely resemble human societies (Smuts et al. 2008). Hence, they offer a precious model to study the evolutionary pathways from intentional communication to language.

In this regard, our results provide some of the first evidence of intentional gesture use towards conspecifics by monkeys. When producing a gesture, olive baboons looked at the recipient, waited for a response, and took into account the attentional state of the recipient. Moreover, we also found evidence for means–ends dissociation as baboons flexibly selected among different gestures to achieve one function, while a same gesture could be used to different ends. Our results also indicate variations in the use of gestures by baboons which are comparable to the variability reported in apes. Indeed, individuals did not produce the same set of gestures. In addition, the gesture’s types, rate, and modality changed with individual’s age, but not with sex.

Repertoire size varied a lot across individuals, and baboons used around 46% of all gesture types within their own repertoire, and actually none of the 47 subjects used the entirety of the repertoire, which is consistent with what has been found in apes (Tomasello et al. 1994, 1997; Liebal and Pika 2005; Hobaiter and Byrne 2011a; Roberts et al. 2014). Moreover, juveniles showed the largest repertoire and the highest rate of gestures produced, and these values decreased with age. In apes, the active repertoire of juveniles is also larger than the ones of adults and infants (Tomasello et al. 1989; Liebal et al. 2004b, 2006; Genty et al. 2009; Hobaiter and Byrne 2011b). It has been suggested that young individuals first explore the variety of gestures available, using a large number of gestures in a variety of interactions, before retaining the ones that have proved to be the most effective in their social interactions and group (Hobaiter and Byrne 2011b; Byrne et al. 2017). In chimpanzees, the likelihood of choosing an effective gesture increases with age (Hobaiter and Byrne 2011b). In baboons, the use of tactile gestures decreased with age, in contrast to audible and visual gestures. This corroborates the observation made in ape species where gestures that are potentially effective over distance (i.e. audible and visual gestures) increase with age, while gestures that involve contact with the recipient (i.e. tactile gestures) decrease (Schneider et al. 2012; Fröhlich et al. 2016; Liebal et al. 2019). One explanation may be that young individuals use more tactile gestures because of their close proximity with their mother, and reliance on this modality may decrease with the increase of independence (Liebal et al. 2019). No difference of repertoire size and production of gestures was found between males and females. In non-human primates, differences between sexes are scarcer and often limited to sexual context (e.g. Liebal et al. 2004b; Hesler and Fischer 2007; Scott 2013).

It is difficult to directly compare the repertoire size between species because of the variation in sampling methods across studies. Indeed, there are noticeable discrepancies (1) in the definition of gesture (e.g. some studies only considered as ‘gestures’ the movements of the hand(s); Hobaiter and Byrne 2017; Liebal et al. 2019), (2) in the level of details used to define and categorise each gesture type (i.e. granularity of description; Cartmill and Byrne 2011; Byrne et al. 2017; Hobaiter and Byrne 2017), as well as (3) in how gestures are described (e.g. action-based or meaning-based; Hobaiter and Byrne 2017). However, the repertoire size of olive baboons is large and quite similar to the ones reported in apes such as bonobos (e.g. 68 gestures, Graham et al. 2017), chimpanzees (e.g. 66 gestures, Hobaiter and Byrne 2011a), and orangutans (e.g. 64 gestures, Cartmill and Byrne 2010). Two idiosyncratic gestures were found in this study, which may indicate that olive baboons may be able to invent new gestures. However, this result might be taken with caution, because recent studies have shown that increasing sampling effort or confronting gesture categorization choices could dismiss the hypothesis of idiosyncrasy (Genty et al. 2009; Hobaiter and Byrne 2011a; Byrne et al. 2017; Graham et al. 2017). Qualitative differences with some great ape gestural repertoires include the absence of gesturing with detachable object in baboons. This latter difference requires further investigation so as to specify whether this lack of behaviour in baboons is species-specific or related to their captive environment which offered very limited opportunities with detachable object. While chimpanzees, orangutans, and gorillas often incorporate objects when producing gestures (e.g. throwing an object or hitting the recipient with an object), this is less the case in bonobos, siamangs, and Barbary macaques (Liebal et al. 2004b, 2006; Hesler and Fischer 2007; Genty et al. 2009; Liebal and Call 2012; Byrne et al. 2017). In apes, repertoires have been found to overlap across species, despite differences in body shape and locomotion (Hobaiter and Byrne 2011b; Byrne and Cochet 2017; Byrne et al. 2017; Graham et al. 2017). It has been suggested that because gestures overlapped between apes, and because these gestures were only a part of all possible gestures that an ape body could perform (Hobaiter and Byrne 2017), these gestures may have a common descent and the gestural repertoire may be inherited (Byrne et al. 2017). It is worth noting that some gestures described in olive baboons (e.g. embrace, grab, presentation, hand–body touch, stretch arm, bared-teeth, and lip smack) seem not only to overlap with gestures described in other monkeys such as macaques (Maestripieri 1996, 1997, 2005; Hesler and Fischer 2007) and other baboon’s species (Rowell 1967; Kummer 1968), but also with gestures described in ape’s species (Liebal et al. 2004b, 2006; Parr et al. 2015; Byrne et al. 2017). If some gestures are actually shared by apes, macaques, and baboons, it may imply that their phylogenetic origin may be relatively old, going back to the ancestor of catarrhine primates. Overall, an effort to increase consistency between studies is still necessary to provide a solid comparison basis of the gestural repertoires across species (Byrne et al. 2017; Hobaiter and Byrne 2017; Graham et al. 2018; Liebal et al. 2019; Pika and Fröhlich 2019).

Flexibility assessment showed that approximately 31 gestures were used within the same context, and each gesture was on average used in 4 different contexts. Such level of flexibility is similar to the one found in the gestural communication of apes (e.g. Tomasello et al. 1994; Pika et al. 2005b; Liebal et al. 2006; Genty et al. 2009), but also in macaques (Maestripieri 1996, 1997; Hesler and Fischer 2007). This emphasizes the diversity of the gestural lexicon used by baboons to fulfil social functions. Note, however, that the diversity of gestures used across contexts may depend on how the contexts had previously been defined and classified in the study (e.g. the more the definition of the context is broad, the more behaviours can potentially be included in this context). Byrne and colleagues investigated the flexibility of the gestural lexicon of apes using a different approach. Instead of looking at functional contexts, they used the meaning of gestures to assess flexible use (Cartmill and Byrne 2010; Hobaiter and Byrne 2014, 2017; Byrne et al. 2017). Using this approach, it has been found that in chimpanzees and bonobos, flexibility resides mostly in the use of several gestures for a specific meaning, and one gesture can have several meanings which are disambiguated by the social context like in human pragmatics (Roberts et al. 2012; Hobaiter and Byrne 2014; Byrne et al. 2017). Further investigation is required to explore the flexibility of meaning in baboon gestures, by relying on the behavioural response of the recipient and whether the signaller is apparently satisfied by this response or not (e.g. Apparently Satisfactory Outcome, ASO, Hobaiter and Byrne 2017).

Importantly, our study provides a comprehensive evidence of the ability of a monkey species to communicate intentionally with congeners, and outside experimental design involving human–monkey communication. Specifically, our study showed that olive baboons looked at their communication partner in 90.5% of cases, waited for a response, and actively adjusted the modality of their gestures to the attentional state of the recipient. Indeed, they increased the production of tactile gestures while decreasing the production of audible and visual gestures when the recipient was not visually attending. Tactile gestures involve physical contact with the recipient and can thus be effective without the recipient being attending. Some tactile gestures may also serve as attention getters, to trigger the attention of an inattentive recipient (e.g. Tomasello et al. 1989, 1994; Liebal and Call 2012). For example, young chimpanzees poke their recipient to initiate play when this one is not attending (Tomasello et al. 1989). Note that here, baboons favoured tactile gestures over audible gestures when the recipient was not visually attending. It can be noted that their repertoire includes only four audible gestures and that these gestures have also a strong visual component. Thus, it is possible that olive baboons use this type of modality more as a visual signal, with the audible component remaining secondary. Overall, these findings are consistent with the evidence of signallers’ sensitivity to the recipient’s attention in apes (Genty et al. 2009; Hobaiter and Byrne 2011a; Roberts et al. 2014; Waller et al. 2015), as well as in monkeys’ gesturing to humans in experimental settings (Hattori et al. 2010; Maille et al. 2012; Meunier et al. 2013; Bourjade et al. 2014; Canteloup et al. 2014, 2015). Therefore, the gestural communication of olive baboons fulfils the main criteria of intentional communication, which means that olive baboons gesture in a goal-directed way to influence specific target audiences. To go further in analysing the intentionality of gesture production in olive baboons, future studies may also look at whether they persist in using the same gesture, or whether they elaborate using another gesture, when the response which they received is unsatisfactory (Liebal et al. 2004a; Leavens et al. 2005; Hobaiter and Byrne 2011b; Roberts et al. 2013).

In addition, in spite of the very small sample size for infants, our results in infant baboons must be stressed for at least one reason: they suggested that intentional communication might not be there from birth. Infants actually were less likely to look at the recipient when producing a gesture and they were also less likely to wait for a response. Thus, the intentional use of gesture may develop over lifetime. Similar patterns are observed in infant chimpanzees where markers of intentional communication increase with age (Bard et al. 2014; Fröhlich et al. 2018), as well as in human infants. Indeed, children within their first year go through a pre-intentional stage where their communication is not directed to communicative partners but seem to reflect their internal states (Bates et al. 1979; Harding 1984). Through repeated interactions with their caregiver who answers appropriately to these behaviours, children develop intentional communication in which they direct their signals appropriately to their caregiver to receive a particular response at around 9 months of age (Bates et al. 1979; Harding 1984; Carpenter et al. 1998). Thereby, through repeated interactions with their mother and other group mates, infants may learn to direct their gestures to appropriate audience in a goal-directed way. Longitudinal studies looking at the development of gestures and intentionality from birth may help to shed light on how intentional gestural communication develops in non-human primates (e.g. see Liebal et al. 2019 for a review).

Our investigation of the gestural communication system of olive baboons provides some evidence of an evolutionary continuity with some key properties of human language in the catarrhine lineage. Further studies are needed to investigate the gestural repertoire and properties of other catarrhine primates, but also of other clades such as Platyrrhini, to track down the precursors to human language. To conclude, this study offers a comprehensive description of the gestural communicative system of olive baboons with empirical evidence of flexibility, variability, and intentionality. These core properties of human language, that are found in all natural languages, may have been present in the common ancestor of baboons and humans, around 30–40 million years ago.