Introduction

How animals, particularly primates, perceive, integrate and classify information from multiple sensory modalities has been a topic of research for nearly 50 years (Davenport 1977; Friedes 1974; Ward et al. 1970). Primarily, this research has focused on what has been termed ‘inter-modal equivalence’ or ‘cross-modal perception,’ and at least one of the early theories proposed that cross-modal perception was unique to humans and was afforded by the emergence of symbolic thought (Geschwind 1965). Specifically, because language is a symbolic form of communication, symbols provided a means of taking arbitrary sensory information from different modalities and integrating them into a unified percept or mental representation. Working on the assumption that language is unique to humans, it was argued that cross-modal perception was therefore unique to humans. We now know that the evolutionary assumptions of this model are incorrect because cross-modal perception has been shown in a variety of primates. Chimpanzees and other great apes have been reported to succeed in cross-modal tasks including haptic-to-visual and visual-to-haptic matching, auditory-to-visual and olfactory-to-visual matching (Davenport and Rogers 1970, 1971; Davenport et al. 1973; Ettlinger and Jarvis 1976; Savage-Rumbaugh et al. 1988). Moreover, degrading the visual stimuli had only minimal effects on haptic-to-visual matching (Davenport et al. 1973). In monkeys, evidence of cross-modal transfer of learning or cross-modal matching has also been shown between the visual-to-haptic and haptic-to-visual modalities (Cowey and Weiskrantz 1975; Ettlinger 1960). Like great apes, monkeys can also successfully solve haptic-to-visual cross-modal matching tasks when the visual stimuli have been degraded (Tolan et al. 1981). Collectively, these studies suggest great apes and monkeys can recognize objects in one modality, when sensory input is restricted to a different modality.

In contrast to the research on cross-modal perception, much less effort has focused on identifying and characterizing multimodal production of signals used in a communicative context. In primates, clearly, facial expressions, body postures, and vocalizations all have important functional roles in both intraspecific and interspecies communication, and often these signals are tightly integrated in terms of their timing and sequential output. Notwithstanding, different modalities in the production of communicative signals in primates are often examined independently of each other rather than as a unified construct that may or may not have multiple functions (Partan 2002; Forrester 2008). For example, oro-facial movements and differing body postures accompany many vocalizations but it remains relatively unknown what the interrelationship might be between these different modalities of communication and their potential effect on the behavior of the recipient of these communicative signals. The purpose of this paper is to describe how chimpanzees integrate and use different modalities of communication when attempting to engage in interspecies communication.

Recently, a number of studies have examined manual and vocal gestural communication in great apes (mostly chimpanzees) during interspecies communication. In captivity, chimpanzees use manual gestures and other visual communicative signals to gesture for foods or tools that are otherwise unattainable to them. Studies show that manual gesturing and other visual communicative signals are inhibited in the absence of a communicative agent (i.e., human or conspecific), indicating that the apes understand the communicative function of their behavior (Call and Tomasello 1994; Hostetter et al. 2001, 2007; Leavens et al. 1996, 2004a; Russell et al. 2005). Additional studies have shown that chimpanzees will alter the primary modality of their communicative behavior depending on the orientation (Hostetter et al. 2001) or visual regard (Leavens et al. 2004b) of the human experimenter, and in some cases, conspecifics (Tomasello et al. 1994). Thus, chimpanzees attempt to capture or draw the attention of a human (or conspecific) to their communicative behavior and alter their modality and type of signaling depending on the attentional status of the human. These findings have been anticipated or corroborated in chimpanzees or other great apes species by independent researchers (e.g., Call and Tomasello 1994; Cartmill and Byrne 2007; Krause and Fouts 1997).

In this study, we sought to further investigate multimodal communication in chimpanzees, specifically as it relates to persistence and elaboration in intra-modal signaling. Persistence and elaboration in communicative signaling has been described in human children as an indicator of intentional communicative behavior in the context of failed communication (Bates et al. 1975; Golinkoff 1986, 1993). Much like chimpanzees, young preverbal children often gesture and vocalize to capture the attention of a human caregiver. When these communicative behaviors are not responded to by the human caregiver (i.e., fail) or are responded to “incorrectly”, children will often attempt to communicate again, sometimes using a different or more elaborate communicative signal. Previous studies in our laboratory have shown that chimpanzees will alter their modality of communicative behavior in relation to the attentional status of a human (Hostetter et al. 2001; Leavens et al. 2004b) and will communicate in relatively more modalities as a function of the quality of food delivered (Leavens et al. 2005a) and the quantity of visible food available (Leavens and Hopkins 2005). However, in these previous studies, only the first or first two responses were recorded, so that although we know chimpanzees choose communicative modalities tactically in accordance with the attentional status of a human observer, we have few data on how these choices perseverate in modality-specific ways. We wanted to know whether chimpanzees would continue to elaborate in a tactically efficient manner throughout a minute-long episode in which their communication was having no apparent effect. In this study, we evaluated whether chimpanzees will elaborate on their communicative signals and whether the elaboration of signaling is modality-specific in accordance with the attentional state of a human. If chimpanzees elaborate in their signaling in logical and modality-specific ways, then the number of different modality-appropriate signals should vary in accordance with the attentional state of the human.

Methods

Subjects

There were 110 chimpanzee subjects in this study including 48 males and 62 females. All subjects were housed at the Yerkes National Primate Research Center (YNPRC) of Emory University. Subjects ranged in age from 6 to 47 years of age (mean = 21.08 years, SD = 10.08). With respect to rearing histories of the subjects, there were 41 mother-reared, 59 nursery-reared and 10 wild-caught chimpanzees. Mother-reared chimpanzees were those reared by their biological, conspecific mother for more than 30 days of life. Nursery-reared subjects were those which were brought to the YNPRC nursery before 31 days of life. Wild-caught apes had been captured in Africa and brought to the U.S. before 1973. The YNPRC is fully accredited by the American Association for Laboratory Animal Care and all guidelines for the ethical treatment of animals were strictly adhered to during this study.

Procedure

All subjects were tested in their home cages and were not isolated from their social groups for the purposes of testing. Testing occurred between 10:00 a.m. and 5:00 p.m. A schematic of the basic paradigm is shown in Fig. 1. In this task, a human offered a banana to one of two chimpanzees, facing toward the recipient of the offer, and facing away from the other animal. When facing away, the degree of angulation of the human relative to focal chimpanzees was approximately 45°. During testing, each trial lasted 60 s, divided into two 30-s sampling intervals. At the onset of each trial, an experimenter offered a banana to one of the focal chimpanzee subjects for 30 s. At the end of 30 s, the experimenter turned and offered the banana to the second focal chimpanzee subject for 30 s. At the end of this second 30-s interval, the banana was broken in half, and one half of the banana was delivered to each of the two chimpanzees. It is important to emphasize that in both conditions, the banana was visible to the chimpanzees and what changed was the direct visual contact between the human offering the food and the focal chimpanzee, and the orientation of the human’s body toward or away from the first and second chimpanzees, respectively. During each 30-s interval, the communicative behavior of each chimpanzee was recorded by two independent experimenters. The first experimenter recorded behaviors for the Toward condition while the second experimenter recorded behaviors for the Away condition. The order of testing of each condition was counterbalanced across subjects. The communicative behaviors of interest were calls, manual gestures, lip pouts, presentations of genitalia, attention-getting (including clapping, cage-banging, spitting, or throwing), barter (or offer food), display, or other. These same behaviors have been previously used to describe the communicative repertoire of chimpanzees in other studies (Hostetter et al. 2001, 2007; Leavens et al. 2004a, b; Liebal et al. 2004).

Fig. 1
figure 1

Schematic of the experimental testing procedure. The first experimenter offered a banana to one chimpanzee for 30 s (Toward condition), then turned to face a second chimpanzee and offered the same banana to this second individual for 30 s (Away condition). One experimenter recorded the behavior during the Toward condition and a second experimenter recorded behavior during the Away condition. The order of testing of each condition was counterbalanced across subjects

With respect to the call category, at least two novel sounds produced by chimpanzees during social interactions with humans in the presence of out of reach food items have been described and include the “raspberry” and “extended food grunt” (Hopkins et al. 2007; Leavens et al. 2004a, b). The “raspberry” sound is a bi-labial fricative in which the chimpanzees purse their lips and expel air out through the closed lips of the mouth. The “extended food grunt” is a low, loud, guttural sound that the chimpanzees make with their mouths open. Although grunts have been described in wild chimpanzees, these sounds are structurally different and used in different social contexts. These two sounds are not frequently reported as part of the species-typical repertoire of calls described for chimpanzees (Goodall 1986) and it has been suggested that they are under volitional control (Hopkins et al. 2007; Leavens et al. 2004b). During testing, bouts of these behaviors were recorded rather than specific elements of the sounds. No specific record of the types of calls made in the different test conditions was made in this study.

Data analysis

Relative frequency of communicative behaviors

For each subject and condition, we calculated the total number of communicative responses. We then summed the total number of calls, gesture, and attention-getting behaviors for each condition. Attention-getting behaviors for the initial analysis included spitting, throwing, clapping, and cage-banging. We were specifically interested in the relative rates of calls, gestures, and attention-getting behaviors, rather than the absolute number of responses; therefore, we divided the number of calls, gestures, and attention-getting behaviors in each condition by the total number of communicative behaviors in each condition. The relative proportions of calls, gestures, and attention-getting behaviors were then compared between the Towards and the Away conditions.

Elaboration in communicative behaviors

For the analyses pertaining to elaboration in communication, we calculated the total number of different behaviors that were made up of visual signals (including gesture + lip pout + present of genitalia + barter) and non-visual or “attention-getting” signals (call + spitting + throwing + clapping + cage-banging). The total numbers of different visual and attention-getting behaviors were then divided by the total number of different behaviors produced in each condition to produce a relative proportion of responses within each response category. Within each condition, we determined the frequency of a visual signal followed by another, yet different visual signal (visual–visual). Similarly, within each condition, we calculated the frequency of attention-getting signals followed by different attention-getting signals. We note that all of the behaviors categorized here as “attention-getting signals” have visual properties, but they also have additional auditory and/or tactile properties that make them effective when an interactant is facing away from the signaler.

Results

Descriptive statistics

Shown in Table 1 are the mean number of responses for each behavioral category as well as the mean number of total responses and mean number of different responses in the Toward and Away conditions. In addition, the number of individuals that made at least one response in each behavioral category is also shown in Table 1. Paired samples t tests indicate that the chimpanzees made significantly more communicative responses in the Toward when compared with the Away conditions t 109 = 4.61, P < 0.001. The chimpanzees also made significantly more different communicative behaviors in the Toward when compared with the Away conditions t 109 = 3.83, P < 0.001. We compared the overall frequencies in communicative behaviors as well as the number of different communicative behaviors as a function of rearing history and sex in two separate analyses of variance. In each analysis, the frequencies in the Away and Toward conditions were the dependent variables while sex and rearing history served as the between group factors. For total frequencies of behaviors, no significant main effects for sex F 1,104 = 2.34, P = 0.12 or rearing history F 2,104 = 0.88, P = 0.48 were found nor was the interaction between the two variables found to be significant F 2,104 = 1.28, P = 0.28. Similarly, for the total number of different behaviors, no significant main effects for sex F 1,104 = 1.08, P = 0.30 or rearing history F 2,104 = 0.63, P = 0.53 were found nor was the interaction between the two variables found to be significant F 2,104 = 1.34, P = 0.24.

Table 1 Mean frequency and number of subjects exhibiting each behavior in each condition

Orientation effects on relative frequency of communicative behavior

The functional use of gestures, attention-getting behaviors, and calls in communicating with humans was examined in the initial analysis. Eighty-nine subjects were recorded to exhibit a gesture, call, or attention-getting behavior in each condition. For this analysis, a mixed model ANOVA was performed with the condition (Towards, Away) and behavioral category (gesture, call, attention-getting) proportions serving as repeated measures. Sex (male, female) and rearing history (mother-reared, nursery-reared, wild-caught) served as between group factors. Consistent with the predictions, there was a significant interaction between test condition and behavioral category F 2,176 = 32.84, P < 0.001 (Fig. 2a). Post hoc tests indicated that when the human was facing away from, compared to facing towards the chimpanzee, they produced relatively more calls t 88 = 3.15, P = 0.03 and non-vocal attention-getting behaviors t 88 = 4.06, P < 0.001. In contrast, the chimpanzees produced relatively more gestures when the human was facing towards them, compared to when the human was facing away from them t 88 = 7.10, P < 0.001. No other significant main effects or interactions were found. These results indicate that the chimpanzees were functionally using the calls in the same way as non-vocal attention-getting behaviors such as clapping, cage-banging, and spitting, that is, to capture the attention of an otherwise inattentive human (Hopkins et al. 2007; Leavens et al. 2004b).

Fig. 2
figure 2

a The mean proportion of gestures, calls and attention-getting behaviors was calculated by dividing the total frequency of occurrence for each behavior by the total number of communicative responses produced in the toward and away conditions. b The mean proportion of elaborative signaling in the visual and attention-getting modalities

Orientation effects on elaboration in communicative behavior

The extent to which the chimpanzees elaborated on their communicative behavior was tested in the next set of analyses. For this analysis, a mixed model ANOVA was performed with the condition (Towards, Away) and behavioral elaboration (visual–visual, attention–attention) proportions serving as repeated measures. Sex (male, female) and rearing history (mother-reared, nursery-reared, wild-caught) served as between group factors.

A significant two-way interaction was found between experimental condition and behavioral elaboration F 1,99 = 15.84, P < 0.001. Subsequent post hoc tests indicated that the relative proportion of different visual–visual signals was significantly higher when the experimenter was facing toward compared to away from the chimpanzee t 104 = 8.16, P < 0.001. In contrast, the relative proportion of different attention–attention behaviors was significantly higher when the experimenter was facing away from rather than towards the chimpanzee t 104 = 4.67, P = 0.02. No other significant main effects or interactions were found.

Discussion

Two results emerged from this study. First, chimpanzees produced more attention-getting behaviors when a human was oriented away from, compared to when a human was oriented towards them. In contrast, when a human was orienting towards them, they produced more visual gestures than attention-getting behaviors. Secondly, not only did the chimpanzees modify the modality of their communication, but also they elaborated their communication within a modality in accordance with the orientation of a human observer. This is a kind of elaboration that differs from others reported, recently, for chimpanzees (Leavens et al. 2005a) and orangutans (Pongo pygmaeus—Cartmill and Byrne 2007). In Leavens et al. (2005a) and Cartmill and Byrne (2007), elaboration was defined as the number of different response classes displayed across modalities, whereas in the present study, elaboration occurred within each of two modalities, auditory and visual.

The chimpanzees used significantly more different visual signals when a human was oriented toward them but used significantly more different attention-getting signals when the human was oriented away. These findings indicate that the chimpanzees’ communicative behaviors were not initially tactical, followed by a series of random behaviors, but followed a logical and efficient pattern of modality-specific permutations. In other words, in the face of two consecutive and relatively protracted 30-s intervals during which their communicative bids were not successful, the chimpanzees varied their signals within a modality that was appropriate to the attentional status of a human experimenter. In contrast to previous studies (Hostetter et al. 2001; Leavens et al. 2004b), the changes across experimental conditions occurred within the same trial, as the experimenter changed orientation from facing towards Focal Subject No. 1 and away from Focal Subject No. 2, to the opposite orientation (away from Focal Subject No. 1 and towards Focal Subject No. 2). Thus, the effects that we report are the result of relatively rapid accommodations to changes in experimenter visual orientation, in comparison to previous studies.

The findings reported here do not directly address the question of whether the chimpanzees were attempting to manipulate the psychological state of the human during their multimodal communication, but we and others have argued repeatedly that such direct evidence is, in principle, unattainable for any non-verbal organism, including young humans (Leavens et al. 2004b, 2005a, 2008; Racine and Carpendale 2007). Hence, these data are certainly no less ambiguous than data derived from the study of human infants with respect to the topically current, but scientifically dubious question of whether these subjects attempted to manipulate the behavior or the mental contents of the human experimenters’ minds. When human babies exhibit sensitivity to what their social partners can see or have seen, they are credited with having rather advanced appreciations of the psychological properties of their social partners, including an appreciation that seeing leads to knowing and that others have visual perspectives that differ from one’s own perspective (Camaioni et al. 2004; Liszkowski et al. 2004; O’Neill 1996; Povinelli and Eddy 1996). In the present study, chimpanzees discriminated the manipulated attentional status of a human experimenter in their spontaneous communicative behavior. While these data are certainly consistent with the idea that the chimpanzees explicitly represented the visual perspectives of their human interactants, this is no more a necessary inference in the present context than it is when the subjects happen to be pre-verbal human children (Brinck 2001; Doherty 2006; Leavens 2006; Leavens et al. 2005a; Moore and Corkum 1994). In other words, these chimpanzees exhibited the same kind of sensitivity to variations in the visual orientation of adult humans that very young human children do and this fact is equally incapable of clarifying the psychological processes involved, whether the subjects are chimpanzees or humans; in either case, we do not know whether a discrimination learning account or appeal to an abstract representational capacity better characterizes these processes.

Reports of spontaneous referential gestures in wild apes are rare (Pika and Mitani 2006; Veà and Sabater-Pi 1998) and the differences in gestural signaling between captive and wild populations of apes appear to reflect what has been described as the “Referential Problem Space” challenge facing chimpanzees living in captivity when compared with the wild (Leavens et al. 2005b, 2008, 2009). There are essentially no space constraints on wild chimpanzees, and if they want a desirable food item, they can freely walk to that space and obtain the food. In captivity, the food items are located outside their cages and the chimpanzees are therefore restricted from simply walking over and obtaining it. This creates a formidable cognitive challenge to the captive chimpanzee and the differential use of attention-getting and visual communication behaviors, only in the presence of a human, appears to be a very efficient means of solving this problem (Leavens et al. 1996; Leavens and Hopkins 1998). This clearly demonstrates the sensitivity of captive chimpanzees to human social stimuli in their environment and their ability to instrumentally use these social agents for their own ends. We would emphasize that constraining their ability to use only visual cues is a necessary condition for the differential use of calls when a human is facing toward or away from them. If chimpanzees are allowed to reposition themselves so that they can use a visual signal rather than an auditory signal, then they will reposition, even if it requires that they move away from the immediate goal (i.e., food) (Liebal et al. 2004).

We found no evidence of sex or rearing history differences in the communicative abilities or performance of the chimpanzees. Although some have suggested that certain ape communicative abilities reflect different degrees of human enculturation (Tomasello and Call 1997), our results do not support this interpretation. Certainly, the communicative abilities expressed by captive apes differ from those reported in wild subjects (see above “Discussion”) but the diverse rearing experiences of the Yerkes population do not appear to have any direct influence on the production and elaboration of signaling within the methodological and procedural constraints of this study. However, the subjects in this study were not home-raised or language-trained—there are qualitative differences in communicative signaling between apes that have been cross-fostered and institutionalized apes, like those in the present study (see Call and Tomasello 1994; Kellogg 1968; Leavens 2004; Leavens et al. 2005b, 2008, 2009; Racine et al. 2008; for discussion). We found no substantive differences in signaling behavior between institutionalized chimpanzees raised in peer cohorts, raised by their mothers, or taken from the wild prior to the 1970s and subsequently institutionalized.

In evolutionary terms, clearly, the ability to exercise choice over modality of communication and to tactically vary the display of signals within a context-appropriate modality emerges in captive populations of chimpanzees in the complete absence of any explicit training to do so. Whether captive circumstances foster particular problem-solving capacities not seen in the wild, or whether these chimpanzees are displaying species-typical problem-solving capacities in novel environments is not clear from the present study, but these data do highlight the flexibility in communicative signaling that characterizes chimpanzees (Bard 1998; Goodall 1986; Leavens et al. 2005b). This spontaneous ‘attunement’ of these chimpanzees’ signals to the response characteristics of their human social partners constitutes evidence for continuity between humans and apes in their motivation to engage in inventive communicative exchanges (Bard and Leavens 2009; Hopkins et al. 2007; Leavens et al. 2008). These motivational elements have been termed ‘intersubjectivity’ (Trevarthen and Hubley 1978) and the present data suggest that chimpanzees, like humans, exhibit both a profound desire to manipulate their social partners and a manifest skill in that manipulation (Bard 1998; Bard and Leavens 2009; Gómez 1998; Leavens et al. 1996, 2005b, 2008, 2009). The patterns of intramodal variegation of elements displayed by the chimpanzees in the present study resemble a kind of ‘praxic babbling,’ or a generative capacity which may have affinity with the human specialized babbling in the vocal modality, at least with respect to the motivation for this kind of play in signaling (Hopkins et al. 2007).