Introduction

Historical background

Although researchers interacting with their animal subjects often use communicative signals both in the laboratory and in the field, until recently there was no systematic study aimed at investigating to what extent animals comprehend human gestural signals and whether such communication affects the behaviour of animals. There have been, however, many anecdotal accounts, some followed up with experiments, which have suggested that some individuals could indeed react differentially to human given cues (for some examples see Candland 1993). Animals with a longer domestication history might have some advantage and also individuals that were raised by humans from a very young age. Perhaps the most well-known case was about a horse (“Clever [Kluger] Hans”), which reportedly was able to “count and read” (Candland 1993; Rosenthal 1965). Oskar Pfungst (1911) was among the first to show that the horse's “extraordinary abilities” were based on its sensibility to observe and react to minute changes in body tension of the human experimenter.

The case of Clever Hans made researchers more cautious and as a consequence, textbooks in both experimental psychology and ethology have argued that researchers should avoid getting in communicative interactions with their subjects because inadvertent cueing during training or testing could influence the behaviour and therefore the performance of experimental subjects. This warning has been taken seriously and since then special precautionary measures are taken to prevent direct contact with the animals. For example, experimenters separate themselves by a screen while their subjects make a choice in the Wisconsin test apparatus, and computers control operant conditioning apparatuses while researchers observe their subjects through one-way mirrors or by the means of closed circuit video camera systems.

In recent years there has been a renewed interest in investigating the ability of various animal species to understand human gestures. The interest in this subject was at least partially initiated study by Anderson et al.'s (1995), which used a very simple experimental method to look systematically for the effect of experimenter given cues on the behaviour of the subject. Several studies with various animals using the same paradigm have followed up this work, and as of today, more than 20 papers have been published. Unfortunately, these studies suffer from many conceptual and methodological problems. The goal of this review is to facilitate future comparative work by pinpointing some of the main issues.

The ethological approach: What can we learn from studying comprehension of human pointing in animals?

Although it might be of some interest to study how our gestural communicative system is able to influence the behaviour of various species of animals, we would argue that the controlled investigation of the animals' ability to understand human communicative signals offers a possibility to study the cognitive abilities underlying their communicative interactions.

Because most studies preferentially used the pointing gesture as the main communicative signal, this will be the focus of the present review. The advantage of using this gesture, pointing with an extended index finger, is that represents a species specific, human communicative gesture that is not used on any regular basis by other free living primates (see Vea and Sabater-Pi 1998; for a recent review of pointing in apes see Leavens and Hopkins 1999). In humans, pointing behaviour emerges toward the end of the first year but continues to develop until the age of two years when infants use pointing to direct the attention of the mother to objects in space (Bates et al. 1977). Comprehension precedes production, and already 9-month-old infants can follow with their gaze simple forms of pointing. Research reveals a gradual development in pointing comprehension that very likely is the result of an interaction between cognitive development and learning, social experience, social interaction and communication. This development is further facilitated by the own pointing behaviour. We believe that the comparative research of pointing comprehension can in principle extend our knowledge in two directions: First, we can learn about possible evolutionary scenarios that allowed the emergence of pointing behaviour in our species, and by it, understand the ultimate and proximate causes of both its comprehension and production. Second, the species studied can open a window on their communicative system. Learning about a communicative signal of another species can reveal the flexibility of the system as well as the contribution (or impediment) of species-specific features to get engaged in inter-specific communication.

Following the footsteps of Tinbergen (1963), we investigate pointing comprehension from three different perspectives:

  1. 1.

    The evolutionary perspective is based on the observation that the pointing gesture is a specific human behaviour. Collecting experimental evidence for such human distinctiveness requires comparative data on humans and related primate species. Differences and similarities could show whether abilities associated with pointing comprehension are restricted to humans or have some evolutionary antecedents (Povinelli and Giambrone 1999). A somewhat different line of argumentation suggests that convergent evolutionary processes could also lead to the emergence of such abilities. More specifically, it has been argued that dogs (and perhaps other domesticated species) are at advantage in comprehending human communicative signals (including pointing), because the process of domestication might have selected for such abilities (Hare et al. 2002; Kaminski et al. 2005; Miklósi et al. 1998).

  2. 2.

    The functional perspective of pointing comprehension was emphasized by Hare (2001) when he attempted to account for some differences in pointing comprehension between apes and humans. He argued that pointing is an inherently cooperative type of signal, that is, the signaller directs the attention of the receiver to an object or location. It has been hypothesized that pointing comprehension should be found in species with a stronger tendency to cooperate. Based on this, negative results have been explained by the mainly competitive relationship present in most species of apes and monkeys.

  3. 3.

    The mechanistic perspective is concerned with explaining the cognitive abilities underlying pointing comprehension. Earlier it had been thought that pointing comprehension could reveal something about the animals' ability of understanding mental states (Anderson et al. 1995, 1996). It was suggested that the perceiver of a pointing gesture might attend to the mental state of the pointer by recognizing its attentive state. Later this hypothesis was referred to as the ‘high-level’ model of pointing comprehension (Povinelli et al. 1999), and it was contrasted with the ‘low-level’ interpretation, which assumed that subjects either use some observable visual cues to discriminate between situations (associative discrimination learning), or they use some simple behavioural automatisms that enables them to track the goal object of the signal in space. Based on Call's (2001) recent proposition, a third alternative mechanism can be suggested. Animals that are able to recognize a situation as communicative could also be able to deduce after some experience, a secondary variable (“knowledge” in Call's terms), a “rule” that they apply in a novel situation or context. This explanation assumes a more complex mental representation of the communicative interaction (at least the ability to recognize the communicative nature of the cue emitted by the signaller), in contrast to the simple discrimination learning explanation, but does not depend on abilities to represent mental states of others.

A further interesting question is whether animals are able to comprehend the referential aspect of pointing. In humans the referential aspect of pointing follows directly from its imperative and/or declarative nature (Tomasello and Camaioni 1997), however this does not mean that also animals are necessarily decoding the referential components of pointing. Although the issue of referentiality remains controversial in animal communication (see e.g. Owren and Rendall 1997), here we refer to it as an ability to comprehend communicative signals that refer to external events or objects and their effect seems to be relatively independent of the context (Allen and Saidel 1998; Evans 1997).

The structure of this review

In the following we will present a 3-step analysis of the topic. (1) We compare and evaluate current experimental methods used. (2) We compare available experimental results on performance of different species and we investigate the interaction of species differences and other independent variables. In this section we look for answers on how modifications of the procedure affect performance (see Table 1) and whether previous (social) experience influences performance, including the effect of development. (3) We try to evaluate how our present understanding of pointing comprehension answers questions about function, evolution and mechanisms.

Table 1 Summary table for experimental studies of pointing comprehension, grouped on the basis of three aspects of the gesture. The temporal aspect refers to the duration and the dynamics of the gesture used. The spatial aspects refers to the physical proximity of the experimenter and the pointing hand to the target object (at target = the experimenter is standing at the target and is pointing at the target; proximal = the experimenter is standing at equal distance from both targets but the tip of pointing finger is within 10 cm of the target; distal = the experimenter is standing at equal distance from both targets but the tip of pointing finger is at least 50 cm from the target; asymmetric = the experimenter is standing at the non-target, and pointing to the target. Pointing gestures are often not accompanied by any gazing cues (no gazing = looking at the floor or wearing non-transparent glasses), experimenters gaze at their subjects during pointing (gazing) or make gaze alterations (gazing at the subject is followed gazing at the target and vice versa). ± signs indicate significant/non-significant preference for the object indicated by pointing

It should be noted that because of the divergence of theoretical and experimental paradigms, our comparisons will often be only tentative. Nevertheless, we believe that this review will provide a starting point for developing more precise hypotheses and better designed experiments.

Methods

Subjects

Probably because researchers had different theoretical questions in mind, a wide array of animals have been tested for pointing comprehension (monkeys: Rhesus monkey (Macaca mulatta), capuchin monkeys (Cebus apella); apes: chimpanzees (Pan troglodytes), gorillas (Gorilla gorilla), orangutans (Pongo pygmaeus); dogs (Canis familiaris); wolves (Canis lupus); cats (Felis catus); dolphins (Tursiops truncatus); horses (Equus caballus), seals (Arctocephalus pusillus), goats (Capra hircus)). There are however, at least three potential problems when relying on the comparative argument based on the animals tested up to now with this paradigm. (1) It should be noted that given the nature of these tasks in the case of some species, the number of subjects is very low which could violate the external validity of the results and eventually could bias the comparisons. (2) Given the social nature of the situation, previous individual social experience could have major influence on the outcome. In many cases we know little about the age, the individual history of the subjects and their previous experience with humans that remains difficult to assess. For example, dogs living in a family (and sometimes also being trained) are exposed to a great extent to pointing gestures (personal observation). The effect of human social experience is problematic in the case of apes because there is a considerable variation from using captive apes (with minimal human contact) to home raised individuals (for categorization on human experience in apes, see Call and Tomasello 1996), and in addition, the amount of early socialisation, which can have a strong effect on social skills in general, is often also not known. Further, pointing is such a natural behaviour in humans that it is difficult to assess the amount of previous exposure to such experience with humans in animals like dolphins or seals, even if admittedly this gesture was not used during training (Herman et al. 1999; Tschudin et al. 2001). (3) The experimental procedures used differ also among the tested species. This means that comparisons can be done only with some reservations and negative or contradictory results can be attributed sometimes to procedural differences.

Procedure(s)

The studies reviewed here are based on the following experimental procedure: The subject's out of view food is hidden in one of two (or sometimes three) bowls that are placed either on the floor or at particular height. The human informant stands (or kneels) between them, points to one of the bowls, and then the subject is allowed to make a choice. Pre-training often involves familiarization with the bowls since some subjects have to learn that those might contain food. The reward is usually presented in semi-random order (No more than two rewards are hidden on the same side in subsequent trials to avoid the development of a side bias.).

In some cases subjects (e.g. chimpanzees) are separated from the human informant by partitions (or “by water” in dolphins, see Herman et al. 1999; Tschudin et al. 2001), and in these cases subjects are usually nearer to the bowls (or goal objects in case of dolphins) than the pointer, or both participants are at equal distance. In other experiments both the informant and the recipient are in the same physical space, and can interact freely (e.g. dogs, goats), and subjects usually stand further from the bowls.

A further difference concerns the degree of familiarity with the testing environment and the experimenter. Animals are sometimes tested in their home environment or in a familiar environment that is used for testing. The experimenter is usually familiar in the case of apes and dolphins, while often unfamiliar persons test dogs.

Two distinct experimental procedures can be distinguished: In the ‘training procedure’, novel gestures are introduced only one by one, usually only after the animal had reached significant performance levels with the previous “simpler” gesture. In this arrangement one can test for learning ability by measuring differences in learning rate for different gestures. However, inference from earlier learning makes interpretation difficult (Anderson et al. 1995; Miklósi et al. 1998). In the so-called ‘probe trials procedure’ different and often “unfamiliar” variations of the pointing gesture are intermixed with simple “baseline” pointing (for example see Povinelli et al. 1997, 1999; Soproni et al. 2001, 2002). This method has the advantage that it usually involves a small number of trials to test the actual comprehension abilities of the subject, and on the practical side, the baseline trials also help to maintain the attention of the subjects, and are indicative of the general level of responsiveness during the experimental sessions. Additionally, the response to the first presentation of a novel gesture can be used to assess the ability for generalization.

Looking at different experimental protocols there is a considerable variation in the presentation of the pointing gesture (see Table 1). In humans, pointing is also accompanied by joint/mutual visual attention (Morissette et al. 1995) that has been also described for chimpanzees in semi-natural circumstances (Tomasello et al. 1994). Interestingly, however, in order to decrease the chance of “unconscious” influence on the subjects' behaviour, investigators often avoided establishing the state of joint/mutual attention during pointing. For example, in most experiments by Povinelli and his colleagues (e.g. Povinelli et al. 1997) the experimenter did not establish eye contact with the subjects (experimenters gazed at the floor) before performing the pointing gesture. In contrast, in other studies, the experimenter used other visual gestures such as gazing, that is, simultaneously looking at the bowl when pointing to their subject (e.g. Itakura et al. 1999). Some experimenters established eye contact before presenting the pointing gesture (e.g. Soproni et al. 2002).

Experimenters have often varied the movement component of the gesture; this can have a strong effect on performance because it can affect salience and can have a differential effect on memory. We suggest that the following categories of arm movements are used for future reference:

Static pointing: The informant is already in the pointing position (indicating one of the objects) before the subject views him/her, and remains in this position until the choice is made.

Dynamic pointing: The pointing gesture is enacted in full view of the subject and the arm remains in the pointing position until choice is made.

Momentary pointing: The subject observes only a short 1–2 seconds long extended arm movement toward the object, after which the arm rests at the side of the body before it is allowed to make its choices, therefore the subject has to remember at which object the arm was pointing.

Experimenters varied also the distance between the pointing finger and the object. Although these distances are often not reported, actually we can deduce the values for most experiments reviewed here from other distance measures provided (and assuming that the average length of the human arm is about 70 cm). Accordingly, we suggest that the following two categories are considered:

Proximal pointing: The distance between the tip of the finger and the bowl is smaller than 10–40 cm.

Distal pointing: The distance between the tip of the finger and the bowl is greater than 50 cm.

Further procedural differences can be found in the spatial relation between the pointer and the objects pointed at. Generally the pointer stands at equal distance from all objects (symmetric pointing), but in some cases the experimenter points to a distant object while standing near to another potential object (asymmetric pointing). Finally, the ipsilateral hand with regard to the position of the object or the contralateral one can execute pointings. The later is usually referred to as cross pointing. (For some schematic examples of pointing gestures see Fig. 1.)

Fig. 1.
figure 1

Schematic view of the most frequently used pointing gestures

Results

In this section we compare the comprehension of the human pointing gesture across many species but being aware of the constraints provided by the data available in the literature. Because most studies do not only differ in the species involved, but also in their procedural details, we can not and do not aim at direct comparisons of the results. Instead, we compare the outcome of various experiments done with the same research questions in mind in a qualitative way.

Most of the experiments conducted recently on pointing comprehension have been summarized in Table 1 under three different aspects: temporal pattern, spatial relationship to the goal object, and whether pointing was accompanied by other forms of visual gesturing (i.e. head turning, gazing or eye movement).

The human body as a cue for food

In everyday situations the pointer is usually closer to the object he is indicating, than to other potential objects, and in the case of captive animals humans are often “the source” of food, so it can be expected that for such animals the human body could become a signal for food. In experimental tests, chimpanzees (Itakura et al. 1999), dogs (Hare and Tomasello 1999) and socialized wolves (Miklósi et al. 2003) seem to choose the bowl at which the experimenter is standing, providing a clear indication that they can use the body position as a cue for choice.

Static and dynamic pointing to proximal and distal objects

As indicated earlier, two types of pointing gestures can be discriminated, depending on the distance between the tip of the finger and the object. Although distance is not an obvious variable for categorization, in some cases “proximal” and “distal” distances can be distinguished. For example, objects can be said to be in a proximal position if they are within reaching distance, and by the same token distant objects are ‘out of reach’.

The problem of distance between the place of reward and a visual “beacon” becomes obvious if one considers the results of earlier similar experiments executed in a social context with chimpanzees (Jenkins 1943). It was found that their choice behaviour (in the Wisconsin apparatus) is poor if the indicating (cuing) object (“beacon”) is further than 20 cm from the goal object. In contrast the pointing index finger, head/gaze contour or eyes etc., are usually at a greater distance from the goal (sometimes even at 1 m). Therefore one could investigate whether the social context facilitates the use of distant cues in object choice tasks.

Fig. 2.
figure 2figure 2

(a) Comparison of the effect of human ‘proximal’ (index finger is approx. within 10 cm from target) and ‘distal’ (index finger is approx. more than 50 cm from target) pointing on choice behaviour in apes. C1: Itakura and Tanaka 1998 (n=2); C2: Itakura et al. 1999 (n=4); O1: Itakura and Tanaka 1998 (n=1); O2: Call and Tomasello 1994 (n=2); P1: Peignot and Anderson 1998 (n=4). Dotted lines shows chance performance, and * indicates significant difference from chance. (b) Comparison of the effect of human ‘distal’ pointing on choice behaviour in different species. D8: Miklósi et al. 2005 (dogs = 14; cats = 14); D7: Miklósi et al 2003 (wolves = 4); M1: Anderson et al. 1996 (n=3); M2: Anderson et al. 1995 (n=3); Do2: Tschudin et al. 2001 (n=6); S1: Scheumann and Call 2004 (n=4). Dotted lines shows chance performance, and * indicates significant difference from chance

The effect of distance on static pointing in apes can be seen in Fig 2a. If the distance between the tip of the index finger and the object was greater than 50 cm, subjects performed poorly in contrast to trials where the pointing finger almost touched the baited box. We should also note that in some trials/experiments the pointers also turned their head and looked in the same direction thereby enhancing the communicative effect of the gesture but even so the difference did not disappear. Even after training, chimpanzees in Povinelli et al's (1997) experiment were just able to master the task. Note that in all of these experiments, the experimenter presented a static pointing gesture (keeping the hand in extended position until the subject made a choice) that could also increase the chances of success. Two dolphins seem to show a much higher level of performance right from the beginning of such static pointing trials after being tested on various forms of dynamic pointing gesture (Pack and Herman 2004).

Capuchin monkeys (Cebus apella) performed well with proximal dynamic pointing (Anderson et al. 1995; Vick and Anderson 2000) but three Rhesus macaques were less successful in a similar task (Anderson et al. 1996). Recent comparative experiments have shown that seals (Scheumann and Call 2004), dogs and cats (Miklósi et al. 2005) are very skillful in trials with proximal dynamic pointing gestures (Fig. 2b) in contrast to chimpanzees (Tomasello et al. 1997a, b) and wolves (Hare et al. 2002). Dolphins (Tschudin et al. 2001) have only been tested with the distal dynamic gesture, and their performance was above chance in all experiments carried out (Fig. 2b) similarly to one seal (Shapiro et al. 2003). Cats and dogs (Miklósi et al. 2005) seemed to be relatively skillful, while wolves tested by Hare et al. (2002) performed at chance level (see below). In general, dogs, cats and dolphins performed at a similar (high) level from the beginning of the testing, so there was little indication for learning.

It should be noted that the experimenter avoided eye contact with the dolphins (by wearing sunglasses) but in trials with dogs, cats and wolves (in some cases with chimpanzees) eye contact was established before the presentation of the pointing gesture. Comparing dogs and apes in experiments where eye contact (accompanied by distal pointing) was obtained before pointing, dogs seem to be more efficient, but eye contact or especially gaze alternation, increases the effect of the pointing gesture.

More supporting evidence for comprehension of distal pointing comes from studies showing that two dogs (Hare et al. 1998), one dolphin (Herman et al. 1999) and a seal (Shapiro et al. 2003) were quite skillful at choosing between two objects that were placed behind them. Interestingly, at present it seems that despite variations in both the actual pointing gesture and other accompanying signals, apes have problems with comprehending static and dynamic distal pointing gestures in contrast to dogs, cats, dolphins and seals.

Momentary pointing gestures

The outstretched arm during pointing can be used as a “beacon” for guiding choice behaviour. Therefore, subjects can rely on simple spatial orienting mechanisms that are based on the use of physical cues for finding food repeatedly at the same location. In the case of momentary pointing gestures the subject has to remember a short signal for some time before making a choice. This makes the situation more similar to a communicative interaction where behaviour of the receiver is influenced by a short discrete signal emitted by the sender. Dolphins (Herman et al. 1999; Pack and Herman 2004), dogs, cats (Miklósi et al. 2005), and a seal (Shapiro et al. 2003) seem to show only minor decrease in performance if any (Fig. 3). Currently there is data from only one ape for this condition, Chantek (an orangutan), performed well to momentary pointing.

Fig. 3.
figure 3

Comparison of the effect of human dynamic and momentary pointing gestures on choice behaviour. D8: Miklósi et al. 2005 (dogs = 14; cats = 14); Do1 (momentary pointing): Herman et al. 1999 (n=1); Do2 (dynamic pointing): Tschudin et al. 2001 (n=6); S2: Shapiro et al. 2003. * indicates significant difference from chance

The direction of movement

It can be argued that the signal's most salient feature is not the pointing arm, but its direction of movement. This would explain why some subjects (for example apes) faced problems when presented with static pointing gestures (Povinelli et al. 1997). The possibility of such an effect has been tested by using a “reversed pointing gesture” when either the experimenter moved away from the goal object (McKinley and Sambrook 2000) during pointing, or the pointing arm was lowered to the side of the body after the static gesture had been observed by the dog (Soproni et al. 2002). Especially the later variation bears some importance because in this case the subjects witness an arm movement opposite to the directional movement observed in natural circumstances. The performance seems to affect only partially dogs (Fig. 4) suggesting that the movement component of the gesture plays a secondary role in dogs responding to pointing (see also similar results with dolphins in Pack and Herman 2004).

Fig. 4.
figure 4

Comparison of the effect of movement during pointing gestures on choice behaviour in dogs. D1: Soproni et al. 2002 (n=9); D4: Hare et al. 1998 (n=2); D5: McKinley and Sambrook 2000 (n=14). * indicates significant difference from chance

Asymmetric pointing

The use of asymmetric pointing could be very revealing because it presents the subject with two conflicting cues. While animals with more or less human experience show a clear preference for approaching the object in the vicinity of the human (see above), in this case, pointing indicates an object that is further away. In this situation dogs seemed to prefer to choose the bowl indicated by pointing and not at which the human was standing (Soproni et al. 2002). Note that the dogs did not witness the hiding, and the informant stayed in the vicinity of one of the potential places that could have hidden food. Although this situation proved to be somewhat difficult for the dogs, they seemed to be able to overcome their natural tendencies. In a similar manner, asymmetrical pointing seemed to be comprehendible for seals (Scheumann and Call 2004, Shapiro et al. 2003) (Fig. 5).

Fig. 5.
figure 5

Comparison of the effect of body signalling and asymmetrical pointing. C2: Itakura et al. 1999 (n=4); C3: Povinelli et al. 1997 (n=7); D1: Soproni et al. 2002 (n=9); D6: Hare and Tomasello 1999 (n=10); S1: Scheumann and Call 2004 (n=4). * indicates significant difference from chance

Variability in the form of the pointing gesture

In (adult) humans, the pointing gesture can take many forms. Although in most cases we point with the extended arm and index finger ipsilateral to the objects, sometimes, variations in the position of the upper arm and the hand with respect to the body can be observed. For example, pointing across the body represents such a variation when one is using his contralateral hand. If done with the arm the extended index finger usually protrudes on the contralateral side of the body (cross-pointing). Rarely, people point with a bent arm when the elbow protrudes on the ipsilateral side of the body and the index finger is positioned in front of the belly (described as “elbow cross-pointing”: Soproni et al. 2002; or as “belly pointing” in Hare et al. 1998). The use of such (relatively) unfamiliar variations of the pointing gesture offers the possibility to test the animal's plasticity and capacity for generalization in understanding this visual signal. We should also add that during pointing interactions under natural conditions, animals have the opportunity to observe the gesture from different viewing angles that might also contribute to their flexibility in responding to unfamiliar signals.

The cross pointing gesture was so far only applied to dogs (Soproni et al. 2002), dolphins (Pack and Herman 2004) and a seal (Shapiro et al. 2003); all demonstrated relatively good levels of comprehension (Fig. 6). However, since in the cross pointing gesture, the tip of the index finger is even further from the object than in the case of the normal pointing gesture, then apes should show on the basis of previous experience even lower performance.

Fig. 6.
figure 6

Comparison of the effect of human ‘cross-pointing’ and ‘elbow cross-pointing’ on choice behaviour. D1: Soproni et al. 2002 (n=9); D4: Hare et al. 1998 (n=2); D6: Hare and Tomasello 1999 (n=10); Do1: Herman et al. 1999 (n=1); S2: Shapiro et al. 2003 (n=1); C3: Povinelli et al. 1997 (n=7). * indicates significant difference from chance

For the “elbow-cross-pointing” gesture there are data both for apes and dogs (Fig. 6), and members of both species seem to be unable to comprehend the meaning of pointing which is less surprising in the case of apes given their low performance with other types of the pointing gesture. In the case of dogs, it seems that the protrusion of the hand/finger from the body torso represents the most characteristic feature of the pointing gesture. Interestingly, recent results suggest that seals show decreased performance when only the index finger is used for indicating the correct container (Scheumann and Call 2004) but another seal might be better with such type of gestures (Shapiro et al. 2003). Finally, in a somewhat different design, dolphins seem to be able to follow the direction at which the hand was pointing and not at which the arm was extended (see Fig. 4b in Pack and Herman 2004)

The role of additional attention cues

In humans, pointing is usually accompanied with gazing cues that direct the attention of the subject both to the signaller and the target. In some cases, researchers tried to separate the pointing gesture from these natural human specific features by avoiding gazing at the animal. The performance of dolphins and a seal was not affected. Interestingly however, chimpanzees showed improvement if the human was also gazing at the target but not if he was gazing at the chimpanzee during pointing (Povinelli et al. 1997). This can be explained on the basis of gaze following in the chimpanzees (e.g. Povinelli and Eddy 1996), when human's directional gazing biases the choice behaviour.

In some experiments pointing was accompanied with repeated gaze alternations when the human was looking to and from between the subject and the target. This “enhanced” or facilitated version of pointing increased the performance of two enculturated chimpanzees (Itakura and Tanaka 1998) in the case of proximal pointing. Goats with restricted human experience performed just above chance level in this case, when additionally to pointing, many other behavioural cues indicated the location of the target (Kaminski et al. 2005). Little variation can be detected in this regard in dogs but they have not been tested without gazing, and there is some indication that dogs could even decrease their performance if the gazing cue is omitted (personal observations).

The effect of human social experience and development

The comparison of comprehension in these species is invariably confounded by the effect of experience with humans, and it can be argued that different individual experience can have a larger effect on this behaviour than species predisposition. For example, dogs live usually and naturally in a relatively stable social environment in or around human families. Their experience with and exposure to the pointing gesture can be compared to that of children since it should be remembered that dogs often share to some extent the physical and social environment. Although dolphins are kept in a different manner, the living environment of the animals used in these studies is relatively similar including their level of intensity of interaction with humans. The captive apes represent the most diverse group because their contact with humans is often restricted to some extent (or after some age), and their direct contact with humans is difficult to assess. Some individuals were captured in the wild, while others were raised in special nurseries or in human homes, and differences in early social experience might turn out to be important. The notion of enculturation refers to apes that were brought up with intensive human contact (Call and Tomasello 1996), and although this concept might have explanatory value in some cases (e.g. imitation, see Carpenter et al. 1995; Nagell et al. 1993), the lack of quantification of human experience often makes it difficult to apply (see also Bering 2004; Tomasello and Call 2004). Nevertheless one might argue that more intense human contact leads to better comprehension of the pointing gesture. Unfortunately, the present data does not equivocally support such a conclusion. Seven nursery reared chimpanzees in the study by Povinelli et al. (1997) reached about 100% performance with the proximal pointing gesture after some training which is comparable to the two “enculturated chimpanzees” in the study by Itakura and Tanaka (1998). Similarly, captive lowland gorillas with little human contact proved to be successful with static proximal pointing (Peignot and Anderson 1999). In contrast, most apes used in both studies by Tomasello et al. (1997a, b) and Itakura et al. (1999) did not show signs of comprehension.

Individuals that acquired communicative skills in close contact and in the course of communicative exchanges with humans (i.e. language trained apes) did not show outstanding performance. There are data that Chantek (language training, see Miles 1990) was reported to choose relatively well in the case of distal pointing gestures (Call and Tomasello 1994) but Savage-Rumbaugh (1986) notes that two language trained apes showed little comprehension of human pointing (no formal testing has been reported).

The comparison of dogs and wolves has revealed that in some cases even extensive socialization with humans cannot overcome natural species differences but can increase performance after some training. In a recent study, dogs and wolves were raised under identical social conditions in close human contact during their first 3–4 months of life (Miklósi et al. 2003). Testing them under identical circumstances revealed that juvenile dogs performed better than wolves of the same age but that after extensive training, one wolf was able to reach a performance comparable to that of dogs even with the momentary distal gesture.

Earlier there was some indication that social experience improves the performance of dogs to pointing gestures (Hare and Tomasello 1999). However, there are observations that dogs as young as 4 months old are able to choose correctly on the basis of static, proximal pointing gesture (Hare et al. 2002) even if raised in puppy kennels with limited human contact. More recent observations suggest that at this age, dogs living in human homes perform relatively well even with momentary distal pointing gestures (Gácsi et al. 2005, unpublished data).

Discussion

Functional arguments: Can a single ultimate hypothesis explain species differences in comprehension of pointing?

Recently, arguments have been put forward to explain the apparent differences between apes and humans suggesting that the inherently competitive social strategy of primates prohibits them from performing well in a cooperative context (Hare 2001). In humans, communication based on pointing is inherently cooperative in its nature. The informant directs the attention of the perceiver to himself and/or to the gesture, and this signalling is seen usually as a benefit for the observer.

However, the concept of cooperation can also be evoked on another level. For example, the “social tool use” hypothesis of this gesture is based on the observation that the perceiver is usually willing to act in partnership with the pointer's goal. From the pointer's perspective it might be either a natural tendency to expect this (Gómez 1990) or a result of experience (learning) during social interactions. Many recent studies do not support a high-level interpretation of cooperation among apes (e.g. Boesch and Boesch 1989; Chalmeau and Gallo 1993), nevertheless this kind of ‘social tool use’ has been observed both in apes (Gómez 1990) and dogs (Miklósi et al. 2000) while interacting with humans.

In support of the aforementioned hypothesis chimpanzees have been found to be superior in judging the visual experience of the other individual (conspecific) in agonistic (e.g. Hare et al. 2000) but not in cooperative situations (Povinelli et al. 1990), suggesting that chimpanzees are experiencing difficulties in understanding tasks based on cooperation. This notion could receive further support from the observation that chimpanzees do not actively share food although they tolerate passive sharing (de Waal 1989) and the lack of this natural habit could prevent them from understanding the “logic” of interactions based on pointing.

However, the picture becomes less clear if the “cooperation-competition hypothesis” is applied to other species. Hare and Tomasello (1999) noted that the hunting behaviour of the wolves relies more on cooperative abilities in comparison with the apes. This would suggest that both wolves and dogs should be able to perform well in responding to the pointing gesture, provided they have appropriate social experience and/or training. However, as we have seen this is not the case because even after extended human socialization wolves' performance was inferior to that of dogs (Miklósi et al. 2003). Furthermore, chimpanzees (Itakura and Tanaka 1998) and monkeys (Vick and Anderson 2000) learned to comprehend simpler forms of the pointing gestures, and at the behavioural level (although no such data are available) most chimpanzee subjects seem to be able to cooperate with their human partner during the task. Conversely, one could argue that in the case of proximal pointing the informant is within reach of the potential food source (e.g. in the case of standing at the target), therefore the situation is not cooperative at all, and the gesture becomes a cue (and not a signal, see Hauser 1996 ) for indicating either the place of the food or the goal of the “dominant” human to obtain the food.

Although the “cooperation-competition hypothesis” can explain why dolphins showing many instances of cooperative behaviour (e.g. Connor and Norris 1982) exhibit superior pointing comprehension (Herman et al. 1999) it fails short in the case of seals that are not particularly cooperative in their nature (Le Boeuf et al. 2000; McConell et al 1999). This means that although the “cooperation-competition hypothesis” might have some virtue in explaining differences within the primate line, such difference in predominant social strategies could only partially be responsible for the observed effect.

Having noted high levels of pointing comprehension in dogs, another appealing hypothesis aims at explaining dog-ape differences by the effect of domestication (Hare et al. 2002; Miklósi et al. 1998, 2000; Soproni et al. 2001, 2002). It has been suggested that dogs have been selected for enhanced sociocognitive abilities for living in human social settings. However, it should be noted that domestication is not a unified process and the behaviour selected for depends not only on the selection process, but also on the species in question, and there is likely a “selection response x species” interaction. For example, dogs and cats are among the animals domesticated very early. Being both predatory species, dogs and cats share many cognitive abilities but it is likely that their domestication took place in context and species differences in their communicative behaviours toward humans is also evident (Miklósi et al. 2005). A recent finding that domesticated goats with relatively little human contact also seem to be able to master the comprehension of pointing gestures (Kaminski et al. 2005), provides further support for the domestication hypothesis but it remains to be seen whether all species domesticated perform better in these tasks in comparison to their “wild cousins” if the later are socialized to humans at comparable levels. The only available case of dog-wolf comparison points in this direction.

Despite the appealing nature of the domestication hypothesis, it still fails to explain pointing comprehension in dolphins and seals, but it could be suggested that species with some exapted behavioural traits (see Gould and Vbra 1982) for using directional signals in their species specific communicative exchanges, are able to comprehend the human pointing gesture, given some social experience or formal training (see also Hare and Tomasello 1999; Herman et al. 1999). More specifically, explaining comprehensive abilities of their dolphin, Herman et al. (1999) suppose that the oriented body position of the dolphins during projection of the sonar emission could provide a behavioural basis for such ability. A non-echolocating observer dolphin is able to identify a target that was targeted by another dolphin by its sonar (Xitco and Roitblat 1996). Translated to the world of visual senses this means that a dolphin is able to attend to an object the other is looking at.

Similarly, wolves have the ability to signal the direction where potential prey can be located (Mech 1970). When sensing the smell of the prey, wolves often freeze into a “pointing” position for some time. (This behaviour seems to have a genetic basis, and has been enhanced by selective breeding in some hunting dog breeds, i.e. pointers). However, the ability to follow the other's gaze into a space is not restricted to dogs and dolphins because chimpanzees have been found to use the gazing cues in very flexible ways (Hare et al. 2000; Povinelli and Eddy 1996). Therefore it seems unlikely that the ability of gaze following (which seems to be present in most mammals investigated so far) would give a primary basis for the comprehension of pointing.

Finally, it should be noted that the evolutionary and functional hypotheses listed above are not mutually exclusive and depending on the species they might have a synergic or antagonistic effect. At present is seems unlikely that pointing comprehension (or the lack of it) can be explained by a simple one factorial theory.

Mental representations behind pointing comprehension

Recently, it has been argued that all experimental evidence gathered so far on pointing comprehension in animals can be explained by “simple conditioning processes” (Shapiro et al. 2003). This opinion is in concert with Povinelli's “low level” hypothesis, which explains performance in such communicative situations by assuming that low level associative learning is at work. Especially in the case of social cognition it has become a fashion to think about the animal mind dichotomously: either assuming “simple conditioning” or some “high level complex cognitive processes”. However, it becomes extremely clear that by putting actual performance in one or the other category not much is explained. Actually, it turns out that such “simple conditioning” can be quite a complex process in itself. In this regard, we sympathize with the approach taken by Call (2001) who places more emphasis on explaining how animals set up general rules after experiencing certain environmental invariance. However, this line of investigation needs carefully planned studies based on viable hypotheses with appropriate control conditions. For example, the comparison of responses to proximal and distal pointing gestures could provide bases for differentiating between asocial and social aspects of the task. In the case of the former, the pointing finger/hand is at the goal object and learning about this form of gesture can be restricted to associating the vicinity of the cue with the place of food. The performance with distal pointing gestures cannot be explained easily this way because animals do not readily associate physical cues placed at a distance from the reinforced location (see above, Jenkins 1943). In this case, they rely either on the movement of the hand at the beginning of the cueing (see contradictory data for dogs in Soproni et al. 2002), or they have other strategies associated only with social interactions and not asocial situations. Additionally we find striking differences among species in performance with distal pointing as socialized wolves or chimpanzees have been found to have problems in understanding the task. Such negative evidence suggests that learning experience in itself is not enough to develop this skill. There is little doubt that associative learning plays a role in comprehending a gesture of a heterospecific, but the question is how flexible such learning mechanisms are and what kind of representations emerge as a consequence. The use of novel communicative gestures can also be informative here. Tomasello et al. (1997a, b) found that in contrast to children, chimpanzees had problems to recognize the act of putting an object on the baited bowl as signalling the place of food, while dogs seemed to be able to base their choice on this signal (Agnetta et al. 2000). This provides further evidence that the representations acquired by dogs and children in communicative context with humans might share some features in contrast to those emerging in some apes.

Another mechanism playing a role has been suggested by Miklósi et al. (2003) after observing the behaviour of socialized wolves and dogs in problem situations. It was found that dogs engage more readily in gazing contact with humans than wolves and along with others (Byrne 2003), we have argued that this propensity might facilitate the development of dog-human communicative interactions.

A further question is whether animals represent and are able to decode the referential information provided by the pointing gesture, which is an important ability in humans (e.g. Tomasello and Camaioni 1997). In view of Povinelli et al. (1997), the comprehension of the referential nature of the pointing gesture means understanding that it refers to (or is about) a given object in space. Their tests were based on the presumption that the ability to generalize from the basic (proximal) pointing gesture to novel forms of the pointing gesture might reveal representational understanding in chimpanzees. In contrast to the (often) negative findings with chimpanzees, both dogs and dolphins displayed high levels of performance in the case of some novel pointing gestures. The application of Povinelli's argument would suggest that both latter species might share some understanding with humans on the referential character of the pointing gesture. Herman et al. (1999) broadened the quest for referentiality showing that in dolphins the pointing gesture could also be a substitute for other gestural symbols referring to objects. Of course, it is difficult to exclude that the trained dolphin had no prior experience with pointing of this kind during their interaction with humans. Nevertheless, the dolphin was able to substitute the pointing, cross pointing and combinations of points and cross-points for the symbols in an artificial symbolic gestural system. A major thrust of the combination of the point test was to show that the subject could understand a reference to an object that was to be acted upon indirectly (the destination object) which had to be represented in the memory while the transport object was identified and transported. Further evidence for referentiality comes from experiments in which a dolphin shows the ability to choose successfully from three or more objects on a horizontal plain on the basis of pointing. In the case of the dog, we do not have such evidence yet, but dogs responding to other novel types of pointing gestures could give more support to this idea (Miklósi et al., in preparation).

The effect of experience and socialization

Given the peculiarities of the phenomenon, the experience with humans could turn out to be crucial. Unfortunately the picture is not clear because no systematic comparisons have been done. For example, language trained apes should display superior abilities if daily, communicative exchange is an important factor in pointing comprehension. To date, only Chantek has been tested systematically and evidence from other individuals is contradictory (see above). Similarly in the case of wolf-dog comparison, human socialization did not result in comparable performances.

Interestingly, after learning to point to the hiding place, macaques have shown increased understanding of the human pointing gesture (Blaschke and Ettlinger 1987) and conditioned joint attention in the same species, seems to facilitate comprehension of pointing (Kumashiro et al. 2003). Both results suggest the functioning of a more complex mental representational system behind such comprehension than one would assume on the basis of simple associative phenomena.

Nevertheless it is likely that a minimal experience with humans is needed to perform successfully and extensive experiences can faceplate performance in some cases, especially when the subject is faced with unfamiliar situations.

Future directions

Perhaps at this point we should admit that this review was elicited by our frustration in trying to make sense of recent experimental results in this field. Moreover, in the field of social cognition, we often note that researchers “pick” their favourite arguments and totally disregard the actual results of the experiments and their external validity. On this basis we agree with one of our reviewers saying that the present state of this field does not allow for scientific arguments on social cognitive evolution. In this sense, this work should be treated as a provocative script providing a “mirror” to all those involved. We hope, however, that by summarizing most of the available data we facilitate better-planned research strategies and more systematic experiments. A short list suggesting the most urgently needed studies follows.

Technical considerations:

  1. (1)

    We have suggested a kind of categorisation of the gestures, but there is room for improvement. This would help to understand how the gesture was actually executed and what the subject saw. Apparently there are cultural differences among humans, which cause no major problems in our conspecific communication, but might do make a difference if not controlled for in experimental conditions.

  2. (2)

    The so-called probe procedure seems to be accepted by most laboratories. Although this makes life easier to account for the effect of generalization, researchers should provide data on first trials with the novel gesture and also show whether learning takes place over the testing procedure.

  3. (3)

    As many assume that social experience is important, data on experience with humans should be provided. In the case of species comparison, members of all species should experience similar level of human contact, taking into account developmental effects. This problem is most evident in the case of ape-human comparison when the performance of apes is compared to young children. In principle differences in the results could always be attributed to differential social experience.

Suggestions for further research:

  1. (1)

    Using monkey species that are known to differ in their social relationships, one can test the cooperative-competitive hypothesis further. For example, cooperative behaviour observed in captivity of Tonkean (Macaca tonkeana) and Rhesus macaques (M. mulatta) seem to mirror their dominance structure in the wild (Petit et al. 1992). Socialized Tonkean macaques should be superior to Rhesus monkeys if social tolerance and cooperative nature contributes to better performance in pointing tasks.

  2. (2)

    The domestication hypothesis needs further parallels when both wild and domesticated species are tested in comparable manner (e.g. pigs versus wild boars). It should be emphasized that in this case the experience both with humans and the experimental procedure should be the same for both species. Positive results could encourage direct comparison between domesticated species that originate from species of similar ecological background, as in dogs and cats or goats and horses. Accordingly, “domesticated” foxes (Trut 1999) should perform also better than non-domesticated ones when tested with distal pointing gestures (see also Hare et al. 2002) because this type of gesture proved very difficult to learn by socialized wolves (Miklósi et al. 2003).

  3. (3)

    Interestingly, there are only a few studies with human children (i.e. Povinelli et al. 1997). Recently, we have started to re-run experiments in dogs with children in order to collect data on their development of pointing comprehension (Soproni 2004).

  4. (4)

    One should investigate whether other faculties of social cognition can influence understanding of pointing. For example, does the ability of establishing joint attention, the sensitivity for attention cues (Gácsi et al. 2004; Xitco et al. 2004) or the skill of pointing, enhance pointing comprehension?