Introduction

Pointing has been widely studied in humans as a ubiquitous referential gesture (Werner and Kaplan 1963; Butterworth 1998). Its development in infants (Bates 1976, 1979; Masur 1983; Desrochers et al. 1995), and its coordination with gaze and vocalization (Pechmann and Deutsch 1982; Dobrich and Scarborough 1984; Schaffer 1984; Zinober and Martlew 1985) have been considered foundational referential acts. Tomasello (1995) emphasized the dependence of referential communicative behavior on joint attention in particular. He argued that joint attention was more than simultaneous attention. As an example of simultaneous attention, he offered occasions in which the attention of two individuals was fortuitously drawn to the same stimulus, such as an unexpected sound. Tomasello suggested that joint attention was distinguishable from simultaneous attention by joint monitoring of each partner by the other.

Current discussions of pointing, gaze following and coordination, and joint attention in nonhuman primates, canines, and marine mammals demonstrate the difficulty in specifying what is understood when such behaviors occur (Kummer et al. 1996; Povinelli and Eddy 1996a, 1996b; Emery et al. 1997; Itakura and Tanaka 1998; Tomasello et al. 1998; Miklosi et al. 2000; Tschudin et al. 2001; Scheumann and Call 2004). However, truly referential pointing should be most directly related to the receiver’s attentional behavior (Bruner 1975, 1977; Pechmann and Deutsch 1982; Mangold and Pobel 1988; Tomasello 1995). Attention itself is a higher-order cognitive mechanism and of course cannot be directly observed. However, its presence is inferred, at least by human signalers, by a variety of behavioral, verbal, and physical cues. In particular, a referential signaler must be sensitive to the orientation of a receiver’s “forward-directed” sensory systems (Moore and Corkum 1994). For example, Call and Tomasello (1994) tested the effect of a receiver’s orientation on the pointing behavior of two orangutans that had been explicitly trained to point. One had participated in a long-term sign language study (Miles 1990), while the other had participated in a variety of studies on learning and problem solving. In this test, an experimenter placed two glasses of juice just beyond the ape’s reach. The experimenter then either faced the orangutan with eyes open, faced the ape but with eyes closed, turned his back to the ape, or left the room. Both orangutans pointed significantly more often in the eyes-open and eyes-closed conditions. Although the language-trained orangutan pointed significantly more often in the eyes-open condition than in the eyes-closed condition, the test-wise orangutan produced points equally often in both experimental conditions. The test-wise ape also pointed, though at a lower frequency, in both of the other conditions. In contrast, the language-trained orangutan produced only a single point on one trial when the experimenter turned his back, and on one trial when the experimenter left the room.

Xitco et al. (2001) reported the spontaneous emergence of behaviors that resembled referential pointing and gaze alternation for two adult male Atlantic bottlenose dolphins (Tursiops truncatus) participating in a symbolic communication project at Walt Disney World. In these studies, human trainers, wearing SCUBA gear, interacted with the dolphins underwater, searching for goal objects (i.e., foods, toys, and tools) randomly distributed within a 22-million-l, simulated coral reef environment. Goal objects were usually placed inside transparent containers that prevented the dolphins from gaining access to them without the use of a tool or assistance from their human companions.

The dolphins’ superior swimming speed enabled them to find many of the goal objects before their trainers. The dolphins began to spontaneously “point” at these objects, something that had never been previously observed in any context. While pointing, a dolphin would stop his forward progress, often less than 2 m from an object, and align the anterior-posterior axis of his body with the object for several seconds. The dolphin then alternated the direction of his head between the object and the trainer several times, as if to monitor the trainer, while maintaining the alignment of its body with the object. From their first occurrences, the dolphins’ pointing and monitoring behaviors appeared simultaneously and fully formed, suggesting that though they had emerged to serve an instrumental function, shaping played little role in the development of the form of these formal gestures.

Clark (1978) argued that to be considered referential, gestures like pointing must be distinct from the act of attending in and of itself. Although it is possible that the dolphins were merely attending to the object and receiver in alternation, several aspects of their pointing and monitoring behavior suggest that something more was occurring. Previous experimental work on dolphin vision (Nachtigall 1986) and echolocation (Au 1993) suggests that the dolphins could both detect and discriminate the goal objects used in this study from a distance of many meters. Their close approach to indicated objects was not needed for their own perceptual benefit, and in fact may have made the objects more difficult to discern. The dolphin has a visual “blind spot” directly in front of its rostrum, and the mechanisms by which dolphin sonar might function at distances less than 1 m, within an acoustic near-field constrained by the velocity of sound in water, are not understood. Dolphin echolocation performance at ranges <1 m has not been studied, and may not be better than at ranges >1 m. In addition, when dolphins typically attend to objects, with either vision or echolocation, they do so while continuing to swim. In these studies, the dolphins’ stationary posture while pointing, and the effort they exerted to maintain the alignment of their bodies with objects while looking toward receivers were strikingly different from their normally fluid movement while searching for objects, and provided them no perceptual benefit. Lastly, the small, scanning head movements that often characterize the dolphin’s use of echolocation could sometimes be observed when the dolphins pointed to the goal objects or monitored receivers, but they were superimposed on the sweeping, exaggerated movement of the dolphin’s head between the goal object and receiver. Such large-scale head movements were not needed to maintain attention on targets in two directions. The dolphin’s laterally-placed eyes give it a visual field extending over 270°, which would have allowed these dolphins to direct the central axis of their echolocation beam at one object while monitoring the other visually. The dolphins could have alternated their attention between object and receiver without moving their head. On the whole, we believe the dolphins pointing and monitoring behavior were more than inspecting the object and receiver in alternation with echolocation, but instead, combined with their stationary posture, constituted a formal, communicative gesture.

In the studies reported by Xitco et al. (2001), the dolphins’ pointing and monitoring behavior was influenced by the presence of a receiver, and the distance between the dolphin and receiver. When the presence and absence of humans was analyzed the dolphins never pointed or performed behaviors resembling monitoring in the absence of a human receiver. When a human was present, the dolphins were sensitive to the proximity of the receiver. While pointing, the dolphins were more likely to include monitoring when apparently attentive receivers were far away, but to omit it when they were nearby.

Several instances were reported in which the link between the dolphins’ spontaneous pointing and monitoring behavior and the orientation of their human companions was clearly demonstrated by the failure of humans to respond to the gesture. On two occasions a dolphin, looking towards his inattentive companion after first pointing, was observed to wait and point again only after the human turned to face the dolphin. On two other occasions the dolphin stopped signaling when it became clear the human was continuing to engage in another activity. However, such clear examples were infrequent because the dolphins’ human companions were specifically charged with focusing their attention on the dolphins and interacting with them. Therefore, the present study was conducted, adapting the technique of Call and Tomasello (1994) used with orangutans, to test the effect of a receiver’s orientation on pointing in these same dolphins.

Methods

Subjects

The subjects were the two male Atlantic bottlenose dolphins (Tursiops truncatus), Bob and Toby, whose spontaneous behavior was described by Xitco et al. (2001). The dolphins were approximately 15 years old, and at the time of testing had been participating in daily research sessions for 8 years. Each consumed approximately 9.5 kg of food per day, composed of a mixture of herring, mackerel, capelin, sardine, night smelt, and silver smelt. The total quantity of food was delivered across four training/research sessions. Any food not received during or immediately following a session was offered at the end of the day, regardless of the dolphins’ performance during sessions.

Apparatus

Training and testing sessions took place in the simulated coral reef environment at the Living Seas, Epcot, Walt Disney World. One portion of the environment was sectioned off with a large, rigid fence that runs from the central underwater viewing area to the perimeter of the aquarium. This fence, the divider, was constructed principally of round fiberglass tubes, 4 cm in diameter, that ran vertically from the bottom of the aquarium to approximately 0.5 m above the water’s surface. The interval between vertical bars was 11.5 cm. As a result, the dolphins could not put their heads through the divider, but could readily inspect objects on the opposite side with either vision or echolocation.

A schematic of the apparatus is shown in Fig. 1. It consisted of three rectangular pieces of 1-cm-thick PVC plastic sheet. The base was 61 cm wide × 1.5 m long. Two circular holes, 12.5 cm in diameter, were cut through the base 17 cm from either end. The base piece was attached to two end pieces as depicted in Fig. 1, such that it rested 2.5 cm above the aquarium floor. The end pieces were 61 cm wide × 38 cm tall. During training and testing trials, the front edge of the apparatus was located 45 cm from the divider. The dolphins’ food was placed inside one of two clear, water-filled, snap-top, polypropylene jars, each of which was 10.5 cm in diameter and 27.5 cm in height. The dolphins could detect the food inside the jars using both vision and echolocation. A 0.91-kg lead weight measuring 6.5 cm wide × 7 cm long × 2 cm thick, was attached to the bottom of each jar.

Fig. 1A, B
figure 1

Schematic of apparatus. A An overhead view of the apparatus, and the positions of the dolphin and trainer. B The apparatus from the trainer’s side of the divider

Two video cameras were used to record test sessions. One experimenter, wearing SCUBA gear and located 3 m above and 4 m behind the apparatus operated a Sony V801 Hi8 mm video camera in a hand-held Amphibico underwater housing. This camera recorded the behavior of both the dolphin and trainer when they remained near the apparatus. A second camera, Subsea Video Systems SC 42, was mounted to the center of the apparatus. This camera was connected by an underwater cable to a Sony EV-C100 Hi8mm video cassette recorder, located above the surface, that remotely recorded the output from the mounted camera. The mounted camera only provided a view of the dolphin’s behavior at the apparatus. The apparatus itself, including the jars, and the trainer were not visible. This was done so that coders could later view the dolphin’s behavior without being biased by the presence of a target, or the trainer’s behavior.

Definition of behaviors

The analysis focused on two behaviors: (1) pointing, and (2) monitoring. Pointing was defined as the alignment of the anterior-posterior axis of the head and body with one of the jars while remaining stationary for approximately 2 s or longer as determined by the coders. Monitoring was defined as the rotation of the head approximately 45° or more, as determined by the coders, towards the trainer while maintaining the alignment of the body with the jar.

Training

During each session, the dolphins were released into the section of the aquarium bounded by the divider fence. A trainer, wearing SCUBA gear and carrying approximately 2.64 kg of the dolphins’ food in a visually opaque bait bucket, then dove to the apparatus, located on the opposite side of the divider. The trainer knelt on the bottom behind the center of the apparatus, and then summoned one dolphin with a gestural/acoustic cue. The other dolphin remained at the surface, and was engaged in other activities by a second trainer. At the beginning of a trial, the experimental trainer placed both clear jars on the center of the apparatus, unsnapped the tops, and then placed approximately 0.22 kg of food in one of the jars. The trainer then snapped the tops, and simultaneously placed one jar at each end of the apparatus. After placing the jars, the trainer looked at the dolphin. If the dolphin pointed to the jar containing food, the trainer retrieved the jar and gave the food to the dolphin. When the dolphin finished consuming the food, the trainer retrieved the other jar, and began the next trial. If the dolphin pointed to the empty jar, the trainer picked up the empty jar and showed it to the dolphin, turning it upside down to demonstrate that it was empty. The trainer then retrieved the other jar, and after a 10-s pause, proceeded to the next trial without opening the jars. The position of the jar containing food on each trial was determined randomly before the start of the session, and written down on a submersible slate attached to the apparatus. During training and testing, the dolphin’s response was judged by the trainer, who recorded the outcome of the trial on the slate. During training, trials were run with one dolphin until a total of six correct responses were made or the trainer’s supply of food for that dolphin was depleted. The dolphins then switched places, and the second dolphin was run through a training session using the same procedure.

Each dolphin was given a training session of up to 12 trials once per day and up to 5 days per week, until they made no errors (i.e., pointing at the empty jar) across three consecutive 6-trial sessions. After satisfying this criterion, performance was maintained, until sufficient personnel were available to conduct testing, by providing a single 6-trial training session once per week for three weeks. In total, the training period lasted approximately 1 month.

Testing

Each test session consisted of three training trials, identical to those used previously, and three test trials. Test trials included one trial from each of three test conditions. In the face-forward condition, the dolphin was presented with an apparently attentive receiver, but one whose behavior was inconsistent with the response established on training trials. The trainer placed the jars on the apparatus, and then looked at the dolphin for 30 s without making any response. Another experimenter provided an acoustic cue to signal the trainer that 30 s had elapsed. The trainer then responded to the dolphin’s next point as on training trials. The other two conditions presented the dolphin with a potential receiver, but one whose orientation was not conducive for detecting signals. In the back-turned condition, the trainer turned his back to the dolphin after placing the jars on the apparatus. After 30 s, the trainer turned to face the dolphin, and responded to his next point. In the swim-away condition, the trainer placed the jars, and then turned and swam away from the dolphin and hid behind a nearby, low-lying reef located 4.9 m from the apparatus. After 30 s, the trainer emerged from behind the reef, returned to the apparatus, and responded to the dolphin’s next point. Test trials were presented on trial numbers 2, 4, and 6, and alternated with training trials. The order of test trials was random, with the constraint that each type of test trial occurred equally often on trials 2, 4, and 6 over the course of the test. Trials were given from a predetermined schedule that randomly assigned the position of the jar containing food. Twelve test sessions were conducted with each dolphin, one per day, up to 5 days per week, for approximately 2 weeks.

Data coding and reliability

Because the dolphins’ behaviors are unfamiliar to most readers, the method used to measure their occurrence is described here in some detail. Three coders familiar with the dolphins’ spontaneous pointing and monitoring behavior were shown video tape depicting one training trial, viewed from both the hand-held and mounted cameras. They were then told that they would be independently recording the behavior of the dolphins, as seen from the mounted camera, for the first 30 s of test trials that were similar to the training trial they had just viewed. The footage for each test trial that followed began with a freeze frame of the trainer’s hand in front of the camera, and went to black 30 s after the trainer’s hand moved. There was a 10-s pause between trials. The coders were instructed to keep a written tally, in real time, of the number of points the dolphin directed at either of the jars, and the number of times the dolphin monitored the trainer while maintaining the alignment of his body with one of the jars. They were told to maintain their gaze on the video image throughout the 30-s interval, and so could not look down at their recording sheets as they were writing during the trial. They were further instructed to infer the position of the jars and the trainer, which were not visible, on the basis of the training trial they had been shown. In addition, they were asked to note at the end of the trial if the dolphin had left the immediate vicinity of the apparatus at any time during the 30-s interval.

It was decided a priori to give the coders practice writing their scores while simultaneously viewing the trials. The coders were not informed of this so that an unbiased measure of task difficulty and intra-coder reliability could be generated. Therefore, after the coders viewed the first ten trials they were told to cross out these “practice” scores. The videotape was rewound, and without additional discussion the coders were instructed to watch the sequence depicting the training trial again, and to score each of the test trials that followed. The coders then scored all test trials, 36 for each dolphin, in the same order that the trials were run.

Intra-coder reliability was assessed by comparing the frequency of points and monitoring reported for each of the first 10 trials during the practice run with those reported for the same trials on the complete run of all 72 trials. The counts on the practice run were significantly related to those on the complete run for both points (r 28=0.96, P<0.001) and monitoring (r 28=0.93, P<0.001). Inter-coder reliability was assessed by comparing the frequency of points and monitoring reported by each coder for the same trial during the complete run. The counts reported by each coder were significantly related to those reported by each of the other coders for both points (r 70=0.92, P<0.001) and monitoring (r 70=0.83, P<0.001). In all subsequent analyses, the mean number of points and monitoring reported by the three coders was used for each trial. Intra-coder agreement between the practice run and the complete run, and inter-coder agreement during the complete run for the measure of whether or not the dolphin left the test apparatus during the 30-s interval was 100%.

Results

Training

Since the end pieces of the apparatus obscured the dolphin’s view of the jars from the side, the dolphins adopted a position directly opposite and slightly above the trainer, centered on the apparatus, without explicit training, from the first trial. Both dolphins also spontaneously pointed at food placed in jars on the apparatus from the first trial, using their established gesture in this new context without other prompting or cueing. They rapidly met the training criterion of three consecutive six-trial sessions without error after nine sessions. A total of three additional six-trial sessions were run over a 3-week period to maintain the dolphins’ performance until testing could begin.

Testing

During testing, Toby pointed to the jar that contained food on all 36 training trials and on 35 of 36 test trials. Bob pointed to the jar that contained food on 29 of 36 training trials, and on 34 of 36 test trials. All of Bob’s errors occurred by pointing to the jar to his left.

The frequency of dolphin points to objects and monitoring of the trainer, as a function of test condition, is presented for both dolphins in Fig. 2. Separate two-way analyses of variance were performed for points and monitoring. For points, there was a significant effect for test condition (F 71=146.82, P<0.001). Subsequent t-tests indicated that each dolphin pointed more often in the face-forward condition than in the back-turned condition (Bob, t 23=6.56, P<0.001; Toby, t 23=3.99, P<0.001), and pointed more often in the back-turned condition than in the swim-away condition (Bob, t 23=5.01, P<0.001; Toby, t 23=5.15, P<0.001). The dolphins rarely pointed in the swim-away condition (M=0.92 points, SD=0.68). When the trainer was actually hidden behind the reef, Toby produced a total of only three points. He pointed once on one trial, and twice on another. Bob produced a total of ten points, pointing once on six trials, and twice on two trials.

Fig. 2
figure 2

Frequency of dolphin points and monitoring. The mean number of behaviors is shown for each dolphin as a function of three test conditions: (1) face-forward (FF), (2) back-turned (BT), and swim-away (SA)

There was a significant two-way interaction for frequency of pointing between dolphin and test condition (F 71=5.28, P<0.01). Toby pointed significantly more often than Bob in the back-turned condition (t 23=2.70, P<0.02), whereas Bob pointed more often than Toby in the face-forward and swim-away conditions, although not significantly so.

For monitoring, there was a significant effect for test condition (F 71=13.45, P<0.001). Toby monitored more often in the face-forward condition than in the back-turned condition (t 23=2.68, P<0.02), and monitored more often in the back-turned condition than in the swim-away condition (t 23=2.55, P<0.02). However, none of the pair-wise comparisons between conditions yielded a significant difference in the frequency of monitoring produced by Bob, and the two-way interaction between dolphin and condition was not significant (F 71=0.965, P=0.39).

The numbers of points and monitoring produced by the dolphins were plotted as a function of trial number to assess whether the dolphins’ behavior changed with repeated exposure to the test conditions. No significant effects of trial were found for Toby for any of the test conditions. For Bob, there were significant changes in the frequency of points (r 11=0.64, P<0.025) and monitoring (r 11=−0.62, P<0.025) produced on face-forward trials, raising the possibility that the observed differences between the test conditions for Bob were a result of learning during the test. However, closer examination of the data suggests that this was not the case. The frequency of Bob’s points and monitoring in each test condition as a function of trial is shown in Fig. 3. Bob pointed most often in the face-forward condition at all times during the test. This difference increased as the test progressed. For monitoring, the most dramatic differences between the face-forward condition and the other conditions were found during the early part of the test.

Fig. 3
figure 3

Bob’s points and monitoring by trial. The number of points and monitoring is shown over 12 trials as a function of three test conditions: (1) face-forward (FF), (2) back-turned (BT), and swim-away (SA)

Both dolphins were significantly more likely to leave the vicinity of the test apparatus during the 30-s interval for some test conditions compared to others (Bob, χ 2 2=8.96, P<0.02; Toby, χ 2 2=11.47, P<0.01). The dolphins were significantly more likely to leave the test apparatus on back-turned trials than on face-forward trials (Bob, χ 2 1=7.36, P<0.01; Toby, χ 2 1=7.00, P<0.01). Bob left on only one face-forward trial, but left on 10 of the 12 back-turned trials. Toby never left on face-forward trials, but left on 7 of the 12 back-turned trials. Both dolphins left the apparatus on all 12 swim-away trials.

Discussion

Manipulation of the receiver’s attentional behavior, orientation and position had a striking effect on the dolphins’ pointing and monitoring behavior. The dolphins pointed more often in the face-forward condition, when the trainer’s orientation was consistent with that of an attentive receiver, than they did in the back-turned condition, when the trainer could not have detected the dolphins’ gestures. In addition, Toby monitored the trainer more often in the face-forward condition than the back-turned condition, suggesting that he spent more time monitoring trainers when their orientation was inconsistent with their response, relative to that established on training trials (i.e., immediately opening the jar). Bob monitored the trainer most often during the earliest face-forward trials, when the condition was most novel. It is unlikely that these dolphins had experienced situations like the face-forward condition before—humans with food always interacted with them previous to the test sessions. The back-turned condition was also relatively novel. A trainer interacting with a dolphin searching for goal objects might turn away from the dolphin momentarily, in order to access an object or search a location. But the trainer generally resumed the interaction within a few seconds. In contrast, the dolphins likely had a great deal of experience with situations analogous to the swim-away condition during interactive sessions. The dolphins rarely pointed in the swim-away condition, when trainers clearly were not attending to the dolphins’ gestures.

The dolphins’ performance was similar in many respects to that of the orangutans tested by Call and Tomasello (1994). The face-forward and swim-away conditions used with the dolphins were analogous to the conditions that prompted the highest and lowest frequencies of pointing by the orangutans. Despite its surface resemblance to the away condition used by Call and Tomasello (1994), the back-turned condition used with the dolphins might more appropriately be considered as an intermediate version of two conditions presented to the orangutans. In the orangutans’ away-condition, an experimenter placed the objects on the apparatus, walked several meters from the apparatus, and turned their back to the ape. In their eyes-closed condition, the experimenter placed the objects and then closed their eyes and sat at the apparatus facing the orangutan. The dolphins’ back-turned condition changed the receiver’s orientation, but kept the receiver in close proximity to the objects and the signaler. Regardless of its relevance to the issue at hand, an eyes-closed condition, similar to that used with the orangutans, was not warranted in the present context with the dolphins. Although dolphins might be able to determine whether a trainer’s eyes were open or closed in the air using vision, it seems very unlikely that they would do so spontaneously underwater for a trainer wearing a diving mask. At a distance of more than a few meters it is not an easy task for humans, and human vision is superior to that of dolphins (Dawson et al. 1981). The mask darkens the diver’s face, light reflects off of the face plate, and aquarium water is not as transparent as air. It is likely that dolphins rely on sonar to determine the orientation of a diver’s head while underwater. The air pocket between the face plate and the diver’s head is likely to be an especially salient stimulus to an echolocating dolphin, because of the relatively large impedance mismatch between air and water. In a similar and perhaps even more striking manner, the presence of a large, metal, air-cylinder on the diver’s back may be used by dolphins to determine the diver’s position and orientation. Although it has not been directly reported in the literature, based on comparable stimuli (Au 1993) it is likely that a dolphin can detect a diver facing them at distances well over 100 m with echolocation. With such salient and reliable cues available to them, it seems unlikely that Bob and Toby use vision to check the status of their trainers’ eyelids when monitoring the trainers’ attentional behavior underwater.

Both orangutans in the Call and Tomasello study produced more points in the eyes-closed condition than in the away condition, although the language-trained orangutan typically pointed only once in the eyes-closed condition, whereas the test-wise orangutan pointed with the same high frequency that it had in the eyes-open condition. The dolphins pointed at a moderate frequency in the back-turned condition. However, the tendency of the dolphins to leave the test area on back-turned and swim-away trials, but not on face-forward trials, suggests that the dolphins treated the back-turned condition differently from the face-forward condition, just as the language-trained orangutan discriminated between the eyes-closed and eyes-opened conditions. Thus, the dolphins’ overall performance on the back-turned condition suggests that they were responding more like the language-trained ape than the test-wise one. It is worth noting that (1) the dolphins’ level of test sophistication was comparable to that described by Call and Tomasello (1994) for the test-wise orangutan, and (2) the dolphins were actively participating in a symbolic communication project, although they had yet to achieve the level of sophistication reported for the language-trained orangutan (Miles 1990). Call and Tomasello (1994) suggested that the difference in performance between the two orangutans might be a result of their different research histories. They characterized the language-trained ape’s pointing as consistent with an understanding of others as attentional agents, whereas the test-wise ape’s use of gestures was referential, but at a more rudimentary level adequately accounted for by conditioning. The dolphins’ performance supports the possibility that exposure to a symbolic communication system may facilitate the appreciation of others’ perspectives (see Kuczaj and Hendry 2003, for a more detailed consideration of the role of language enculturation on animal thinking). Language enculturation, and human enculturation in general, may have a significant impact on behavior for some species (for example, dogs: Soproni et al. 2001, 2002; Hare et al. 2002), but not others (for example, chimpanzees: see Leavens and Hopkins 1999, for review; fur seals: Scheumann and Call 2004; and wolves: Miklosi et al. 2003).

Corkum and Moore (1995) and Tomasello (1995) have proposed a transition from referential pointing based on conditioning to that which is purposeful and subsequently guided by an understanding of the attentional state of others. The timing and speed with which human infants move through these steps is not yet fully understood. Depending on innate and environmental influences, other species might proceed at a slower pace, or be limited to something less sophisticated than the referential behavior achieved by older infants (e.g., see Reaux et al. 1999, for a discussion of such limitations in chimpanzees). The results of the present study, combined with those reported by Xitco et al. (2001), help to better define the referential nature of the dolphins’ spontaneous pointing, and establish where their behavior lies along the continuum of referential behavior. In itself, even the mature referential gesture of infants does not demonstrate that infants understand mental states in others, but is instead one indicator of such a capacity (Tomasello 1995). Tomasello (1995) proposed that learning behavior through delayed imitation suggested an understanding of the self/other distinction, a precursor to understanding others as mental agents (Wellman 1993; Moore and Corkum 1994). Like apes (Tomasello et al. 1993), dolphins have demonstrated that they can learn actions through imitation (Xitco 1988; Herman 2002), and to recognize themselves in mirrors (Reiss and Marino 2001). The presence of such capacities in dolphins suggests that direct tests of knowledge attribution should be considered.