Introduction

It has been almost 100 years since the pioneering work of Teuber (1915), Köhler (1917, 1921, 1922, 1925), Kohts (1924), Tinkelpaugh (1928, 1932), and Yerkes (1927, 1928) gave birth to an incredibly generous and fruitful agenda: the characterization of the nonhuman primate mind. This research provided an original picture of the mental capacities of apes, and launched a promising line of evolutionary investigation into the understanding of the human mind. This groundbreaking work focused on the nonhuman primate capacities for intelligent behavior. In most instances, these researchers were interested in the chimpanzee’s and the monkey’s capabilities that related to their understanding of entities and their relations in the world, such as delayed reaction to representative factors, memory for objects, reasoning about causation and means-ends, “insight”, etc. In the same vein of Köhler’s insight experiments, and several years later, Premack and Woodruff’s study of Sarah (Premack and Woodruff 1978) represents the core data for goal attribution in chimpanzees. In these experiments, Sarah was shown videotapes of a person trying to solve a problem, and then she had to select from a set of photographs the “solution” to the problem. Sarah tended to pick the “correct” photograph.

On the basis of this study, several prominent researchers in theory of mind would like to believe that at least chimpanzees attribute goals to others (e.g., Leslie 1994, 1995; Baron-Cohen 1995). However, Premack and Woodruff’s study has been criticized along several dimensions and, since then, efforts in this domain of inquiry have either failed to show evidence of chimpanzee discrimination of intentional and accidental behaviors (Povinelli et al. 1998) or showed questionable evidence for the discrimination between intentionally or accidentally marked objects by chimpanzees and orangutans (Call and Tomasello 1998).

In light of these and other data, it remains unclear whether chimpanzees (and possibly other apes) may have an understanding of the mind in terms of being capable of mental state attribution, or whether they may have a thin concept of “goal-directed behavior” that is learned in their social environments. In addition, it becomes particularly difficult to develop a theoretical framework on theory-of-mind abilities when the objective is to compare those of human children to those of nonhuman primates. For example, nonhuman primate tasks test for theory-of-mind abilities in apes with training techniques that are less than ideal. Animals are generally trained in particular task requirements, and this can obscure issues about underlying competencies. For instance, during training the animal might acquire a set of specific heuristics that enable him to solve the task, in which case success on the task does not really reveal a cognitive ability that may exist spontaneously. Alternatively, the training might lead the animal to produce stereotyped responses in the test phase, masking the spontaneously available capacity. In addition, these tasks make use of apes that have either been extensively trained in other cognitive tasks or have been exposed to a language-like system. Thus, it is often difficult to determine whether the apparent similarities and differences in theory-of-mind capabilities are genuine or the result of methodological differences.

Alternatively, one could adopt methodologies traditionally used in the developmental work with nonlinguistic babies to assess precursory theory-of-mind abilities in nonlinguistic animals. Extensive work by Gergely and colleagues, and others using the looking time methodology indicates that young infants have the capacity to attribute goals (Leslie and Keeble 1987; Gergely et al. 1995; Gergely and Csibra 1997; Csibra and Gergely 1998; Woodward 1998; Csibra et al. 1999; Woodward and Sommerville 2000; Sodian and Thoermer 2001). The looking time methodology has been employed in infant cognition for several decades. It exploits the infants’ propensity to stare longer at events that violate their understanding than at events that accord to their understanding. The first looking time experiment on goal attribution in infants was carried out by Gergely and colleagues (Gergely et al. 1995). In this experiment, 12-month-old infants were habituated to computer generated stimuli in which a small circle (ball) moves in a parabolic trajectory over a rectangle (barrier) and then touches a big circle (big ball). After habituation, babies were then shown one of two conditions in which there was no barrier, but the small ball still touched the big ball. In condition 1 the small ball shows the same trajectory as in habituation trials; in condition 2 the small ball moves in a straight line. The researchers found that the babies’ looking times in the first condition were longer than those in the second condition, despite the fact that the motion in the second condition was novel, and that babies generally look longer at novel events. These results have been interpreted as showing that young babies attribute goals (Baron-Cohen 1995; Gergely et al. 1995; Premack and Premack 1997; Csibra et al. 1999).

The looking time methodology has been recently adopted from the child development literature to test nonhuman primates in a series of different cognitive tasks. The advantage of this methodology for comparative studies is twofold. One, it does not require any training, and it allows for the assessment of spontaneously available cognitive abilities, namely, abilities that exist in the absence of formal training. Two, it does not require any linguistic mediation that could otherwise interfere with this assessment. To date, the topics tested with this methodology include numerical understanding (Hauser et al. 1996; Uller 1996; Uller et al. 2001), speech perception (Ramus et al. 2000; Hauser et al. 2001), and object individuation and object relations (Uller et al. 1997; Munakata et al. 2000; Cacchione and Krist 2003).

The study by Cacchione and Krist (2003) is particularly relevant here because it reports experiments with chimpanzees with the looking time method in computer generated tasks. They showed chimps video clips of objects and tested for their understanding of object relations and support. The chimps performed well in the tasks, namely, like human infants, chimps expected solid objects not to go through other solid objects, or float in the air, and when an unexpected event was shown, they looked significantly longer than when an expected event was shown. This study presents both theoretical and methodological relevance to the present study. Methodologically, the use of the looking time measure in a computer-generated task yielded interpretable data. Thus, the method is useful across different populations of chimpanzees. Theoretically, the use of 2D images to represent real 3D objects proved useful. At a very minimum, on the basis of the results of Cacchione and Krist (2003), one can argue that the chimpanzees understood solidity and continuity of objects, determinant for an understanding of the physicality of objects, criteria needed to succeed in the present task.

Recent work in the domain of goal attribution using the looking time method tested chimpanzees (Uller and Nichols 2000; see also the retraction Uller 2001) in a version of the computer-generated task developed for human infants (Gergely et al. 1995). However, there were inadvertent variations in the durations and presentations of the trials in that experiment, and the study was retracted (Uller 2001). In that study, the durations of the trials were not timed in a constrained fashion: some of them were shorter or longer than 10 s, not exactly 10 s; the lighting conditions were not ideal; and the adult chimps were allowed to roam around the testing enclosure, which made some of the trials longer than others.

In the present study, a metronome was added to the experimental setup to constrain the durations of the trials and to maximize the presentations of the trials in terms of making them the same across subjects; the room lighting was improved: four sets of overheads installed in the ceiling provided the appropriate light to illuminate the chimps’ faces, and thus the quality of the videotape was improved. Instead of adult chimps, we tested infant chimps. The chimps were physically constrained, either sitting in a high chair or in the caretaker’s lap, and therefore could not roam around the room.

Methods

The task used in the retracted paper (Uller and Nichols 2000, see also the retraction Uller 2001) was used here to test infant chimpanzees, employing the same software as before. The task was composed of a familiarization phase and a test phase. The familiarization phase contained four trials instead of two trials as in the original study to provide the infant chimps with more exposure to the stimuli: the adult chimps tested before had experience with TVs and monitors; these infant chimps had none. The test phase contained two trials as in the original study. The familiarization trials showed a block which moves above a “barrier” and makes contact with a ball. Following those, the test trials showed no barrier in between the block and the ball. The trajectory of the block towards the ball is either the same as in familiarization trials (parabolic) or it is a straight line towards the ball. If the chimpanzees interpret the block as trying to achieve a goal (i.e., get to the ball) in the most direct way, then they should look longer at the old trajectory event (parabolic) than the new trajectory event (straight line) because, when the barrier is removed, the old trajectory does not provide the most direct way to get to the ball. If, on the other hand, the chimpanzees simply prefer perceptual novelty, then they should look longer at the novel trajectory event than at the old trajectory event.

Subjects

Four chimpanzees ranging in age from 5 months 15 days to 10 months 25 days at the time of testing (one male, three females) were the subjects in this experiment. The chimps were housed in the new nursery in the Life Sciences Building at the New Iberia Research Center (NIRC), Louisiana, USA. The chimps were nursery reared from birth, and lived in this area since birth. They were fed two milk feedings per day, and they had free choice of monkey chow, various fruits, and ad libitum water. They had positive human interaction with experimenters, behavioral science staff and caretakers. Access to an exercise and enrichment room was provided 8 h per day, and for the rest of the time, they were group housed with same-age peers. The chimps were tested in the nursery room in which they spent most of the day. Table 1 contains information about the infant chimps.

Table 1 Age of chimpanzees tested as a function of their date of birth (DOB). The same four chimps were tested in both conditions. The inter-condition interval (~4–6 weeks) was applied to allow the chimps to be re-tested (cf. Uller et al. 2001 for inter-condition testing of cotton-top tamarins). Half the subjects were tested in the experimental condition first, and half of the subjects were tested in the control condition first.

Materials and procedure

A Compaq DeskPro EN Series Pentium II computer with Java capabilities and a Viewsonic Graphic Series G773 17” color monitor (viewing region: 33.0 cm×24.0 cm) were used to present the stimuli. The stimuli were generated by a Java application using JDK 1.1.7 accelerated by Symantec JIT compiler. The stimuli consisted of two “objects”, a blue block measuring 2.0 cm×0.5 cm and a red ball measuring 1.5 cm in diameter. In all trials, the block moved across the screen and touched the ball. In the familiarization trials, there was a brown “wall” measuring 4.0 cm×1.5 cm. The events happened on top of a 0.2 cm×30.0 cm line that indicated the “floor”. The ball was 0.5 cm from the left end of the line, and at the beginning of the event, the block was 28.0 cm from the left end of the line. For all trials, the trajectory was 26.0 cm long and happened in the same direction, from right to left. The event lasted for approximately 2.5 s, and the inter-stimulus interval lasted approximately 0.1 s.

The setup of the experiment was prepared on the leftmost end of the nursery room mentioned above. The room measured 360 cm×540 cm. The computer monitor was placed on a cart that measured 60 cm×40 cm×80 cm, and the cart was located approximately 150 cm from the wall, facing it. The chimpanzee sat in front of the monitor screen approximately 70 cm away to watch the stimuli. For those chimps that required sitting with the caretaker, a chair with a seat elevated 60 cm from the ground was placed against the wall facing the monitor. For those chimps that did not require sitting with the caretaker, a baby high chair was placed against the wall facing the monitor. The high chair was adjusted to a height comparable to the same height as the chair used by the caretaker, so that all infant chimps had the same eye level when facing the screen. Attached to the wall there was a mirror measuring 60 cm×60 cm, situated approximately 100 cm from the ground. The purpose of the mirror was to provide a view of the monitor screen to the experimenter, situated 120 cm behind the monitor, so that she could control the presentation of the trials without the chimp seeing her. This setup was used because it provided no cues to the chimp being tested.

A metronome set to 66 beats per min timed the trials. The trials were therefore timed for approximately 10 s, counted as “start”, “1, 2, 3”, etc., “stop”: the experimenter said start, counted 1–10, then said stop.

A VHS video camera was used to record the sessions by capturing the gaze direction of the chimps. The camera lens was approximately 80 cm from the head of the chimp. The camera controller situated herself right behind the monitor. The experimenter presented the stimuli and also counted out loud the number of seconds in each trial, following the beats of the metronome. After 10 beats at 66 beats/s had elapsed, the experimenter ended the trial by saying stop. The camera controller interrupted the videotape soon thereafter. Together with the experimenter and the camera controller, an animal caretaker stayed with the infant chimp during testing, whenever necessary.

In the beginning of each session there was a calibration period. The camera controller shook a set of keys lengthwise along the display to (1) indicate the direction of gaze of the chimp’s eyes for the sake of the videotape coder, and (2) attract the chimp’s attention to the monitor screen. This generally attracted the interest of the chimp in the beginning of each trial. In cases when the trials were initiated following the pace of the metronome and the chimp was not looking at the monitor, or the chimp did not sit still either in the caretaker’s lap or in the baby high chair, not paying any attention to the monitor screen, the session was carried out until the end. Immediate viewing of the taped chimp evaluated whether the session had yielded codable data. In seven (out of eight) sessions, the chimps were not run again. In one session, the chimp was run again in the same condition approximately 12 days later so a whole session in which the infant chimp sat through and watched all events could be obtained.

There were two conditions in this experiment: an experimental condition and a control condition. In each condition, the chimpanzees were shown four familiarization trials and two test trials. Familiarization trials were exactly the same within condition. The two test trials were exactly the same across conditions. The chimpanzees were allowed to look at the stimulus for the time between start and stop, after which the trial was terminated. The delay between trials varied according to how long it took for the chimpanzee to attend again. Figure 1 illustrates familiarization and test trials for both conditions.

Fig. 1
figure 1

Graphic representation of stimuli

Pre-testing phase

The infant chimp sat in the high chair or in the caretaker’s lap to watch screensavers for 3 min a session. This phase was developed to (1) attract the attention of the chimpanzee to the monitor screen and (2) make sure the chimpanzee knew that interesting moving objects appeared on the monitor screen. As these infant chimps had never seen a TV screen before, it was necessary for them to undergo a “screen habituation” phase.

The first pre-testing phase consisted of 3 min/day sessions presented 4 days in a row. The stimuli presented to the chimpanzee in this first phase were the flower-ball Microsoft screensaver. Following this phase, a second pre-testing was done. This phase consisted of 3 min/day sessions presented 2 days/week for 2 weeks. The stimuli were fish and deep-sea life movement in a Microsoft screensaver software. The onset of the presentation of the stimuli was parallel to the start of counting. The stimuli stayed on until 3 min had elapsed.

Testing phase

Familiarization

The block’s trajectory was the same in the familiarization trials for both conditions. It moved on a straight line for 3.0 cm, and then it began a parabolic trajectory, landing 7.0 cm from the left end of the line, and continued in a straight line until it touched the ball. In the experimental condition, the wall was 13.5 cm from the left end of the line (barrier), so the block went over the wall and touched the ball. In the control condition, the wall was situated behind the starting point of the block at 28.5 cm from the left end of the line (back wall), not in between the ball and the block. There were four familiarization trials. The onset of familiarization trials was parallel to the start of counting. As soon as the experimenter stopped the 10-s counting, the screen would turn dark and another trial would start with the onset of the presentation of the stimuli and the start of counting.

Test

For the test trials, the block and the ball were in the same starting positions as in the familiarization trials, but the wall was absent. There was an “old action” event and a “new action” event. In the old action test trial, the block moved along the same trajectory as in familiarization trials. In the new action test trial, the block moved in a straight line until it touched the ball. There were two test trials, the new action event and the old action event. The onset of test trials was parallel to the start of counting. As soon as the experimenter stopped the 10-s counting, the screen would turn dark and the second trial would start with the onset of the presentation of the stimuli and the start of counting.

Each infant chimp was tested in the experimental condition and in the control condition in a within-subjects design. Inter-condition sessions were scheduled 4–6 weeks apart. For example, after the chimp was run in the control condition, it was run in the experimental condition 4–6 weeks later.

The dependent measure was the total amount of looking time during each of the 10-s trials. The introductory calibration period served to determine the direction of gaze of the chimps so that the coder could code the videotapes. A “good look” was therefore defined as “a gaze directly into the monitor as determined by the calibration”.

Looking time data were analyzed two ways: frame–frame (30 frames/s) and real time, the latter exactly the same way infant experiments are coded. Two videotape coders, experienced in coding infant experiments, coded the videotapes. The primary observer coded the videotapes frame-by-frame, as generally done in looking time experiments with nonhuman primates, and the secondary observer coded the videotapes in real time, as generally done in looking time experiments with human infants. The purpose of this double coding with different protocols was to evaluate the sensitivity of both coding methods. The coders were blind to the experimental trial types. Agreement between the two coders was assessed by first calculating %disagreement. For each trial, a %disagreement was calculated as the difference in times recorded by the primary and secondary coders divided by the time recorded by the primary coder. This disagreement score was then averaged over the trials, and subtracted from 100% to yield the %agreement. The agreement between the two coders was 82%. A Pearson correlation score was also computed for the two coders, r=0.92. The statistical analyses were performed on the primary observer’s data.

Results

The analyses Footnote 1 of looking times in familiarization and test phases used two within-subject factors, outcome (new action event, old action event), and condition (experimental, control); and one between-subject factor, order (new action first, old action first). A first analysis examined the familiarization data only. A 2×4 within-subjects analysis of variance (ANOVA) examined the effects of condition (experimental, control) and trial number (1, 2, 3, 4) on looking times. As one of the subjects did not have a full set of familiarization trials, but only two trials, the data for this subject were excluded from this analysis. There were no significant main or interaction effects of condition or trial number. The familiarization trials did not differ across condition, and they did not vary across time either, namely, the chimpanzees looked roughly equally long in all four familiarization trials in both conditions (M experimental=4.1 s, SD=1.0; M control=4.6 s, SD=1.1). The chimps remained interested in the display across familiarization trials. This is not surprising. In familiarization trials, human infants do not show habituation either. In order for habituation effects to be observed, babies generally need more trials than the usual 4–6 of studies based on familiarization. The chimps, like human babies, may need more trials to habituate to the stimuli.

Preliminary 2×2 ANOVAs examined the effects of test trial order (new action first, old action first) as a between-subjects factor and outcome (new action, old action) as within-subjects factor on looking times in the test trials in each condition. In the experimental condition, the main effect of outcome, F (1,2)=11.30, P<0.07, was not statistically significant, despite the fact that the chimps looked longer at the old action event (M=6.0 s, SD=1.7) than at the new action event (M=2.4 s, SD=0.4). There were no other significant effects in this condition. In the control condition, both the main effect of outcome, F (1,2)=413.44, P<0.002, and the interaction between outcome and order, F (1,2)=121.00, P<0.008, were significant. The main effect of order was not significant. The outcome main effect indicated that chimps in the control condition looked significantly longer at the new action (M=5.7 s, SD=3.4) than at the old action (M=2.7 s, SD=1.8). This is the opposite pattern from that found in the experimental condition. The pattern of the outcome by order interaction can be interpreted as indicating that when the old action was presented first, the effect of the new action was stronger than when the new action was presented first. This could be because when the old action was presented first, it served essentially as an additional familiarization trial, enhancing the effect of the new action. Thus, although the order factor appears to have played a role in the control condition, its effect did not change the basic pattern of the new versus old action effects, which were exactly the opposite in the control condition from what they were in the experimental condition.

Because order does not affect the direction of the outcome effects in either condition, it makes sense to analyze outcome and condition together in an analysis that ignores order. A 2×2 (outcome by condition) ANOVA was carried out with both factors as within-subjects factors. The only significant effect was the interaction between outcome and condition, F (1,3)=27.81, P<0.01. In the experimental condition, all chimps looked longer at the old action event (M=6.0 s) than the new action event (M=2.4 s). In the control condition, all chimps looked longer at the new action event (M=5.7 s) than the old action event (M=2.7 s). This interaction is due to the fact that the chimpanzees looked longer at the old action event in the experimental condition, but looked longer at the new action event in the control condition.

To follow up on this interaction, paired t-tests analyzed the looking times in the experimental and control conditions separately. In the experimental condition, there was a significant difference between the old action and the new action events, t (3)=3.9, P<0.03, two-tailed. The chimps looked significantly longer at the old action event than the new action event. In the control condition, the difference in looking times in the old action and the new action events was not significant, t (3)=−3.2, P<0.06, two-tailed (see Table 2). The chimps looked longer at the new action event than the old action event, but this difference was not significant.

Table 2 Mean looking times in experimental and control conditions as a function of trial type. Values represent mean looking times in seconds. Values enclosed in parentheses represent standard deviations. Statistical significance: see P values of main effects and interactions in the Results section

The main result of this study is that the infant chimpanzees showed a stronger preference to look at the old action event than the new action event in the experimental condition, and this preference was reversed in the control condition. In addition, the interaction shows that the chimpanzees had opposite preferences in the experimental and in the control conditions. In the experimental condition, the chimpanzees did not respond on the basis of perceptual novelty. These results seem to suggest that the chimpanzees were sensitive to the apparent goal-directedness of the block. When the chimpanzees saw the block go over the barrier towards the ball, then they “understood” that the block had a goal. When the chimpanzees saw the block go over “nothingness” in a parabolic motion towards the ball, all bets were off, and they had a (nonsignificant) preference for the novel trajectory.

Discussion

The present findings with chimpanzees parallel findings with human babies (Gergely et al. 1995; Premack and Premack 1997; Woodward 1998; Csibra et al. 1999). These findings have been interpreted as evidence that babies attribute goals. In the baby study (Gergely et al. 1995), there were two groups of babies. In the experimental group, infants were habituated to a ball going over a barrier towards another ball. In the control group, infants were habituated to a ball going in the same parabolic motion towards a ball, but the barrier was not in between the balls hindering the trajectory, it was placed behind the ball that moves towards the stationary ball. In test trials, when the barrier was removed altogether, babies in the experimental condition looked longer at the parabolic trajectory than at the straight line, whereas babies in the control condition did not. On the basis of this research, Gergely and colleagues have systematically argued for intentionality as the capacity being tapped in their experiments. As pointed out by an anonymous reviewer, “the original paradigm was carefully designed to assess whether human infants recognize (1) agency, (2) equifinality of behavior and (3) rationality of the agent’s equifinal behavior.” Babies would need to recognize all three components in another individual’s behavior for one to demonstrate that they attribute intentions to others.

If one follows the logic of Gergely et al. (1995), then there may be a temptation by some to conclude that these results provide the basis for a prima facie case that chimpanzees have intentionality. I resist this inference. The infant chimps performed comparably to human infants in the present task, namely, they showed a preference to look longer at the old action event in the experimental condition (in which the block goes in a parabolic trajectory towards the ball) and they showed a preference to look longer at the new action event in the control condition (in which the block goes in a straight line towards the ball), although the looking times in this latter case did not differ statistically. Although the chimps seem to present a disposition to detect the goal of an agent under these conditions, the same way as human infants do, for the time being it is certainly too premature to form conjectures about issues of intentionality in nonhuman primates.

The baby chimps were sensitive to the trajectories of the block towards the ball and understood the shape movement as an action of an “animate” being. The attribution of “intentionality” to computer-generated moving shapes has been shown in several populations. Original research by Heider and Simmel (1944), for example, shows that human adults interpret moving shapes on a computer screen as being intentional, namely, adults tend to describe the shape movements as actions of agents. More recently, Abell et al. (2000) and Castelli et al. (2000) have explored this understanding in normal and abnormal development, showing that 8-year-old children, like adults, understand computer-generated shape movements as “intentional actions”, whereas children with autism gave inappropriate descriptions of the movements. In the infant literature, as pointed out earlier, Gergely and colleagues (Gergely et al. 1995; Csibra et al. 1999) have consistent results showing that 9- and 12-month-old infants understand the relationships between computerized shapes in goal attribution tasks. For chimps, together with Cacchione and Krist’s (2003), the present results suggest that chimpanzees too seem to be sensitive to these effects. The extent to which they would succeed in a Heider and Simmel (1944) type of task remains to be investigated further.

It is clear that theory-of-mind work with infants has produced a substantial amount of data that the present results cannot compare to. In all of Gergely, Csibra, Woodward, Sodian, and colleagues’ work, sample sizes generally vary from 20 to 60 infants in each experiment. Thus, methodologically, sample size in the chimpanzee case is arguably a source of concern. However, this method yielded interpretable data. Research using this same method in other domains of investigation has also yielded interpretable results. Moreover, it is particularly noteworthy that these baby chimps showed the same pattern of looking time across subjects. These are promising results, and should be further explored.

One possible exploration is the line of inquiry that followed the original Gergely et al. (1995) task in which questions regarding the nature of the events elicited in the task were investigated. For example, the perception of self-initiated movement of agents, the encoding of cues that perceptually determine the equifinality of the goal (Csibra et al. 1999), among others, may be alternative explanations that require further testing. Another line of research should explore the chimpanzee understanding of what constitutes an intentional agent. Work by Meltzoff (1995) suggests that 18-month-old infants can discriminate an intentional agent from “something else” on the basis of physical characteristics of the agent. Babies will re-enact an action when it is performed by a human actor, but will not re-enact the same action when it is performed by a mechanical device. It would be interesting to see whether nonhuman primates make the same kinds of inferences that human infants do in this regard. It has also been argued that 15-month-old infants have a clear concept of “mentalistic agent” that enables them to attribute mental states to a novel nonhuman agent under some mentalizing contexts, but not others (Johnson et al. 2001). This approach could also yield potentially interesting results with nonhuman primates.

Theoretically, this result is interesting because it suggests that chimpanzees may be endowed with at least a mechanism that recognizes goals. Why would social primates need one? As Jolly (1966) pointed out, the social life of primates provided the evolutionary environment for the development of “primate intelligence”: evolution must have endowed apes (monkeys and prosimians as well, though differently) with a mechanism that allows them to survive in their social communities. At the very least, one would expect a mechanism to anticipate and predict behaviors of conspecifics, competitors and prey. But possessing a behavior pattern predictor that enables creatures to anticipate and predict behavior may be all that is needed for a creature to be an evolutionary success, as it may be able to correlate cues of anticipating or predicting behavior without the need to understand anything about the notion of goal (Nichols and Stich 2003). What mechanism (or mechanisms), therefore, may have been necessary for creatures to survive so many thousands of years in social contexts? The literature on nonhuman primate “mind reading” provides equivocal evidence and theory on the abilities that nonhuman primates might need to survive in their social worlds. In addition, this literature is conservative and minimalist. Arguments for a potential mechanism for mind reading fall short of an organized framework because there are hundreds of perceptual explanations to account for the available data. Besides, one need not ascribe a higher cognitive explanation to nonhuman primates as, minimally, they probably (it is argued) do not need much to survive in their social worlds. This is all fortunately an empirical question, and students of theory-of-mind abilities in nonhuman primates should engage in efforts to tell a plausible story.

Recently, Call (2001) has addressed this concern on his characterization of theory-of-mind abilities in apes. He proposed that, although chimps may learn cues in their social environments, they certainly use knowledge-based inferences to solve “social problems”. Call has identified three different “mechanisms” to account for chimpanzee theory of mind: (1) nonrepresentational, in which an association is formed between a self-elicited behavior and a goal, and is consolidated by experience; (2) representational, in which past experiences are recalled to re-combine into novel situations for creative problem solving; and (3) meta-representational, in which representations of others’ beliefs, desires, perceptions provide the means for novel problem solving and knowledge.

Call’s framework is rather elegant and useful. However, it still lacks the structure of a model that might precisely explain the cognitive structures underlying behavior/performance. I will adopt Carey’s (1995) developmental framework to propose that a theory of mind reading in nonhuman primates (one that might have even appealed to Köhler, Teuber, Yerkes, and others) has to include both a descriptive and an explanatory component. In addition to a description of what corresponds to the behavioral/performance data revealed by experimental evidence and observational data, a formalization of the operative cognitive components for mind reading is much required.

What mechanisms have been proposed in the theory-of-mind literature in humans? One possibility is that there are several independent mechanisms underlying theory-of-mind capacities (e.g., Leslie 1994; Baron-Cohen 1995; Nichols and Stich 2003). For instance, theory of mind in humans (and potentially, nonhuman primates) might be composed of several separate mechanisms including distinct mechanisms for belief and desire attribution. Nichols and Stich (2003) proposed that the mind of our ancestors could have been composed of a belief box, a desire box, an inference mechanism, a practical reasoning mechanism, a planner and an updater. Early human ancestors could have had a mechanism such as this one. This mechanism allowed creatures to anticipate and predict behavioral patterns based on cue associations and correlations to maximize survival through eating, mating, and avoiding death. At this time, creatures survived by matching stereotypical cues with specific behaviors. Later on, creatures started to realize that they had states that corresponded to particular “goals”—for example, “I am trying to figure out the best way to catch the monkey up above in the tree to eat, but it will flee as soon as I climb up”. In order to maximize success in attaining the goal, a more efficient mechanism came to play a role. The difference between the behavioral pattern predictor in the very beginning and this “goal and strategy” strategy (Nichols and Stich 2003) is the following. In the goal strategy, the creature has a repertoire of different alternative (goal directed) patterns of behavior that it can select from. The trick is to select the best way to achieve the goal in a particular circumstance. According to Nichols and Stich (2003), at this stage, all the creature needs is a strategy/strategies for goal attribution, a capacity to figure out the best way to achieve the goal and a mechanism that will generate the prediction that this is what the target will do. So, in the example above, I am a chimp and I want to get the monkey to eat. The monkey will flee as soon as I climb up. What do I need to succeed? I need to determine that the monkey will try to flee, I need to figure out what he will try to do to escape, and the best way to do it, and I need to predict that that is what he will do. For Nichols and Stich (2003), goal attribution is a representation that may have been present very early on in the evolution of mind reading, and existed completely independent of other mind-reading abilities that developed later on.

The pattern of results from the present study fits nicely within the framework just described. In order to succeed in the task, the baby chimps needed a mechanism with (1) a set of strategies for goal attribution, (2) the capacity to figure out the best way to achieve the goal and (3) the predictor of behavior. Minimally, these characteristics of an ancestral system of mind reading, present spontaneously in chimpanzees, may be the precursory system sought by researchers in the field of theory-of-mind abilities in nonhuman primates. Much work still needs to be done. I hope this result will engage students of cognition and behavior in nonhuman species to develop a picture of mind reading in our ancestors.