Introduction

Setting some money aside for our retirement plan or for our children’s studies are just two examples of human future-oriented planning. Such planning requires giving up on any immediate and selfish benefits in order to serve future interests. Cognitively speaking, this human behaviour is based on our capacity to mentally envisage the future state of our own needs, which typically differ from our current ones, and to act now to ensure that our needs will be met at a later time (Suddendorf and Corballis 2007; Tulving 1984, 2005).

Although there is considerable evidence indicating that primates trade goods and services amongst themselves in cooperative situations (food: de Waal 1997, 2000; grooming: Barrett et al. 1999; Manson et al. 2004; Schino et al. 2003; Ventura et al. 2006; conflict support: de Waal and Luttrell 1988; Hemelrijk 1994; Schino et al. 2007), some authors have questioned whether animals form expectations about future returns in these settings (Stephens 2000). For instance, it is unclear whether primates base their investment in a cooperative behaviour on the mental scorekeeping of favours they have given and received (de Waal and Luttrell 1988). Nor have we clearly established whether primates can mentally anticipate future needs that are not connected to current ones known as the Bischof-Köhler hypothesis (Roberts 2002; Suddendorf and Corballis 2007)- in social settings.

Animals often show short-sightedness in delay-of-gratification tasks. Some rodent and bird species, for example, choose small, immediate rewards rather than waiting for larger ones when more than a few seconds’ wait is required (Roberts 2002; Stephens and Anderson 2001). This also holds true for many non-human primates (e.g. long-tailed macaque: Tobin et al. 1996; marmoset and tamarin: Stevens et al. 2005; brown capuchin: Ramseyer et al. 2006). Non-humans have also demonstrated ‘temporal myopia’, as evidenced by their inability to act in the present to alleviate future thirst or hunger (e.g. in the rat: Naqshbandi and Roberts 2006; the long-tailed macaque: Silberberg et al. 1998; the rhesus macaque: Paxton and Hampton 2009). However, increasing numbers of publications have provided evidence of time-related calculation in non-human species (e.g. the chimpanzee: Dufour et al. 2007; Rosati et al. 2007; the long-tailed macaque: Pelé et al. 2010a) and have also evidenced some anticipation skills (e.g. the squirrel monkey: McKenzie et al. 2004; Naqshbandi and Roberts 2006; the chimpanzee: Dufour et al. 2007). It has been argued, however, that planning cannot be demonstrated in animals since it is impossible to assess in what extent an individual mentally envisages its future needs. One tenet of future planning is based on the ability to envisage oneself both in the past and the future, that is to say elaborating coherent past/future scenarios relying on the episodic cognitive system (Szpunar 2010). Indeed, future planning requires individuals to recall the particularities of personal past events (i.e. referred to as episodic memory: Tulving 1983) in order to pre-live future events and hence act upon them in the present (Suddendorf and Corballis 2007; Tulving 2005). This has led researchers to suggest that the study of future planning in non-verbal beings should rely on observable future-oriented behaviours that are not connected to current needs or prompted by environmental cues, in order to ensure that episodic recollection of personal information occurs (Tulving 2005; Suddendorf and Busby 2005). This does not mean that humans never plan for future needs that they currently desire, but rather suggests that certain experimental criteria set in animals could prove that their future-oriented behaviours are driven by an episodic cognitive system. As Roberts (2002) stressed, we need to decipher whether animals envisage a future state of hunger or whether they are already experiencing it when planning their routes (Noser and Byrne 2007; Valero and Byrne 2007) or selecting tools for future use (Boesch and Boesch 1984; Visalberghi et al. 2009).

Several researchers have investigated the above issue with a variety of paradigms. Scrub jays (Aphelocoma californica) have been reported to anticipate their future food needs by caching pine seeds the night before they would need to eat them (Raby et al. 2007). Controlling the birds’ current satiety for this specific food reinforced the argument that the birds behaved independently of their current motivational state (Correia et al. 2007). Mulcahy and Call (2006) showed that bonobos and orang-utans are able to save tools for future use as long as 14 h in advance, although the study failed to carry out controls for the visibility of food rewards at the time of tool collection. In another study, two chimpanzees and one orang-utan were tested in a similar experiment. They had to selectively collect and save a tool for later use in a food-delivering apparatus, invisible to subjects at the time of collection. Subjects selected and saved the right tool for use 70 min later, even when a competing reward was presented with the tools (Osvath and Osvath 2008). In a comparable study, Dufour and Sterck (2008) tested ten chimpanzees in two different planning tasks, controlling for potential prompting effects (i.e. partner presence in the planning-to-exchange task and sight of the apparatus in the tool-use planning task). In the first task, subjects had to select and save tokens for exchange with an experimenter 1 h later, while in the second task they had to select the correct tool 1 h before they were given access to a possible food reward. It is particularly interesting to note that chimpanzees failed to plan in the exchange task but succeeded in the tool-using procedure (Dufour and Sterck 2008), as observed by Osvath and Osvath (2008). While there is evidence that great apes can plan, it appears that animals may be limited by the extent to which they plan for the future (measured by time delay or degree of complexity) and their flexibility in doing so (measured in various contexts) rather than in the possible underlying cognitive mechanisms involved in the generation of predictions or imagined scenarios (Bar 2007; Szpunar 2010). Hence, it is necessary to address future-planning abilities in further species, using a range of experimental situations in which animals derive positive personal benefits (Suddendorf and Corballis 2010; Zentall 2005).

In this study, we investigated future planning in brown capuchin monkeys (Cebus apella) and Tonkean macaques (Macaca tonkeana). These species possess some of the cognitive prerequisites for future prospection such as self-control and anticipatory abilities as shown by their capacity to delay larger gratifications over 10–20 min (Pelé et al. 2010a, 2011). They display complex social interactions in situations where trading of commodities could occur (de Waal 1997; Hemelrijk 1994; Manson 2004) and exchange non-edible items with humans in experimental situations, illustrating a type of cooperative interaction (Brosnan and de Waal 2005; Hyatt and Hopkins 1998; Pelé et al. 2009; 2010a, b, 2011). Our planning test, derived from Tulving’s spoon test (2005), relied on a token-exchange-task procedure closely modelled on those used by Dufour and Sterck (2008). In Tulving’s (2005) spoon test, a girl wanted to eat a piece of cake at a party she attended, but was unable to do so because no more spoons were available. The crucial test for planning abilities is to check whether, facing the possibility of attending the party again, she would bring her own spoon or not. Similarly, the rationale of our planning-to-exchange task required subjects to collect tokens in advance and bring them to a different place in order to exchange them for food.

This task fulfils Tulving’s (2005) experimental criteria of ensuring that animals are displaying future-oriented behaviours before the future problem appears and that these behaviours are not driven by current motivational states but rather rely on personal experience. Hence, the monkeys had to recall their own experience from the previous testing days in order to behave accordingly over the following days, meaning that they had to collect tokens during a short time-window several minutes before being able to exchange them for food with an experimenter. Note that the task could not be solved by merely collecting the tokens, as they also had to transport the tokens to another location (testing compartment) at a later time (40 or 25 min later), therefore requiring an integrated representation of what they had to bring (which token), where they had to bring it (which room) and when (in anticipation of a future need) (Clayton et al. 2003). Crucially, collecting tokens in advance was useless in their current context as they could not readily exchange them or be prompted by the presence of the experimenter, who remained out of sight throughout the collecting and waiting periods.

Method

Subjects and housing conditions

We tested six subjects maintained in three different social groups at the Primatology Centre of the University of Strasbourg, France: two brown capuchin monkeys taken from group 1 (17 individuals), two Tonkean macaques from group 2 (16 individuals) and two Tonkean macaques from group 3 (7 individuals). The age, sex and species of subjects are provided in Table 1, together with their involvement in experiments. Subjects were chosen on the basis of their previous experience with the exchange procedure (Pelé et al. 2010b, 2011; Ramseyer et al. 2006), but had never experienced the planning tasks. Groups 1 and 3 were housed in indoor–outdoor enclosures composed of several 3-m-high compartments, with respective total areas of 78 m² and 42 m². Macaques from group 2 were kept in a 5,000 m² wooded area with an indoor room of 20 m² and a 2-m-high outdoor wire-mesh compartment measuring 40 m² that was used for experiments. Compartments were connected by sliding doors. All groups were fed with industrial monkey pellets and water was available ad libitum in the indoor rooms. Subjects were never deprived of food and were individually separated from the rest of the group for testing in outdoor compartments.

Table 1 Information about subjects

Procedure of the planning task

We used a planning-to-exchange task previously adapted from Tulving’s (2005) spoon test by Dufour and Sterck (2008). The principle was to get subjects used to exchanging a particular token for a food reward at a fixed time of the day (training phase). The regularity of the exchange activity made it predictable for the subjects. The planning test (testing phase) then consisted of providing subjects with a set of tokens that they could collect during a limited time interval, several minutes or hours prior to the usual exchange activity time. These collecting and waiting periods both occurred in a compartment adjacent to the one where the exchange activity took place. Subjects inclined to plan for the exchange activity had to collect the suitable tokens during the collection period, save them during the waiting period and then transport them into the testing compartment. Once in this compartment, they were given the opportunity to exchange the tokens for the usual food reward, at the regular time for exchange activity. If subjects failed to collect and then transport the tokens, no exchange could take place. Subjects had to remember their failure on the first day in order to anticipate their future need of tokens for the following days and therefore solve the task.

For both training and testing phases, subjects had initially to wait for a fixed delay (i.e. the waiting time) in the waiting compartment before entering the adjacent testing compartment where the exchange activity took place (See electronic supplementary material for further details). Once they were inside the testing compartment, the door was closed and subjects were denied any further access to the waiting compartment (except if mentioned otherwise). Three experimenters, previously unknown to the subjects, were consistently involved throughout the study: the human partner (M.B.), who was assigned exclusively to the exchange activity, the camera operator filming token manipulation during testing, and the assistant, who ushered subjects into the right compartments, and deposited and removed tokens.

Learning phase

All subjects were initially involved in a discriminative learning exchange task during which they were trained to return one particular object amongst three possibilities in order to obtain a food reward from the human partner. The rewarded token was a plumbing copper bend (B); the other two objects, a red plastic stick (S) and a blue metallic curtain ring (R), served as distractors throughout the experiments to determine whether subjects truly planned for the exchange task or were merely interested in any object. Objects differed in shape, material and colour and could easily pass through the wire mesh (Fig. 1); all were unknown to the subjects.

Fig. 1
figure 1

Set of tokens. From left to right: metallic curtain ring (R), plumbing copper bend (B), plastic stick (S). The ruler units are cm

The exchange activity proceeded as follows. The human partner placed 12 exemplars of each token (i.e. 36 tokens) in the compartment and then sat in front of the subject. Holding her open palm out in front of her and holding the food reward in the other, she asked vocally for the return of the token. The human partner only rewarded subjects on presentation of the copper bend (B); all other tokens were accepted and dropped in a bucket. Subjects needed on average 5.33 ± 0.99 sessions (range 4–10 sessions) to reach the criterion of three consecutive sessions where 11 of the 12 correct tokens were the first to be exchanged. The six subjects did not significantly select any one token type above chance during the very first twelve exchanges at the beginning of the learning phase (G test: p > 0.05). This indicates that subjects had no a priori preference for a given type of token.

Training phase

Once all subjects had identified the correct token, they were given the opportunity to exchange tokens for food rewards on a daily basis. In order to make the task a predictable event for each individual, the exchange activity took place every working day in the same testing compartment, at the same time of day, with the same partner and the same availability of tokens.

The training procedure was run as follows: both the waiting and the testing compartments were emptied of any group members, and then one subject was isolated in the waiting compartment. A video camera was systematically placed in front of the wire mesh and stopwatches were started. At the end of the waiting time, the assistant returned and deposited the 36 tokens in the testing compartment before ushering the subject inside. The human partner arrived for the exchange session once the assistant had left. At this stage, the exchange session took place and the human partner begged for tokens until the subject gave her all the correct tokens in his/her possession. This procedure was repeated every day until the task became predictable for all the subjects, that is, when subjects spontaneously entered the waiting compartment for three consecutive sessions with no more than 5-min difference from 1 day to another.

Testing phase

Subjects were tested using the general procedure detailed below (for specific details of each experiment, see Testing procedure). The first testing day was always immediately after the last training day. The experimental set-up was the same as the one used for the training, but the procedure differed in the following ways: once the subject had been isolated, the video camera was placed in the front of the wire mesh and the camera operator started filming the subject. The assistant then hung a box containing the 36 tokens on the wire mesh of the waiting compartment, and the stopwatches were set off. Tokens were available to the subject for a limited time, that is, the collection period, before being removed by the assistant. At the end of the waiting time, the assistant ushered the subject into the testing compartment for her/his daily exchange session. In the testing situation, however, no tokens had been previously deposited in the testing compartment. In order to avoid any prompting effect, the human partner was not visible to the subject prior to the exchange activity and arrived once the assistant had left. She begged for tokens and only rewarded subjects on presentation of copper bends, but accepted any other donation and dropped it in a bucket. Exchange sessions lasted for 2 min 30 s during which the human partner begged for tokens every 30 s. Rewards were similar to those used in the training phase but their size was doubled.

Video footages were used a posteriori to sample behaviours related to token manipulation. All occurrences of the following behaviours were continuously recorded during the waiting time: manipulation by hand, oral manipulation, saving in the hand, saving in the mouth, transport in the hand and transport in the mouth (see supplementary material for definitions). Throughout the waiting time, we also performed one-zero sampling of any subject’s contact with each type of token per 1-min interval.

Testing procedure

We designed an initial planning experiment (Experiment 1) and set the delay at 40 min prior to the usual exchange session time, composed of a 10-min collection period followed by a 30-min wait. The total length of this delay was chosen on the basis of the anticipatory skills of both species in delayed gratification tasks (Pelé et al. 2010b). In the light of the results of Experiment 1, the delay in Experiment 2 was reduced to 25 min, that is, 5-min collection and 20-min waiting. Experiment 3 consisted of an introductory open-access condition in which the waiting compartment was freely available at the time of the exchange activity, followed by a 5-trial replicate of Experiment 2. Experiment 4 involved the two successful subjects of the previous experiment in a full 10-trial replicate of Experiment 2. In Experiment 5, free access to the testing compartment made it possible for subjects to store tokens there during the collection period before they were ushered into the waiting compartment again for the rest of the waiting period. Experiment 6 was a control experiment based on the procedure of Experiment 1, but without any delay between token deposit and time of exchange.

Statistics

For each experiment, we counted the number of tokens of each type collected (B, R, S) and the number of manipulating behaviours shown for each type of token. Token selectivity was assessed for each subject by performing G tests, that is, exact calculations of the log-likelihood ratios between observed distributions and discrete uniform distributions (Sokal and Rohlf 1995). We then used pairwise comparison tests with Holm–Bonferroni corrections to test whether (a) the number of tokens collected for each different type of token differed significantly from the chance value and (b) whether the number of manipulating behaviours for each different type of token differed significantly from the chance value. To obtain a general overview of the factors influencing success, a generalized additive mixed model was fitted with the proportion of success for each individual in each experiment as dependent variable. Fixed effects were (i) the number of B collected, (ii) the number of manipulating behaviours for B and (iii) the task (i.e. experiments 1–5). The positive correlation amongst observations related to the same individual was taken into consideration by adding the individual as a random effect. The chosen family for the dependent variable was Binomial with a Logit link function; model selection was based on the AIC (e.g. Zuur et al. 2009). All tests were performed with R 2.10.1 software (http://cran.r-project.org) with level of significance set at 0.05.

Results

Experiment 1: 40-min delay planning task

Three subjects were trained over a mean of 17.6 ± 1.6 days and were then tested in the first planning task for 10 trials (one trial per day), each consisting of a 10-min collection period followed by a 30-min wait.

One macaque (Sha) succeeded on his first trial (1 token), but did not renew this success afterwards (Table 2); the success was observed after he had dropped one suitable token in the testing compartment while waiting. The other two subjects did not successfully complete the task. All subjects took part in 10 trials except for the macaque Lad, who stopped participating after 8 trials.

Table 2 Number and order of correct trials for each subject and each experiment

As regards token collection, both the macaques Sha and Lad collected tokens in all trials while the capuchin, Arn, practically stopped collecting tokens from the fifth trial onwards. Tokens were collected selectively by all three subjects (G-tests per individual: Arn, p < 0.05; Sha and Lad, p < 0.01) with a preference for the rewarded token (Table 3; post hoc tests (B): p < 0.05 for all). The rewarded token was also manipulated preferentially during waiting time (Table 4; G tests: p < 0.001 for all; post hoc tests (B): p < 0.001 for all). By comparison, the non-rewarded tokens (R and S) were collected and manipulated equally or less often than expected by chance (Tables 3, 4). Subjects mostly failed to transport tokens when entering the testing compartment.

Table 3 Collection of tokens by subjects during each experiment
Table 4 Manipulation of tokens by subjects during each experiment

Experiment 2: 25-min delay planning task

Six subjects were involved in a second experiment composed of 10 trials, each consisting of 5-min collection period followed by a 20-minute wait. Subjects were previously trained over a mean of 23.7 ± 4.8 days.

Only one capuchin (Ros) solved the task twice, by transporting one suitable token into the testing compartment each time (Table 2). None of the other subjects successfully completed the task. All subjects took part in 10 trials except for the macaque Lad, who stopped participating after 4 trials.

With respect to token collection, only the macaque Sha collected tokens in all trials; the capuchin Arn started to collect on the fifth trial; and the other four subjects collected tokens in 84 % of trials. Tokens were collected selectively by five of the six subjects (G tests per individual: Arn, Ros and Sha, p < 0.01; Sim and She, p < 0.020; Lad, p > 0.05) with a preference for the rewarded token displayed by four subjects (Table 3; post hoc tests (B): Arn, p < 0.001, Sha, Sim and Ros, p < 0.05). The rewarded token was, however, manipulated preferentially during the waiting time by the five subjects who had previously shown selectivity (Table 4; G tests: p < 0.001 for all; post hoc tests (B): p < 0.001 for all). By comparison, non-rewarded tokens (R and S) were collected and manipulated equally or less often than expected by chance by most subjects (Tables 3, 4). All subjects except the capuchin Ros failed to transport tokens when entering the testing compartment.

Experiment 3: 25-min delay planning task with open-access condition

Following the results of the previous experiment, we ran Experiment 3 to check whether subjects understood what they needed to solve the task, and where to find it. Experiment 3 began with a training phase (6.5 ± 0.6 days) designed to strengthen the motivation of subjects to participate in the task. The testing procedure then continued in a similar way to Experiment 2, except that from the first trial onwards we provided subjects with the opportunity to return to the waiting compartment during the exchange activity (open-access condition). The human partner first begged for one token, but if subjects did not have any she opened the sliding doors towards the waiting compartment, allowing subjects to recover any tokens they had collected 25 min before. On the first trial, the human partner waited for 30 s in case subjects spontaneously returned to the waiting compartment to collect the required tokens. If subjects did not do so spontaneously after 30 s (which was the case for all tested subjects), the human partner then stood next to the waiting compartment until the subject had collected at least one token. The human partner then resumed the exchange procedure, sitting in front of the testing compartment and begging for tokens again without inciting the subjects to return to the waiting compartment. Subjects needed 2.8 ± 0.5 trials to start spontaneously transporting tokens from one compartment to another during the exchange activity. All subjects transported suitable tokens in open-access conditions. The following days were dedicated to a 5-trial replicate of Experiment 2 (normal condition without open access).

Two subjects never solved the task (She and Arn). The other two subjects transported suitable tokens into the testing compartment twice (Table 2). These two subjects had also shown the strongest selectivity in collecting tokens 25 min before the time of the exchange activity (Table 3; G tests per individual: Ros, p < 0.001; Sha, p < 0.01), with a preference for the rewarded token (post hoc tests (B): Ros, p < 0.001, Sha, p < 0.05). In comparison, unsuccessful subjects did not preferentially select any particular token (Arn: G test, p > 0.05, post hoc tests p > 0.05 for B, R, S; She: G test, p < 0.05, post hoc tests, p < 0.05 for S and p > 0.05 for B and R). The rewarded token was, however, manipulated preferentially during the waiting time by all four subjects (Table 4; G tests: p < 0.001 for all; post hoc tests (B): p < 0.001 for all). By comparison, the non-rewarded tokens (R and S) were manipulated equally or less often than expected by chance by most subjects (Tables 3, 4; post hoc tests (R): p < 0.001).

Every time the macaque Sha and the capuchin Ros solved the task, they consistently used their own previous tactic; Sha collected the tokens he had previously dropped in the testing compartment (Experiments 1 and 3), while Ros either saved a suitable token or turned back to collect one just before entering the testing compartment (Experiments 2 and 3; see electronic supplementary material for video sequences).

Experiment 4: Full replicate of the 25-min delay planning task

Experiment 4 was designed to consolidate results on these two subjects by involving them in a full 10-trial replicate of Experiment 2. They were both trained for 5 days in order to renew their motivation prior to testing.

Both subjects transported suitable tokens into the testing compartment, although at very low rates; the macaque Sha solved the task once, while the capuchin Ros successfully did so three times. The two subjects collected tokens in 100 % (Sha) and 80 % (Ros) of trials, respectively. The capuchin Ros was the only subject to collect tokens selectively (Table 3; G tests per individual: Ros, p < 0.010; Sha, p > 0.05), with a preference for the rewarded token (post hoc test (B): Ros, p < 0.01). However, both subjects preferentially manipulated the rewarded token during the waiting time (Table 4; G tests: p < 0.001 for all; post hoc tests (B): p < 0.001 for all). By comparison, the non-rewarded tokens (R and S) were generally manipulated less often than expected by chance (Tables 3, 4). Both subjects consistently used their own tactic, as previously described, each time they solved the task.

Experiment 5: 25-min delay storing experiment

The tactic employed by the macaque Sha was further investigated in Experiment 5 with the two subjects (She and Arn) who had never succeeded. The rationale was that monkeys could succeed in planning tasks if they had the possibility to ‘store’ tokens in advance at the correct location. Experiment 5 was designed to shorten the time lag within which key behaviours had to be performed; we gave subjects the opportunity to deposit tokens in the testing compartment during the collection period, so as to clump collection and transport of tokens together in a 5-min time period. The procedure was therefore similar to that of Experiment 2, except that once the token box had been hung on the wire mesh, the assistant opened the sliding doors between the waiting and testing compartments and the 5-min collection period began. After this time, the box was removed and the subject was ushered into the waiting compartment for the remaining waiting time. The two subjects were trained over 14 days prior to testing to boost their motivation to participate in the task and were then tested for 10 trials.

One macaque (She) deposited three suitable tokens in the testing compartment in two trials out of ten (Table 2). The capuchin (Arn) failed to solve the task again and only took part in 5 trials due to his lack of motivation to collect tokens. Token collection was therefore only examined for the macaque She, who collected tokens selectively for the first time with a preference for the rewarded one (G test: p < 0.001; post hoc test (B): p < 0.01). By comparison, the non-rewarded tokens (R and S) were collected equally or less often than expected by chance (Table 3, post hoc tests: R, p > 0.05; S, p < 0.01). Manipulating behaviours were not considered here, as they were not relevant to the task in question.

Experiment 6: Control condition

Experiment 6 was a control experiment based on the procedure used in Experiments 1 and 2, but without the delay between token deposit and time of exchange. The three subjects who had never succeeded in any of the previous experiments (Lad, Sim, Arn) took part in three control trials.

Two subjects succeeded in 100 % of trials in the control task; with the exception of the macaque Sim’s first trial, in which he transported only one copper bend, the macaque Sim and the capuchin Arn transported all 12 available copper bends for exchange with their human partner. One macaque (Lad) did not succeed in the control task and refused to go and collect tokens. In a final phase, the partner handed her one exemplar of each token and begged again to check whether she would exchange one of them against a food reward. Under these circumstances, the macaque Lad gave the suitable token and obtained the reward.

Individuals’ success rates

Individuals’ success rates were evaluated using a generalized additive mixed model. As the macaque Sim only took part in one experiment, he was not included in the model. The effects tested for this model included both manipulating behaviours and the collection of rewarded tokens and as well as the task presented to the subjects as independent variables. Interactions between the task and the other two effects could not be tested due to small sample size. The model (see Table 4 for the detailed results) revealed that almost all the tasks explained part of the variance in the individuals’ success rate, suggesting that each experiment made somewhat different cognitive demands on the subjects. The number of manipulating behaviours was a further predictor of the individuals’ success, with more frequent manipulation leading to more success, whereas the number of tokens collected had no predictive value (Table 5).

Table 5 Results of the generalized additive mixed model fitted with the proportion of success for each individual in each experiment as dependent variable; df = 5 in all tests

This result suggests that successes could have occurred by chance, that is, subjects merely transported the tokens they were currently manipulating into the testing compartment when the door opened. However, three aspects of the behaviour shown by the capuchin (Ros), who scored the highest number of successes (28 % of trials), deserve to be mentioned here. First, in four out of seven successes the capuchin Ros was not holding any token at the time of changing compartment. In the other three cases, she saved one exemplar of the suitable token towards the end of the waiting period for 8 s (Exp 3), 54 s (Exp 4) and 251 s (Exp 4) before any apparent changes occurred around her (i.e. before the sliding door was opened and the partner arrived for exchange). Second, it is worth noting that her patterns of token manipulation evolved as the study progressed. In Experiment 2, her two successes were indeed preceded by high manipulation rates for the whole duration of the waiting period (Fig. 2a). In Experiment 4, instead of being preceded by the continuous manipulation of suitable tokens, her three successes were characterized by systematic contacts with the suitable token at 3 and 23 min only (Fig. 2b), precisely at the times of collection and shortly prior to the opportunity of transport. Finally, if the transport of tokens had simply resulted from chance, we would have expected to see at least one occurrence of transport of each type of token based on the percentages of manipulation for each type of token during the waiting period (especially for Experiment 4, see Table 6), but this did not happen.

Fig. 2
figure 2

The manipulation rates of copper bends by the capuchin Ros throughout the waiting time. The presented values refer to the percentage of trials where contacts with copper bends had been recorded throughout the waiting time, that is, at least one contact with one copper bend per 1-min interval. Plain line with circles: successful trials; dotted line with triangles: unsuccessful trials. a Experiment 2: N success = 2, N failure = 8; b Experiment 4: N success = 3, N failure = 7

Table 6 Percentages of the different types of token manipulated and transported into the test compartment by the individual Ros

Discussion

Monkeys showed little evidence of being able to plan for a future exchange. More specifically, three subjects never solved the task. Three others solved the task several times, although at very low rates (0.10–0.33), using different tactics, mostly suboptimal. Each of the two most successful subjects (the capuchin Ros and the macaque Sha) consistently used only one exemplar of the suitable token to solve the task. They never transported or dropped more than one copper bend (except the macaque Sha on one occasion). One can extract several implications from these findings.

Whatever the cognitive mechanisms underlying these responses, the transport of a single token rather than several exemplars may indicate some cognitive limitation in these species. Individuals seem to have focused on a strategy to get food, without understanding that they could maximize their gains by using several exemplars of the same token. This result is reminiscent of those obtained during studies of temporal myopia in macaques and chimpanzees by Silberberg and colleagues (1998). In a food choice experiment, both the subjects consistently preferred peanuts to sweet potato slices, but chose equally between one peanut or one peanut and one slice of sweet potato. This suggests that they assigned no particular value to future food consumption, but rather focused on the quality of a food item. The fact that food quality could be a more relevant cue than quantity for monkeys has also been recently evidenced in capuchin monkeys (Anderson et al. 2008).

In Experiment 3, we checked whether these low rates of success were due to the difficulty of coping with the temporal component of the task, or whether they could be explained by subjects not understanding the task requirements, that is, the need to hold a suitable token, where to find it and where to transport it. All subjects recovered and transported suitable tokens during the open-access condition, suggesting that they knew all they needed to solve the task. The two subjects out of four who solved the planning task thereafter were also those who had previously done it in Experiments 1 or 2. These results indicate that for the other subjects, simply understanding the task requirements was not sufficient to enable them to plan for a future opportunity to exchange tokens for food. Hence, defining when opportunities for exchanging tokens would occur was certainly the limiting component.

On close examination of the behaviour shown by our subjects in relation to their success rates, it was puzzling that the rates of suitable token transport to the testing compartment did not relate to token collection. This means that subjects performed the key behaviours of collecting and transporting tokens independently and shows that they did not have a global understanding of the task. The model rather showed that token manipulation was a good predictor of the success rates of individuals. Globally, these results suggest that the most difficult component of the task could be to bridge the temporal gap between collecting and transporting tokens and that the manipulation of tokens for the whole waiting duration could help subjects to do so.

We thought that shortening the time lag between key behaviours was another way of coping with this temporal gap problem. Experiment 5 therefore examined whether subjects would be more likely to deposit tokens in the testing compartment 25 min before the time of the exchange activity, rather than keep track of their collected tokens over a longer delay. One macaque (Sha) had previously solved the planning task several times by dropping some of the correct tokens in the adjacent testing compartment during the waiting time. In this experiment, another macaque (She) solved the task twice, by depositing three exemplars of the suitable token in the testing compartment during the collection period, whereas the capuchin did not. Nevertheless, success rates remained very low. Eventually, only one capuchin (Ros) repeatedly solved the task, transporting one correct token in 28 % of all her trials and never transporting any other type of token. This subject’s results may indicate a better understanding of the task than the other subjects.

It is often considered that future-oriented behaviours, even the most complex ones, can be explained to a significant extent by innate predispositions (Suddendorf and Corballis 2007, 2008; Tulving 2005). With regard to monkeys, we are more inclined to think that the collecting and transporting of tokens in order to exchange them for food is profoundly reliant on learning abilities. Hence, the use of a learned procedure to assess planning skills in this study guarantees that flexible cognitive processes were at work. However, it has also been suggested that learned associations could be responsible for the display of future-oriented behaviours (Suddendorf et al. 2009; Suddendorf and Corballis 2010). In particular, the repeated exposure to the same stimulus-reward relationships and the presence of potential discriminative cues in the environment are considered to facilitate such behaviours. Our planning task is based on repetitive trials as part of the experimental design; we think this approach enables monkeys to recall episodic information from previous testing days, so as to steer their behaviour in the following days. Once the testing phase had started, reinforcement of the token–reward relationship ceased and cuing by the human partner was controlled. While offering new tokens to the monkeys in a single-trial task could be argued to lead to stronger evidence of flexible planning, it would however fail to control for the attractiveness of new objects that the monkeys could keep in their hands for hours and so solve the task accidentally.

Evidence of flexibility comes from testing future-planning abilities in various contexts. The one study that has examined this issue in chimpanzees showed that subjects proved unable to solve a planning-to-exchange task that implied a one-hour delay between token collection and exchange (Dufour and Sterck 2008), whereas apes are known to be capable of planning future tool use at comparable delays (e.g. Dufour and Sterck 2008; Mulcahy and Call 2006; Osvath and Osvath 2008). We are currently testing orangutans and bonobos in a similar planning-to-exchange task to assess the extent of their flexibility in this set-up. As regards the Dufour and Sterck’s (2008) study, the ‘social’ dimension of the token exchange may have added some complexity in comparison with the tool-use procedure since the human partner could not be manipulated or controlled like a physical object, nor be apprehended by a causal relationship. We may even consider that some element of preparedness (Seligman 1971) may be involved in planning for future tool use, whereas it would not be so in planning for token exchange. Furthermore, Dufour and Sterck (2008) suggested that subjects may have remembered the unwilling attitude of the human partner more than their personal actions during testing, hence sharply reducing their motivation to solve a task that involved such an unreliable partner. These considerations hold true for the present study, and comparisons with planning for future tool use remain to be carried out with monkeys, especially capuchin monkeys.

Whether or not animals may possess the capacity for episodic memory definitely appears to be a pivotal element in our understanding of future-planning abilities in animals. Thinking back and forth in time has been described as a subjective experience accompanying the conscious recollection of what specific event occurred or will occur, and where and when the occurrence took or will take place (Clayton et al. 2003; Tulving 2005). Such a system would allow individuals to envisage future scenarios from the recollection of personal events, thus, even without environmental discriminative cues reminiscent of the problem in question. Neuroimaging studies have recently demonstrated that the same brain regions are engaged in episodic future and past thinking, supporting the idea of a common system of constructive episodic simulation (see Szpunar 2010 for a review). It is worth noting that non-human primates’ abilities for episodic memory and future planning appear to converge. Indeed, unlike macaques (Hampton et al. 2005), chimpanzees (Martin-Ordas et al. 2010) seem to possess an integrated representation of what event occurred, where and when, although their performances have not yet been shown to be as finely tuned as those of scrub jays (Dekleva et al. 2011). No study has yet investigated episodic memory abilities in capuchin monkeys.

Our results are consistent with the idea that future planning is probably not within the reach of monkeys; at least not in the same way as humans. How monkeys make future-oriented decisions ‘without having the future in mind’ remains a question that deserves further investigation. It should be emphasized however that the experimental criteria used to test planning in non-verbal individuals sharply restrict the possibilities of investigation, and lead us to focus on the most cognitively demanding of the skills seen in human planning (Hayes-Roth and Hayes-Roth 1979). These results appear relevant to the debate about reciprocal interactions in animals. Reciprocal altruism (Trivers 1971) can only be demonstrated by proving that a contingency exists between giving and receiving, although the time frame of this contingency has not been clearly stated to date (Schino et al. 2007, 2009). Our results indicate that monkeys may not have any expectation of reciprocation in mind when investing in cooperative interactions involving a delay of half an hour or more. The cognitive demands of a planning task seem to be too high for monkeys to deal with easily. It remains plausible that they expect a return within shorter delays, as bridging a temporal gap of less than half an hour could be tractable for them. In this respect, we advocate taking animals’ understanding of the future into account so as to determine the relevant time frame within which contingent exchanges of goods and services can be measured in animals.