Introduction

Behavioral flexibility is the ability to adjust behavior to better adapt to changing environmental contingencies (Bond et al. 2007). Flexibility of this sort requires an animal to rapidly inhibit responses to stimuli no longer profitable, while sometimes engaging in novel (or previously punished) behavior in order to acquire new associations. Measuring an organism’s behavioral flexibility is one way to indirectly assess its level of cognitive flexibility, which has been defined as the ability to shift attention to different sources of information based on comparisons of adaptive value (Klanker et al. 2013). This ability has been studied extensively in human and non-human animals using a variety of paradigms, such as set shifting, inhibitory learning, and reversal learning, and has been argued to involve a number of higher-order cognitive processes, such as the ability to attend or shift attention to the most relevant information available (Hamilton and Brigman 2015). These paradigms reveal interesting differences in behavior that can be compared across species to investigate differences in both the qualitative and quantitative mechanisms that guide behavioral change.

Recently, researchers have investigated whether various species show similar switching behavior in a reversal task in which a single reversal occurs at the midpoint of each session (Cook and Rosen 2010; Rayburn-Reeves et al. 2011). In this task, called midsession reversal (MSR), one behavioral pattern is reinforced for the first half of a session (S1+, S2−) and the opposite pattern is reinforced for the second half of the session (S1−, S2+). This MSR task measures how readily and under what conditions animals adjust their behavior over time, with the most recent trial’s response-reinforcement contingency arguably the most advantageous cue available to control behavior. That is, efficient control by this cue and a high degree of behavioral flexibility could potentially require only a single experience to adjust behavior on the following trial, a strategy termed win-stay/lose-shift (Levine 1975; Restle 1962). Therefore, we can compare various species’ ability to optimize behavior on the MSR task with their performance on similar cognitive tasks, such as serial reversal learning (SRL), thereby providing more insight into whether there exists a quantitative difference in the rate of error reduction over reversal sessions between species. Additionally, we may also begin to pinpoint the types of cues that come to control behavioral responses over learning, as there may be a shift in attention to various cues over time, as has been documented in previous research (Rayburn-Reeves and Cook 2016).

An advantage of the MSR task is that it provides multiple, relevant and predictable cues, thereby allowing for the assessment of whether and when a particular cue comes to control behavior. This is due to the consistency of the reversal occurring in the middle of the session, allowing for a variety of sources of information, such as recent reinforcement history, the passage of time, the number of trials within the session, or changing satiety levels, to predict the reversal. Therefore, MSR tasks allow us to investigate the same mechanisms of flexibility as do other reversal learning and set-shifting tasks, but with the added benefit of being able to assess qualitative (i.e., cue dominance) as well as quantitative (i.e., rate of error reduction) differences in learning over time. Finally, another advantage of the MSR task is that the utilization of particular cues becomes evident within the first 10–20 sessions of training (Laude et al. 2014; Rayburn-Reeves et al. 2011, 2013a, b). Therefore, this task allows for a fairly rapid analysis of the cognitive mechanisms controlling behavioral choice than that afforded by other reversal learning tasks.

Rayburn-Reeves et al. (2011) initially studied MSR learning in pigeons using a simultaneous (e.g., S1+, S2−) visual discrimination over 50, 80-trial sessions where the values of these stimuli were consistently reversed (S1−, S2+) on Trial 41 of each session. They found that, once performance stabilized, pigeons displayed systematic errors around the reversal location. Specifically, they began responding to S2 before the reversal occurred within the session and maintained responding to S1 after the reversal occurred. These two types of errors, termed anticipatory and perseverative errors, respectively, were approximately equivalent in frequency and varied as a function of proximity to the reversal (see Fig. 1, reprinted from Rayburn-Reeves et al. 2011). These results suggest that the pigeons’ behavior was primarily controlled by the temporal properties of the session, where pigeons appeared to be timing from the start of the session to the reversal event. This theory has been supported by research demonstrating shifts in these anticipatory and perseverative errors when the intertrial interval (ITI) is manipulated (McMillan and Roberts 2015). From our human perspective, this temporal cue seems more effortful and less efficient than attending to recent response-reinforcement contingencies. Why the pigeons were not controlled (at least primarily) by the information provided by recent reinforcement contingencies is not clear. Had they been able to attend and adjust to this information efficiently, they could have received reinforcement on all trials except for the first reversal trial (Trial 41). In spite of this difference, however, it should be noted that their accuracy on these final sessions averaged approximately 90% (Rayburn-Reeves et al. 2011).

Fig. 1
figure 1

The percentage choice of S1 as a function of trial number averaged across pigeons for the last 10 sessions of training. These data are reprinted from Experiment 1 of Rayburn-Reeves et al. (2011)

In an attempt to discourage the use of time as a cue, Rayburn-Reeves et al. (2011) varied the point at which the reversal occurred across sessions. They trained a novel group of pigeons using five randomized reversal locations (after Trials 10, 25, 40, 55 or 70) for 100 sessions (20 at each reversal location). They found that when the reversal occurred early in the session (after Trial 10), the pigeons showed little to no anticipation but a large amount of perseveration. When the reversal occurred late in the session (after Trial 70), pigeons showed a high degree of anticipation, responding equally to S1 and S2 immediately before the reversal. Additionally, the percentage of errors produced by the different reversal locations was highest when the reversals occurred at the endpoints of the sessions and lowest when it occurred at the midpoint. However, the functions produced by the different reversals did not overlap. It appeared pigeons were averaging the probability that the reversal would occur at a given point over sessions and using that average, as well as the reinforcement feedback, to respond to S1 or S2. Therefore, even when the time within the session was made less reliable as a source of information about the changing environment, it continued to exert a high degree of control over behavior. This finding that the temporal properties of the task dominate behavioral choice has been replicated a number of times with pigeons (Laude et al. 2016, 2014; McMillan and Roberts 2012; McMillan et al. 2016; Rayburn-Reeves et al. 2013a, b), suggesting that it is a highly salient cue that readily comes to dominate behavioral choice in this species.

Rats, on the other hand, have produced varying results, depending on the apparatus with which they are tested. Given their poorer visual acuity as compared with pigeons and the novelty of the paradigm, to date, rats have only been tested on spatial MSR tasks. When tested in an operant chamber with a spatial discrimination, rats have shown highly flexible discriminative behavior, whether this discrimination requires a nose poke (Smith et al. 2016) or a lever press (Rayburn-Reeves et al. 2013a, b; Smith et al. 2016). In contrast, testing rats in an open-field T-maze apparatus produces marked anticipatory and perseverative errors (McMillan et al. 2014), suggesting differences in the particular cues mediating behavioral change across the session. McMillan et al. (2014) suggested that the more accurate performance around the reversal by rats in an operant setup is due to their ability to use position cues during the ITI to signal the correct response on the following trial; however, the evidence supporting this argument rests on a single experiment with rats tested in a T-maze procedure, which is a very different experimental design than that afforded by operant tasks. As seen with SRL tasks, which show how vital the methodology is to demonstrate either successful or unsuccessful reversal performance in various species (Bond et al. 2007; Brown 2015; O’Hara et al. 2015), more research in this area is warranted. A more complete picture of cue use in MSR tasks necessitates the testing of other species whose behavior under other, comparable cognitive tasks is well documented.

The purpose of our study was to examine the relative contribution of the various cues in control of behavior during a visual MSR task in rhesus macaques. Rhesus macaques have been a popular primate species in comparative cognition research, sometimes displaying behavior matching that of human participants in tasks such as serial chaining (D’Amato and Colombo 1988), list learning (Sands and Wright 1980), learning set formation (Harlow 1949) and SRL (Beran et al. 2008). Therefore, testing monkeys on the MSR task will provide evidence as to whether the same type of information that appears to guide human and rat MSR learning (i.e., reinforcement feedback) is employed by monkeys as well to maximize reinforcement. Although it remains unclear whether rats utilize this cue with non-spatial tasks, evidence for the use of this cue in monkeys allows for a better understanding of the qualitative differences in learning and ultimately provides a better picture of the level of cognitive flexibility available in various species.

Across three experimental phases, monkeys were tested with a single reversal occurring at the midpoint of the session, a single reversal occurring at various points across individual sessions and multiple reversals within individual sessions, respectively, using the five reversal locations used in previous research (Rayburn-Reeves et al. 2011). We examined the degree to which the monkeys showed anticipatory or perseverative errors, and how those performance patterns related to other species that have been tested on similar MSR tasks.

Methods

Subjects

Seven adult male rhesus macaque monkeys (Macaca mulatta; age in years—Chewie: 16, Han: 13, Hank: 33, Lou: 23, Luke: 16, Murph: 23 and Obi: 12) were tested using a within-subject design across all three experimental phases. All of the monkeys were housed at the Language Research Center of Georgia State University (GSU). They were tested individually, and usually in the morning, but had constant visual and auditory access to nearby monkeys, as well as a 24-h period with access to a compatible social partner once per week during which time they did not engage in computerized testing. The monkeys had continuous access to their test apparatus while in their home enclosures, allowing them to work as they chose for banana-flavored chow pellets. Although monkeys worked and rested as they chose during each session, we found that they nearly always worked on the task at a continuous and consistent pace (see Results section). Food and water deprivation were not used in this study; all of the animals had continuous access to water and were fed a daily diet of primate chow biscuits and various fruits and vegetables, regardless of their performance on the tasks. All testing protocols complied with US National Institutes of Health guidelines for the Care and Use of Laboratory Animals, as well as the guidelines for working with non-human primates as established by protocols approved by the GSU Institutional Animal Care and Use Committee (IACUC Protocol A15014, GSU). GSU is an Association for Assessment and Accreditation of Laboratory Animal Care International (AAALAC)-accredited institution.

Apparatus and stimuli

The monkeys were tested using the Language Research Center’s Computerized Test System, which is comprised of a personal computer, digital joystick, color monitor and pellet dispenser (Evans et al. 2008; Richardson et al. 1990). Monkeys used their hands to manipulate the joystick to control a small cursor on the computer screen. While the monkeys were engaged in the task, the viewing distance from the screen averaged 40 to 50 cm, creating viewing angles subtended approximately 6.5°–7.5°. Reward consisted of a 94-mg banana-flavored chow pellet (Bio-Serv, Frenchtown, NJ) provided from a pellet dispenser connected to the computer. All of the tasks were written in Visual Basic 6.0. For all phases of the experiment, two different clip art images (6.0 × 6.5 cm) were used on every trial. One image was a five-pointed black and gray star, and the other was a white rightward-pointing arrow on a black background.

Procedure

Phase 1: single midsession reversal task

On Trial 1 of the experimental sessions, macaques were presented with the star and arrow stimuli at the top left and top right corners of the computer screen. Across trials, these two stimuli were randomly assigned to the two spatial locations. Monkeys Lou, Luke and Murph were given the star image as the first correct stimulus (S1), and Chewie, Han, Hank and Obi were given the arrow image as S1. For the first 40 trials of each session, selection of the S1 image resulted in the delivery of the food reward during the 2-s ITI, while selection of the S2 image resulted in no food reward and presentation of the 2-s ITI. From Trials 41 to 80, these contingencies were then reversed (S2+/S1−). Every session had the same experimental setup (Trials 1–40: S1+, S2−; Trials 41–80: S2+, S1−). The monkeys were given a single, 80-trial session per day, and all monkeys completed a total of ten sessions in this phase.

Phase 2: single variable reversal (SVR) task

Immediately following the ten training sessions on the MSR task, the location of the single reversal trial was varied across one of five preselected locations within the session (after Trials 10, 25, 40, 55 or 70). Monkeys were trained for a total of 25 sessions, with five sessions tested at each reversal location. All reversal locations were randomly assigned across experimental sessions with the condition that, across 5-session bins, all five reversal locations were experienced before any was repeated.

Phase 3: multiple variable reversal task

Immediately following completion of the SVR task, monkeys were given an additional 20 sessions in which two reversals would occur within each session. The reversals occurred at two of the preselected locations used in the SVR task (after Trials 10, 25, 40, 55 or 70) within a single session. Therefore, a total of ten, two-reversal combinations were possible (Trials 10 and 25, 10 and 40, 10 and 55, 10 and 70, 25 and 40, 25 and 55, 25 and 70, 40 and 55, 40 and 70 or 55 and 70). As in the first two experimental phases, the same S1+ and S2− contingencies that were in effect pre-reversal, became the S1− and S2+, respectively, after the first reversal, and then became the original S1+ and S2− again after the second reversal. All other testing procedures were the same as in the previous phases of the experiment. Each monkey completed two, 80-trial sessions for each two-reversal combination, with session type (i.e., where the two reversals occurred) randomized across sessions.

Results

Time to complete sessions

In Phase 1, the monkeys completed each session in an average of 20.60 min (SD = 26.09). One monkey, Murph, took much longer (M = 56.7, SD = 45.62). If he is removed from the group, the average length and variability of the session on the group level are reduced (M = 14.47, SD = 14.55). Session length was not significantly correlated with session number for any monkey, indicating no tendency for monkeys to perform more quickly or more slowly as the experiment progressed. In Phase 2, the monkeys completed each session in an average of 25.16 min (SD = 31.48). Again, without Murph (M = 47.7, SD = 50.17) this number and the variability are reduced (M = 21.40, SD = 25.53). Correlations were not conducted between session length and session number for Phase 2, because of the additional factor of session type, which was randomized across sessions for each monkey. In Phase 3, the monkeys completed each session in an average of 28.98 min (SD = 42.77). When Murph (M = 89.7, SD = 68.89) is not included, the monkeys completed each session in an average of 18.85 min (SD = 25.61). For the same reason as in Phase 2, correlations were not conducted between session length and number. Individual session length averages and standard deviations for each experimental phase are included in Table 1, as well as Pearson’s product correlations and statistical significance for those values for each monkey in Experiment 1.

Table 1 Phases 1–3: average time to complete sessions for each monkey

Phase 1: single, midsession reversal location

The results of the first phase of the experiment with the single, MSR location are shown in Fig. 2, which depicts the percentage choice of S1 as a function of trial number averaged across subjects for all ten sessions of training. The data are plotted in blocks of five trials. Average overall accuracy was 90.04% (SEM = 3.08, 95% CI [83.48, 96.60]). In general, the monkeys chose the S1 stimulus the majority of the time during the first half of the session, with no evidence of anticipation of the reversal. After the reversal, the monkeys transitioned to responding to S2 within 5 to 10 trials and continued to choose S2 almost exclusively for the remainder of the session. Additionally, there was a small unreliable dip in the first block that was due to two monkeys (see Fig. 3 for individual subject data), but no such dip was seen in later phases as the monkeys accrued additional experience in the MSR task. This dip was due in all cases to between-subject variation in choice of S1 on these initial trials (range 54–100%) and has been found in previous research with pigeons (Rayburn-Reeves et al. 2011, 2016).

Fig. 2
figure 2

Percentage choice of S1 as a function of trial number (in blocks of five trials) averaged across monkeys and the 10 sessions of Phase 1 training. The reversal location is indicated by the dashed line. Error bars depict standard errors of the mean

Fig. 3
figure 3

Percentage choice of S1 as a function of trial number averaged across the 10 sessions of training in Phase 1 for each monkey. The reversal location is indicated by the dashed line

Figure 4 provides a more detailed picture of S1 responding on the trials immediately preceding and following the reversal trial (Trial 41). In contrast to pigeons but similarly to rats, monkeys showed no significant drop in response to S1 on the trials immediately preceding the reversal. Average accuracy across monkeys on Trials 37–41 was 95.39% (SEM = 1.19, 95% CI [92.47, 98.31]). Additionally, a one-way, repeated-measures ANOVA revealed no significant difference in accuracy across these trials F(4, 24) = 1.39, P = .27, indicating the monkeys consistently chose S1 at a high rate and showed no evidence of anticipation of the reversal event.

Fig. 4
figure 4

Percentage choice of S1 as a function of trial number averaged across monkeys and the 10 sessions of training in Phase 1. The reversal location is indicated by the dashed line. Error bars depict standard errors of the mean

On the five trials after the reversal (Trials 42–46), a significant drop in accuracy (M = 63.71, SEM = 8.66, 95% CI [39.64, 87.78]) was indicated by a one-way repeated-measures ANOVA, F(4, 24) = 11.38, P < .01. Tukey’s HSD analyses revealed this difference was due to an increase in accuracy from Trial 42 (M = 32.86, SEM = 8.43) to Trials 44 (M = 71.40, SEM = 10.90), 45 (M = 76.64, SEM = 7.13) and 46 (M = 79.97, SEM = 6.90). There was no significant difference in accuracy between Trials 42 and 43 (M = 56.92, SEM = 9.70), and no other differences in accuracy were significant. Finally, as a measure of sensitivity to the contingency shift, the average drop in choice of S1 from Trial 41 (M = 98.57, SEM = 1.43, 95% CI [95.07, 102.07]) to Trial 42 (M = 67.14, SEM = 8.43, 95% CI [46.49, 87.79]) was significant, t(6) = 4.25, P < .01.

Phase 2: single, variable reversal location

The results of the single, variable reversal location task are shown in Fig. 5, which depicts the percentage choice of S1 as a function of trial number with each of the five reversal locations plotted separately. The reversals are indicated by the dashed, vertical lines. Overall accuracy was high across all session types (M = 88.09, SEM = 1.14, 95% CI [84.92, 91.26]) and comparable to the overall accuracy observed during the MSR phase (M = 90.04%). When averaging across trials, a repeated-measures ANOVA revealed a significant effect of accuracy as a function of reversal location, F(4, 24) = 5.14, P < .01. Pairwise comparisons revealed this effect was due to the fact that accuracy on sessions with reversals occurring after Trial 70 (M = 85.10, SEM = 2.70, 95% CI [78.4, 91.8]) was significantly lower than when reversals occurred after Trial 10 (M = 91.4, SEM = 1.4, 95% CI [87.8, 94.9]), Trial 25 (M = 88.9, SEM = 2.3, 95% CI [83.2, 94.6]) and Trial 40 (M = 89.1, SEM = 2.0, 95% CI [84.2, 94.0]), but no significant difference when comparing sessions with reversals occurring after Trials 70 and 55 (M = 86.0, SEM = 3.5, 95% CI [77.3, 94.6]) and no other significant differences were found. Therefore, although monkeys maintained a high degree of accuracy across reversal locations, they declined in accuracy as the reversal location was shifted later on in the session.

Fig. 5
figure 5

Percentage choice of S1 as a function of trial number averaged across monkeys for all 25 sessions of training. Each reversal location (10, 25, 40, 55 and 70) is indicated by a dashed line

On the trials leading up to the reversal, including the reversal trial (Relative Trials −4 to 0), choice of S1 was consistently high (M = 91.66, SEM = 1.20). A repeated-measures ANOVA indicated no significant difference across reversal location on these trials, F(4, 24) = 0.97, P = .44. On the five trials after the reversal (Relative Trials +1 to +5), accuracy dropped to a mean of 46.40% (SEM = 2.53, 95%), but there was no significant difference in accuracy across reversal location as indicated by a repeated-measures ANOVA F(4, 24) = 2.25, P = .09. We compared accuracy on the five trials prior to and the five trials after each reversal across sessions for each reversal location using a 2 × 5 repeated-measures ANOVA (Trial Location [Pre-Reversal, Post-Reversal] × Reversal Location [10, 25, 40, 55, 70]). We found a significant main effect of Trial Location, F(1, 6) = 61.72, P < .01, indicating a significant drop in choice of S1 across the reversal, but no significant main effect of Reversal Location, F(4, 24) = 1.48, P = .24, indicating that the monkeys did not differ in their behavior across reversal locations, and no significant interaction, F(4, 24) = 2.05, P = .12. Therefore, regardless of reversal location, monkeys were highly accurate on trials preceding the reversal, with accuracy dropping significantly after the reversal.

Phase 3: multiple, variable reversals

Overall accuracy for the multiple, variable reversal task (M = 86.78; SEM = 0.83) was similar to that seen in the two previous phases. We averaged across trials for each monkey and compared accuracy across reversal locations. For the first reversal position (after Trials 10, 25, 40 or 55), overall accuracy across reversal locations was almost identical, F(3, 18) = .57, P = .64. Similarly, no significant differences in overall accuracy (M = 86.75, SEM = 0.59) for the four reversals in the second reversal position (after Trials 25, 40, 55 or 70) were found, F(3, 18) = 0.76, P = .53.

As was found in Phase 2, the location of the reversal did not impact overall accuracy, nor did it impact anticipatory or perseverative error rates; therefore, we collapsed across reversal location to analyze general levels of anticipation and perseveration. These results are shown in Fig. 6, which depicts the percentage choice of the previously correct stimulus as a function of relative trial number for the first and second reversal locations for Phase 3. In addition to these two lines, we added in the averaged function from Phase 2, collapsing across reversal location in Fig. 6 for comparison purposes. After collapsing across reversal position, given that there were no significant differences, we analyzed differences in accuracy prior to and after the reversal location separately. Accuracy was high on the five trials prior to the relative reversal location (Trials -4−0: M = 97.29, SEM = 0.71), fell to chance on the five trials after the reversal (Trials +1 to 5: M = 43.43, SEM = 4.34) and then rose significantly thereafter (Trials +6 to 10: M = 83.86, SEM = 2.61). A repeated-measures ANOVA revealed a significant effect of relative trial block F(2, 12) = 158.61, P < .01, where all three trial blocks were significantly different from one another. As with the previous phases, these results indicate that monkeys had an easier time staying with the initially correct stimulus than transitioning to the newly correct stimulus and inhibiting responses to this stimulus after it was no longer correct.

Fig. 6
figure 6

Percentage choice of the previously correct stimulus as a function of relative trial number averaged across monkeys for Phases 2 and 3. The dashed line indicates the reversal location, relative to the trials immediately preceding and following it. Closed and open circles indicate the first and second reversals for Phase 3, respectively, while the closed triangles indicate the single, variable reversal from Phase 2

Between-phase comparisons

When comparing across the three phases of the experiment, Phase 1 accuracy was highest (M = 90.04, SEM = 1.86), followed by Phase 2 (M = 88.09, SEM = 2.31) and then Phase 3 (M = 86.46, SEM = 1.34). A comparison of overall accuracy across the three phases using a one-way repeated-measures ANOVA, however, revealed no significant differences, F(2, 12) = 3.18, P = .07, indicating that there was no reliable evidence that different mechanisms guided behavioral choice in the three tasks.

Discussion

The results across the three phases of the experiment demonstrate that monkeys, as opposed to pigeons, but similarly to rats when tested in operant chambers, show behavior that appears to be mediated by recent reinforcement contingencies. The lack of control by temporal information is evidenced in the absence of anticipatory errors prior to the reversal in all phases, which we argue is a necessary component to control by temporal information, given the inherent noise in the timing systems of animals (Buhusi and Meck 2005; Church et al. 1994). Although the monkeys were only tested on ten sessions with the MSR task as compared with 50 sessions for pigeons and rats, their accuracy on the trials immediately preceding and following the reversal is comparable to results obtained with rats in an operant setting (Rayburn-Reeves et al. 2013a, b; Smith et al. 2016). In addition, the average session length for the monkeys, excluding Murph, was less than 15 min, suggesting that the monkeys consistently went through trials once they began their sessions, performing approximately 5–6 trials per minute. Therefore, although it would have been possible for the monkeys to have been controlled by a temporal cue, this did not seem to be the case, given the lack of anticipatory errors across sessions. If anything, monkeys who took longer to complete the task, such as Murph and Hank, achieved a lower overall accuracy than monkeys who completed the task in under 15 min, suggesting that maintaining a consistent pattern of responding over trials aids in performance on this task, regardless of the mediating variable controlling responses.

Phase 2 results with the single, variable reversal location showed a similar effect of control by recent response-reinforcement contingencies. For all five reversal locations, there was little to no anticipation prior to the reversal, followed by a systematic decrease in choice of the S1 stimulus on the trials immediately following the reversal event. This is consistent with previous findings with rats tested in an operant setting (Rayburn-Reeves et al. 2013a, b) and inconsistent with the results obtained with pigeons in an operant setting (Rayburn-Reeves et al. 2011) and rats tested in a T-maze apparatus (McMillan et al. 2014). The results suggest that recent response-reinforcement contingencies mediate switching behavior over the session for monkeys.

Phase 3 results further added to the finding that reinforcement was the cue controlling responses over the session. Regardless of where the first or second reversal location occurred within the session, responses prior to and after reversal points showed equivalent rates of anticipatory and perseverative behavior across reversal locations. Importantly, regardless of whether the reversal occurred after Trials 10, 25, 40, 55 or 70, responses did not increase to the S2 stimulus on trials immediately before the reversal. It appears that monkeys will stay with the option that has the stronger recent reinforcement probability (given that over many sessions, the probability of S1 and S2 being correct is 50%). The number of trials it takes them to switch to the newly correct stimulus is indicative of their level of behavioral flexibility, which can be considered to be a measure of inhibitory strength. It may also be a measure of cognitive flexibility, which may differ from the level of behavioral flexibility exhibited by various species. That is, cognitive flexibility can be defined as the ability to shift attention to various sources of information or different mechanisms of learning to maximize reinforcement. This may not manifest in flexible behavior if the inhibitory processes of behavioral control are strong.

The results of the current experiment reveal that time as a means to signal the availability of reinforcement is not readily controlling the behavior of rhesus macaques in the MSR task. Consistent with this, creating a situation where the reversal varies in location across sessions or where multiple reversals are presented within a single session does not produce shifts in the overall behavior of the monkeys. That is, the behavior exhibited is independent of the location or quantity of the reversal events within a session. This suggests that the behavior is likely controlled by recent history of reinforcement and that a shift in contingencies produces a shift in behavioral choice that occurs within a number of trials immediately following this event. These results mirror those found with a variety of primate species that suggest they may employ a reinforcement-based win-stay/lost-shift response pattern (e.g., Harlow 1949; Schrier 1984).

The same results were found with rats using a left/right spatial midsession reversal task (Rayburn-Reeves et al. 2013a, b), although the number of perseverative responses made by the rats was less than that observed by monkeys in the current experiment. The reason for this error rate difference is likely the nature of the stimulus dimension used in each study. The spatial task provided an additional, proprioceptive cue that allowed the rats to orient to the correct location during the ITI, thereby bridging the gap between the reinforcement feedback of the previous response–outcome association and the next available choice point. This alternative hypothesis about rat performance on the MSR task is further supported by the finding that, in a T-maze procedure where the spatial orientation cue could not be used as a cue, rats appeared to use time as a cue, producing significant errors of anticipation and perseveration, both in an MSR and a variable reversal location setup (McMillan et al. 2014). Therefore, these results with monkeys provide additional information regarding the availability of the reinforcement cue as a basis for behavioral control in different species. Now we have evidence that both monkeys and rats (in certain testing conditions), but not pigeons, can solve the MSR task by using the information provided by recent reinforcement. Future research with different species is needed to assess the degree to which control by recent response-reinforcement contingencies and temporal information mediate behavior in MSR tasks. This paradigm, however, provides an important tool for furthering our understanding of the differences in behavioral flexibility and the various sources of control that mediate behavioral choice across species.