Introduction

In the transitive inference task described by Piaget (1928, 1955) a child might be told, for example, Alice is taller than Betty and Betty is taller than Carol. The children would then be asked, “Who is taller, Alice or Carol.” If a child answered Alice, Piaget believed that it would suggest that the child could make an inference using knowledge about Betty as a mediator, to infer the relation between Alice and Carol. This problem can be symbolized as A > B > C. The correct response to this question may not require an inference; however, because although both Alice and Betty have been described as taller than someone, Carol never has. Such a solution may not require understanding about the relations among the propositions.

Bryant and Trabasso (1971) proposed that this non-relational solution to the task could be avoided if the series was expanded by adding two more propositions to the two that were given, symbolized as A > B > C > D > E. In this way, an inference can be judged without testing with a term positioned at either end of the series (i.e., an end point). As B would be taller than C but not taller than A, and D would be taller than E and not taller than C, a relational solution should be needed to arrive at the correct response.

McGonigale and Chalmers (1977) reasoned that one could modify this task to make it appropriate for use with nonverbal animals by converting the training to four premise pairs consisting of simultaneous discriminations, in which choice of one of the two stimuli in each premise pair was reinforced (S+) and choice of the other was not (S −). Thus the original A > B > C > D > E became A + B −, B + C −, C + D −, D + E −. When they trained monkeys on the four premise pairs and tested them on the untrained and non-end-point terms B and D, consistent with making a transitive inference, they found a strong preference for B.

The conclusion that monkeys are able to make inferences of the kind that would be required with this procedure may be somewhat surprising. Since then, however, transitive inference effects have also been found in several other nonhuman species: rats (Davis 1992; Roberts and Phelps 1994), chimpanzees (Boysen et al. 1993), pigeons (Fersen et al. 1991) and even fish (Grosenick et al. 2007). It has also been demonstrated using a variety of procedures: the mixed-pair procedure (in which all of the pairs are trained at once; Fersen et al. 1991), the successive-pair procedure (Steirn et al. 1995; Weaver et al. 1997), and various other hybrid procedures (e.g. Daniels et al. 2014; Gillan 1981; Higa and Staddon 1993). Before accepting the evidence for an inference-like process, one should be careful to eliminate other non-inferential accounts of the preference typically found on in testing.

According to one alternative account, although training on all of the premise pairs is nominally the same, the history of reinforcement associated with each of the test stimuli, B and D, may not have been equal (see Couvillon and Bitterman 1992; Wynne et al. 1992). When the history of reinforcement of the B and D stimuli has been examined, the ratio of reinforced responding failed to predict the choice of B over D (Lazareva and Wasserman 2006; Steirn et al. 1995; Weaver et al. 1997). Furthermore, Lazareva and Wasserman (2012) tested the reinforcement history account more directly, by altering the associative value of D such that it should have been greater than B (see also Gazes et al. 2014). They accomplished this by giving the pigeons massed presentations of D + E − until D had a greater history of reinforcement than B had. Contrary to the predictions of the reinforcement history account, B was still preferred over D. These findings suggest that some other mechanism must be responsible for the choice of B over D with the five-term procedure.

Fersen et al. (1991) proposed an alternative non-inferential account for the finding of a transitive inference effect in nonhumans called value transfer theory. According to value transfer theory, in a simultaneous discrimination, positive value is transferred from the reinforced member of each trained stimulus pair, to the nonreinforced member of the pair. Thus in the A + B − pair, B acquires positive value because it is presented with A. This transferred value from A is added to any value that B acquires directly from reinforced responding in the presence of C (in the B + C − discrimination). Thus, although the direct values of B and D may be comparable, greater value should transfer from A (a stimulus to which responding is always reinforced) to B, than should transfer from C (a stimulus to which responding is not reinforced in the presence of B) to D.

To test the value transfer account of transitive inference, Weaver et al. (1997) used a five-term procedure that equated the value of A, C, and E. This manipulation involved reinforcing a random 50% of the choices of A while responses to B continued to be nonreinforced. In the B + C − discrimination, they reinforced all choices of the B stimulus and 50% of the choices of the C stimulus. In the C + D − discrimination, they reinforced 50% of the choices of the C stimulus but not choices of the D stimulus. In the D + E − discrimination, they reinforced all choices of the D stimulus and 50% of the choices of the E stimulus. Thus, choices of the A, C, and E stimuli were reinforced 50% of the time, whenever they appeared, and there should have been no differential value transfer to B and D from A and C, respectively. Nevertheless, for this condition, Weaver et al. found evidence of transitive inference (i.e., a significant preference for B over D).

A third alternative account of transitive inference was recently proposed by Galizio et al. (2017) to account for a transitive inference effect with humans using the five-term procedure. According to this hypothesis, when subjects learn the A + B − discrimination, they could learn to select A+, or reject B −, or both. They refer to these alternatives as stimulus-control topographies. As choice of A is always reinforced, subjects could learn to select A but not to reject B. In the case of the D + E − discrimination, as choice of E is never reinforced, subjects may learn to reject E − but not to select D+. In the C + D − discrimination, however, they likely would have learned to reject D − and select C+ because choice of C was not reinforced in the B + C − discrimination. Thus, when presented with B and D in test trials, subjects would have a bias to select B over D. That is, the choice of B over D may result not from a transitive inference (i.e., if B is selected over C, and C is selected over D, then B should be selected over D by inference).

To encourage their subjects to learn to reject the B − stimulus, Galizio et al. (2017) varied the identity of the A + stimulus over trials using ten different A + stimuli. By varying the A + stimuli from trial to trial, they reasoned that subjects would learn to reject the B − stimulus, rather than learn to select the ten different A + stimuli. If subjects learned to reject the B − stimulus in the A + B − discrimination, and to reject the D − stimulus in the C + D − discrimination (because the C + stimulus was the C − stimulus in the B + C − discrimination), according to this hypothesis, it should reduce or eliminate the transitive inference effect. While, subjects in a control condition, in which there was only one A + stimulus, showed a reliable transitive inference effect, the subjects in the 10A + condition did not.

The purpose of the present experiment was to determine for pigeons, if reduced rejection of the B − stimulus in the A + B − discrimination might be responsible for the preference for the B stimulus over the D stimulus, on B versus D test trials. In the present experiment, we used the successive-pair procedure used by Steirn et al. (1995), Weaver et al. (1997). For the Experimental group, over training trials, we used four different A + stimuli for the A + B − discrimination, whereas for the Control Group we used only one A + stimulus. In all other respects, the two groups were the same. We also manipulated the nature of the stimuli. For half of the pigeons in each group, the stimuli that represented the A − E stimuli were different colors, for the remaining pigeons the stimuli were different national flags.

Method

Subjects

The subjects were 28 White Carneau pigeons 8–10 years old. All of the pigeons had previous training on a simultaneous color discrimination task (midsession reversal) in which in each session, the color that was positive on the first 40 trials was negative on the remaining 40 trials and vice versa (see Rayburn-Reeves et al. 2011). The pigeons were maintained at 85% of their free-feeding weight and were individually housed in wire cages with free access to water and grit in a colony room that was maintained on a 12-h/12-h light/dark cycle. The pigeons were cared for in accordance with University of Kentucky animal care guidelines.

Apparatus

The experiment was conducted in two sound attenuating operant test chambers with inside measurements 35 cm high, 30 cm long, and 35 cm across the response panel. In one chamber, the panel had three response keys, each 2.5 cm in diameter and 6.0 cm apart. Only the left and right side keys were used in this experiment. The bottom edge of the response keys was 24 cm from the floor. A 12-stimulus inline projector behind the left and right response keys projected one of eight stimuli (red, green, yellow, blue, orange, pink, purple, and white hues) onto the response keys. A center-mounted feeder located 10 cm above the floor allowed access to mix grain when raised (1.5 s), and was illuminated by a 28 V 0.04 A lamp.

The other chamber had an Elo 1537L 15-in. Open-Frame Touchmonitor on the front wall and a center-mounted grain feeder located 10 cm above the floor on the back wall. The flags were 3.3 cm wide by 2.5 cm high, when projected onto a black background on the screen. The bottom edge of the flags was located 12.7 cm above the floor of the test chamber and the flags were presented in pairs horizontally centered on the screen, with 1.5 cm between the flags.

For the standard response–key chamber with in-line projectors, for half of the pigeons in each group the colors were A = yellow, B = orange, C = red, D = pink, E = green. For the remaining pigeons, the colors were A = green, B = pink, C = red, D = orange, E = yellow. For both experimental groups, the additional A colors were purple, white, and blue.

For the touchscreen pigeons, for half of the pigeons in each group the flags were A = Taiwan, B = Korea, C = France, D = Australia, E = Canada. For the remaining pigeons, the flags were A = Canada, B = Australia, C = France, D = Korea, E = Taiwan. For both experimental groups the additional A flags were Mexico, Singapore, and Chile. The flags were chosen to be relatively simple and judged subjectively dissimilar by the experimenters.

Procedure

Pretraining

Pigeons in the standard response–key chamber (n = 16) were trained to peck each of the stimuli when presented singly on the left and right response key (one peck) to receive 1.5 s reinforcement. Each stimulus was presented five times on the left and right response key each session for two sessions.

For pigeons in the touch screen condition (n = 16), over a period of 2 weeks, the pigeons were trained to peck a single flag on the touch screen (the flag that was to serve as one of the experimental A flags and the single A flag for the corresponding Control Group) for 2 s of reinforcement. Reinforcement was extended for pigeons in the touch screen condition to allow for the fact that the grain magazine was located on the back wall of the chamber. At the end of 2 weeks of intensive training, all but three of the pigeons had been training to peck the touch screen with sufficient force and for a sufficient peck duration (the touch screen required a peck duration of 16 ms for the peck to be recorded). Those three pigeons (two from the Experimental group, one from the Control Group) were dropped from the study.

Training

Phase 1: Pigeons in both the standard response–key chamber and the touch screen box were divided into two groups. Pigeons in the Control Group of each condition were given A + B − training in which there was a single A stimulus consisting of either a color or a flag. Pigeons in the Experimental group of each condition were given A + B − training in which the A stimulus changed from trial to trial such that there were four A stimuli (A1 + B −, A2 + B −, A3 + B −, A4 + B −) that were presented in random permutations over trials. All pigeons were trained for 96 trials per session. The trials involving different A + stimuli were randomly presented with the constraint that there were no more than three trials in a row with the same A + stimulus. The location of the A + and B − stimuli was also randomly presented with the same constraint. All pigeons were trained to a criterion of two consecutive sessions of 90% correct or better. Following A + B − training, pigeons in the Experimental and Control Groups were treated the same.

Phases 2, 3, and 4: As each pigeon reached criterion, on the next session it was transferred to the B + C − discrimination and trained to a criterion of two consecutive sessions of 90% correct or better, and it was then transferred to the C + D − discrimination, and then to the D + E − discrimination, each trained to the same criterion. The same counterbalancing of the location of the positive and negative stimuli occurred as with the A + B − discrimination. One pigeon in the color discrimination Control Group became ill during the course of the experiment and had to be dropped from the study. None of the data from this pigeon were included in the results.

Testing phase

As each pigeon reached criterion on the D + E − discrimination, on the next session, it was tested with the B and D stimuli. There were 96 trials in the test session with stimulus location determined randomly with the same constraint as in training. Choice of either stimulus was reinforced nondifferentially 50% of the time. Reinforcement was determined by random selection of one-half of the trials. On those trials choice of either stimulus was reinforced, whereas choice of neither stimulus was reinforced on the remaining trials. All pigeons received one session of testing.

Results

Training. The procedure used by Steirn et al. (1995) and Weaver et al. (1997) has been found to result in rapid acquisition of the four discriminations and has typically demonstrated a significant preference for B over D. Acquisition to the criterion of two successive sessions with accuracy at or above 90% for each of the conditions (colored response keys and touch screen flags), for each of the four discriminations, appears in Table 1. As can be seen in the table, Acquisition of the four discriminations by pigeons in the Experimental and Control Groups, for colors and flags was rapid and comparable. A two-way mixed-factor analysis of variance performed on the training data with discrimination (A + B −, B + C −, C + D −, D + E −) and Group (Experiment versus Control) as factors indicated that the effect of discrimination was significant, F(3,27) = 6.10, p = .0026, but the effect of Group, F < 1, was not. Most important, the discrimination × group interaction, F(3,27) = 1.47, p = .245, was not significant.

Table 1 Sessions to a criterion of two sessions at or above 90% correct for each of the four training discriminations for the Experimental Group with 4A + stimuli and Control Group with only 1A + stimulus for the color discrimination and flag discrimination conditions

The only difference between the procedures for the Experimental and Control Groups occurred in the A + B − discrimination, in which for the Experiment Group there were four A + stimuli that varied over trials, whereas for the Control Group there was only one A + stimulus. For this reason, a separate analysis was performed on the training data from the A + B − discrimination. The analysis indicated that the difference between the Experimental (M = 3.31, ± 0.70 SEM) and Control (M = 3.46, ± 0.27 SEM) Groups was not statistically significant, t < 1, Cohen’s d = 0.178.

Testing. The results of the testing session are presented in Fig. 1. For both the flag discriminations and the color discriminations, choice of the B stimulus for the Experimental Group (4A+) was less than for the Control Group (1A+). A two-way analysis of variance, with Experimental versus Control Groups and Condition (colors or flags) as factors, performed on the testing data indicated that the effect of Group was statistically significant, F(1,22) = 6.74, p = .016, Cohen’s d = 0.959. Pigeons in the Experimental Group (M = 52.1%, ± 7.30 SEM), chose the B stimulus significantly less than pigeons in the Control Group (M = 72.2%, ± 4.94 SEM). The effect of Condition was also significant, F(1,22) = 5.81, p = .025, Cohen’s d = 0.864. Pigeons in the flag condition chose the B stimulus significantly less than pigeons in the color condition. Although the difference between the Experimental and Control Groups for the flag condition (22.7%) was somewhat greater than the difference between the Experimental and Control Groups for the color condition (17.6%), the group × condition interaction was not statistically significant, F(1,22) = 1.67, p = .21, Cohen’s d = 0.119. Test data for the individual pigeons in each group appear in Table 2.

Fig. 1
figure 1

Pigeons were trained with either flags or colors on four successive simultaneous discriminations: A + B −, B + C −, C + D −, D + E − and were then tested on the BD pair. For the experimental group (4A+) any one of four A + stimuli could appear from trial to trial in the A + B − discrimination. For the Control Group (1A+) there was only one A + stimulus

Table 2 Number of choices of the B stimulus during the B versus D test session for Experimental and Control Groups in each of the color and flag conditions

A single sample t test on the data from the Control Group relative to chance indicated that the preference for the B stimulus was statistically significant, t(12) = 3.77, p = .003, Cohen’s d = 2.176. This shows that we were able to replicate the basic finding of a significant preference for B over D, found in many other experiments.

To determine if the rate of acquisition of the training discriminations was related to the preference for the B stimulus, we conducted a Pearson correlation between sessions to criterion, pooled over the four training discriminations, and choice of the B stimulus on the test session. Although a positive correlation was found, r(24) = 0.33, p = .10, it was not statistically significant. A second Pearson correlation between sessions to criterion on just the A + B − discrimination (the only discrimination for which training for the Experimental and Control Groups differed) and choice of the B stimulus on the test session also yielded a positive correlation but it too was not statistically significant, r(24) = 0.31, p = .12.

Discussion

McGonigale and Chalmers (1977) developed the simple simultaneous discrimination procedure using a five-term transitive inference task for use with animals (A + B −, B + C −, C + D −, D + E −). Using this procedure, when given a choice between the B and D stimuli, animals generally show a transitive inference effect in the form of a preference for B. Such evidence for transitive inference by animals has sometimes been attributed to differential reinforcement effects (Couvillon and Bitterman 1992; Wynne et al. 1992). Attempts to assess the relation between the history of reinforcement and the transitive inference effect, however, have not supported this hypothesis (Steirn et al. 1995; Weaver et al. 1997). Furthermore, explicitly providing a large amount of overtraining on the D + E − discrimination did not eliminate the preference for B over D (Lazareva and Wasserman 2012).

Alternatively, Fersen et al. (1991) suggested that differential value transfer from the stimulus to which responding is reinforced to the stimulus to which responding is not reinforced may account for the transitive inference effect. According to this theory, A + is a stimulus to which responding always provides reinforcement and thus, has more value to give to B −, than C + has to give D −, a stimulus to which responding does not provide reinforcement when presented with B+. But Weaver et al. (1997) found that when the value of A, C, and E were all equated, there was still evidence for a transitive inference effect.

More recently, Galizio et al. (2017) suggested that the transitive inference effect in humans might result from differential experience with the B − and D − stimuli (a hypothesis that they referred to as different stimulus-control topographies). Specifically, because B- is presented with A+, a stimulus that is always correct, there is very little experience with the B − stimulus. The D − stimulus, however, is experienced in the presence of C+, a stimulus that is incorrect in the presence of B+. Thus, there should be more experience with choice of the D − stimulus than with the B − stimulus and consequently B should be preferred over D. To test this hypothesis with humans, Galizio et al. used ten different A stimuli to encourage subjects to learn to reject the B − stimulus, rather than learning to select all ten of the A + stimuli. They found that this manipulation eliminated the transitive inference effect.

The present experiment with pigeons used a design similar to that used by Galizio et al. (2017) with humans but with only four different A stimuli. The results of our experiment supported the hypothesis that a preference for B over D can be explained by the inability of the B − stimulus to control choice as much as the D- stimulus.

Could it be that the A + B − discrimination involving four different A + stimuli was simply more difficult to acquire because the four different A + s required a greater amount of learning and thus, the added training was responsible for the pigeons’ failure to show a transitive inference effect? Examination of Table 1 suggests that differential training on the A + B − discrimination was not responsible for the group differences that were found, as the amount of A + B − training was comparable for the two groups in both the color and flag conditions. Similar rates of learning suggests that the 4A + task was not more difficult, regardless of what the hypothesized difficulty might be.

The hypothesis tested in the present experiment was that in the original procedure involving a single A + stimulus the animals may not develop as much tendency to reject the B − stimulus as they develop to the D − stimulus in the C + D − discrimination. To test this hypothesis, we trained the pigeons in the Experimental Group with four different A + stimuli to encourage the pigeons to learn to reject the B − stimulus. If the pigeons in the Experimental group learned to reject the B − stimulus, one might have expected it to make the B + C − discrimination harder for those pigeons than for the pigeons in the Control Group. Yet, acquisition of the B + C − discrimination was virtually the same for both groups. It may be that the failure to observe a difference in sessions to acquisition of the B + C − discrimination by the pigeons in the two groups may result from the pigeons’ tendency to be neophobic (Greenberg 2003). If so, pigeons may not show a natural preference for a novel stimulus even over a stimulus that they had learned to reject.

The implication of the findings from the present experiment is that the transitive inference effect does not result from the inference that the B stimulus must be better than the D stimulus because the B stimulus is better than the C stimulus and the C stimulus is better than the D stimulus. The present results suggest that rather than making an inference, the pigeons are likely weighing the select and reject response tendencies to the B stimulus relative to the D stimulus, without involving the C stimulus as a mediator. We believe that this non-inferential interpretation can account for the results of most research on transitive inference that have used nonverbal five-term (or more) procedures.