1 Introduction

Cooperation in the finitely repeated prisoner’s dilemma (FRPD) is both widely observed and difficult to rationalize. The model developed by Kreps et al. (1982)—the most prominent theory to justify this behavior—shows that such cooperation can be rational if one player believes her opponent might be a “behavioral type” who plays the tit-for-tat strategy regardless of the history of play. The opponent can then take advantage of these beliefs by imitating the behavioral type early in the game and defecting as the final period approaches, resulting in a pattern of cooperation consistent with aggregate-level data from laboratory experiments (e.g., Andreoni and Miller 1993).

This reputation-building theory requires that players have some uncertainty about their opponents’ types. In our experiment, however, we find that cooperative play persists even when reputation-building is rendered impossible by revealing players’ histories of play. Specifically, we have subjects who completed a block of five sequential-move FRPD games against varying opponents, play a second block of five games against new opponents, who can see the subjects’ histories of play from the first block.Footnote 1 Selfish, rational players cannot credibly imitate the behavioral type in the second block because their true colors have been revealed through their history of play in the first block. Despite eliminating type uncertainty, we find aggregate patterns of cooperation very similar to treatments where no history is revealed. On the individual level, first-movers who are relatively distrusting (seldom cooperating in the first block) tend to be more cooperative in the second block, even when the second-mover’s revealed history is relatively uncooperative. Hence, rather than reducing cooperation by eliminating the opportunity for reputation-building, revealing histories of play generally improves cooperation. This finding is clearly inconsistent with standard reputation-building explanations of cooperation in FRPDs.

We organize these results through a model of semi-rational behavior similar to those of Kreps et al. (1982) and Radner (1986). Here, players decide in which round to stop playing tit-for-tat and begin unconditionally defecting by weighing the risk of cooperation against the immediate gain from defecting. Unlike Kreps et al. (1982), players in our model form arbitrary or “naïve” prior beliefs about how many rounds their opponents will continue playing tit-for-tat, which may be inconsistent with the opponent’s actual strategy. Players decide how long to conditionally cooperate in each game based only on these naïve prior beliefs and information about their opponents’ history of play, if revealed. Because this model does not assume any higher-level reflection about the rationality or best response of the opponent, it provides a contrasting benchmark to a model of full, commonly known rationality.

One implication of this model is that cooperation is sustainable for many rounds even when players have relatively pessimistic beliefs about their opponents’ strategies. For example, the model predicts that cooperation will be sustained until the penultimate round if the players have a uniform prior, i.e., if players believe their opponent will defect with equal probability in every round of the FRPD. In stark contrast to the reputation-building theory, the model also predicts that the level of cooperation will be the same or even higher when players learn that their opponent is not a behavioral type. This pattern of behavior is frequently observed in our experiment and sufficient to reject reputation-building as an explanation for cooperation. Hence, we find that this semi-rational model of naïve beliefs rationalizes the observed experimental behavior.

The paper is organized as follows. In Sect. 2, we review the related theoretical and experimental literatures. Section 3 contains a description of the reputation-building theory of cooperation in FRPDs. Section 4 details the experimental design, and the results of the experiment are presented in Sect. 5. In Sect. 6, we propose a model of boundedly rational cooperation that explains the experimental results. Section 7 concludes with a summary of the main results. The “Appendix” contains the proofs and additional summary statistics.

2 Related literature

Many experiments have studied the consistency of players’ behavior in FRPDs with the theory of Kreps et al. (1982). For example, Andreoni and Miller (1993) compare the amount of cooperation in 10-round FRPDs with one-shot prisoners’ dilemma games. They find significantly higher cooperation in the FRPDs compared to the one-shot games, as well as significantly higher cooperation in early rounds compared to later rounds. These patterns are consistent with the reputation-building theory of Kreps et al. (1982) at an aggregate level, though in a similar FRPD experiment, Cooper et al. (1996) observe that, at the individual level, only 25 % of subjects play consistently with reputation-building. Cooper et al. argue that the time path of play exhibits more cooperation than the Kreps et al. model predicts and speculate that their findings could indicate reputation-building if they were to consider alternative types of “irrational” players.Footnote 2

Camerer and Weigelt (1988) study the reputation-building sequential equilibrium in a finitely repeated investment game that is similar in structure to our FRPD game. The authors randomly assign a small fraction of second-movers to have payoffs such that they prefer cooperation over defecting. This exogenously induces the behavioral type. They find evidence consistent with the equilibrium prediction: as time progresses, first-movers are less likely to trust and second-movers are more likely to defect. The observed mixing probabilities are different than predicted by the induced probability of the behavioral type, but can be explained easily by assuming first-movers believe an additional 17 % of second-movers with the “noncooperative” incentives still prefer cooperation. To test this theory, the authors run additional experiments in which all second-movers were given noncooperative payments. The sequential equilibrium prediction assuming beliefs of 17 % is calculated, and the data from the new experiments conform surprisingly well to that prediction. Thus, Camerer and Weigelt (1988) provide strong evidence in favor of a reputation-building theory in which the first-mover’s beliefs are “homemade” naturally and need not be induced in the laboratory.

Neral and Ochs (1992) extend Camerer and Weigelt (1988) by analyzing the behavioral responses of players to changes in the parameters of the game. Like Camerer and Weigelt, they find that uncertainty induces players to develop a mutually profitable relationship consistent with the predictions of sequential equilibrium. However, Neral and Ochs find problems with the comparative static predictions of the sequential equilibrium model. When the parameters of the game are altered (e.g., decrease the payoff to second-mover), they find that the players respond in the exact opposite direction from what the theory predicts and that these results cannot be explained using the homemade priors specified by Camerer and Weigelt.Footnote 3

More recently, Reuben and Suetens (2012) find that many players cooperate in a manner consistent with Kreps et al. (1982). Reuben and Suetens use the strategy method to disentangle strategically and non-strategically motivated cooperation in a sequential prisoner’s dilemma with an uncertain end point in which cooperation is not an equilibrium strategy for rational players.Footnote 4 Players in their experiment can condition their action on whether they are currently playing the last period of the game or whether the game will continue. Second-movers who cooperate as long as the first-mover cooperates unless it is the final round are classified as reputation builders. Among second-movers who cooperate, the authors find that one-third to two-thirds do so for reputation-building rather than reciprocity. Moreover, Reuben and Suetens find that an increase in the payoff of mutual cooperation increases the ratio of reputation builders to unconditional defectors, consistent with Kreps et al. (1982).

Kagel and McGee (2014) compare individual play and team play in the finitely repeated prisoners’ dilemma. In the team play treatment, each player in the game is a 2-person team who can chat internally (but cannot communicate with opponents) and makes joint decisions. They find that teams are initially less cooperative than individuals, but with experience become more cooperative. Kagel and McGee’s analysis of team chat logs suggests that cooperation is driven by a failure of common knowledge of rationality, as teams attempt to anticipate when their opponents might defect and try to defect one period earlier, without accounting for the possibility of their opponents thinking similarly. Inconsistent with Kreps et al. (1982), they find no evidence that players anticipate opponents’ beliefs and attempt to mimic an irrational player, while they find that increased chat about the round in which the opponent will defect is accompanied by an increase in cooperation over the course of the experiment. This finding is also consistent with Embrey et al. (2014), who find that players learn to play “threshold strategies” in which they conditionally cooperate for a fixed length of time and then defect. Though players converge to these strategies, unraveling of cooperation occurs very slowly across supergames. Subjects clearly are not applying backwards-induction reasoning.

Other experiments have examined how cooperation in one-shot prisoners’ dilemma games is impacted when players see their opponents’ play history. Schwartz et al. (2000), Camera and Casari (2009), and Gong and Yang (2010) all find that observing an opponent’s history of play significantly increases cooperation, though Duffy and Ochs (2009) do not observe this effect in their data. In all of these studies, subjects are aware that their actions will be revealed to future opponents. Thus, players can follow a reputation-building strategy even though their opponents differ in every period. In contrast, because subjects in our experiment are not told in the first part of the experiment that their play histories will be revealed later, they are unlikely to perceive any benefit from cooperating in the final period of each repeated game in the first part.

Alternative models have been proposed that predict rational cooperation in FRPDs. For example, Selten and Stoecker (1986) develop an alternative theory based on a Markov learning model. In their model, players establish a period of first defection and update their period of intended defection based on their experience in the previous supergame. The authors conduct an experiment in which subjects play twenty-five 10-period FRPD supergames and find that cooperation ends in earlier periods as subjects gain experience.Footnote 5 Subjects in their experiment were told that they will play each opponent only once and were given no information about opponents’ histories of play in prior supergames. Hence, their data do not address the validity of reputation-building directly, nor are their results inconsistent with reputation-building.Footnote 6 By revealing second-mover histories in one treatment and not in another, our design allows a more direct examination of how a player’s intended period of first defection and beliefs about her opponent’s intentions matter for cooperation.Footnote 7

Other studies have focused on the role of reputation in inducing cooperation, though not in the context of an FRPD game. Gachter and Thoni (2005) and Ambrus and Pathak (2011) show how cooperation can be sustained in a public goods contribution game when some players are selfish and others are reciprocating with varying information on other players’ past behavior. As in our design, Ambrus and Pathak incorporate a restart in their experimental design to see how it impacts cooperation, as do Gachter and Thoni (2005). In Gachter and Thoni (2005), the past behavior of subjects is revealed, and the response to these “reputations” in the public goods game is studied. In Ambrus and Pathak, players in their experiment know in advance that they will be participating in subsequent games, but in Gachter and Thoni (2005), like our experiment, they do not. Bolton et al. (2005) examine how information about their partner in an image scoring game affects cooperation, while Irlenbusch and Sliwka (2005) study the role of reputation and uncertainty about the partner’s type in inducing cooperation in a gift-exchange setting. The former find that providing players with more information about their partner’s last action as well as the action of their partner’s previous partner increases cooperation while the latter find that direct reciprocal behavior is stronger when efforts are revealed. Healy (2007) considers a reputation-building equilibrium when firms stereotype workers and find that selfish workers imitate fair-minded types when firms have sufficiently high priors to generate cooperation. Similarly, Roe and Wu (2009) find evidence for the reputation-building equilibrium by finding that employees classified as selfish mimic cooperative employees when individual histories are observable, but not when histories are kept private. Andreoni and Croson (1998) survey repeated public goods game experiments and find little evidence of reputation-building behavior. Contrary to the expected result that reputation-building should lead to higher contributions when players’ opponents are fixed, this is not always the case (Andreoni 1988), and a restart effect seems more influential than the matching regime in raising contributions.Footnote 8

3 A reputation-based theory of cooperation

In this section, we apply the theory developed by Kreps et al. (1982) to the sequential-move FRPD played by the participants in our experiment and characterize the optimal strategies.Footnote 9 \(^{,}\) Footnote 10 A single stage of the sequential-move FRPD played in our experiment is shown in Fig. 1. As with Kreps et al. (1982), we assume the first-mover believes the second-mover may be a tit-for-tat behavior type with positive probability and a rational second-mover is aware of (and can take advantage of) this belief.Footnote 11 The tit-for-tat type always reciprocates the first-mover’s action, regardless of the period.Footnote 12 In the following analysis, we therefore focus only on the rational, payoff-maximizing second-mover’s decisions.Footnote 13

Fig. 1
figure 1

A single stage of the sequential-move FRPD

Let \(p_t\) be the first-mover’s period-\(t\) belief probability that the second-mover is the tit-for-tat type, with \(p_1\in (0,1)\) representing his prior belief. Updating occurs in equilibrium according to Bayes’ rule. If \(p_t=0\) in any \(t\), then by standard unraveling arguments, both players must play \(D\) in period \(t\) and thereafter. For this game, there exists a unique sequential equilibrium in which there is a belief threshold \(\overline{p}_t\) such that the first-mover is willing to trust the second-mover (by playing C) in period \(t\) if and only if \(p_t\ge \overline{p}_t\). The rational second-mover prefers to maintain a reputation for being the tit-for-tat type by responding to C with C (i.e., conditionally cooperate) until some later round that is dependent on \(p_t\). As the end of the game nears, the expected payoff to the first-mover from continuing to play C declines since there are fewer rounds left and the probability that a rational second-mover plays D increases. In consequence, the belief threshold \(\overline{p}_t\) is strictly increasing in \(t\) up to \(\overline{p}_{10}=4/7\), given the specific stage-game payments shown in Fig. 1.

When \(p_1<\overline{p}_1\), the first-mover will defect in all periods, but when \(p_1>\overline{p}_1\), the first-mover initially trusts the second-mover, who will perfectly imitate a tit-for-tat type if he is rational. Because the second-mover is either tit-for-tat, or imitating a tit-for-tat type, the first-mover’s beliefs do not change in early periods. At some point in time, however, because the belief threshold increases as \(t\) increases, \(\overline{p}_t\) may rise above \(p_1\), at which point the first-mover would stop trusting the second-mover. Let \(\overline{t}\) be the first period in which \(\overline{p}_t>p_1\). The second-mover benefits from being trusted and would prefer to keep \(p_t\) weakly above \(\overline{p}_t\) in every period \(t\ge \overline{t}\). He does this by playing a mixed strategy, so that the first-mover’s beliefs shift up to exactly \(p_{t}=\overline{p}_{t}\) for all \(t\ge \overline{t}\), that is, if the second-mover is observed to conditionally cooperate at \(t\), then the first-mover’s belief that the second-mover is in fact a tit-for-tat type increases, given that a rational second-mover would have defected with some probability. Formally, the second-mover conditionally cooperates in period \(t\) with probability

$$\begin{aligned} q_t^* = \frac{p_t}{1-p_t}\frac{1-\overline{p}_{t+1}}{\overline{p}_{t+1}}, \end{aligned}$$

and the first-mover’s beliefs update to

$$\begin{aligned} p_{t+1} = \left\{ \begin{array}{l@{\quad }l}\frac{p_t}{p_t + q_t(1-p_t)}&{} \text{ if } \text{ C } \text{ is } \text{ realized };\\ 0 &{} \text{ otherwise }.\end{array} \right. \end{aligned}$$

We refer to \(q_t^*\) as the post-threshold probability of cooperation. Since \(p_t\) must continue to increase over time, \(q_t^*\) must decrease accordingly to keep \(p_t\) above \(\overline{p}_t\).

Fig. 2
figure 2

Sequential equilibrium of the sequential-move FRPD with a tit-for-tat type. See the “Appendix” for a proof of the sequential equilibrium and the equations used to derive the figure

When \(p_t=\overline{p}_t\), the first-mover is exactly indifferent between \(C\) and \(D\). Since the second-mover is mixing, he must also be exactly indifferent between \(C\) and \(D\). This is done by having the indifferent first-mover mix between \(C\) and \(D\) in period \(t+1\) with appropriate probabilities. If the first-mover’s realized action is \(D\), then both types of second-mover respond with \(D\) and beliefs do not update. Thereafter, the first-mover does not trust the second-mover (because \(p_{t+1}=p_t=\overline{p}_t<\overline{p}_{t+1}\)), and the second-mover never has a chance to alter beliefs, so defection occurs in every subsequent period.Footnote 14 The structure of the unique sequential equilibrium for a first-mover having belief \(p_1>\overline{p}_1\) is displayed in Fig. 2.

Observe that the path of equilibrium play depends crucially on \(p_1\). If \(p_1> \overline{p}_{10}=4/7\), then no mixing phase is needed; both players play C with certainty until the rational second-mover defects in the final period. If \(p_1\in [\overline{p}_1,\overline{p}_{10}]\), then mixing will begin in period \(\overline{t}\) (the smallest \(t\) such that \(p_1\le \overline{p}_t\)), after which one defection will lead to defection in all subsequent actions. If \(p_1<\overline{p}_1\), then the first-mover will never trust the second-mover, the second-mover will never have an opportunity to alter beliefs, and defection will occur in every action.

Regardless of \(p_1\), the realized path of play must feature a regime shift from cooperation to defection. This shift can be triggered by either player, can occur in the first action, and may never occur if tit-for-tat players truly exist. In the laboratory, beliefs are not directly observable and second-movers’ types are unknown, so the time at which defection begins cannot be predicted without additional data.Footnote 15

4 Experimental design

Our experiment is designed to test directly the reputation-building aspect of the Kreps et al. (1982) theory in a sequential-move, finitely repeated prisoner’s dilemma. In all treatments, 20 subjects are divided into two equal-sized groups: first-movers and second-movers. The sessions are divided into two blocks. In each block, each subject plays five finitely repeated prisoners’ dilemma supergames, with each supergame played against a different subject from the other group.Footnote 16 Through the course of the experiment, each subject will therefore play a supergame with each subject in the other group exactly once (5 in Block 1 and the other 5 in Block 2).Footnote 17 Each supergame is 10 rounds in length.

The sequential-move game allows us to focus on the second-mover and her opportunities for reputation-building. We study this by varying the information structure in two treatments, denoted 2S and 1S. The first block of five supergames is identical across the two treatments. Players see their opponent’s history in the current supergame, but not from any prior supergames. In Block 2 of treatment 2S, subjects in each supergame see their opponents’ entire history of play from their five Block 1 supergames, as well as the play of their opponents’ opponents in these five supergames. Thus, all of the first-mover’s actions, all of the first-mover’s opponents’ actions, all of the second-mover’s actions, and all of the second-mover’s opponents’ actions from Block 1 are revealed to both players in each Block 2 supergame of 2S. It is commonly known that this information is revealed to both players.

In treatment 1S, only the first-mover’s history from Block 1 is revealed to the second-mover (including the first-mover’s actions and the first-mover’s opponents’ actions); the second-mover’s history is not revealed to the first-mover. Again, this revelation structure is commonly known. The purpose of this treatment was to control for changes in behavior between Blocks 1 and 2 that result from revealing the first-mover’s Block 1 history of play to Block 2 second-movers as well as to use as a control for any “restart effect.” Differences between 1S and 2S can then be interpreted as the effect of revealing the second-mover’s Block 1 history of play to Block 2 first-movers (revealing reputations), given that the second-mover is equally informed about the first-mover’s history. The treatment names 2S and 1S are mnemonic for “two-sided” and “one-sided” knowledge of histories, respectively.

At the beginning of Block 1 of both treatments, subjects know they will play five finitely repeated prisoners’ dilemma supergames against five different opponents in Block 1. Subjects are also told at the beginning of Block 1 of both treatments that we will conduct a second experiment immediately following the first in which they may also participate if they want. They are informed that instructions for the second experiment will be distributed after the first experiment concludes. The subjects are not told that they will play prisoners’ dilemmas in the second experiment or that their histories from the Block 1 may be revealed in Block 2. Only at the beginning of the Block 2 do they learn that they will play additional repeated prisoners’ dilemma supergames and that their history from Block 1 will be revealed (depending on the treatment). Subjects have the option of taking their earnings after Block 1 and leaving instead of participating in the Block 2 experiment, and if they participate in the second experiment, they are guaranteed a second minimum show-up payment. All of the subjects chose to stay for the second experiment.

As mentioned in the previous section, we use the same stage-game payoffs as Andreoni and Miller (1993) shown in Fig. 1. Subjects are paid for one randomly selected supergame out of the five supergames in each Block, and this is known in advance.

We use a strategy-elicitation method for second-mover choices, which asks subjects to enter their action conditional on the first-mover defecting (choosing right) and their action conditional on the first-mover cooperating (choosing left) before learning the first-mover’s action in each round. The second-mover’s strategy for a particular round is implemented for them after the first-mover chooses an action for that round. In a survey paper comparing the strategy method to direct response, Brandts and Charness (2011) found that while the two methods can induce different behavior in some experiments, there is generally no significant difference in behavior in prisoner’s dilemma experiments between the two. They conclude from their survey that in cases where behavior differs between the two methods, the strategy method provides a lower bound for treatment effects.Footnote 18

In the sequential-move prisoner’s dilemma, the second-mover’s strategic intention may be censored in rounds where the first-mover defects, which leads to defection by the second-mover as well according to a wide range of strategies. We elicited second-mover choices using the strategy method, so that we would be able to identify the second-mover’s intent to cooperate if reciprocated, regardless of the first-mover’s actual choice. This way, if the first-mover defected and the second-mover responded by defecting in the same round, we would know whether the second-mover would have continued cooperating or unilaterally defected if the first-mover had instead cooperated in that round. All of the second-mover results reported in the paper focus on this conditional cooperation (the second-mover’s choice conditional on cooperation by the first-mover) and not the choice actually implemented for the second-mover in response to the first-mover.

5 Results

We conducted 14 sessions of the experiment at The Ohio State University experimental economics laboratory with a total of 264 subjects: seven sessions of the \(1S\) treatment with 130 subjects and seven of the \(2S\) treatment with 134 subjects. Subjects were chosen randomly from a pool of students at The Ohio State University who had previously signed-up to be considered for participation in economic experiments. Subjects could not participate in more than one session of this experiment. Table 1 provides summary information for the experiment. Payoffs in the experiment were denominated in “points” and converted into dollars at the rate of four points per dollar. Average earnings per subject were approximately $27. We proceed by first analyzing the players’ aggregate behavior and then focus on how information impacts behavior at the individual level.

Table 1 Summary information

5.1 Aggregate behavior

We focus on cooperation by the first-mover and conditional cooperation by the second-mover (choosing to cooperate conditional on the first-mover cooperating) as our outcomes of interest. The first hypothesis, based on Kreps et al. (1982), states that cooperation will be eliminated when the second-mover’s history reveals she is rational.

Hypothesis 1

Compared to Block 2 of 1S, the rate of aggregate cooperation should not be higher in Block 2 of 2S.

Figure 3 displays the paths of aggregate cooperation by first-movers over the course of a supergame. The first row represents the data pooled from each treatment for the first block, the second row represents the Block 2 data for 1S, and the third row represents the Block 2 data for the 2S treatment. Each column represents a supergame in the order in which they were played. The paths of play in Block 1 (1S and 2S pooled) and Block 2 of 1S are quite similar for first-movers as we would expect given that there is no change in the information given to first-movers between these treatments. The path of play in Block 2 of 2S, however, exhibits higher levels of cooperation than the other treatments. Moreover, the level of cooperation increases across supergames, and the path of play shows a clear endgame effect in late supergames of Block 2—something that is not clearly apparent in other supergames where aggregate cooperation declines more steadily by round.

Fig. 3
figure 3

First-mover average cooperation

Figure 4 displays the paths of aggregate second-mover conditional cooperation—cooperate conditional on the first-mover cooperating—over the course of a supergame. The paths of play are again shown by treatment and supergame, as in Fig. 3. The paths of aggregate cooperation over supergames in Block 2 of 1S show more of an endgame effect than those in Block 1 (pooled) since cooperation is maintained in early rounds then drops off more sharply in later rounds. However, round 10 behavior is roughly the same across blocks, suggesting that players view the endgame similarly in both.Footnote 19

The paths of play in Block 2 of both 1S and 2S exhibit more cooperation than occurs in Block 1 with 2S exhibiting even higher levels of cooperation than that in Block 2 of 1S despite first-movers observing the histories of second-movers. That greater overall cooperation and sustained cooperation until later rounds of the supergames are observed when second-mover histories are revealed to first-movers is again inconsistent with the predictions of Kreps et al. (1982).

Result 1

We observe greater aggregate cooperation in Block 2 when second-movers’ histories of play are exposed to first-movers (treatment 2S, compared to 1S).

Table 2 reports the aggregate cooperation frequencies of first-movers and the aggregate conditional cooperation frequencies of second-movers by block and treatment. There is almost no difference in overall cooperation between Block 1 cooperation in 1S and 2S for either first or second-mover. However, not only do both first- and second-movers cooperate more in Block 2 of 2S, but their cooperation rates are slightly higher than that of first- and second-movers in Block 2 of 1S. Wilcoxon–Mann–Whitney tests (bootstrapped to account for clustering by session) confirm that there is no between-treatment difference in Block 1 cooperation rates for first- or second-movers (\(p\) values 0.959 and 0.662, respectively), while there is slightly higher Block 2 cooperation rates in 2S than in 1S (\(p\) values are 0.063 and 0.071 for first- and second-movers, respectively).Footnote 20

5.2 Individual-level behavior

As Cooper et al. (1996) point out, individual behavior provides a better test of theoretical predictions in the FRPD than aggregate cooperation. We now analyze individual-level behavior and classify players into types based on that behavior.

Figure 5a, b plots average cooperation for each subject in Blocks 1 and 2. The first figure shows cooperate rates for first-movers while the second shows conditional cooperation rates for second-movers. Each data point represents a single subject. Subjects in the 1S treatment are black circles, while subjects in the 2S treatment are white. Lack of a treatment effect would express itself in data points scattered symmetrically about the 45-degree line. In both figures, the 1S data are distributed in this manner. However, the 2S data are skewed above the 45-degree line, indicating that revealing Block 1 histories of second-movers increases cooperation for both first- and second-movers. Wilcoxon signed-rank tests for data having within-group correlations (Larocque 2005) show that differences in the frequency of cooperation across blocks are highly significant in 2S (p value = 0.063 for first-movers and \(p \hbox { value}=0.015\) for second-movers) and insignificant in 1S (\(p \hbox { value}=0.222\) for first-movers and \(p\hbox { value}=0.210\) for second-movers).Footnote 21

Fig. 5
figure 5

Average cooperation by mover by subject (all supergames). a First-mover (observed cooperation), b second-mover (conditional cooperation)

We classify players into types within each block to control for the heterogeneity in individual behavior. By focusing on how information differentially impacts players conditional on their type, we can identify whether the behavior of some players is consistent with the reputation-building theory. Importantly, we classify players based on the histories of play only—not on the off-path choices of second-movers—as our main focus is on the information contained in Block 1 histories revealed in Block 2. Furthermore, we classify second-movers based solely on their behavior in rounds in which the first-mover cooperated. Second-movers cannot reveal any information about their types in rounds in which the first-mover defected, as tit for tat and reputation-building players alike would defect in such cases. For the ease of explaining the type classification, rounds in which the first-mover cooperated are referred to as trusting rounds.

The classification procedure works as follows. For each supergame, a second-mover is classified as an Imitator if her behavior is consistent with the reputation-building strategy of Kreps et al. (1982), i.e., she plays cooperate in the first trusting round and continues doing so until some later trusting round (possibly round 10), after which she plays defect in each subsequent trusting round. Otherwise, she is classified as a Cooperator if she cooperates in the first trusting round (but did not play as an Imitator) or as a Defector if she does not cooperate the first trusting round. Her type classification for the block is identified by the mode of her five supergame classifications within that block. In the case of a tie, the most recently used modal type is used.Footnote 22

By this procedure, we arrive at a classification of second-movers that summarizes quite well their Block 1 history of play as observed by first-movers. The overall percentage of Block 1 supergames having a second-mover with a type classification of Imitator, Cooperator, or Defector is 18.2, 41.7, and 39.4 %, respectively.Footnote 23 Based on these type classifications, we have the following hypothesis regarding how a second-mover’s classification will change in Block 2 when her history of play is revealed.

Hypothesis 2

Second-movers in 2S who play as Imitators or Defectors in Block 1 play as Defectors in Block 2 because they are revealed to first-movers as rational.

Table 3 reports the type classifications for second-movers by treatment and block in the form of a Markov transition matrix. The data reveal that transitions between second-mover types between Blocks 1 and 2 in 2S are inconsistent with reputation-building. Surprisingly, nearly half of the Defectors in Block 1 become Cooperators in Block 2 and only 38 % of the Block 1 Defectors remain Defectors in Block 2. Even more striking, none of the second-movers who are Imitators in Block 1 of 2S become Defectors in Block 2. These findings are summarized in the following result.

Table 3 Second-mover-type transition matrix

Result 2

Only 26.3 % of second-movers who are classified as Defectors or Imitators in Block 1 of 2S are classified as Defectors in Block 2, while 55.2 % of them are classified as Cooperators in Block 2.

The first-mover-type classification is slightly different. Kreps et al. (1982) allows for two possible types of first-movers: those who believe the probability of an irrational second-mover is high enough to justify cooperation in round 1 of a supergame, and those who do not. Therefore, we classify a first-mover as Trusting if her modal behavior is to cooperate in round 1 of the five supergames in a given block. Otherwise, a first-mover is classified as Non-Trusting. Based on these type classifications, we have the following hypothesis regarding how a first-mover’s classification will change in Block 2 when the second-mover’s history of play is revealed.

Hypothesis 3

Compared to Block 2 of 1S, first-movers are less likely to be classified as Trusting in Block 2 of 2S.

Table 4 reports type classifications for first-movers by treatment and block in the form of a Markov transition matrix. In 1S, 47.7 % of first-movers are Non-Trusting, and about one-third (32.3 %) of these first-movers become Trusting in Block 2. However, in 2S, more than two-thirds (73.0 %) of first-movers are Non-Trusting in Block 1 (38.8 % of all first-movers in this treatment) transition to Trusting in Block 2. Because this result is not conditional on the revealed type of second-mover opponents, it would not contradict the theoretical predictions if no second-movers were exposed as rational, but some were, so we should expect an aggregate decrease in the number of Trusting first-movers. Instead, we find the opposite: first-movers become more Trusting when histories are revealed.

Table 4 First-mover-type transition matrix

Result 3

A higher proportion of first-movers is Trusting in Block 2 of 2S compared to Block 2 of 1S (\(p=0.007\)), while there is no difference in the proportion of Trusting first-movers in Block 1 (\(p=0.386\));Footnote 24 Additionally, there is a larger increase in the proportion of first-movers who are Trusting from Block 1 to Block 2 in 2S: 25.4 % point increase in 2S (\(p=0.032\)) and a 12.3 (\(p=0.122\)) percentage point increase in 1S.Footnote 25

We now study first-mover reactions to the second-mover’s revealed type. Because first-movers should only cooperate if there is some positive probability that the second-mover is irrational, exposing the second-mover as an Imitator or Defector in Block 1 should convince the first-mover that the second-mover is rational and destroy cooperation in Block 2 of 2S. This is summarized in the following hypothesis.

Hypothesis 4

In Block 2 of 2S, a first-mover whose opponent’s Block 1 history is an Imitator- or Defector type will play as a Non-Trusting type against this opponent.

We test hypothesis 4 by splitting the Block 2 data by the second-mover’s Block 1 type and examining how first-movers respond to their opponents’ revealed histories. Note that in the following analysis, we study Block 2 data using type classifications based on Block 1 data.

First, we look at the case where the second-mover’s history indicates they are a Cooperator type. Figure 6 shows average first-mover cooperation and second-mover conditional cooperation by round in Block 2 games for only those supergames where the second-mover’s Block 1 histories classify her as a Cooperator. The top two panels show the Block 2 cooperation rates in 1S and 2S, respectively, for supergames in which the first-mover is Non-Trusting. The bottom panels are for Block 2 supergames in which the first-mover is Trusting. Both Non-Trusting and Trusting first-movers cooperate more frequently and into later rounds after the second-mover is revealed to be a Cooperator compared to when no histories are revealed to first-movers.

Fig. 6
figure 6

Block 2 cooperation rates when the second-mover is revealed to have been a Cooperator type in Block 1

Figure 7 presents a similar view of the data for Block 2 supergames where the second-movers’ Block 1 histories classify her as an Imitator. First-movers with Trusting Block 1 histories have very similar cooperation rates in each corresponding round across treatments. Block 2 cooperation is much higher in 2S than in 1S for Non-Trusting first-movers, however. Revealing information that indicates that second-movers are willing to cooperate increases first-mover cooperation even if that information reveals that the second-mover is rational.

Fig. 7
figure 7

Block 2 cooperation rates when the second-mover is revealed to have been a Imitator type in Block 1

Figure 8 is similar to Figs. 6 and 7 except data are from Block 2 supergames where the second-movers’ Block 1 histories classify her as a Defector. Initial cooperation by Trusting first-movers in 2S decreases substantially between 1S and 2S when the second-mover is revealed to be a Defector. However, Non-Trusting first-movers cooperate slightly more when the second-mover is revealed to be a Defector. These findings are consistent with the notion that providing the second-mover’s history causes first-movers to update their beliefs regarding the degree of cooperation that they can expect from a first-mover. Gachter and Thoni (2005) find a similar result in their public goods game experiment in which players’ behavior from the first game is revealed to other players in the subsequent series of games (and players were not told this would happen before the first game). They find that when players whose contributions were low in the first game are grouped with other low contributors in subsequent games, they contribute more than when grouped randomly. As in our experiment, this behavior may be due to a failure of backwards induction, combined with updating of pessimistic beliefs that these low contributors hold in the first game, an interpretation consistent with the model of naive beliefs that we develop in Sect. 6.

Fig. 8
figure 8

Block 2 cooperation rates when the second-mover is revealed to have been a Defector type in Block 1

5.3 Rate of first defection

We use logit regressions to test how first-mover cooperation in Block 2 supergames depends on the type of second-mover. Robust standard errors are corrected to account for within-subject and within-session correlations between observations, i.e., the standard errors are clustered by individual first-mover and session. Specifications (i) and (ii) of Table 5 show the impact of revealed player histories on first-mover initial cooperation. Specification (i) explores the impact of revealing a player’s history of play and indicates that first-movers are more trusting when second-movers are observed to have cooperated in Block 1 supergames. Specification(ii) examines the impact that a second-mover’s type has on the likelihood that a first-mover will be Trusting. In this specification, only second-mover types from 2S are identified by first-movers while second-mover types from 1S are unobserved and represent the omitted second-mover type in the regression.Footnote 26 The estimates show that both trusting and Non-Trusting first-movers are more likely to trust in Block 2 when the second-mover is revealed to be a Cooperator or an Imitator, compared to when no history is revealed. However, when facing a Defector, Non-Trusting first-movers are no more likely to trust than when no history is revealed. Regardless of the second-mover’s type, Trusting first-movers are more likely than Non-Trusting first-movers to trust in Block 2 when the second-mover’s history of play is revealed. In this case, even though a second-mover’s model type is Defector, the first-mover may be responding to the fact that the second-mover likely cooperated in some supergames.

Table 5 Effect of player types on first-round cooperation and rounds of cooperation before first defection

Specifications (iii)–(v) of Table 5 explore the impact of revealing player types on the number of periods of joint cooperation before first defection in Block 2 supergames. As the number of rounds of initial cooperation are censored between 0 and 10, the results are estimated using Tobit regressions. Furthermore, in contrast to specifications (i) and (ii), second-movers are classified based on their Block 1 history of play in both 1S and 2S. Although the first-mover cannot classify a second-mover in 1S, it is useful as an explanatory variable as a second-mover’s type will impact the overall degree of cooperation and will help us to assess which player characteristics generated the cooperation rates.

In Sect. 5.1, we analyzed aggregate behavior and showed that there is more cooperation in Block 2 than Block 1 with even higher levels of cooperation in 2S despite first-movers observing the histories of second-movers. Though average cooperation is informative, it can be biased by the player types. If, for example, one treatment included more cooperative types, then this could make it appear that average cooperation increased more for that treatment. Specifications (iii)–(v) are designed to explore how first- and second-mover types relate to the duration of cooperation. Specification (iii) includes all of the data while specification (iv) includes only Trusting first-movers and specification (v) includes only Non-Trusting first-movers to help clarify the impact of the first-movers’ types. The estimate for Cooperator in specification (iii) reveals that playing with a more cooperative second-mover in either 1S or 2S causes a first-mover to cooperate for longer within a supergame, especially when the first-mover is Non-Trusting.Footnote 27 Interestingly, Non-Trusting first-movers are also more likely to cooperate longer with Defector and Imitator second-movers. Given the results in (ii), part of this result is likely driven by Non-Trusting first-movers’ trusting second-movers more after observing that these second-movers did not initially defect. Trusting first-movers, however, do not cooperate as long in 2S when facing a Defector and, to a lesser extent, an Imitator-type second-mover. As these first-movers have shown that they are more willing to trust, even when the second-mover’s history of play is not known, this result can likely be explained by first-movers examining the number of rounds that second-movers cooperated and trying to preempt their defection without performing full backward induction.Footnote 28

Specifications (iv) and (v) help to clarify the impact of the first-mover’s type by restricting the sample to either Trusting or Non-Trusting first-movers. Reinforcing the results in (iii), the estimates for specification (iv) indicate that first-movers update the amount that they are willing to cooperate based on the amount second-movers cooperated in Block 1. Recall that Defector is the omitted category, thus when trusting first-movers encounter a Defector or Imitator in Block 2, they will cooperate for fewer rounds than in 1S but they will also cooperate more with either an Imitator or Cooperator than a Defector in 2S. The results of specification (v) reveal that Non-Trusting first-movers, who were more pessimistic in their Block 1 beliefs (i.e., have a high prior that their opponent is not a tit-for-tat type), cooperate more in Block 2 even when facing a Defector who has revealed that she is not a tit-for-tat type, suggesting that first-movers update their beliefs about how long these second-movers will cooperate based on the history of play. While the amount of additional cooperation for a Non-Trusting first-mover does not depend on second-mover type, revealing the second-mover’s history of play is sufficient to increase average cooperation with a Non-Trusting first-mover.

In summary, both trusting and Non-Trusting first-movers are more trusting when they can view the second-mover’s history of play; Non-Trusting first-movers cooperate longer in 2S than in 1S regardless of the second-mover type; and Trusting first-movers cooperate for fewer periods with Defectors and to a lesser extent with Imitators. These results indicate that cooperation in the finitely repeated prisoners’ dilemma continues even after the opportunity for reputation-building by second-movers is destroyed. We therefore reject Hypothesis 4.

Result 4

Compared to a first-mover whose opponent’s history is not revealed, a first-mover whose opponent’s Block 1 history is revealed as Imitator type (a) is more likely to cooperate in round 1 of a Block 2 supergame, and (b) Non-Trusting first-movers cooperate longer when second-mover histories are revealed while Trusting first-movers do not cooperate for as long when second-movers are revealed to be Defectors or Imitators.

5.4 Beliefs

In some later sessions, we added a belief-elicitation stage at the end of the experiment. Subjects were presented with a sequence of five randomly selected Block 1 histories, consisting of five supergames each for both first- and second-movers from previous sessions of the same type of treatment, that is, if the subject participated in 1S, then she would only see the first-mover’s history of play, and if the subject participated in a 2S treatment, then she saw both histories of play just as was visible in her block 2 games. Since the elicited beliefs were based on the same available information as subjects had in their Block 2 supergames, they should be similar to the beliefs they held when playing the game. For each set of histories, subjects were then asked to state how many rounds they believed these past participants would cooperate before the first defection when matched with one another. For one randomly selected belief-elicitation question, each subject was paid $5 if her stated belief was exactly correct.Footnote 29

Regression results are shown in Table 8 in the “Appendix,” summarizing how elicited beliefs responded to observed cooperation rates in displayed histories as well as the role of the player whose beliefs are elicited. While we find some evidence that elicited beliefs respond to observed cooperation in the expected directions, statistical significance is weak due to small sample size. Furthermore, we find that first- and second-movers report systematically different beliefs, suggesting that elicited beliefs may be biased by the subjects’ own experience in the experiment.

6 A model of naïve beliefs

In contrast to the predictions of the reputation-building theory of Kreps et al. (1982), our experiment shows that there will be substantial cooperation in FRPDs even when players’ previous histories are revealed. Moreover, learning that an opponent imitated a tit-for-tat type frequently increases cooperation, rather than destroying it.

The observed increases in cooperation may be partially motivated by fairness or indirect reciprocity. For example, Ho and Su (2009) utilize an ultimatum game in an experiment and find that roughly half of their subjects value how their offer compares to offers other followers have received. A similar notion of fairness could plausibly affect cooperation in other games, such as the FRPDs played here, when subjects can view how their current opponents treated previous opponents; however, fairness would seem to be a less salient concern in a prisoner’s dilemma setting, where payoffs are determined more symmetrically (i.e., both players have a hand in the outcome), than in an ultimatum game setting, where control over payoffs is heavily unbalanced. Moreover, a concern for fairness would not explain why Non-Trusting first-movers become more Trusting when they observe that their opponent is a Defector. In community games with public reputations, indirect reciprocity has been shown to induce cooperation (see Nowak and Sigmund 2005, for a recent survey), but these games typically feature frequent rematching among small groups of subjects, who play simple, one-shot stage games with unilaterally determined payoffs, making reputation in future matches a dominant concern. In contrast, the matching protocol we employ minimizes the opportunity for indirect reciprocity by generally matching each subject no more than once with each other subject. In addition, the payoff gradient of strategic interactions in an FRPD supergame is large enough that the direct payoff consequences of behavior in a given supergame should reduce the extent to which behavior is motivated by potential payoffs in future supergames.

Though we cannot rule out that cooperation in our experiment may be partially motivated by fairness or indirect reciprocity, we have reason to doubt that they are the primary drivers of the cooperation we observe. For example, our results are consistent with the findings of Kagel and McGee (2014)’s experiment on FRPDs played by teams. They observe an increase in cooperation over time because teams anticipate that their opponent will defect in later rounds. Kagel and McGee find no evidence in team chat logs, however, that teams anticipate opponents’ beliefs and attempt to mimic an irrational player as the Kreps et al. (1982) model predicts. Nor do they find evidence of reciprocity or concerns for fairness. Instead, they find evidence that teams anticipate when their opponents will defect (i.e., their opponents’ actions, not their beliefs) and attempt to defect one round earlier, which they interpret as a failure of common knowledge of rationality. These results suggest that boundedly rational behavior is a primary driver of cooperation in FRPDs. We now propose a simple model of boundedly rational behavior, in which we relax the requirement that players’ prior beliefs be consistent with an opponent’s best response.Footnote 30

As in Kreps et al. (1982), players in our model decide in which round to stop playing tit for tat and begin unconditionally defecting. This is decided by weighing the long-term benefit of cooperation against the risk of the other player defecting first.Footnote 31 The difference between this approach and that of Kreps et al. (1982) is that players do not engage in higher-level reflection about the beliefs of their opponent. Instead, players form “naïve” beliefs which may not be consistent with their opponent’s best response. This is consistent with Kagel and McGee (2014), who provide evidence that players simply try to defect one round before their opponent’s anticipated first defection without reflecting on the beliefs of their opponents. Given arbitrary initial beliefs about how many rounds the opponent will continue playing tit for tat, players update beliefs within the game based on their opponents’ choices using Bayes’ rule and choose the optimal round to stop playing tit for tat and begin unconditionally defecting.Footnote 32

This simple model generates predictions that are consistent with our experimental results. First, the common finding that cooperation in the FRPD does not break down until one or two rounds before the last is consistent with the model if players have uniform or even more pessimistic beliefs.Footnote 33 Second, the first-mover who plays cooperatively in Block 1 may defect earlier in Block 2 supergames after the second-mover’s history of play from Block 1 exposes him as generally uncooperative. Third, there will be as much, if not more, cooperation in Block 2 after a second-mover’s history of play from Block 1 has exposed him as rational. The first two predictions are also consistent with the reputation-building theory of Kreps et al. (1982), but the third is not. These results demonstrate that while the behavior we observe is inconsistent with Kreps et al. (1982), it is rational in a reasonable non-equilibrium sense. While this is not the only model that could explain our experimental data, the recent evidence from Kagel and McGee (2014) indicates that this interpretation is worthy of a more careful exploration.

For a formal description of the model, let rounds be counted backwards. Play begins in round 10 and ends after round 1. We assume that all players adopt a strategy from \(S = \{s_{11},s_{10},\ldots ,s_{1}\}\). A player adopting strategy \(s_{k}\) plays tit for tat in rounds 10 through \(k\) and defects in rounds \(k-1\) through 1. Strategy \(s_{11}\) is defined as defecting in every round. In addition to the Kagel and McGee (2014) chat evidence, Embrey et al. (2014) find that subjects learn with experience to play exactly these strategies, so we believe that restricting the strategy space in this way has empirical foundations.

Players have prior beliefs \(\mu \) over their opponent’s strategies in \(S\). Thus, \(\mu (s_{k})\) is the probability of playing against an opponent using strategy \(s_{k}\). Though \(s_1\) is dominated for a payoff-maximizing agent, we find a non-negligible amount of last-round cooperation in our data. Thus, we assume beliefs \(\mu \) are such that \(\mu (s_{k}) \in (0,1)\) for all \(k\), including \(k=1\).Footnote 34

Players’ beliefs are updated within each supergame round by round according to Bayes’ rule based on the prior \(\mu \) and the opponents’ history of actions in that supergame. If her opponent has cooperated up to and including round \(t+1\), a player believes that her opponent will continue playing tit for tat for at least one more round with probability \(p_{t} = \sum _{i=1}^{t}\mu (s_{i})\big /\sum _{i=1}^{t+1}\mu (s_{i})\). In Proposition 1, we characterize the beliefs for which a naïve player chooses strategy \(s_{k}\) in terms of the conditional probability \(p_{t}\) that the opponent will continue to play tit for tat in round \(t\), given that he has played tit for tat for all previous rounds, for all \(t\) up to round \(k\).

Proposition 1

  1. (a)

    The first-mover plays \(s_{k}\) if and only if

    $$\begin{aligned} p_{l} \ge \frac{4}{\sum _{i=k+1}^{l} \bigl (3 \prod \limits _{j=i}^{l-1} p_{j}\bigr ) + 7 \prod \limits _{i=k}^{l-1} p_{i}} \end{aligned}$$

    for every \(l\in \{k,\ldots ,10\}\).

  2. (b)

    The second-mover plays \(s_{k+1}\) if and only if

    $$\begin{aligned} p_{l} \ge \max \left\{ \frac{1}{3},\frac{5}{\sum _{i=k+1}^{l} \left( 3 \prod \limits _{j=i}^{l-1} p_{j}\right) + 8 \prod \limits _{i=k}^{l-1} p_{i}}\right\} \end{aligned}$$

    for all \(l\in \{k,\ldots ,10\}\).

Proposition 1 provides lower bounds on the beliefs needed to sustain a particular strategy, \(s_{k}\). For each round \(l \ge k\), the subjective probability that the second-mover will play tit for tat until round \(l\) must be high enough that the expected payoff of cooperating in round \(l\) exceeds the payoff that can be obtained by defecting in round \(l\). To build intuition, consider the condition for round \(l = k+1\) given strategy \(s_{k}\). First, notice that the first-mover can always defect in rounds \(k+1\) and \(k\) and obtain a total payoff of 8 from these two rounds (as a rational second-mover would respond by defecting, earning each player a payoff of 4 in each round). By playing tit for tat through round \(k\) instead, the first-mover faces three possible outcomes assuming that the second-mover has also chosen a strategy in \(S\). She may earn payoffs of 0 in round \(k+1\) and 4 in round \(k\) (if the second-mover defects in rounds \(k+1\) and \(k\)), payoffs of 7 in round \(k+1\) and 0 in round \(k\) (if the second-mover cooperates in round \(k+1\) and defects in round \(k\)), or payoffs of 7 in both rounds \(k+1\) and \(k\) (if the second-mover cooperates in both rounds). Thus, the first-mover is guaranteed a payoff of at least 4 from these two rounds. She can gain an additional 4 with certainty by defecting in rounds \(k+1\) and \(k\), but expects that she can gain either an additional 3 or 10 with some probability by playing tit for tat in rounds \(k+1\) and \(k\). If the subjective probability of these cooperation payoffs is sufficiently high, then it is rational for the first-mover to play tit for tat in round \(k+1\).

The second-mover’s strategy is governed by similar belief conditions to the first-mover’s. Consider the second-mover’s condition for round \(l = k+1\) given strategy \(s_{k}\) (assuming that the first-mover cooperates in round \(k+1\)). The second-mover can always defect in rounds \(k+1\), \(k\), and \(k-1\) to obtain a total payoff of 20 from these three rounds (12 in round \(k+1\) and 4 in each of the following two rounds). By playing tit for tat through round \(k\) instead, the second-mover faces three possible outcomes assuming that the first-mover has also chosen a strategy in \(S\). She may earn payoffs of 7 in round \(k+1\) and 4 in rounds \(k\) and \(k-1\) (if the first-mover defects in rounds \(k\) and \(k-1\)), payoffs of 7 in rounds \(k+1\) and \(k\), and 4 in round \(k-1\) (if the first-mover cooperates in round \(k\) and defects in round \(k-1\)), or payoffs of 7 in rounds \(k+1\) and \(k\), and 12 in round \(k-1\) (if the first-mover cooperates in rounds \(k\) and \(k-1\)). Thus, the second-mover is guaranteed a payoff of at least 15 from these three rounds. She can gain an additional 5 with certainty by defecting in rounds \(k+1\), \(k\), and \(k-1\), but expects that she can gain either an additional 3 or 11 with some probability by playing tit for tat in rounds \(k+1\) and \(k\). If the subjective probability of these cooperation payoffs is sufficiently high, then it is rational for the second-mover to play tit for tat in round \(k+1\). However, if the subjective probability that the first-mover plays tit for tat in any round is less than \(\frac{1}{3}\), it is optimal for the second-mover to defect before that round due to the incentive to defect before the first-mover (and earn a payoff of 12 instead of 4 for one round).

The conditions in Proposition 1 are permissive enough that cooperation is sustained into later rounds with a large variety of beliefs. The following examples demonstrate the range of beliefs that can support late-round cooperation.

Example 1

(Uniform prior) Assume that both players have prior beliefs such that \(\mu (s_{k}) = 1/11\) for all \(k\). Then, by Proposition 1, the first-mover’s optimal Block 1 strategy is \(s_{2}\) and the second-mover’s optimal Block 1 strategy is \(s_{3}\).Footnote 35 that is, the first-mover plays tit for tat until the last round, in which she always defects, while the second-mover plays tit for tat up to the next-to-last round and always defects in the last two rounds. Hence, cooperation until the penultimate round is observed.

Example 2

(Pessimistic triangular prior) Assume that both players have prior beliefs \(\mu (s_{k}) = k\big /66\) for all \(k\). Then, by Proposition 1, the first-mover’s optimal Block 1 strategy is \(s_{3}\) and the second-mover’s optimal Block 1 strategy is \(s_{5}\), that is, the first-mover plays tit for tat for eight rounds and then defects, while the second-mover plays tit for tat for six rounds and then defects thereafter. These relatively pessimistic beliefs still support cooperation for more than half of the supergame.

Now, consider how players update their beliefs based on revealed Block 1 histories. We assume that players’ own past opponents’ behavior from either Blocks 1 or 2 does not affect beliefs because players know that they will not face previous opponents again. In Block 2, however, players incorporate their opponents’ histories into their beliefs when available. Let \(\tilde{\mu }\) represent the updated beliefs based on the prior \(\mu \) and the observed Block 1 history. Because players are assumed to adopt a pure strategy from \(S\), beliefs are updated such that if a player’s opponent never defected before her opponent in rounds \(10,\ldots ,n\), then \(\sum _{i=n-1}^{11}\tilde{\mu }(s_{i}) = 0\) holds. If the opponent always defected before her opponent by round \(m\), then \(\sum _{i=1}^{m}\tilde{\mu }(s_{i}) = 0\) holds. If the opponent’s opponent always defected first, then \(\tilde{\mu }(s_{i}) = \mu (s_{i})\) holds for all \(i\).

If players focus on how long they can expect their opponent to cooperate, then first-movers could have a fairly optimistic prior (i.e., first-movers may be trusting in Block 1) and then, upon learning that their opponent was a Defector in Block 1, become more pessimistic and cooperate less in Block 2. This argument is formalized in the following proposition.

Proposition 2

Suppose that the second-mover always defected before her opponent by round \(n\) of Block 1 supergames. If the naïve first-mover’s prior beliefs satisfy \(\mu (s_{k+1}) \le (3\big /4) \sum _{i=1}^{k} \mu (s_{i})\) for all \(k \in \{m,\ldots ,10\}\), where \(m \le n\), then her Block 1 strategy is \(s_{m}\) and her Block 2 strategy is \(s_{m+t}\) for some \(t\ge 1\).

Proposition 2 applies to the set of first-mover prior beliefs such that, for a certain number of rounds beginning with the first, the probability that the second-mover will begin unconditionally defecting in each of these rounds is not more than three-fourths the probability that the second-mover will play tit for tat in that round. This set includes Examples 1 and 2 above and infinitely many others. Given such beliefs, a first-mover who is classified as Trusting in Block 1, playing strategy \(s_{i}\), will respond to a second-mover’s Defector history (\(s_{j}, j \ge i\)) by choosing a strategy \(s_{i+t}, t \ge 1\) in Block 2. In other words, if the first-mover had been playing tit for tat up to round \(i\) in Block 1, she will defect at least one round earlier in Block 2 games against a second-mover whose game history shows that he always defected in round \(i\) or earlier in Block 1. Again, this prediction is consistent with our data, which shows a significant decrease in cooperation by Trusting first-movers whose opponents are revealed as Defectors compared to those whose opponents’ types are not revealed. Proposition 2 also predicts earlier defections in a given supergame by Trusting first-movers when the second-mover is revealed to be a Defector than when no information is revealed, as observed in the data.

In contrast, a first-mover may become more optimistic about how long he can expect to cooperate when his opponent is revealed to be an Imitator through her Block 1 history. In this way, a first-mover will choose to cooperate longer upon seeing that his opponent was cooperative in Block 1. The following proposition formalizes this argument.

Proposition 3

Suppose that the second-mover never defected before her opponent in rounds \(10,\ldots ,m\) of Block 1 supergames.

  1. (a)

    If the naïve first-mover’s prior beliefs satisfy \(\mu (s_{k+1}) \le (3\big /4) \sum _{i=1}^{k} \mu (s_{i})\) for all \(k \in \{n,\ldots ,10\}\), where \(n > m\), then her Block 1 strategy is \(s_{n}\) and her Block 2 strategy is \(s_{n-t}\) for some \(t\ge 1\).

  2. (b)

    If the naïve first-mover’s prior beliefs satisfy \(\mu (s_{11}) > 3\big /7\), then her Block 1 strategy is \(s_{11}\) and her Block 2 strategy is \(s_{11-t}\) for some \(t\ge 1\).

Proposition 3 applies to the same set of prior beliefs as Proposition 2 as well as beliefs under which the first-mover defects in every round of Block 1 supergames. Given these beliefs, a first-mover plays strategy \(s_{11}\), is classified as Non-Trusting in Block 1, and responds to a second-mover’s Imitator-type history (\(s_{i}, i \le 10\)) by choosing a strategy \(s_{j}, j \le 10\) in Block 2. The simple intuition for this result is that if the first-mover had been playing tit for tat up to round \(i\) in Block 1, she will continue to play tit for tat at least one round later in Block 2 games against a second-mover whose game history shows that he always played tit for tat beyond round \(i\) in Block 1. This prediction is consistent with our data, which shows a significant increase in initial cooperation by Non-Trusting first-movers when their opponents are revealed as Imitators compared to those whose opponents’ types are not revealed, a finding that is clearly inconsistent with the predictions of the Kreps et al. (1982) model. This model may also provide a more plausible explanation than Kreps et al. (1982) for the finding of Gachter and Thoni (2005)’s public goods experiment, where low contributors contribute more when grouped with other players revealed to have made low contributions in the past.

7 Conclusion

We have shown that cooperation in FRPDs occurs when the reputation-building theory of Kreps et al. (1982) predicts complete unraveling. The results of the experiment indicate that first-movers change their strategy when they observe their opponent’s history of play by either increasing or decreasing their degree of cooperation based on the relative cooperativeness of their opponent. First-movers tend to cooperate at least as often initially and continue cooperating at least as long when second-mover histories are revealed, except in the case of relatively trusting first-movers meeting relatively uncooperative second-movers. Second-movers also tend to behave more cooperatively when their histories are revealed. In particular, we find the surprising result that revealing histories improve cooperation even in the case of a relatively Non-Trusting first-mover meeting a relatively uncooperative second-mover. Thus, cooperation persists and often increases, even when revealed histories are relatively uncooperative. These results are clearly inconsistent with the reputation-building theory.

We show that an alternative behavioral model to Kreps et al. (1982) generates predictions that are consistent with the features of our experimental data. Players in this model form beliefs over the strategies of their opponents, which may not be consistent with the opponent’s best response, and then choose the optimal strategy based on those naïve beliefs. We do not view this as the ultimate model of behavior in FRPDs, but as a simple and reasonable one which generates predictions that fit the observed behavior better than prevailing equilibrium models. By using such a simple model, we avoid ad hoc assumptions about more specific behavioral types which could possibly fit behavior in this game more precisely. One limitation of this analysis is that beliefs are a critical part of our behavioral model, but we are able to observe beliefs only in a very limited way. Because our main hypotheses could be tested without elicited beliefs, and because eliciting beliefs before or during gameplay is complicated and may itself alter beliefs and behavior, we opted not to do so. Examining beliefs in more depth may be an interesting direction for future research.