Abstract
In a sequential-move, finitely repeated prisoners’ dilemma game (FRPD), cooperation can be sustained if the first-mover believes her opponent might be a behavioral type who plays a tit-for-tat strategy in every period. We test this theory by revealing second-mover histories from an earlier FRPD experiment to their current opponent. Despite eliminating the possibility of reputation-building, aggregate cooperation actually increases when histories are revealed. Cooperative histories lead to increased trust, but negative histories do not cause decreased trust. We develop a behavioral model to explain these findings.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Cooperation in the finitely repeated prisoner’s dilemma (FRPD) is both widely observed and difficult to rationalize. The model developed by Kreps et al. (1982)—the most prominent theory to justify this behavior—shows that such cooperation can be rational if one player believes her opponent might be a “behavioral type” who plays the tit-for-tat strategy regardless of the history of play. The opponent can then take advantage of these beliefs by imitating the behavioral type early in the game and defecting as the final period approaches, resulting in a pattern of cooperation consistent with aggregate-level data from laboratory experiments (e.g., Andreoni and Miller 1993).
This reputation-building theory requires that players have some uncertainty about their opponents’ types. In our experiment, however, we find that cooperative play persists even when reputation-building is rendered impossible by revealing players’ histories of play. Specifically, we have subjects who completed a block of five sequential-move FRPD games against varying opponents, play a second block of five games against new opponents, who can see the subjects’ histories of play from the first block.Footnote 1 Selfish, rational players cannot credibly imitate the behavioral type in the second block because their true colors have been revealed through their history of play in the first block. Despite eliminating type uncertainty, we find aggregate patterns of cooperation very similar to treatments where no history is revealed. On the individual level, first-movers who are relatively distrusting (seldom cooperating in the first block) tend to be more cooperative in the second block, even when the second-mover’s revealed history is relatively uncooperative. Hence, rather than reducing cooperation by eliminating the opportunity for reputation-building, revealing histories of play generally improves cooperation. This finding is clearly inconsistent with standard reputation-building explanations of cooperation in FRPDs.
We organize these results through a model of semi-rational behavior similar to those of Kreps et al. (1982) and Radner (1986). Here, players decide in which round to stop playing tit-for-tat and begin unconditionally defecting by weighing the risk of cooperation against the immediate gain from defecting. Unlike Kreps et al. (1982), players in our model form arbitrary or “naïve” prior beliefs about how many rounds their opponents will continue playing tit-for-tat, which may be inconsistent with the opponent’s actual strategy. Players decide how long to conditionally cooperate in each game based only on these naïve prior beliefs and information about their opponents’ history of play, if revealed. Because this model does not assume any higher-level reflection about the rationality or best response of the opponent, it provides a contrasting benchmark to a model of full, commonly known rationality.
One implication of this model is that cooperation is sustainable for many rounds even when players have relatively pessimistic beliefs about their opponents’ strategies. For example, the model predicts that cooperation will be sustained until the penultimate round if the players have a uniform prior, i.e., if players believe their opponent will defect with equal probability in every round of the FRPD. In stark contrast to the reputation-building theory, the model also predicts that the level of cooperation will be the same or even higher when players learn that their opponent is not a behavioral type. This pattern of behavior is frequently observed in our experiment and sufficient to reject reputation-building as an explanation for cooperation. Hence, we find that this semi-rational model of naïve beliefs rationalizes the observed experimental behavior.
The paper is organized as follows. In Sect. 2, we review the related theoretical and experimental literatures. Section 3 contains a description of the reputation-building theory of cooperation in FRPDs. Section 4 details the experimental design, and the results of the experiment are presented in Sect. 5. In Sect. 6, we propose a model of boundedly rational cooperation that explains the experimental results. Section 7 concludes with a summary of the main results. The “Appendix” contains the proofs and additional summary statistics.
2 Related literature
Many experiments have studied the consistency of players’ behavior in FRPDs with the theory of Kreps et al. (1982). For example, Andreoni and Miller (1993) compare the amount of cooperation in 10-round FRPDs with one-shot prisoners’ dilemma games. They find significantly higher cooperation in the FRPDs compared to the one-shot games, as well as significantly higher cooperation in early rounds compared to later rounds. These patterns are consistent with the reputation-building theory of Kreps et al. (1982) at an aggregate level, though in a similar FRPD experiment, Cooper et al. (1996) observe that, at the individual level, only 25 % of subjects play consistently with reputation-building. Cooper et al. argue that the time path of play exhibits more cooperation than the Kreps et al. model predicts and speculate that their findings could indicate reputation-building if they were to consider alternative types of “irrational” players.Footnote 2
Camerer and Weigelt (1988) study the reputation-building sequential equilibrium in a finitely repeated investment game that is similar in structure to our FRPD game. The authors randomly assign a small fraction of second-movers to have payoffs such that they prefer cooperation over defecting. This exogenously induces the behavioral type. They find evidence consistent with the equilibrium prediction: as time progresses, first-movers are less likely to trust and second-movers are more likely to defect. The observed mixing probabilities are different than predicted by the induced probability of the behavioral type, but can be explained easily by assuming first-movers believe an additional 17 % of second-movers with the “noncooperative” incentives still prefer cooperation. To test this theory, the authors run additional experiments in which all second-movers were given noncooperative payments. The sequential equilibrium prediction assuming beliefs of 17 % is calculated, and the data from the new experiments conform surprisingly well to that prediction. Thus, Camerer and Weigelt (1988) provide strong evidence in favor of a reputation-building theory in which the first-mover’s beliefs are “homemade” naturally and need not be induced in the laboratory.
Neral and Ochs (1992) extend Camerer and Weigelt (1988) by analyzing the behavioral responses of players to changes in the parameters of the game. Like Camerer and Weigelt, they find that uncertainty induces players to develop a mutually profitable relationship consistent with the predictions of sequential equilibrium. However, Neral and Ochs find problems with the comparative static predictions of the sequential equilibrium model. When the parameters of the game are altered (e.g., decrease the payoff to second-mover), they find that the players respond in the exact opposite direction from what the theory predicts and that these results cannot be explained using the homemade priors specified by Camerer and Weigelt.Footnote 3
More recently, Reuben and Suetens (2012) find that many players cooperate in a manner consistent with Kreps et al. (1982). Reuben and Suetens use the strategy method to disentangle strategically and non-strategically motivated cooperation in a sequential prisoner’s dilemma with an uncertain end point in which cooperation is not an equilibrium strategy for rational players.Footnote 4 Players in their experiment can condition their action on whether they are currently playing the last period of the game or whether the game will continue. Second-movers who cooperate as long as the first-mover cooperates unless it is the final round are classified as reputation builders. Among second-movers who cooperate, the authors find that one-third to two-thirds do so for reputation-building rather than reciprocity. Moreover, Reuben and Suetens find that an increase in the payoff of mutual cooperation increases the ratio of reputation builders to unconditional defectors, consistent with Kreps et al. (1982).
Kagel and McGee (2014) compare individual play and team play in the finitely repeated prisoners’ dilemma. In the team play treatment, each player in the game is a 2-person team who can chat internally (but cannot communicate with opponents) and makes joint decisions. They find that teams are initially less cooperative than individuals, but with experience become more cooperative. Kagel and McGee’s analysis of team chat logs suggests that cooperation is driven by a failure of common knowledge of rationality, as teams attempt to anticipate when their opponents might defect and try to defect one period earlier, without accounting for the possibility of their opponents thinking similarly. Inconsistent with Kreps et al. (1982), they find no evidence that players anticipate opponents’ beliefs and attempt to mimic an irrational player, while they find that increased chat about the round in which the opponent will defect is accompanied by an increase in cooperation over the course of the experiment. This finding is also consistent with Embrey et al. (2014), who find that players learn to play “threshold strategies” in which they conditionally cooperate for a fixed length of time and then defect. Though players converge to these strategies, unraveling of cooperation occurs very slowly across supergames. Subjects clearly are not applying backwards-induction reasoning.
Other experiments have examined how cooperation in one-shot prisoners’ dilemma games is impacted when players see their opponents’ play history. Schwartz et al. (2000), Camera and Casari (2009), and Gong and Yang (2010) all find that observing an opponent’s history of play significantly increases cooperation, though Duffy and Ochs (2009) do not observe this effect in their data. In all of these studies, subjects are aware that their actions will be revealed to future opponents. Thus, players can follow a reputation-building strategy even though their opponents differ in every period. In contrast, because subjects in our experiment are not told in the first part of the experiment that their play histories will be revealed later, they are unlikely to perceive any benefit from cooperating in the final period of each repeated game in the first part.
Alternative models have been proposed that predict rational cooperation in FRPDs. For example, Selten and Stoecker (1986) develop an alternative theory based on a Markov learning model. In their model, players establish a period of first defection and update their period of intended defection based on their experience in the previous supergame. The authors conduct an experiment in which subjects play twenty-five 10-period FRPD supergames and find that cooperation ends in earlier periods as subjects gain experience.Footnote 5 Subjects in their experiment were told that they will play each opponent only once and were given no information about opponents’ histories of play in prior supergames. Hence, their data do not address the validity of reputation-building directly, nor are their results inconsistent with reputation-building.Footnote 6 By revealing second-mover histories in one treatment and not in another, our design allows a more direct examination of how a player’s intended period of first defection and beliefs about her opponent’s intentions matter for cooperation.Footnote 7
Other studies have focused on the role of reputation in inducing cooperation, though not in the context of an FRPD game. Gachter and Thoni (2005) and Ambrus and Pathak (2011) show how cooperation can be sustained in a public goods contribution game when some players are selfish and others are reciprocating with varying information on other players’ past behavior. As in our design, Ambrus and Pathak incorporate a restart in their experimental design to see how it impacts cooperation, as do Gachter and Thoni (2005). In Gachter and Thoni (2005), the past behavior of subjects is revealed, and the response to these “reputations” in the public goods game is studied. In Ambrus and Pathak, players in their experiment know in advance that they will be participating in subsequent games, but in Gachter and Thoni (2005), like our experiment, they do not. Bolton et al. (2005) examine how information about their partner in an image scoring game affects cooperation, while Irlenbusch and Sliwka (2005) study the role of reputation and uncertainty about the partner’s type in inducing cooperation in a gift-exchange setting. The former find that providing players with more information about their partner’s last action as well as the action of their partner’s previous partner increases cooperation while the latter find that direct reciprocal behavior is stronger when efforts are revealed. Healy (2007) considers a reputation-building equilibrium when firms stereotype workers and find that selfish workers imitate fair-minded types when firms have sufficiently high priors to generate cooperation. Similarly, Roe and Wu (2009) find evidence for the reputation-building equilibrium by finding that employees classified as selfish mimic cooperative employees when individual histories are observable, but not when histories are kept private. Andreoni and Croson (1998) survey repeated public goods game experiments and find little evidence of reputation-building behavior. Contrary to the expected result that reputation-building should lead to higher contributions when players’ opponents are fixed, this is not always the case (Andreoni 1988), and a restart effect seems more influential than the matching regime in raising contributions.Footnote 8
3 A reputation-based theory of cooperation
In this section, we apply the theory developed by Kreps et al. (1982) to the sequential-move FRPD played by the participants in our experiment and characterize the optimal strategies.Footnote 9 \(^{,}\) Footnote 10 A single stage of the sequential-move FRPD played in our experiment is shown in Fig. 1. As with Kreps et al. (1982), we assume the first-mover believes the second-mover may be a tit-for-tat behavior type with positive probability and a rational second-mover is aware of (and can take advantage of) this belief.Footnote 11 The tit-for-tat type always reciprocates the first-mover’s action, regardless of the period.Footnote 12 In the following analysis, we therefore focus only on the rational, payoff-maximizing second-mover’s decisions.Footnote 13
Let \(p_t\) be the first-mover’s period-\(t\) belief probability that the second-mover is the tit-for-tat type, with \(p_1\in (0,1)\) representing his prior belief. Updating occurs in equilibrium according to Bayes’ rule. If \(p_t=0\) in any \(t\), then by standard unraveling arguments, both players must play \(D\) in period \(t\) and thereafter. For this game, there exists a unique sequential equilibrium in which there is a belief threshold \(\overline{p}_t\) such that the first-mover is willing to trust the second-mover (by playing C) in period \(t\) if and only if \(p_t\ge \overline{p}_t\). The rational second-mover prefers to maintain a reputation for being the tit-for-tat type by responding to C with C (i.e., conditionally cooperate) until some later round that is dependent on \(p_t\). As the end of the game nears, the expected payoff to the first-mover from continuing to play C declines since there are fewer rounds left and the probability that a rational second-mover plays D increases. In consequence, the belief threshold \(\overline{p}_t\) is strictly increasing in \(t\) up to \(\overline{p}_{10}=4/7\), given the specific stage-game payments shown in Fig. 1.
When \(p_1<\overline{p}_1\), the first-mover will defect in all periods, but when \(p_1>\overline{p}_1\), the first-mover initially trusts the second-mover, who will perfectly imitate a tit-for-tat type if he is rational. Because the second-mover is either tit-for-tat, or imitating a tit-for-tat type, the first-mover’s beliefs do not change in early periods. At some point in time, however, because the belief threshold increases as \(t\) increases, \(\overline{p}_t\) may rise above \(p_1\), at which point the first-mover would stop trusting the second-mover. Let \(\overline{t}\) be the first period in which \(\overline{p}_t>p_1\). The second-mover benefits from being trusted and would prefer to keep \(p_t\) weakly above \(\overline{p}_t\) in every period \(t\ge \overline{t}\). He does this by playing a mixed strategy, so that the first-mover’s beliefs shift up to exactly \(p_{t}=\overline{p}_{t}\) for all \(t\ge \overline{t}\), that is, if the second-mover is observed to conditionally cooperate at \(t\), then the first-mover’s belief that the second-mover is in fact a tit-for-tat type increases, given that a rational second-mover would have defected with some probability. Formally, the second-mover conditionally cooperates in period \(t\) with probability
and the first-mover’s beliefs update to
We refer to \(q_t^*\) as the post-threshold probability of cooperation. Since \(p_t\) must continue to increase over time, \(q_t^*\) must decrease accordingly to keep \(p_t\) above \(\overline{p}_t\).
When \(p_t=\overline{p}_t\), the first-mover is exactly indifferent between \(C\) and \(D\). Since the second-mover is mixing, he must also be exactly indifferent between \(C\) and \(D\). This is done by having the indifferent first-mover mix between \(C\) and \(D\) in period \(t+1\) with appropriate probabilities. If the first-mover’s realized action is \(D\), then both types of second-mover respond with \(D\) and beliefs do not update. Thereafter, the first-mover does not trust the second-mover (because \(p_{t+1}=p_t=\overline{p}_t<\overline{p}_{t+1}\)), and the second-mover never has a chance to alter beliefs, so defection occurs in every subsequent period.Footnote 14 The structure of the unique sequential equilibrium for a first-mover having belief \(p_1>\overline{p}_1\) is displayed in Fig. 2.
Observe that the path of equilibrium play depends crucially on \(p_1\). If \(p_1> \overline{p}_{10}=4/7\), then no mixing phase is needed; both players play C with certainty until the rational second-mover defects in the final period. If \(p_1\in [\overline{p}_1,\overline{p}_{10}]\), then mixing will begin in period \(\overline{t}\) (the smallest \(t\) such that \(p_1\le \overline{p}_t\)), after which one defection will lead to defection in all subsequent actions. If \(p_1<\overline{p}_1\), then the first-mover will never trust the second-mover, the second-mover will never have an opportunity to alter beliefs, and defection will occur in every action.
Regardless of \(p_1\), the realized path of play must feature a regime shift from cooperation to defection. This shift can be triggered by either player, can occur in the first action, and may never occur if tit-for-tat players truly exist. In the laboratory, beliefs are not directly observable and second-movers’ types are unknown, so the time at which defection begins cannot be predicted without additional data.Footnote 15
4 Experimental design
Our experiment is designed to test directly the reputation-building aspect of the Kreps et al. (1982) theory in a sequential-move, finitely repeated prisoner’s dilemma. In all treatments, 20 subjects are divided into two equal-sized groups: first-movers and second-movers. The sessions are divided into two blocks. In each block, each subject plays five finitely repeated prisoners’ dilemma supergames, with each supergame played against a different subject from the other group.Footnote 16 Through the course of the experiment, each subject will therefore play a supergame with each subject in the other group exactly once (5 in Block 1 and the other 5 in Block 2).Footnote 17 Each supergame is 10 rounds in length.
The sequential-move game allows us to focus on the second-mover and her opportunities for reputation-building. We study this by varying the information structure in two treatments, denoted 2S and 1S. The first block of five supergames is identical across the two treatments. Players see their opponent’s history in the current supergame, but not from any prior supergames. In Block 2 of treatment 2S, subjects in each supergame see their opponents’ entire history of play from their five Block 1 supergames, as well as the play of their opponents’ opponents in these five supergames. Thus, all of the first-mover’s actions, all of the first-mover’s opponents’ actions, all of the second-mover’s actions, and all of the second-mover’s opponents’ actions from Block 1 are revealed to both players in each Block 2 supergame of 2S. It is commonly known that this information is revealed to both players.
In treatment 1S, only the first-mover’s history from Block 1 is revealed to the second-mover (including the first-mover’s actions and the first-mover’s opponents’ actions); the second-mover’s history is not revealed to the first-mover. Again, this revelation structure is commonly known. The purpose of this treatment was to control for changes in behavior between Blocks 1 and 2 that result from revealing the first-mover’s Block 1 history of play to Block 2 second-movers as well as to use as a control for any “restart effect.” Differences between 1S and 2S can then be interpreted as the effect of revealing the second-mover’s Block 1 history of play to Block 2 first-movers (revealing reputations), given that the second-mover is equally informed about the first-mover’s history. The treatment names 2S and 1S are mnemonic for “two-sided” and “one-sided” knowledge of histories, respectively.
At the beginning of Block 1 of both treatments, subjects know they will play five finitely repeated prisoners’ dilemma supergames against five different opponents in Block 1. Subjects are also told at the beginning of Block 1 of both treatments that we will conduct a second experiment immediately following the first in which they may also participate if they want. They are informed that instructions for the second experiment will be distributed after the first experiment concludes. The subjects are not told that they will play prisoners’ dilemmas in the second experiment or that their histories from the Block 1 may be revealed in Block 2. Only at the beginning of the Block 2 do they learn that they will play additional repeated prisoners’ dilemma supergames and that their history from Block 1 will be revealed (depending on the treatment). Subjects have the option of taking their earnings after Block 1 and leaving instead of participating in the Block 2 experiment, and if they participate in the second experiment, they are guaranteed a second minimum show-up payment. All of the subjects chose to stay for the second experiment.
As mentioned in the previous section, we use the same stage-game payoffs as Andreoni and Miller (1993) shown in Fig. 1. Subjects are paid for one randomly selected supergame out of the five supergames in each Block, and this is known in advance.
We use a strategy-elicitation method for second-mover choices, which asks subjects to enter their action conditional on the first-mover defecting (choosing right) and their action conditional on the first-mover cooperating (choosing left) before learning the first-mover’s action in each round. The second-mover’s strategy for a particular round is implemented for them after the first-mover chooses an action for that round. In a survey paper comparing the strategy method to direct response, Brandts and Charness (2011) found that while the two methods can induce different behavior in some experiments, there is generally no significant difference in behavior in prisoner’s dilemma experiments between the two. They conclude from their survey that in cases where behavior differs between the two methods, the strategy method provides a lower bound for treatment effects.Footnote 18
In the sequential-move prisoner’s dilemma, the second-mover’s strategic intention may be censored in rounds where the first-mover defects, which leads to defection by the second-mover as well according to a wide range of strategies. We elicited second-mover choices using the strategy method, so that we would be able to identify the second-mover’s intent to cooperate if reciprocated, regardless of the first-mover’s actual choice. This way, if the first-mover defected and the second-mover responded by defecting in the same round, we would know whether the second-mover would have continued cooperating or unilaterally defected if the first-mover had instead cooperated in that round. All of the second-mover results reported in the paper focus on this conditional cooperation (the second-mover’s choice conditional on cooperation by the first-mover) and not the choice actually implemented for the second-mover in response to the first-mover.
5 Results
We conducted 14 sessions of the experiment at The Ohio State University experimental economics laboratory with a total of 264 subjects: seven sessions of the \(1S\) treatment with 130 subjects and seven of the \(2S\) treatment with 134 subjects. Subjects were chosen randomly from a pool of students at The Ohio State University who had previously signed-up to be considered for participation in economic experiments. Subjects could not participate in more than one session of this experiment. Table 1 provides summary information for the experiment. Payoffs in the experiment were denominated in “points” and converted into dollars at the rate of four points per dollar. Average earnings per subject were approximately $27. We proceed by first analyzing the players’ aggregate behavior and then focus on how information impacts behavior at the individual level.
5.1 Aggregate behavior
We focus on cooperation by the first-mover and conditional cooperation by the second-mover (choosing to cooperate conditional on the first-mover cooperating) as our outcomes of interest. The first hypothesis, based on Kreps et al. (1982), states that cooperation will be eliminated when the second-mover’s history reveals she is rational.
Hypothesis 1
Compared to Block 2 of 1S, the rate of aggregate cooperation should not be higher in Block 2 of 2S.
Figure 3 displays the paths of aggregate cooperation by first-movers over the course of a supergame. The first row represents the data pooled from each treatment for the first block, the second row represents the Block 2 data for 1S, and the third row represents the Block 2 data for the 2S treatment. Each column represents a supergame in the order in which they were played. The paths of play in Block 1 (1S and 2S pooled) and Block 2 of 1S are quite similar for first-movers as we would expect given that there is no change in the information given to first-movers between these treatments. The path of play in Block 2 of 2S, however, exhibits higher levels of cooperation than the other treatments. Moreover, the level of cooperation increases across supergames, and the path of play shows a clear endgame effect in late supergames of Block 2—something that is not clearly apparent in other supergames where aggregate cooperation declines more steadily by round.
Figure 4 displays the paths of aggregate second-mover conditional cooperation—cooperate conditional on the first-mover cooperating—over the course of a supergame. The paths of play are again shown by treatment and supergame, as in Fig. 3. The paths of aggregate cooperation over supergames in Block 2 of 1S show more of an endgame effect than those in Block 1 (pooled) since cooperation is maintained in early rounds then drops off more sharply in later rounds. However, round 10 behavior is roughly the same across blocks, suggesting that players view the endgame similarly in both.Footnote 19
The paths of play in Block 2 of both 1S and 2S exhibit more cooperation than occurs in Block 1 with 2S exhibiting even higher levels of cooperation than that in Block 2 of 1S despite first-movers observing the histories of second-movers. That greater overall cooperation and sustained cooperation until later rounds of the supergames are observed when second-mover histories are revealed to first-movers is again inconsistent with the predictions of Kreps et al. (1982).
Result 1
We observe greater aggregate cooperation in Block 2 when second-movers’ histories of play are exposed to first-movers (treatment 2S, compared to 1S).
Table 2 reports the aggregate cooperation frequencies of first-movers and the aggregate conditional cooperation frequencies of second-movers by block and treatment. There is almost no difference in overall cooperation between Block 1 cooperation in 1S and 2S for either first or second-mover. However, not only do both first- and second-movers cooperate more in Block 2 of 2S, but their cooperation rates are slightly higher than that of first- and second-movers in Block 2 of 1S. Wilcoxon–Mann–Whitney tests (bootstrapped to account for clustering by session) confirm that there is no between-treatment difference in Block 1 cooperation rates for first- or second-movers (\(p\) values 0.959 and 0.662, respectively), while there is slightly higher Block 2 cooperation rates in 2S than in 1S (\(p\) values are 0.063 and 0.071 for first- and second-movers, respectively).Footnote 20
5.2 Individual-level behavior
As Cooper et al. (1996) point out, individual behavior provides a better test of theoretical predictions in the FRPD than aggregate cooperation. We now analyze individual-level behavior and classify players into types based on that behavior.
Figure 5a, b plots average cooperation for each subject in Blocks 1 and 2. The first figure shows cooperate rates for first-movers while the second shows conditional cooperation rates for second-movers. Each data point represents a single subject. Subjects in the 1S treatment are black circles, while subjects in the 2S treatment are white. Lack of a treatment effect would express itself in data points scattered symmetrically about the 45-degree line. In both figures, the 1S data are distributed in this manner. However, the 2S data are skewed above the 45-degree line, indicating that revealing Block 1 histories of second-movers increases cooperation for both first- and second-movers. Wilcoxon signed-rank tests for data having within-group correlations (Larocque 2005) show that differences in the frequency of cooperation across blocks are highly significant in 2S (p value = 0.063 for first-movers and \(p \hbox { value}=0.015\) for second-movers) and insignificant in 1S (\(p \hbox { value}=0.222\) for first-movers and \(p\hbox { value}=0.210\) for second-movers).Footnote 21
We classify players into types within each block to control for the heterogeneity in individual behavior. By focusing on how information differentially impacts players conditional on their type, we can identify whether the behavior of some players is consistent with the reputation-building theory. Importantly, we classify players based on the histories of play only—not on the off-path choices of second-movers—as our main focus is on the information contained in Block 1 histories revealed in Block 2. Furthermore, we classify second-movers based solely on their behavior in rounds in which the first-mover cooperated. Second-movers cannot reveal any information about their types in rounds in which the first-mover defected, as tit for tat and reputation-building players alike would defect in such cases. For the ease of explaining the type classification, rounds in which the first-mover cooperated are referred to as trusting rounds.
The classification procedure works as follows. For each supergame, a second-mover is classified as an Imitator if her behavior is consistent with the reputation-building strategy of Kreps et al. (1982), i.e., she plays cooperate in the first trusting round and continues doing so until some later trusting round (possibly round 10), after which she plays defect in each subsequent trusting round. Otherwise, she is classified as a Cooperator if she cooperates in the first trusting round (but did not play as an Imitator) or as a Defector if she does not cooperate the first trusting round. Her type classification for the block is identified by the mode of her five supergame classifications within that block. In the case of a tie, the most recently used modal type is used.Footnote 22
By this procedure, we arrive at a classification of second-movers that summarizes quite well their Block 1 history of play as observed by first-movers. The overall percentage of Block 1 supergames having a second-mover with a type classification of Imitator, Cooperator, or Defector is 18.2, 41.7, and 39.4 %, respectively.Footnote 23 Based on these type classifications, we have the following hypothesis regarding how a second-mover’s classification will change in Block 2 when her history of play is revealed.
Hypothesis 2
Second-movers in 2S who play as Imitators or Defectors in Block 1 play as Defectors in Block 2 because they are revealed to first-movers as rational.
Table 3 reports the type classifications for second-movers by treatment and block in the form of a Markov transition matrix. The data reveal that transitions between second-mover types between Blocks 1 and 2 in 2S are inconsistent with reputation-building. Surprisingly, nearly half of the Defectors in Block 1 become Cooperators in Block 2 and only 38 % of the Block 1 Defectors remain Defectors in Block 2. Even more striking, none of the second-movers who are Imitators in Block 1 of 2S become Defectors in Block 2. These findings are summarized in the following result.
Result 2
Only 26.3 % of second-movers who are classified as Defectors or Imitators in Block 1 of 2S are classified as Defectors in Block 2, while 55.2 % of them are classified as Cooperators in Block 2.
The first-mover-type classification is slightly different. Kreps et al. (1982) allows for two possible types of first-movers: those who believe the probability of an irrational second-mover is high enough to justify cooperation in round 1 of a supergame, and those who do not. Therefore, we classify a first-mover as Trusting if her modal behavior is to cooperate in round 1 of the five supergames in a given block. Otherwise, a first-mover is classified as Non-Trusting. Based on these type classifications, we have the following hypothesis regarding how a first-mover’s classification will change in Block 2 when the second-mover’s history of play is revealed.
Hypothesis 3
Compared to Block 2 of 1S, first-movers are less likely to be classified as Trusting in Block 2 of 2S.
Table 4 reports type classifications for first-movers by treatment and block in the form of a Markov transition matrix. In 1S, 47.7 % of first-movers are Non-Trusting, and about one-third (32.3 %) of these first-movers become Trusting in Block 2. However, in 2S, more than two-thirds (73.0 %) of first-movers are Non-Trusting in Block 1 (38.8 % of all first-movers in this treatment) transition to Trusting in Block 2. Because this result is not conditional on the revealed type of second-mover opponents, it would not contradict the theoretical predictions if no second-movers were exposed as rational, but some were, so we should expect an aggregate decrease in the number of Trusting first-movers. Instead, we find the opposite: first-movers become more Trusting when histories are revealed.
Result 3
A higher proportion of first-movers is Trusting in Block 2 of 2S compared to Block 2 of 1S (\(p=0.007\)), while there is no difference in the proportion of Trusting first-movers in Block 1 (\(p=0.386\));Footnote 24 Additionally, there is a larger increase in the proportion of first-movers who are Trusting from Block 1 to Block 2 in 2S: 25.4 % point increase in 2S (\(p=0.032\)) and a 12.3 (\(p=0.122\)) percentage point increase in 1S.Footnote 25
We now study first-mover reactions to the second-mover’s revealed type. Because first-movers should only cooperate if there is some positive probability that the second-mover is irrational, exposing the second-mover as an Imitator or Defector in Block 1 should convince the first-mover that the second-mover is rational and destroy cooperation in Block 2 of 2S. This is summarized in the following hypothesis.
Hypothesis 4
In Block 2 of 2S, a first-mover whose opponent’s Block 1 history is an Imitator- or Defector type will play as a Non-Trusting type against this opponent.
We test hypothesis 4 by splitting the Block 2 data by the second-mover’s Block 1 type and examining how first-movers respond to their opponents’ revealed histories. Note that in the following analysis, we study Block 2 data using type classifications based on Block 1 data.
First, we look at the case where the second-mover’s history indicates they are a Cooperator type. Figure 6 shows average first-mover cooperation and second-mover conditional cooperation by round in Block 2 games for only those supergames where the second-mover’s Block 1 histories classify her as a Cooperator. The top two panels show the Block 2 cooperation rates in 1S and 2S, respectively, for supergames in which the first-mover is Non-Trusting. The bottom panels are for Block 2 supergames in which the first-mover is Trusting. Both Non-Trusting and Trusting first-movers cooperate more frequently and into later rounds after the second-mover is revealed to be a Cooperator compared to when no histories are revealed to first-movers.
Figure 7 presents a similar view of the data for Block 2 supergames where the second-movers’ Block 1 histories classify her as an Imitator. First-movers with Trusting Block 1 histories have very similar cooperation rates in each corresponding round across treatments. Block 2 cooperation is much higher in 2S than in 1S for Non-Trusting first-movers, however. Revealing information that indicates that second-movers are willing to cooperate increases first-mover cooperation even if that information reveals that the second-mover is rational.
Figure 8 is similar to Figs. 6 and 7 except data are from Block 2 supergames where the second-movers’ Block 1 histories classify her as a Defector. Initial cooperation by Trusting first-movers in 2S decreases substantially between 1S and 2S when the second-mover is revealed to be a Defector. However, Non-Trusting first-movers cooperate slightly more when the second-mover is revealed to be a Defector. These findings are consistent with the notion that providing the second-mover’s history causes first-movers to update their beliefs regarding the degree of cooperation that they can expect from a first-mover. Gachter and Thoni (2005) find a similar result in their public goods game experiment in which players’ behavior from the first game is revealed to other players in the subsequent series of games (and players were not told this would happen before the first game). They find that when players whose contributions were low in the first game are grouped with other low contributors in subsequent games, they contribute more than when grouped randomly. As in our experiment, this behavior may be due to a failure of backwards induction, combined with updating of pessimistic beliefs that these low contributors hold in the first game, an interpretation consistent with the model of naive beliefs that we develop in Sect. 6.
5.3 Rate of first defection
We use logit regressions to test how first-mover cooperation in Block 2 supergames depends on the type of second-mover. Robust standard errors are corrected to account for within-subject and within-session correlations between observations, i.e., the standard errors are clustered by individual first-mover and session. Specifications (i) and (ii) of Table 5 show the impact of revealed player histories on first-mover initial cooperation. Specification (i) explores the impact of revealing a player’s history of play and indicates that first-movers are more trusting when second-movers are observed to have cooperated in Block 1 supergames. Specification(ii) examines the impact that a second-mover’s type has on the likelihood that a first-mover will be Trusting. In this specification, only second-mover types from 2S are identified by first-movers while second-mover types from 1S are unobserved and represent the omitted second-mover type in the regression.Footnote 26 The estimates show that both trusting and Non-Trusting first-movers are more likely to trust in Block 2 when the second-mover is revealed to be a Cooperator or an Imitator, compared to when no history is revealed. However, when facing a Defector, Non-Trusting first-movers are no more likely to trust than when no history is revealed. Regardless of the second-mover’s type, Trusting first-movers are more likely than Non-Trusting first-movers to trust in Block 2 when the second-mover’s history of play is revealed. In this case, even though a second-mover’s model type is Defector, the first-mover may be responding to the fact that the second-mover likely cooperated in some supergames.
Specifications (iii)–(v) of Table 5 explore the impact of revealing player types on the number of periods of joint cooperation before first defection in Block 2 supergames. As the number of rounds of initial cooperation are censored between 0 and 10, the results are estimated using Tobit regressions. Furthermore, in contrast to specifications (i) and (ii), second-movers are classified based on their Block 1 history of play in both 1S and 2S. Although the first-mover cannot classify a second-mover in 1S, it is useful as an explanatory variable as a second-mover’s type will impact the overall degree of cooperation and will help us to assess which player characteristics generated the cooperation rates.
In Sect. 5.1, we analyzed aggregate behavior and showed that there is more cooperation in Block 2 than Block 1 with even higher levels of cooperation in 2S despite first-movers observing the histories of second-movers. Though average cooperation is informative, it can be biased by the player types. If, for example, one treatment included more cooperative types, then this could make it appear that average cooperation increased more for that treatment. Specifications (iii)–(v) are designed to explore how first- and second-mover types relate to the duration of cooperation. Specification (iii) includes all of the data while specification (iv) includes only Trusting first-movers and specification (v) includes only Non-Trusting first-movers to help clarify the impact of the first-movers’ types. The estimate for Cooperator in specification (iii) reveals that playing with a more cooperative second-mover in either 1S or 2S causes a first-mover to cooperate for longer within a supergame, especially when the first-mover is Non-Trusting.Footnote 27 Interestingly, Non-Trusting first-movers are also more likely to cooperate longer with Defector and Imitator second-movers. Given the results in (ii), part of this result is likely driven by Non-Trusting first-movers’ trusting second-movers more after observing that these second-movers did not initially defect. Trusting first-movers, however, do not cooperate as long in 2S when facing a Defector and, to a lesser extent, an Imitator-type second-mover. As these first-movers have shown that they are more willing to trust, even when the second-mover’s history of play is not known, this result can likely be explained by first-movers examining the number of rounds that second-movers cooperated and trying to preempt their defection without performing full backward induction.Footnote 28
Specifications (iv) and (v) help to clarify the impact of the first-mover’s type by restricting the sample to either Trusting or Non-Trusting first-movers. Reinforcing the results in (iii), the estimates for specification (iv) indicate that first-movers update the amount that they are willing to cooperate based on the amount second-movers cooperated in Block 1. Recall that Defector is the omitted category, thus when trusting first-movers encounter a Defector or Imitator in Block 2, they will cooperate for fewer rounds than in 1S but they will also cooperate more with either an Imitator or Cooperator than a Defector in 2S. The results of specification (v) reveal that Non-Trusting first-movers, who were more pessimistic in their Block 1 beliefs (i.e., have a high prior that their opponent is not a tit-for-tat type), cooperate more in Block 2 even when facing a Defector who has revealed that she is not a tit-for-tat type, suggesting that first-movers update their beliefs about how long these second-movers will cooperate based on the history of play. While the amount of additional cooperation for a Non-Trusting first-mover does not depend on second-mover type, revealing the second-mover’s history of play is sufficient to increase average cooperation with a Non-Trusting first-mover.
In summary, both trusting and Non-Trusting first-movers are more trusting when they can view the second-mover’s history of play; Non-Trusting first-movers cooperate longer in 2S than in 1S regardless of the second-mover type; and Trusting first-movers cooperate for fewer periods with Defectors and to a lesser extent with Imitators. These results indicate that cooperation in the finitely repeated prisoners’ dilemma continues even after the opportunity for reputation-building by second-movers is destroyed. We therefore reject Hypothesis 4.
Result 4
Compared to a first-mover whose opponent’s history is not revealed, a first-mover whose opponent’s Block 1 history is revealed as Imitator type (a) is more likely to cooperate in round 1 of a Block 2 supergame, and (b) Non-Trusting first-movers cooperate longer when second-mover histories are revealed while Trusting first-movers do not cooperate for as long when second-movers are revealed to be Defectors or Imitators.
5.4 Beliefs
In some later sessions, we added a belief-elicitation stage at the end of the experiment. Subjects were presented with a sequence of five randomly selected Block 1 histories, consisting of five supergames each for both first- and second-movers from previous sessions of the same type of treatment, that is, if the subject participated in 1S, then she would only see the first-mover’s history of play, and if the subject participated in a 2S treatment, then she saw both histories of play just as was visible in her block 2 games. Since the elicited beliefs were based on the same available information as subjects had in their Block 2 supergames, they should be similar to the beliefs they held when playing the game. For each set of histories, subjects were then asked to state how many rounds they believed these past participants would cooperate before the first defection when matched with one another. For one randomly selected belief-elicitation question, each subject was paid $5 if her stated belief was exactly correct.Footnote 29
Regression results are shown in Table 8 in the “Appendix,” summarizing how elicited beliefs responded to observed cooperation rates in displayed histories as well as the role of the player whose beliefs are elicited. While we find some evidence that elicited beliefs respond to observed cooperation in the expected directions, statistical significance is weak due to small sample size. Furthermore, we find that first- and second-movers report systematically different beliefs, suggesting that elicited beliefs may be biased by the subjects’ own experience in the experiment.
6 A model of naïve beliefs
In contrast to the predictions of the reputation-building theory of Kreps et al. (1982), our experiment shows that there will be substantial cooperation in FRPDs even when players’ previous histories are revealed. Moreover, learning that an opponent imitated a tit-for-tat type frequently increases cooperation, rather than destroying it.
The observed increases in cooperation may be partially motivated by fairness or indirect reciprocity. For example, Ho and Su (2009) utilize an ultimatum game in an experiment and find that roughly half of their subjects value how their offer compares to offers other followers have received. A similar notion of fairness could plausibly affect cooperation in other games, such as the FRPDs played here, when subjects can view how their current opponents treated previous opponents; however, fairness would seem to be a less salient concern in a prisoner’s dilemma setting, where payoffs are determined more symmetrically (i.e., both players have a hand in the outcome), than in an ultimatum game setting, where control over payoffs is heavily unbalanced. Moreover, a concern for fairness would not explain why Non-Trusting first-movers become more Trusting when they observe that their opponent is a Defector. In community games with public reputations, indirect reciprocity has been shown to induce cooperation (see Nowak and Sigmund 2005, for a recent survey), but these games typically feature frequent rematching among small groups of subjects, who play simple, one-shot stage games with unilaterally determined payoffs, making reputation in future matches a dominant concern. In contrast, the matching protocol we employ minimizes the opportunity for indirect reciprocity by generally matching each subject no more than once with each other subject. In addition, the payoff gradient of strategic interactions in an FRPD supergame is large enough that the direct payoff consequences of behavior in a given supergame should reduce the extent to which behavior is motivated by potential payoffs in future supergames.
Though we cannot rule out that cooperation in our experiment may be partially motivated by fairness or indirect reciprocity, we have reason to doubt that they are the primary drivers of the cooperation we observe. For example, our results are consistent with the findings of Kagel and McGee (2014)’s experiment on FRPDs played by teams. They observe an increase in cooperation over time because teams anticipate that their opponent will defect in later rounds. Kagel and McGee find no evidence in team chat logs, however, that teams anticipate opponents’ beliefs and attempt to mimic an irrational player as the Kreps et al. (1982) model predicts. Nor do they find evidence of reciprocity or concerns for fairness. Instead, they find evidence that teams anticipate when their opponents will defect (i.e., their opponents’ actions, not their beliefs) and attempt to defect one round earlier, which they interpret as a failure of common knowledge of rationality. These results suggest that boundedly rational behavior is a primary driver of cooperation in FRPDs. We now propose a simple model of boundedly rational behavior, in which we relax the requirement that players’ prior beliefs be consistent with an opponent’s best response.Footnote 30
As in Kreps et al. (1982), players in our model decide in which round to stop playing tit for tat and begin unconditionally defecting. This is decided by weighing the long-term benefit of cooperation against the risk of the other player defecting first.Footnote 31 The difference between this approach and that of Kreps et al. (1982) is that players do not engage in higher-level reflection about the beliefs of their opponent. Instead, players form “naïve” beliefs which may not be consistent with their opponent’s best response. This is consistent with Kagel and McGee (2014), who provide evidence that players simply try to defect one round before their opponent’s anticipated first defection without reflecting on the beliefs of their opponents. Given arbitrary initial beliefs about how many rounds the opponent will continue playing tit for tat, players update beliefs within the game based on their opponents’ choices using Bayes’ rule and choose the optimal round to stop playing tit for tat and begin unconditionally defecting.Footnote 32
This simple model generates predictions that are consistent with our experimental results. First, the common finding that cooperation in the FRPD does not break down until one or two rounds before the last is consistent with the model if players have uniform or even more pessimistic beliefs.Footnote 33 Second, the first-mover who plays cooperatively in Block 1 may defect earlier in Block 2 supergames after the second-mover’s history of play from Block 1 exposes him as generally uncooperative. Third, there will be as much, if not more, cooperation in Block 2 after a second-mover’s history of play from Block 1 has exposed him as rational. The first two predictions are also consistent with the reputation-building theory of Kreps et al. (1982), but the third is not. These results demonstrate that while the behavior we observe is inconsistent with Kreps et al. (1982), it is rational in a reasonable non-equilibrium sense. While this is not the only model that could explain our experimental data, the recent evidence from Kagel and McGee (2014) indicates that this interpretation is worthy of a more careful exploration.
For a formal description of the model, let rounds be counted backwards. Play begins in round 10 and ends after round 1. We assume that all players adopt a strategy from \(S = \{s_{11},s_{10},\ldots ,s_{1}\}\). A player adopting strategy \(s_{k}\) plays tit for tat in rounds 10 through \(k\) and defects in rounds \(k-1\) through 1. Strategy \(s_{11}\) is defined as defecting in every round. In addition to the Kagel and McGee (2014) chat evidence, Embrey et al. (2014) find that subjects learn with experience to play exactly these strategies, so we believe that restricting the strategy space in this way has empirical foundations.
Players have prior beliefs \(\mu \) over their opponent’s strategies in \(S\). Thus, \(\mu (s_{k})\) is the probability of playing against an opponent using strategy \(s_{k}\). Though \(s_1\) is dominated for a payoff-maximizing agent, we find a non-negligible amount of last-round cooperation in our data. Thus, we assume beliefs \(\mu \) are such that \(\mu (s_{k}) \in (0,1)\) for all \(k\), including \(k=1\).Footnote 34
Players’ beliefs are updated within each supergame round by round according to Bayes’ rule based on the prior \(\mu \) and the opponents’ history of actions in that supergame. If her opponent has cooperated up to and including round \(t+1\), a player believes that her opponent will continue playing tit for tat for at least one more round with probability \(p_{t} = \sum _{i=1}^{t}\mu (s_{i})\big /\sum _{i=1}^{t+1}\mu (s_{i})\). In Proposition 1, we characterize the beliefs for which a naïve player chooses strategy \(s_{k}\) in terms of the conditional probability \(p_{t}\) that the opponent will continue to play tit for tat in round \(t\), given that he has played tit for tat for all previous rounds, for all \(t\) up to round \(k\).
Proposition 1
-
(a)
The first-mover plays \(s_{k}\) if and only if
$$\begin{aligned} p_{l} \ge \frac{4}{\sum _{i=k+1}^{l} \bigl (3 \prod \limits _{j=i}^{l-1} p_{j}\bigr ) + 7 \prod \limits _{i=k}^{l-1} p_{i}} \end{aligned}$$for every \(l\in \{k,\ldots ,10\}\).
-
(b)
The second-mover plays \(s_{k+1}\) if and only if
$$\begin{aligned} p_{l} \ge \max \left\{ \frac{1}{3},\frac{5}{\sum _{i=k+1}^{l} \left( 3 \prod \limits _{j=i}^{l-1} p_{j}\right) + 8 \prod \limits _{i=k}^{l-1} p_{i}}\right\} \end{aligned}$$for all \(l\in \{k,\ldots ,10\}\).
Proposition 1 provides lower bounds on the beliefs needed to sustain a particular strategy, \(s_{k}\). For each round \(l \ge k\), the subjective probability that the second-mover will play tit for tat until round \(l\) must be high enough that the expected payoff of cooperating in round \(l\) exceeds the payoff that can be obtained by defecting in round \(l\). To build intuition, consider the condition for round \(l = k+1\) given strategy \(s_{k}\). First, notice that the first-mover can always defect in rounds \(k+1\) and \(k\) and obtain a total payoff of 8 from these two rounds (as a rational second-mover would respond by defecting, earning each player a payoff of 4 in each round). By playing tit for tat through round \(k\) instead, the first-mover faces three possible outcomes assuming that the second-mover has also chosen a strategy in \(S\). She may earn payoffs of 0 in round \(k+1\) and 4 in round \(k\) (if the second-mover defects in rounds \(k+1\) and \(k\)), payoffs of 7 in round \(k+1\) and 0 in round \(k\) (if the second-mover cooperates in round \(k+1\) and defects in round \(k\)), or payoffs of 7 in both rounds \(k+1\) and \(k\) (if the second-mover cooperates in both rounds). Thus, the first-mover is guaranteed a payoff of at least 4 from these two rounds. She can gain an additional 4 with certainty by defecting in rounds \(k+1\) and \(k\), but expects that she can gain either an additional 3 or 10 with some probability by playing tit for tat in rounds \(k+1\) and \(k\). If the subjective probability of these cooperation payoffs is sufficiently high, then it is rational for the first-mover to play tit for tat in round \(k+1\).
The second-mover’s strategy is governed by similar belief conditions to the first-mover’s. Consider the second-mover’s condition for round \(l = k+1\) given strategy \(s_{k}\) (assuming that the first-mover cooperates in round \(k+1\)). The second-mover can always defect in rounds \(k+1\), \(k\), and \(k-1\) to obtain a total payoff of 20 from these three rounds (12 in round \(k+1\) and 4 in each of the following two rounds). By playing tit for tat through round \(k\) instead, the second-mover faces three possible outcomes assuming that the first-mover has also chosen a strategy in \(S\). She may earn payoffs of 7 in round \(k+1\) and 4 in rounds \(k\) and \(k-1\) (if the first-mover defects in rounds \(k\) and \(k-1\)), payoffs of 7 in rounds \(k+1\) and \(k\), and 4 in round \(k-1\) (if the first-mover cooperates in round \(k\) and defects in round \(k-1\)), or payoffs of 7 in rounds \(k+1\) and \(k\), and 12 in round \(k-1\) (if the first-mover cooperates in rounds \(k\) and \(k-1\)). Thus, the second-mover is guaranteed a payoff of at least 15 from these three rounds. She can gain an additional 5 with certainty by defecting in rounds \(k+1\), \(k\), and \(k-1\), but expects that she can gain either an additional 3 or 11 with some probability by playing tit for tat in rounds \(k+1\) and \(k\). If the subjective probability of these cooperation payoffs is sufficiently high, then it is rational for the second-mover to play tit for tat in round \(k+1\). However, if the subjective probability that the first-mover plays tit for tat in any round is less than \(\frac{1}{3}\), it is optimal for the second-mover to defect before that round due to the incentive to defect before the first-mover (and earn a payoff of 12 instead of 4 for one round).
The conditions in Proposition 1 are permissive enough that cooperation is sustained into later rounds with a large variety of beliefs. The following examples demonstrate the range of beliefs that can support late-round cooperation.
Example 1
(Uniform prior) Assume that both players have prior beliefs such that \(\mu (s_{k}) = 1/11\) for all \(k\). Then, by Proposition 1, the first-mover’s optimal Block 1 strategy is \(s_{2}\) and the second-mover’s optimal Block 1 strategy is \(s_{3}\).Footnote 35 that is, the first-mover plays tit for tat until the last round, in which she always defects, while the second-mover plays tit for tat up to the next-to-last round and always defects in the last two rounds. Hence, cooperation until the penultimate round is observed.
Example 2
(Pessimistic triangular prior) Assume that both players have prior beliefs \(\mu (s_{k}) = k\big /66\) for all \(k\). Then, by Proposition 1, the first-mover’s optimal Block 1 strategy is \(s_{3}\) and the second-mover’s optimal Block 1 strategy is \(s_{5}\), that is, the first-mover plays tit for tat for eight rounds and then defects, while the second-mover plays tit for tat for six rounds and then defects thereafter. These relatively pessimistic beliefs still support cooperation for more than half of the supergame.
Now, consider how players update their beliefs based on revealed Block 1 histories. We assume that players’ own past opponents’ behavior from either Blocks 1 or 2 does not affect beliefs because players know that they will not face previous opponents again. In Block 2, however, players incorporate their opponents’ histories into their beliefs when available. Let \(\tilde{\mu }\) represent the updated beliefs based on the prior \(\mu \) and the observed Block 1 history. Because players are assumed to adopt a pure strategy from \(S\), beliefs are updated such that if a player’s opponent never defected before her opponent in rounds \(10,\ldots ,n\), then \(\sum _{i=n-1}^{11}\tilde{\mu }(s_{i}) = 0\) holds. If the opponent always defected before her opponent by round \(m\), then \(\sum _{i=1}^{m}\tilde{\mu }(s_{i}) = 0\) holds. If the opponent’s opponent always defected first, then \(\tilde{\mu }(s_{i}) = \mu (s_{i})\) holds for all \(i\).
If players focus on how long they can expect their opponent to cooperate, then first-movers could have a fairly optimistic prior (i.e., first-movers may be trusting in Block 1) and then, upon learning that their opponent was a Defector in Block 1, become more pessimistic and cooperate less in Block 2. This argument is formalized in the following proposition.
Proposition 2
Suppose that the second-mover always defected before her opponent by round \(n\) of Block 1 supergames. If the naïve first-mover’s prior beliefs satisfy \(\mu (s_{k+1}) \le (3\big /4) \sum _{i=1}^{k} \mu (s_{i})\) for all \(k \in \{m,\ldots ,10\}\), where \(m \le n\), then her Block 1 strategy is \(s_{m}\) and her Block 2 strategy is \(s_{m+t}\) for some \(t\ge 1\).
Proposition 2 applies to the set of first-mover prior beliefs such that, for a certain number of rounds beginning with the first, the probability that the second-mover will begin unconditionally defecting in each of these rounds is not more than three-fourths the probability that the second-mover will play tit for tat in that round. This set includes Examples 1 and 2 above and infinitely many others. Given such beliefs, a first-mover who is classified as Trusting in Block 1, playing strategy \(s_{i}\), will respond to a second-mover’s Defector history (\(s_{j}, j \ge i\)) by choosing a strategy \(s_{i+t}, t \ge 1\) in Block 2. In other words, if the first-mover had been playing tit for tat up to round \(i\) in Block 1, she will defect at least one round earlier in Block 2 games against a second-mover whose game history shows that he always defected in round \(i\) or earlier in Block 1. Again, this prediction is consistent with our data, which shows a significant decrease in cooperation by Trusting first-movers whose opponents are revealed as Defectors compared to those whose opponents’ types are not revealed. Proposition 2 also predicts earlier defections in a given supergame by Trusting first-movers when the second-mover is revealed to be a Defector than when no information is revealed, as observed in the data.
In contrast, a first-mover may become more optimistic about how long he can expect to cooperate when his opponent is revealed to be an Imitator through her Block 1 history. In this way, a first-mover will choose to cooperate longer upon seeing that his opponent was cooperative in Block 1. The following proposition formalizes this argument.
Proposition 3
Suppose that the second-mover never defected before her opponent in rounds \(10,\ldots ,m\) of Block 1 supergames.
-
(a)
If the naïve first-mover’s prior beliefs satisfy \(\mu (s_{k+1}) \le (3\big /4) \sum _{i=1}^{k} \mu (s_{i})\) for all \(k \in \{n,\ldots ,10\}\), where \(n > m\), then her Block 1 strategy is \(s_{n}\) and her Block 2 strategy is \(s_{n-t}\) for some \(t\ge 1\).
-
(b)
If the naïve first-mover’s prior beliefs satisfy \(\mu (s_{11}) > 3\big /7\), then her Block 1 strategy is \(s_{11}\) and her Block 2 strategy is \(s_{11-t}\) for some \(t\ge 1\).
Proposition 3 applies to the same set of prior beliefs as Proposition 2 as well as beliefs under which the first-mover defects in every round of Block 1 supergames. Given these beliefs, a first-mover plays strategy \(s_{11}\), is classified as Non-Trusting in Block 1, and responds to a second-mover’s Imitator-type history (\(s_{i}, i \le 10\)) by choosing a strategy \(s_{j}, j \le 10\) in Block 2. The simple intuition for this result is that if the first-mover had been playing tit for tat up to round \(i\) in Block 1, she will continue to play tit for tat at least one round later in Block 2 games against a second-mover whose game history shows that he always played tit for tat beyond round \(i\) in Block 1. This prediction is consistent with our data, which shows a significant increase in initial cooperation by Non-Trusting first-movers when their opponents are revealed as Imitators compared to those whose opponents’ types are not revealed, a finding that is clearly inconsistent with the predictions of the Kreps et al. (1982) model. This model may also provide a more plausible explanation than Kreps et al. (1982) for the finding of Gachter and Thoni (2005)’s public goods experiment, where low contributors contribute more when grouped with other players revealed to have made low contributions in the past.
7 Conclusion
We have shown that cooperation in FRPDs occurs when the reputation-building theory of Kreps et al. (1982) predicts complete unraveling. The results of the experiment indicate that first-movers change their strategy when they observe their opponent’s history of play by either increasing or decreasing their degree of cooperation based on the relative cooperativeness of their opponent. First-movers tend to cooperate at least as often initially and continue cooperating at least as long when second-mover histories are revealed, except in the case of relatively trusting first-movers meeting relatively uncooperative second-movers. Second-movers also tend to behave more cooperatively when their histories are revealed. In particular, we find the surprising result that revealing histories improve cooperation even in the case of a relatively Non-Trusting first-mover meeting a relatively uncooperative second-mover. Thus, cooperation persists and often increases, even when revealed histories are relatively uncooperative. These results are clearly inconsistent with the reputation-building theory.
We show that an alternative behavioral model to Kreps et al. (1982) generates predictions that are consistent with the features of our experimental data. Players in this model form beliefs over the strategies of their opponents, which may not be consistent with the opponent’s best response, and then choose the optimal strategy based on those naïve beliefs. We do not view this as the ultimate model of behavior in FRPDs, but as a simple and reasonable one which generates predictions that fit the observed behavior better than prevailing equilibrium models. By using such a simple model, we avoid ad hoc assumptions about more specific behavioral types which could possibly fit behavior in this game more precisely. One limitation of this analysis is that beliefs are a critical part of our behavioral model, but we are able to observe beliefs only in a very limited way. Because our main hypotheses could be tested without elicited beliefs, and because eliciting beliefs before or during gameplay is complicated and may itself alter beliefs and behavior, we opted not to do so. Examining beliefs in more depth may be an interesting direction for future research.
Notes
During the first block, subjects know that there will be an optional second experiment but know nothing about its nature.
In contrast to the finitely repeated case, experimental evidence has shown that cooperation in the infinitely repeated prisoners’ dilemma aligns well with theoretical predictions. For example, Roth and Murnighan (1978) and Murnighan and Roth (1983) study behavior in indefinitely repeated prisoners’ dilemma experiments and find behavioral differences predicted by standard folk theorem equilibria. More recently, Dal Bó (2005) finds experimental evidence that greater cooperation occurs in an indefinitely repeated prisoner’s dilemma with the same expected length as a finitely repeated control and Dal Bó and Frechette (2011) find evidence that subgame perfection is a necessary (but not sufficient) condition in supporting cooperation in an indefinitely repeated prisoners’ dilemma.
Jung et al. (1994) analyze the sequential equilibrium of a chain-store game that shared some features with Camerer and Weigelt’s borrower–lender game and also find discrepancies with the theory that cannot be resolved with an appeal to homemade beliefs. Similarly, both Brandts and Figueras (2003) and Tingley and Walter (2011) find higher rates of cooperation than predicted by reputation-building in shorter finitely repeated games.
Reuben and Suetens ensure that cooperation is not a rational strategy by setting the probability that the game terminates below the threshold required for cooperation to constitute a subgame perfect equilibrium.
Selten and Stoecker use parameter estimates from the first 20 supergames to predict the outcomes of the last five supergames and find strong agreement between the predictions and actual outcomes.
Subjects in Selten and Stoeker’s experiment participated in 6-person matching groups so could have learned their opponents’ types over 25 repetitions. Since subjects were told that they would play each opponent only once, however, they ruled this possibility out.
In an fMRI study of a 10-period trust game—which is similar to the FRPD game—King-Casas et al. (2005) find that second-movers’ brains eventually signal the intent to cooperate before the first-movers’ actions are revealed. They also become more accurate in predicting first-movers’ actions. This is consistent with the hypothesis that players build a model of their opponent over time, though the data are not informative about the content of that model. A related idea is explored by Kahn and Murnighan (1993), who conduct an experiment on FRPDs in which they explicitly induce uncertainty about opponents’ types by varying their pecuniary payoffs. They find that “weak” players (players for whom defection is not a dominant strategy in the stage game) are more cooperative than “strong” players (with typical prisoners’ dilemma payoffs), and that uncertainty about opponents’ payoffs increases cooperation for “weak” players.
Samuelson (1987) shows that cooperation can be sustained for at least some periods when the assumption that the number of periods is common knowledge is relaxed. Following this approach, Normann and Wallace (2012) experimentally compare repeated prisoners’ dilemma games with known, random, and ambiguous number of periods, finding no significant differences in cooperation. An experiment by Bruttel et al. (2012) studies an FRPD in which the number of periods is uncertain. They find that cooperation breaks down closer to the final round than in a baseline treatment with a commonly known finite horizon. They also find that many players cooperated after they were privately informed about the number of remaining periods. In the current study, the number of periods is publicly announced to all subjects to eliminate such uncertainty.
The repeated sequential-move game also has the advantage of tractability and yields a unique sequential equilibrium.
That is, similar to Kreps et al. (1982), the first-mover’s prior is common knowledge. This assumption could be relaxed. For example, one could model the first-mover’s prior as coming from a distribution of priors, in which case the second-mover’s optimal strategy will be a function of the first-mover’s expected prior. The distribution for this expectation may change as \(t\) increases too, but the intuition for the optimal strategies remains largely the same and revealing oneself as rational removes the uncertainty over the second-mover’s type.
Such beliefs are certainly justified, given that tit-for-tat play is often observed in experimental data; see Andreoni and Miller (1993), for example.
We will more simply refer to a player that has the objective of maximizing his own payoff as a rational player.
The second-mover cannot attempt to restore cooperation by playing C in response to D, for this would reveal his true rationality with certainty and result in defection in all subsequent periods.
An alternative model is one in which second-movers begin the game as tit-for-tat players but randomly “wake up” in some period and become rational as they realize the end of the game is approaching. This would be identical to the current model, except \(p_t\) would be decreasing rather than constant in early periods. Further details are available upon request.
Having subjects play multiple supergames in each block allows them to become familiar with the game and, more importantly, allows second-movers to reveal more information about their types in Block 1. In any single supergame, it is possible that an individual second-mover might face a very uncooperative first-mover, so that no information about the second-mover’s type would be revealed.
In a couple of sessions, less than 20 subjects participated. In these sessions, subjects played against a different subject from the other group until they had played against all of them once. In the remaining supergames of Block 2, subjects were matched randomly with one of the subjects had already faced in Block 1 but not yet faced in Block 2.
Experiments have also been conducted on the (one-shot) sequential prisoners’ dilemma, and they generally show little difference from simultaneous-move setups. Bolle and Ockenfels (1990) found little difference in cooperation levels between simultaneous and sequential one-shot prisoners’ dilemma using the strategy method to elicit second-mover strategies. Brandts and Charness (2000) found no significant difference in cooperation between the sequential one-shot prisoner’s dilemma using the strategy method and direct response. Blanco et al. (2011) used the strategy method with role uncertainty in several information conditions and belief-elicitation treatments to show that correlation between strategies in different roles is driven partially (but not completely) by a consensus effect. Clark and Sefton (2001) examined sequential prisoners’ dilemma games with varying levels of temptation and overall stakes in both the United States and the United Kingdom. They found substantial cooperation levels in early rounds, which diminished by the tenth and final round. They also found that second-movers were much more likely to cooperate if the first-mover cooperated, but this tendency also decreased across rounds and with higher temptation levels. Higher overall stakes lead to a slight increase in second-mover reciprocal cooperation in the United Kingdom, but a decrease in reciprocal cooperation in the Unite States.
For both first and second-movers in both treatments, round 10 cooperation/conditional cooperation is never significantly higher in Block 2 than in Block 1 (Wilcoxon signed-rank tests for data having within-group correlations (Larocque 2005) with the unit of observation being the subject-level average cooperation/conditional cooperation in round 10 across all supergames in a block).
The unit of observation in these tests is the average (conditional) cooperation for an individual first (second) mover across all periods of all Block 2 supergames. As we are testing whether there is higher cooperation in 2S than 1S, the latter \(p\) values are based on a one-sided test in which the null is \(\mu _{2S}=\mu _{1S}\) and the alternative is that \(\mu _{2S}>\mu _{1S}\). The two-sided \(p\) values are 0.126 and 0.141, respectively.
We perform a power calculation for the 1S test, assuming the effect size found in 2S and a 10 % significance level, and find that our power is 85 %. The standard threshold for acceptable power is 80 %.
In the “Appendix,” we present similar results in which players are classified only by their type from the last supergame they play.
The proportion of second-movers who are classified as Cooperators may be inflated relative to imitators because first-movers defected first in 41.5 % of the Block 1 games with Cooperators. It is possible that these second-movers were using a reputation-building strategy that is not revealed because of the first-movers’ defection; however, classifying them as Cooperators is still useful because second-movers are not revealed to first-movers as rational.
\(P\) values are from a McNemar test for data having within-group correlations (Durkalski et al. 2003). We calculate the power of the 1S test to be 78 %, assuming the effect size observed in 2S.
As the type effects are relative to second-mover types in 1S, the estimates do not simply reflect a restart effect.
For Trusting first-movers, the difference in cooperation between 2S and 1S is \(2.125\), but only with \(p=0.259\).
This explanation is also consistent with the rates of defecting first shown in Table 6: in 2S, Trusting first-movers facing imitator-type second-movers defect first 68.4 % of the time, compared to 42.0 % in 1S.
We decided not to elicit beliefs before or during gameplay because doing so is difficult and may itself affect beliefs and behavior. We thank an anonymous referee for the suggestion of eliciting beliefs about third parties.
This approach is similar to Radner (1986), in which players have arbitrary beliefs about the opponent’s trigger strategy choice in a simultaneous-move FRPD and choose a best-response trigger strategy given these beliefs.
Selten and Stoecker (1986) propose an alternative non-Bayesian model of learning from histories of play in FRPDs, which predicts a general pattern of behavior that is consistent with our data. In their model, a player defects one period earlier (or later) with some probability if her previous opponent defected earlier (or later) than she did, and she defects in the same period otherwise. This learning model does not include beliefs about other players nor does it assume optimizing behavior, but only an iterative Markov transition learning rule given a starting point and supergame outcome. Unlike our experiment, subjects in Selten and Stoecker (1986) are given no information about opponents’ histories of play in prior supergames, and it is not clear how strategies would be updated in their model when players see the current opponent’s history of play against others. In contrast to Selten and Stoecker, we model players as Bayesian optimizers in a framework that is general enough to accommodate the informational environment of our experiment as well as most other FRPD experiments.
Evidence of non-equilibrium behavior like this is abundant in the experimental literature on strategic sophistication. See Crawford et al. (2013) for a recent survey.
This observation implies that cooperation would be sustained at least as long for more optimistic beliefs.
We restrict \(\mu (s_{k}) \in (0,1)\) so that Bayes’ rule can always be used. Without changing the results of this section, we could instead assume players update via Bayes’ rule whenever possible and allow beliefs to be free when zero-probability events are observed, and an opponent’s history does not eliminate any strategies and assign a new belief of zero when a particular strategy may be eliminated based on the opponent’s history.
Calculation of the conditional probabilities \(p_{t}\) and the conditions in Proposition 1 for a uniform prior show that this is the case. Take the first-mover for example. Because \(p_{t}\) decreases in \(t\) for these beliefs, \(s_{k}\) is optimal if and only if \(p_{k} \ge 4\big /7\) holds because \(p_{k} \ge 4\big /7\) implies that \(p_{l} \ge 4\big /\left[ \sum _{i=k+1}^{l} \bigl (3 \prod _{j=i}^{l-1} p_{j}\bigr ) + 7 \prod _{i=k}^{l-1} p_{i}\right] \) holds for all \(l \ge k\). \(s_{2}\) is optimal because \(p_{k} \ge 4\big /7\) holds for all \(k \ge 2\) and \(p_{1} < 4\big /7\). The calculation is similar for the second-mover.
References
Ambrus, A., Pathak, P.A.: Cooperation over finite horizons: a theory and experiments. J. Public Econ. 95, 500–512 (2011)
Anderlini, L., Lagunoff, R.: Communication in dynastic repeated games: “whitewashes” and “coverups”. Econ. Theory 26, 265–299 (2005)
Andreoni, J.: Why free ride? Strategies and learning in public goods experiments. J. Public Econ. 37, 291–304 (1988)
Andreoni, J., Croson, R.: Partners vs. strangers: random rematching in public goods experiments, working Paper (1998)
Andreoni, J., Miller, J.H.: Rational cooperation in the finitely repeated prisoners’ dilemma: experimental evidence. Econ. J. 103(418), 570–585 (1993)
Blanco, M., Engelmann, D., Koch, A.K., Normann, H.T.: Preferences and beliefs in a sequential social dilemma: a within-subjects analysis, working Paper (2011)
Bolle, F., Ockenfels, P.: Prisoners’ dilemma as a game with incomplete information. J. Econ. Psychol. 11, 69–84 (1990)
Bolton, G.E., Brandts, J., Katok, E.: How strategy sensitive are contributions? Econ. Theory 15, 367–387 (2000)
Bolton, G.E., Katok, E., Ockenfels, A.: Cooperation among strangers with limited information about reputation. J. Public Econ. 89, 1457–1468 (2005)
Brandts, J., Charness, G.: Hot vs. cold: sequential responses and preference stability in experimental games. Exp. Econ. 2, 227–238 (2000)
Brandts, J., Charness, G.: The strategy versus the direct-response method: a first survey oif experimental comparisons. Exp. Econ. 14, 375–398 (2011)
Brandts, J., Figueras, N.: An exploration of reputation formation in experimental games. J. Econ. Behav. Organ. 50(1), 89–115 (2003)
Bruttel, L.V., Güth, W., Kamecke, U.: Finitely repeated prisoners’ dilemma experiments without a commonly known end. Int. J. Game Theory 41(1), 23–47 (2012)
Camera, G., Casari, M.: Cooperation among strangers under the shadow of the future. Am. Econ. Rev. 99, 979–1005 (2009)
Camerer, C., Weigelt, K.: Experimental tests of a sequential equilibrium reputation model. Econometrica 56(1), 1–36 (1988)
Chakravorti, B., Conley, J., Taub, B.: On uniquely implementing cooperation in the prisoners’ dilemma. Econ. Theory 8, 347–366 (1996)
Clark, K., Sefton, M.: The sequential prisoner’s dilemma: evidence on reciprocation. Econ. J. 111, 51–68 (2001)
Cooper, R., DeJong, D., Forsythe, R., Ross, T.W.: Cooperation without reputation: experimental evidence from prisoner’s dilemma games. Games Econ. Behav. 12, 187–218 (1996)
Crawford, V.P., Costa-Gomes, M.A., Iriberri, N.: Structural models of nonequilibrium strategic thinking: theory, evidence, and applications. J. Econ. Lit. 51(1), 5–62 (2013)
Bó, Dal: P.: Cooperation under the shadow of the future: experimental evidence from infinitely repeated games. Am. Econ. Rev. 95, 1591–1604 (2005)
Bó, Dal: P., Frechette, G.: The evolution of cooperation in repeated games: experimental evidence. Am. Econ. Rev. 101, 411–429 (2011)
Duffy, J., Ochs, J.: Cooperative behavior and the frequency of social interaction. Games Econ. Behav. 66, 758–812 (2009)
Duffy, J., Xie, H., Lee, Y.J.: Social norms, information, and trust among strangers: theory and evidence. Econ. Theory 52, 669–708 (2013)
Durkalski, V., Palesch, Y., Lipsitz, S., Rust, P.: Analysis of clustered matched-pair data. Stat. Med. 22, 2417–2428 (2003)
Embrey, M., Fréchette, G.R., Yuksel, S.: Cooperation in the finitely repeated prisoner’s dilemma, Working Paper (2014)
Engle-Warnick, J., Slonim, R.L.: Inferring repeated-game strategies from actions: evidence from trust game experiments. Econ. Theory 28, 603–632 (2006)
Gachter, S., Thoni, C.: Social learning and voluntary cooperation among like-minded people. J. Eur. Econ. Assoc. 3(2–3), 303–314 (2005)
Gong, B., Yang, C.L.: Reputation and cooperation: an experiment on prisoner’s dilemma with second-order information, working Paper (2010)
Healy, P.J.: Group reputations, stereotypes, and cooperation in a repeated labor market. Am. Econ. Rev. 97(5), 1751–1773 (2007)
Ho, T.H., Su, X.: Peer-induced fairness in games. Am. Econ. Rev. 99, 2022–2049 (2009)
Irlenbusch, B., Sliwka, D.: Incentives, decision frames, and motivation crowding out—an experimental investigation, IZA Discussion Paper No. 1758 (2005)
Jung, Y.J., Kagel, J.H., Levin, D.: On the existence of predatory pricing: an experimental study of reputation and entry deterrence in the chain-store game. RAND J. Econ. 25(1), 72–93 (1994)
Kagel, J.H., McGee, P.: Team versus individual play in finitely repeated prisoner dilemma games, Working Paper (2014)
Kahn, L.M., Murnighan, J.K.: Conjecture, uncertainty, and cooperation in prisoner’s dilemma games. J. Econ. Behav. Organ. 22, 91–117 (1993)
King-Casas, B., Tomlin, D., Anen, C., Camerer, C.F., Quartz, S.R., Montague, P.R.: Getting to know you: reputation and trust in a two-person economic exchange. Science 308, 78–83 (2005)
Kreps, D.M., Milgrom, P., Roberts, J., Wilson, R.: Rational cooperation in the finitely repeated prisoners’ dilemma. J. Econ. Theory 27, 245–252 (1982)
Larocque, D.: The Wicoxon signed-rank test for cluster correlated data. In: Duchesne, P., Rémillard, B. (eds.) Statistical modeling and analysis for complex data problems, pp. 309–323. Springer, US (2005)
Murnighan, J.K., Roth, A.E.: Expecting continued play in prisoner’s dilemma games. J. Confl. Resol. 27, 279–300 (1983)
Neral, J., Ochs, J.: The sequential equilibrium theory of reputation building: a further test. Econometrica 60(5), 1151–1169 (1992)
Nishihara, K.: A resolution of N-person prisoners’ dilemma. Econ. Theory 10, 531–540 (1997)
Normann, H.T., Wallace, B.: The impact of the termination rule on cooperation in a prisoner’s dilemma experiment. Int. J. Game Theory 41(3), 707–718 (2012)
Nowak, M.A., Sigmund, K.: Evolution of indirect reciprocity. Nature 437, 1291–1298 (2005)
Radner, R.: Can bounded rationality resolve the prisoners’ dilemma? In: Mas-Colell, A., Hildenbrand, W. (eds.) Contributions to mathematical economics, pp. 387–399. North-Holland, Amsterdam (1986)
Rao, J.N.K., Scott, A.J.: The analysis of categorical data from complex sample surveys: Chi-squared tests for goodness of fit and independence in two-way tables. J. Am. Stat. Assoc. 76, 221–230 (1981)
Rao, J.N.K., Scott, A.J.: On chi-squared tests for multiway contingency tables with cell proportions estimated from survey data. Ann. Stat. 12, 46–60 (1984)
Reuben, E., Suetens, S.: Revisiting strategic versus non-strategic cooperation. Exp. Econ. 15, 24–43 (2012)
Roe, B.E., Wu, S.Y.: Do the selfish mimic cooperators? Experimental evidence from finitely-repeated labor markets, working Paper (2009)
Roth, A.E., Murnighan, J.K.: Equilibrium behavior and repeated play of the prisoner’s dilemma. J. Math. Psychol. 11, 189–198 (1978)
Samuelson, L.: A note on uncertainty and cooperation in a finitely repeated prisoners’ dilemma. Int. J. Game Theory 16, 187–195 (1987)
Schwartz, S.T., Young, R.A., Zvinakis, K.: Reputation without repeated interaction: a role for public disclosures. Rev. Account. Stud. 5, 351–375 (2000)
Selten, R., Stoecker, R.: End behavior in sequences of finite repeated prisoner’s dilemma supergames: a learning theory approach. J. Econ. Behav. Organ. 7, 47–70 (1986)
Sribney, W.M.: Two-way contingency tables for survey or clustered data. Stata Tech. Bull. 45, 33–49 (1998)
Tingley, D.H., Walter, B.F.: The effect of repeated play on reputation building: an experimental approach. Int. Organ. 65, 343–365 (2011)
Acknowledgments
The authors thank Yaron Azrieli, Lucas Coffman, Glenn Dutcher, John Kagel, Semin Kim, Peter McGee, Xiangyu Qu, Arno Riedl, Dan Schley, Mike Sinkey, Tom Wilkening, and Chao Yang for their helpful comments and conversations. We are also grateful to two anonymous referees whose comments and insights have helped us to significantly improve the paper. Healy acknowledges financial support from National Science Foundation Grant #SES-0847406. Any opinions, findings, conclusions, or recommendations expressed are those of the authors and do not necessarily reflect the views of the Federal Trade Commission.
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Appendix
Appendix
Proposition 4
Let \(p\in (0,1)\) be the common belief that the other player plays tit for tat and \(\overline{p}_t\) the period \(t\) posterior belief. The following is a sequential equilibrium for a sequential-move FRPD.
-
(a)
The second-mover plays tit for tat in round \(t\) with probability
$$\begin{aligned} q_{t}\left( p\right) =\min \left\{ \frac{p}{1-p}\frac{1-\bar{p}_{t+1}}{\bar{p} _{t+1}},1\right\} . \end{aligned}$$Otherwise, the second-mover defects in round \(t\).
-
(b)
The first-mover cooperates in round \(t\) if and only if \(t \ge t^{*}(p)\) where
$$\begin{aligned} t^{*}\left( p\right) =\min \left\{ t\in \mathbb {N}:p\ge \bar{p}_{t}\right\} \text { and } \bar{p}_{t} = \left( \frac{4}{7}\right) ^{t} \text { hold for all }t. \end{aligned}$$Otherwise, the first-mover defects in round \(t\).
Proof
It is easier notationally to derive the equilibrium by counting backwards with \(t=10\) representing the first round of the supreme. In the body of the paper, however, time is indexed forward with \(t=1\) representing the first round of the supreme. Now, in any period \(t\), the first-mover will cooperate if
where \(V_{t-1}\left( p\right) \) is the continuation value of the first-mover entering period \(t-1\) with belief \(p\). Let \(V_{0}\equiv 0\). Let \(\bar{p}_{t}\) be the smallest value of \(p_{t}\) satisfying this inequality. (We will show later that the inequality in fact grows in \(p_{t}\).)
The probability a selfish second-mover cooperates is the highest \(q\) such that the first-mover is willing to cooperate in periods \(t-1\) after observing cooperation in period \(t\). Thus, if \(\bar{p}_{t}\) is the lowest belief at which first-mover will cooperate in period \(t\), then \(q_{t}\left( p\right) \) solves
and so
For completeness, let \(q_{t}\left( 1\right) =1\) for all \(t\) and \(q_{t}\equiv 1 \) for any \(t\) where \(\bar{p}_{t-1}=0\). Since a selfish second-mover never cooperates in the last period, set \(q_{1}\left( p\right) =0\) for all \(p\). (This is equivalent to setting \(\bar{p}_{0}=1\).)
For any \(t>1\), consider the case where \(p_{t}\ge \bar{p}_{t-1}\). Here, \( q_{t}\left( p_{t}\right) =1\) (the second-mover cooperates with certainty) and
so the above inequality becomes
or
In other words, the first-mover always cooperates if \(p_{t}\ge \bar{p}_{t-1}\) . This proves that \(\bar{p}_{t}\le \bar{p}_{t-1}\).
Now, suppose \(p_{t}<\bar{p}_{t-1}\). Here,
and so
The above inequality becomes
After several steps of algebra, this reduces to
Since \(\bar{p}_{t}\) solves this inequality exactly, it has the property that
In \(t=1\), the first-mover cooperates if
and so
Thus,
or
Note that \(V_{1}\left( p_{1}\right) =4=V_{1}\left( 0\right) \) for any \( p_{1}\le \bar{p}_{1}\).
In \(t=2\), we know that if \(p_{2}\ge \bar{p}_{1}=4/7\), then the first-mover cooperates with certainty.
If \(p_{2}<\bar{p}_{1}\), then he will cooperate only if \(p_{2}\ge \bar{p}_{2}\) , where \(\bar{p}_{2}\) solves
The expression for \(V_{2}\left( p\right) \) is given by
which is equal to
Note that \(V_{2}\left( p_{2}\right) =8=V_{2}\left( 0\right) \) for any \( p_{2}\le \bar{p}_{2}\).
In \(t=3\), we know that if \(p_{3}\ge \bar{p}_{2}\), then the first-mover cooperates with certainty.
If \(p_{3}<\bar{p}_{2}\), then he will cooperate only if \(p_{3}\ge \bar{p}_{3}\), where \(\bar{p}_{3}\) solves
The expression for \(V_{3}\left( p\right) \) is given by
which is equal to
In general, we will have
and
Let
First-movers will cooperate in period \(T\) if \(p_{T}^{*}\ge \bar{p}_{T}\). Thereafter, they will cooperate as long as they have never seen a defection and will never cooperate after seeing a defection. In that case, beliefs will evolve according to the formula
Beliefs change to \(p_{t}^{*}=0\) if a defection is observed in any previous period. If \(p_{T}^{*}<\bar{p}_{T}\), then both players always defect and \(p_{t}^{*}=p_{T}^{*}\) for every period \(t\). The on-path continuation value of the first-mover will equal
where we set \(\bar{p}_{0}=1\).
Proof of Proposition 1
(a) Let the first-mover’s expected payoff in round \(s\) from the remaining rounds \(1,\ldots ,s\) given beliefs \(p_{1},\ldots ,p_{s}\) be denoted by \(V_{s}(p_{1},\ldots ,p_{s})\). The expected payoff for cooperating in round \(t\) is \(p_{t} (7 + V_{t-1}(p_{1},\ldots ,p_{t-1})) + (1 - p_{t}) V_{t-1}(0,\ldots ,0)\). The expected payoff for defecting in round \(t\), given that the second-mover will respond by defecting for at least one round, is at most \(4 + V_{t-1}(p_{1},\ldots ,p_{t} p_{t-1})\). Therefore, the first-mover plays tit for tat in period \(t\) if and only if the following inequality holds
We need to show that
if
and
otherwise.
The proof is by induction. First, we know that the first-mover cooperates in round \(1\) if and only if \(p_{1} 7 + (1 - p_{1}) 0 \ge 4\) holds. Therefore, if \(p_{1} \ge \frac{4}{7}\) holds, then we have \(V_{1}(p_{1}) = 7 p_{1}\), and if \(p_{1} < \frac{4}{7}\) holds, then we have \(V_{1}(p_{1}) = 4\), and the formula is true for \(t = 1\).
Now, assume that the formula holds for all rounds up to \(t-1\) and show that it holds for round \(t\). Assume that the following holds
if
and
otherwise.
The first-mover cooperates in round \(t\) if and only if
We have assumed that \(p_{l} \ge \frac{4}{\sum _{i=k+1}^{l} (3 \prod _{j=i}^{l-1} p_{j}) + 7 \prod _{i=k}^{l-1} p_{i}}\) holds for all \(l\) such that \(t-1 \ge l \ge k\). First, suppose also that \(p_{t}p_{t-1} \ge \frac{4}{\sum _{i=k+1}^{t-1} (3 \prod _{j=i}^{t-2} p_{j}) + 7 \prod _{i=k}^{t-2} p_{i}}\) holds. Then, the first-mover cooperates in round \(t\) if and only if
Now, suppose that \(p_{t}p_{t-1} < \frac{4}{\sum _{i=k+1}^{t-1} (3 \prod _{j=i}^{t-2} p_{j}) + 7 \prod _{i=k}^{t-2} p_{i}}\) holds. Then, the first-mover cooperates in round \(t\) if and only if
Hence, the first-mover cooperates in round \(t\) and \(V_{t}(p_{1},p_{2},\ldots ,p_{t}) = 4 (t-1) + \sum _{i=k+1}^{t} (3\prod _{j=i}^{t} p_{j}) + 7 \prod _{i=k}^{t} p_{i}\) if and only if \(p_{l} \ge \frac{4}{\sum _{i=k+1}^{l} (3 \prod _{j=i}^{l-1} p_{j}) + 7 \prod _{i=k}^{l-1} p_{i}}\) holds for all \(l\) such that \(t \ge l \ge k\). Otherwise, the first-mover defects in round \(t\) and \(V_{t}(p_{1},p_{2},\ldots ,p_{t}) = 4t\).
(b) Let the second-mover’s expected payoff in round \(s\) from the remaining rounds \(1,\ldots ,s-1\) given beliefs \(p_{1},\ldots ,p_{s-1}\) be denoted by \(V_{s}(p_{1},\ldots ,p_{s-1})\). The expected payoff for cooperating in round \(t\) is \(7 + p_{t} (7 + V_{t}(p_{1},\ldots ,p_{t-1})) + (1 - p_{t}) (4 + V_{t}(0,\ldots ,0))\). The expected payoff for defecting in round \(t\), given that the first-mover will respond by defecting for at least one round, is at most \(12 + V_{t}(p_{1},\ldots ,p_{t} p_{t-1})\). Therefore, the second-mover plays tit for tat in period \(t+1\) if and only if the following inequality holds
We need to show that
if
and
otherwise.
The proof is by induction. First, we know that defection is the dominant action for the second-mover in round \(1\). The second-mover cooperates in round \(2\) if and only if \(7 + p_{1} 12 + (1 - p_{1}) 4 \ge 12 + 4\) holds. Therefore, if \(p_{1} \ge \frac{5}{8}\) holds, then we have \(V_{2}(p_{1}) = 4 + 8 p_{1}\), and if \(p_{1} < \frac{5}{8}\) holds, then we have \(V_{2}(p_{1}) = 4\), and the formula is true for \(t = 1\).
Now, we assume that the formula holds for all rounds up to \(t-1\) and show that it holds for round \(t\). Assume that the following holds
if
and
otherwise.
The second-mover cooperates in round \(t+1\) if and only if
We have assumed that \(p_{l} \ge \frac{5}{\sum _{i=k+1}^{l} (3 \prod _{j=i}^{l-1} p_{j}) + 8 \prod _{i=k}^{l-1} p_{i}}\) holds for all \(l\) such that \(t-1 \ge l \ge k\). First, suppose that \(p_{t}p_{t-1} \ge \frac{5}{\sum _{i=k+1}^{t-1} (3 \prod _{j=i}^{t-2} p_{j}) + 8 \prod _{i=k}^{t-2} p_{i}}\) holds. Then, the second-mover cooperates in round \(t+1\) if and only if
Now, suppose that \(p_{t}p_{t-1} < \frac{5}{\sum _{i=k+1}^{t-1} (3 \prod _{j=i}^{t-2} p_{j}) + 8 \prod _{i=k}^{t-2} p_{i}}\) holds. Then, the second-mover cooperates in round \(t+1\) if and only if
Hence, the second-mover cooperates in round \(t+1\) and \(V_{t+1}(p_{1},p_{2},\ldots ,p_{t}) = 4 (t-1) + \sum _{i=k+1}^{t} (3\prod _{j=i}^{t} p_{j}) + 8 \prod _{i=k}^{t} p_{i}\) if \(p_{l} \ge \max \{\frac{1}{3},\frac{5}{\sum _{i=k+1}^{l} (3 \prod _{j=i}^{l-1} p_{j})\} + 8 \prod _{i=k}^{l-1} p_{i}}\) holds for all \(l\) such that \(t \ge l \ge k\). Otherwise, the second-mover defects in round \(t+1\) and \(V_{t+1}(p_{1},p_{2},\ldots ,p_{t}) = 4t\).
Proof of Proposition 2
By Proposition 1, the first-mover’s Block 1 strategy is \(s_{m}\) if and only if \(\mu \) is such that \(p_{k} \ge \frac{4}{\sum _{i=m+1}^{k} (3 \prod _{j=i}^{k-1} p_{j}) + 7 \prod _{i=m}^{k-1} p_{i}}\) holds for all \(k \in \{m,\ldots ,10\}\). This condition can be rewritten in terms of the prior beliefs \(\mu \) as
for all \(k \in \{m,\ldots ,10\}\). After several steps of algebra, the denominator of the right-hand side of the above inequality simplifies to \(\frac{1}{\sum _{i=1}^{k}\mu (s_{i})}((7 + 3(k - n))(\sum _{i=1}^{m}\mu (s_{i})) + 3\sum _{i=m+1}^{k}((k+1-i)\mu (s_{i})))\), and the condition can be simplified to
for all \(k \in \{m,\ldots ,10\}\).
Now, suppose that the first-mover’s prior beliefs satisfy \(\mu (s_{k+1}) \le (3\big /4) \sum _{i=1}^{k} \mu (s_{i})\) for all \(k \in \{m,\ldots ,10\}\). For \(k=m\), the above condition is satisfied trivially. We now show that the above condition is satisfied for \(k=m+r\) for any \(r \ge 1\). For any \(r \ge 1\), the inequality \(\mu (s_{m+r+1}) \le (3\big /4) \sum _{i=1}^{m+r} \mu (s_{i})\) can be rewritten as
where \(\delta = \frac{1}{4} \sum _{i=m+1}^{m+r} (3(1-m-r+i)+1) \mu (s_{i}) - \frac{3}{4}r \sum _{i=1}^{m} \mu (s_{i})\). If \(r = 1\), then \(\delta = \mu (s_{m+1}) - \frac{3}{4} \sum _{i=1}^{m} \mu (s_{i}) \le 0\) holds and the condition for the first-mover to play strategy \(s_{m}\) in Block 1 is satisfied. Now, suppose that \(r \ge 2\). We can rewrite \(\delta \) as follows:
where \(\gamma = \frac{1}{4} \sum _{i=m+1}^{m+r-2} (3(1-m-r+i)+1) \mu (s_{i})\). Note that if \(r \ge 2\), then \(\gamma < 0\). Therefore, we have the following
\(r \ge 2\) implies that \((\frac{3}{4})^{r-2} - r < 0\), so \(\delta < 0\) holds and the condition for the first-mover to play strategy \(s_{m}\) in Block 1 is satisfied.
Given that the second-mover always defected before her opponent by round \(n\) of Block 1 supergames, where \(m < n\), we have \(\tilde{\mu }(s_{k}) = 0\) for all \(k \le m\). Therefore, \(\tilde{p}_{l} = 0\) holds for all \(l \le m\). By Proposition 1, it follows that the first-mover’s Block 2 strategy is \(s_{m+t}\) for some \(t \ge 1\).
Proof of Proposition 3
(a) By Proposition 1, the first-mover’s Block 1 strategy is \(s_{n}\) if and only if \(\mu \) is such that \(p_{k} \ge \frac{4}{\sum _{i=n+1}^{k} (3 \prod _{j=i}^{k-1} p_{j}) + 7 \prod _{i=n}^{k-1} p_{i}}\) holds for all \(k \in \{n,\ldots ,10\}\). By similar logic to the proof of Proposition 2, if the first-mover’s prior beliefs satisfy \(\mu (s_{k+1}) \le (3\big /4) \sum _{i=1}^{k} \mu (s_{i})\) for all \(k \in \{n,\ldots ,10\}\), then the condition for the first-mover to play strategy \(s_{n}\) in Block 1 is satisfied. Given that the second-mover never defected before her opponent in rounds \(10,\ldots ,m\) of Block 1 supergames, where \(m < n\), we have \(\tilde{\mu }(s_{k}) = 0\) for all \(k > m\). Therefore, \(\tilde{p}_{l} = 1\) holds for all \(l \ge m\). By Proposition 1, it follows that the first-mover’s Block 2 strategy is \(s_{n-t}\) for some \(t \ge 1\).
(b) \(\mu (s_{11}) > 3\big /7\) implies that \(\mu (s_{11}) > (3\big /4) \sum _{i=1}^{10} \mu (s_{i})\) holds, so the condition for the first-mover to play strategy \(s_{11}\) in Block 1 is satisfied. Given that the second-mover never defected before her opponent in rounds \(10,\ldots ,m\) of Block 1 supergames, where \(m \le 10\), we have \(\tilde{\mu }(s_{k}) = 0\) for all \(k > m\). Therefore, \(\tilde{p}_{l} = 1\) holds for all \(l \ge m\). By Proposition 1, it follows that the first-mover’s Block 2 strategy is \(s_{11-t}\) for some \(t \ge 1\).
Rights and permissions
About this article
Cite this article
Cox, C.A., Jones, M.T., Pflum, K.E. et al. Revealed reputations in the finitely repeated prisoners’ dilemma. Econ Theory 58, 441–484 (2015). https://doi.org/10.1007/s00199-015-0863-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00199-015-0863-1