FormalPara Definition

A repeated game ΓT is where the stage game Γ is repeated T times. When T is finite, the repeated game ΓT is called a finitely repeated game, and when T is infinite, the repeated game Γ is called an infinitely repeated game.

Notations

Let \( \Gamma =\left(\left\{1,2,\dots N\right\},{\left({A}_n\right)}_{n=1}^N,{\left({\pi}_n\right)}_{n=1}^N\right) \), where {1, 2, … N} is the set of players; and for each n = 1,2, …, N, An is the action space of player n, \( {\pi}_n:{\times}_{n=1}^N{A}_n\to R \) is the payoff function of player n. In other words, a stage game is defined by the set of players, the action space for each player and the payoff function for each player, which provides the real-valued utility that a player receives for every combination of actions of all players.

We usually assume that (a) N is finite; (b) for each n, An is finite; and (c) Γ is a simultaneous move game. We also assume that if actions \( {\left({a}^t\right)}_{t=1}^T \), where \( {a}^t=\left({a}_1^t,{a}_2^t,\dots, {a}_N^t\right) \), are taken by the players n = 1, 2, …, N in the stages t = 1, 2, …, T, then player n’s payoff function for the repeated game will be \( {\sum}_{t=1}^T{\delta}_n^t{\pi}_n\left({a}^t\right) \), where δn ∈ (0, 1) is the discount factor of player n.

Let δ = (δ1, δ2, …, δN)–in order to make the dependence on the discount factors explicit, we will denote a finitely repeated game with T repetitions and with discount factors δ = (δ1, δ2, …, δN) by ΓT (δ) and an infinitely repeated game with discount factors δ = (δ1, δ2, …, δN) by Γ (δ).

Histories and Subgames

Suppose that the game begins in period 1, with the null history h1. For period t ≥ 2, let ht = (a1, a2, …, at−1) be the period-t history of choices of actions before period t, and let Ht be the set of all such possible period-t histories. A strategy for player n is a sequence of functions \( \left({s}_n^t\right):{H}^t\to {\mathbf{A}}_n \), where An is the set of probability distributions over An. Such a probability distribution is called a mixed strategy, so a pure strategy is a degenerate special case of a mixed strategy where the particular action takes place with probability one.

Note that for a strategy profile to be well-defined, it must specify probabilities over actions (mixed strategies) for all histories, not just the histories that would occur as a result of the strategy profile.

Every history begins a new subgame, and for a strategy profile to be subgame perfect, each such profile of functions needs to be a Nash Equilibrium for the corresponding subgame, on and outside the equilibrium path.

Discount Factor

A player’s discount factor is a combination of a function of the player’s own cost of capital (i.e., the rate at which the player discounts future utility) and the player’s subjective probability of survival into the future.

Finitely Repeated Game

All players know the finite time at which the game will terminate.

In such a case, the only subgame perfect equilibria of the repeated game are where a Nash equilibrium of the stage game – not necessarily the same Nash equilibrium – is played in each period.

Consider the special case where the stage game Γ has a unique Nash equilibrium. Then, for a finitely repeated game, the only subgame perfect equilibrium is where the unique Nash equilibrium is repeated in every stage.

Infinitely Repeated Game

Consider the condition that for each player, at each point of time, there be a positive probability bounded away from zero that there will be another round of the game. Given the interpretation of the discount factor mentioned earlier, such a condition is sufficient for an indefinite game to be modeled as an infinitely repeated game. In particular, if there is a constant probability that there will be another round of the game, the repeated game can be modelled as an infinitely repeated game.

Let player n’s minmax value be \( \underline {v_n}={\mathrm{min}}_{a_{-n}}\left[{\mathrm{max}}_{a_n}{\pi}_n\left({a}_n,{a}_{-n}\right)\right] \). It is the payoff that player n can guarantee itself in each stage game.

The following “folk theorem” asserts that every feasible payoff vector that strictly dominates the players’ minmax values can be sustained in Nash equilibrium for sufficiently high discount factors. It is called a folk theorem because for a time before it was actually proven it was assumed by folk wisdom to be true.

Theorem (folk theorem): For every feasible payoff vector v such that \( {v}_n>\underline {v_n},\forall n,\exists \underline{\delta}\in \left(0,1\right) \) such that \( {\delta}_n\in \left(\underline{\delta},1\right)\forall n\Rightarrow \exists \) Nash equilibrium of Γ(δ) with payoff vector v.

Let V = convex hull of {v : ∃aA such that π(a) = v}. In other words, V is the set of convex combinations of all utilities that can be achieved by actions of the players.

The following classic “Nash-threats folk theorem” shows that any payoff profile that strictly dominates a stage-game Nash equilibrium can be sustained in subgame perfect equilibrium, for sufficiently high discount factors.

Theorem (Friedman 1971): Let a* be a Nash equilibrium of the stage game Γ with payoff vector v*. Then, for any vV such that \( {v}_n>{v}_n^{\ast}\forall n,\exists \underline{\delta}\in \left(0,1\right) \) such that δn ∈(\( \underline{\delta} \), 1)∀n ⇒ ∃ subgame perfect equilibrium of Γ(δ) with payoff vector v.

The following theorem asserts that if the feasible payoff space is of the same dimensionality as the number of players, than any payoff profile that strictly dominates the minmax values can be sustained in subgame perfect equilibrium, for sufficiently high discount factors.

Theorem (Fudenberg and Maskin 1986): Let dimension of V be equal to N, the number of players. Then, for every payoff vector vV such that \( {v}_n>\underline {v_n},\forall n,\exists \underline{\delta}\in \left(0,1\right) \) such that δn ∈(δ, 1)∀n ⇒ ∃ subgame perfect equilibrium of Γ(δ) with payoff vector v.

Repeated Prisoner’s Dilemma

In each stage, the two firms Row Player (A) and Column Player (B) in a market are not price-takers. Each firm can either charge high price (cooperate) or charge low price (not cooperate). If they charge equal prices, they take equal shares of the market at that price. If they do not charge equal prices, the firm with the lower price takes the entire market at the lower price. The market revenues are $200 m at the high price, and $160 m at the low price. Neither firm can observe the other firm’s decision when it has to make its decision, and costs are negligible relative to revenues. In the payoff matrix, it is conventional to put the row player’s payoff first.

Stage Game Prisoner’s Dilemma – Payoff Matrix

 

Column player (B)

  

High price (cooperate)

Low price (not cooperate)

Row player (A)

High price (cooperate)

$100 m, $100 m

$0, $160 m

 

Low price (not cooperate)

$160 m, $0

$80 m, $80 m

  • Both firms are better off when they charge high prices than when they charge low prices (i.e., cooperation by both firms Pareto-dominates non-cooperation by both firms).

  • Pricing low is the dominant strategy for each firm (i.e., no matter what the other firm does, it is in each firm’s interest to not cooperate in the stage game).

  • Both firms charging low is the unique Nash equilibrium (i.e., the only mutually self-enforcing pair of strategies in the stage game is where each firm does not cooperate). This is also the minmax outcome for each player.

From the previous discussions, non-cooperation in every stage is the only subgame perfect equilibrium of a finitely repeated Prisoner’s Dilemma. For an oligopoly with a definite termination date, cooperation is not sustainable in subgame perfect equilibrium. However, in experiments with known large but finite numbers of repetitions, we often find that players cooperate in the initial rounds.

The folk theorems imply that for the infinitely repeated Prisoner’s Dilemma, any payoff vector higher than the non-cooperative payoffs is sustainable in equilibrium, for sufficiently high discount factors – in particular, the cooperative outcome is sustainable in equilibrium, for sufficiently high discount factors. For an oligopoly, these theorems mean that for low enough costs of capital and high enough probabilities of survival, tacit cooperation is sustainable in subgame perfect equilibrium. For example, the repeated interactions between Coca-Cola and Pepsi-Co in the beverage industry can be modelled as a repeated Prisoner’s Dilemma. We can calculate the conditions under which implicit cooperation (“win-win”) between the firms can be sustained in equilibrium.

Dynamic Strategy

A “good” dynamic strategy needs to score highly along the following dimensions:

  • Clarity: it needs to be simple

  • Niceness: it should not initiate deviating from a cooperative outcome

  • Provocability: it should not let deviation from a cooperative outcome go unpunished

  • Forgiving: it should not “hold a grudge” for too long.

Examples of Dynamic Strategy

The “grim trigger” or reversion strategy profile is:

  • In every period t, price high (cooperate) if the other firm has charged high prices (cooperated) in each previous period

  • Price low (do not cooperate) otherwise The grim trigger strategy is, therefore,

  • Clear: absolutely

  • Nice: only in the sense that it does not initiate non-cooperative behaviour

  • Provocable: it punishes every single deviation from cooperative behaviour

  • Completely unforgiving.

The grim trigger strategy profile is conceptually useful in the sense that it provides us with a bound of what is sustainable in subgame perfect equilibrium – in particular, if something is not sustainable through a grim trigger strategy profile, it is not likely to be sustainable in subgame perfect equilibrium through any other dynamic strategy. However, because of its draconian nature, it is unlikely to be used in a real-life situation. Consider the following punishment: if a driver is caught going even one mile per hour above the speed limit, the driver loses her/his licence for life. If this cannot stop speeding, very few other strategies can. However, such a draconian punishment is unlikely to be acceptable to society.

“Tit for tat” is another important dynamic strategy: cooperate in the first period and mimic the opponent’s behaviour from the previous period. It scores highly on all the four criteria of clarity, niceness, provocability and forgiving. It is a robust strategy – it has performed well in competitive tournaments studied by Axelrod (2006). It manages to encourage cooperation, whenever possible, while avoiding exploitation. It can, however, start an escalation process.

A variant of the tit-for-tat strategy is the “What have you done for me lately?” strategy:

  • Begin cooperating

  • Continue cooperating

  • Count how many times your opponent has not cooperated even though you have

  • If the above proportion becomes “unacceptable”, revert to tit for tat. Of course, determining what is “unacceptable” can be critical. Dixit and Nalebuff (1993) suggest the following: start cooperating, and continue to do so until one of the four tests below fails.

  • First impression: non-cooperation on the first move is unacceptable; revert to tit for tat

  • Short term: two non-cooperative moves in any three consecutive turns are unacceptable; revert to tit for tat

  • Medium term: three non-cooperative moves out of the last 20 periods are unacceptable; revert to tit for tat

  • Long term: five non-cooperative moves out of the last 100 periods are unacceptable; revert to tit for tat.

This strategy scores higher than tit for tat on the nicety and forgiving tests, but lower on the provocability and clarity tests.

See Also