Subgame Consistent Cooperative Solution in Dynamic Games

Yeung, David W. K.; Petrosyan, Leon A.

doi:10.1007/978-981-10-1545-8_7

David W. K. Yeung^17,18 &
Leon A. Petrosyan¹⁹

Part of the book series: Theory and Decision Library C ((TDLC,volume 47))

443 Accesses

Abstract

In many game situations, the evolutionary process is in discrete time rather than in continuous time. An extension of the analysis to a discrete-time dynamic framework is provided in this chapter. In particular, it presents an analysis on subgame consistent solutions which entail group optimality and individual rationality for cooperative (deterministic and stochastic) dynamic games. It integrates the works of Yeung and Petrosyan (2010) and Chapters 12 and 13 of Yeung and Petrosyan (2012a). We first present in Sect. 7.1 a general formulation of cooperative dynamic games in discrete time with the noncooperative outcomes, and the notions of group optimality and individual rationality. Subgame consistent cooperative solutions with corresponding payoff distribution procedures are derived in Sect. 7.2. An illustration of cooperative resource extraction in discrete time is given in Sect. 7.3. A general formulation of coopeartive stochastic dynamic games in discrete time is given in Sect. 7.4. Subgame consistent cooperative solutions with corresponding payoff distribution procedures are derived in Sect. 7.5. An illustration of cooperative resource extraction under uncertainty in discrete time is given in Sect. 7.6. A heuristic approach to obtaining subgame consistent solutions for cooperative dynamic games is provided in Sect. 7.7. Section 7.8 contains Appendices of the Chapter. Chapter Notes are given in Sect. 7.9 and problems in Sect. 7.10. In addition, to make the discrete-time analysis in this Chapter fully in line with the continuous-time analyses presented in earlier chapters a terminal condition is added to each player’s payoff in Yeung and Petrosyan (2010, 2012a).

Access provided by Autonomous University of Puebla. Download chapter PDF

Strongly Time-Consistent Solutions in Cooperative Dynamic Games

Nontransferable Utility Cooperative Dynamic Games

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

In many game situations, the evolutionary process is in discrete time rather than in continuous time. An extension of the analysis to a discrete-time dynamic framework is provided in this chapter. In particular, it presents an analysis on subgame consistent solutions which entail group optimality and individual rationality for cooperative (deterministic and stochastic) dynamic games . It integrates the works of Yeung and Petrosyan (2010) and Chapters 12 and 13 of Yeung and Petrosyan (2012a). We first present in Sect. 7.1 a general formulation of cooperative dynamic games in discrete time with the noncooperative outcomes, and the notions of group optimality and individual rationality. Subgame consistent cooperative solutions with corresponding payoff distribution procedures are derived in Sect. 7.2. An illustration of cooperative resource extraction in discrete time is given in Sect. 7.3. A general formulation of coopeartive stochastic dynamic games in discrete time is given in Sect. 7.4. Subgame consistent cooperative solutions with corresponding payoff distribution procedures are derived in Sect. 7.5. An illustration of cooperative resource extraction under uncertainty in discrete time is given in Sect. 7.6. A heuristic approach to obtaining subgame consistent solutions for cooperative dynamic games is provided in Sect. 7.7. Section 7.8 contains Appendices of the Chapter. Chapter Notes are given in Sect. 7.9 and problems in Sect. 7.10. In addition, to make the discrete-time analysis in this Chapter fully in line with the continuous-time analyses presented in earlier chapters a terminal condition is added to each player’s payoff in Yeung and Petrosyan (2010, 2012a).

1 Cooperative Dynamic Games

In this Section we present the basic framework of discrete-time cooperative dynamic games.

1.1 Game Formulation

Consider the general $ T- $ stage $ n- $ person nonzero-sum discrete-time cooperative dynamic game with initial state x ⁰. The state space of the game is $ X\in {R}^m $ and the state dynamics of the game is characterized by the difference equation:

$$ {x}_{k+1}={f}_k\left({x}_k,{u}_k^1,{u}_k^2,\cdots, {u}_k^n\right), $$

(1.1)

for $ k\in \left\{1,2,\cdots, T\right\}\equiv \kappa $ and $ {x}_1={x}^0 $,

where $ {u}_k^i\in {U}^i\subset {R}^{m_i} $ is the control vector of player i at stage k, $ {x}_k\in X\subset {R}^m $ is the state of the game.

The payoff of player i is

$$ {\displaystyle \sum_{\zeta =1}^T}{g}_{\zeta}^i\left[{x}_{\zeta },{u}_{\zeta}^1,{u}_{\zeta}^2,\cdots, {u}_{\zeta}^n\right]\;{\left(\frac{1}{1+r}\right)}^{\zeta -1}+{q}_{T+1}^i\left({x}_{T+1}\right){\left(\frac{1}{1+r}\right)}^T, $$

(1.2)

for $ i\in \left\{1,2,\cdots, n\right\}\equiv N $,

where r is the discount rate, and $ {q}_{T+1}^i\left({x}_{T+1}\right) $ is the terminal benefit that player i received at stage $ T+1 $.

The payoffs of the players are transferable.

1.2 Noncooperative Outcome

In this subsection, we characterize the noncooperative outcome of the discrete-time economic game (1.1 and 1.2). Let $ \left\{{\phi}_k^i(x),\kern0.49em \mathrm{f}\mathrm{o}\mathrm{r}\ k\in \kappa \kern0.24em \mathrm{and}\kern0.24em i\in N\right\} $ denote a set of strategies that provides a feedback Nash equilibrium solution to the game (1.1 and 1.2), and

$$ \begin{array}{l}{V}^i\left(k,x\right)={\displaystyle \sum_{\zeta =k}^T}{g}_{\zeta}^i\left[{x}_{\zeta },{\phi}_{\zeta}^1\left({x}_{\zeta}\right),{\phi}_{\zeta}^2\left({x}_{\zeta}\right),\cdots, {\phi}_{\zeta}^n\left({x}_{\zeta}\right)\right]\;{\left(\frac{1}{1+r}\right)}^{\zeta -1}\\ {}+{q}_{T+1}^i\left({x}_{T+1}\right)\kern0.24em {\left(\frac{1}{1+r}\right)}^T,\end{array} $$

where $ {x}_k=x $, for $ k\in K $ and $ i\in N $, denote the value function indicating the game equilibrium payoff to player i over the stages from k to $ T+1 $. A frequently used way to characterize and derive a feedback Nash equilibrium of the game is provided in the following theorem.

Theorem 1.1

A set of strategies $ \left\{{\phi}_k^i(x),\kern0.24em \mathrm{f}\mathrm{o}\mathrm{r}\;k\in \kappa \kern0.24em \mathrm{and}\kern0.24em i\in N\right\} $ provides a feedback Nash equilibrium solution to the game (1.1 and 1.2) if there exist functions V ⁱ(k, x), for $ k\in K $ and $ i\in N $, such that the following recursive relations are satisfied:

$$ \begin{array}{l}{V}^i\left(k,x\right)=\underset{u_k^i}{ \max}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{g}_k^i\left[x,{\phi}_k^1(x),{\phi}_k^2(x),\cdots, {\phi}_k^{i-1}(x),{u}_k^i,{\phi}_k^{i+1}(x),\cdots, {\phi}_k^n(x)\right]\\ {}{\left(\frac{1}{1+r}\right)}^{k-1}+{V}^i\Big[k+1,{f}_k\left[x,{\phi}_k^1(x),{\phi}_k^2(x),\cdots, {\phi}_k^{i-1}(x),{u}_k^i,{\phi}_k^{i+1}(x),\cdots, {\phi}_k^n(x)\right]\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\},\end{array} $$

(1.3)

$$ {V}^i\left(T+1,x\right)={q}_{T+1}^i(x)\;{\left(\frac{1}{1+r}\right)}^T; $$

(1.4)

for $ i\in N $ and $ k\in \kappa $.

Proof

Invoking the discrete-time dynamic programming technique in Theorem A.5 of the Technical Appendices, V ⁱ(k, x) is the maximized payoff of player i for given strategies $ \left\{\;{\phi}_k^i(x)\right.,\kern0.24em \mathrm{f}\mathrm{o}\mathrm{r}\;j\in N\;\mathrm{and}\;j\ne i\left.\right\} $ of the other $ n-1 $ players. Hence a Nash equilibrium appears. ■

For the sake of exposition, we sidestep the issue of multiple equilibria and focus on solvable games in which a particular noncooperative Nash equilibrium is chosen by the players in the entire subgame.

1.3 Dynamic Cooperation

Now consider the case when the players agree to cooperate and distribute the payoff among themselves according to an optimality principle . Two essential properties that a cooperative scheme has to satisfy are group optimality and individual rationality . An agreed upon optimality principle entails group optimality and an imputation to distribute the total cooperative payoff among the players.

We first examine the group optimal solution and then the condition under which individual rationality will be maintained.

1.3.1 Group Optimality

Maximizing the players’ joint payoff guarantees group optimality in a game where payoffs are transferable. To maximize their joint payoff the players have to solve the discrete-time dynamic programming problem of maximizing

$$ {\displaystyle \sum_{j=1}^n}{\displaystyle \sum_{k=1}^T\left[\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.}{g}_k^j\left[{x}_k,{u}_k^1,{u}_k^2,\cdots, {u}_k^n\right]{\left(\frac{1}{1+r}\right)}^{k-1}\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right]+{\displaystyle \sum_{j=1}^n}{q}_{T+1}^j\left({x}_{T+1}\right)\;{\left(\frac{1}{1+r}\right)}^T, $$

(1.5)

subject to (1.1).

Invoking the discrete-time dynamic programming technique an optimal solution to the control problem (1.1) and (1.5) can be characterized by the theorem below.

Theorem 1.2

A set of strategies $ \left\{{\psi}_k^i(x),\kern0.24em \mathrm{f}\mathrm{o}\mathrm{r}\;k\in \kappa \kern0.24em \mathrm{and}\kern0.24em i\in N\right\} $ provides an optimal solution to the problem (1.1) and (1.5) if there exist functions W(k, x), for $ k\in K $, such that the following recursive relations are satisfied:

$$ \begin{array}{l}W\left(k,x\right)=\underset{u_k^1,{u}_k^2,\cdots, {u}_k^n}{ \max}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\displaystyle \sum_{j=1}^n}{g}_k^j\left[{x}_k,{u}_k^1,{u}_k^2,\cdots, {u}_k^n\right]{\left(\frac{1}{1+r}\right)}^{k-1}\\ {}+W\left[k+1,{f}_k\left({x}_k,{u}_k^1,{u}_k^2,\cdots, {u}_k^n\right)\right]\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\}\\ {}={\displaystyle \sum_{j=1}^n{g}_k^j\left[x,{\psi}_k^1(x),{\psi}_k^2(x),\cdots, {\psi}_k^n(x)\right]}\kern0.36em {\left(\frac{1}{1+r}\right)}^{k-1}\\ {}+W\left[k+1,{f}_k\left(x,{\psi}_k^1(x),{\psi}_k^2(x),\cdots, {\psi}_k^n(x)\right)\right],\end{array} $$

(1.6)

$$ W\left(T+1,x\right)={\displaystyle \sum_{j=1}^n}{q}_{T+1}^j(x)\kern0.24em {\left(\frac{1}{1+r}\right)}^T. $$

(1.7)

Proof

Follow the proof of discrete-time dynamic programming technique in Theorem A.5 of the Technical Appendices. ■

Substituting the optimal control $ \left\{{\psi}_k^i(x),\kern0.24em \mathrm{f}\mathrm{o}\mathrm{r}\kern0.24em k\in \kappa \kern0.24em \mathrm{and}\kern0.24em i\in N\right\} $ into the state dynamics (1.1), one can obtain the dynamics of the cooperative trajectory as:

$$ {x}_{k+1}={f}_k\left({x}_k,{\psi}_k^1\left({x}_k\right),{\psi}_k^2\left({x}_k\right),\cdots, {\psi}_k^n\left({x}_k\right)\right), $$

(1.8)

for $ k\in \kappa $ and $ {x}_1={x}^0 $.

Let $ {\left\{{x}_k^{*}\;\right\}}_{k=1}^T $ denote the solution to (1.8) and hence the optimal cooperative path. The total cooperative payoff over the stages from k to $ T+1 $ can be expressed as:

$$ \begin{array}{l}W\left(k,{x}_k^{*}\right)={\displaystyle \sum_{\zeta =k}^T{\displaystyle \sum_{j=1}^n}}{g}_{\zeta}^j\left[{x}_{\xi}^{*},{\psi}_{\zeta}^1\left({x}_{\zeta}^{*}\right),{\psi}_{\zeta}^2\left({x}_{\zeta}^{*}\right),\varLambda, {\psi}_{\zeta}^n\left({x}_{\zeta}^{*}\right)\right]\kern0.24em {\left(\frac{1}{1+r}\right)}^{\zeta -1}\\ {}+{\displaystyle \sum_{j=1}^n}{q}_{T+1}^j\left({x}_{T+1}\right)\kern0.24em {\left(\frac{1}{1+r}\right)}^T,\kern2.25em \mathrm{f}\mathrm{o}\mathrm{r}\kern0.24em k\in \kappa .\end{array} $$

(1.9)

We then proceed to consider individual rationality .

1.3.2 Individual Rationality

The players have to agree on an optimality principle in distributing the total cooperative payoff among themselves. For individual rationality to be upheld the payoffs an player receives under cooperation have to be no less than his noncooperative payoff along the cooperative state trajectory. For instance, (i) the players may share the excess of the total cooperative payoff over the sum of individual noncooperative payoffs equally, or (ii) they may share the total cooperative payoff proportionally to their noncooperative payoffs.

Let $ \xi \left(\cdot, \cdot \right) $ denote the imputation vector guiding the distribution of the total cooperative payoff under the agreed-upon optimality principle along the cooperative trajectory $ {\left\{{x}_k^{*}\;\right\}}_{k=1}^T $. At stage k, the imputation vector according to $ \xi \left(\cdot, \cdot \right) $ is $ \xi \left(k,{x}_k^{*}\right)=\left[{\xi}^1\left(k,{x}_k^{*}\right),{\xi}^2\left(k,{x}_k^{*}\right),\dots, {\xi}^n\left(k,{x}_k^{*}\right)\right] $, for $ k\in \kappa $.

If for example, the optimality principle specifies that the players share the excess of the total cooperative payoff over the sum of individual noncooperative payoffs equally, then the imputation to player i becomes:

$$ {\xi}^i\left(k,{x}_k^{*}\right)={V}^i\left(k,{x}_k^{*}\right)+\frac{1}{n}\left[W\left(k,{x}_k^{*}\right)-{\displaystyle \sum_{j=1}^n{V}^j\left(k,{x}_k^{*}\right)}\right], $$

(1.10)

for $ i\in N $ and $ k\in \kappa $.

If the optimality principle specifies that the players share the total cooperative proportional to their noncooperative payoffs, then the imputation to player i becomes:

$$ {\xi}^i\left(k,{x}_k^{*}\right)=\frac{V^i\left(k,{x}_k^{*}\right)}{{\displaystyle \sum_{j=1}^n{V}^j\left(k,{x}_k^{*}\right)}}W\left(k,{x}_k^{*}\right), $$

(1.11)

for $ i\in N $ and $ k\in \kappa $.

For individual rationality to be maintained throughout all the stages $ k\in \kappa $, it is required that:

$$ {\xi}^i\left(k,{x}_k^{*}\right)\ge {V}^i\left(k,{x}_k^{*}\right),\ \mathrm{f}\mathrm{o}\mathrm{r}\ i\in N\ \mathrm{and}\ k\in \kappa . $$

(1.12)

In particular, the above condition guaranties that the payoff allocated to a player under cooperation will be no less than its noncooperative payoff.

To satisfy group optimality , the imputation vector has to satisfy

$$ \begin{array}{cc}\hfill W\left(k,{x}_k^{*}\right)={\displaystyle \sum_{j=1}^n}{\xi}^j\left(k,{x}_k^{*}\right),\hfill & \hfill \mathrm{f}\mathrm{o}\mathrm{r}\ k\in \kappa .\hfill \end{array} $$

(1.13)

This condition guarantees the highest joint payoffs for the participating players.

2 Subgame Consistent Solutions and Payment Mechanism

To guarantee dynamical stability in a dynamic cooperation scheme, the solution has to satisfy the property of subgame consistency . In particular, the specific agreed-upon optimality principle must remain effective at any stage of the game along the optimal state trajectory. Since at any stage of the game the players are guided by the same optimality principles and hence do not have any ground for deviation from the previously adopted optimal behavior throughout the game. Therefore for subgame consistency to be satisfied, the imputation $ \xi \left(\cdot, \cdot \right) $ according to the original optimality principle has to be maintained at all the T stages along the cooperative trajectory $ {\left\{{x}_k^{*}\;\right\}}_{k=1}^T $. In other words, the imputation

$$ \xi \left(k,{x}_k^{*}\right)=\left[{\xi}^1\left(k,{x}_k^{*}\right),{\xi}^2\left(k,{x}_k^{*}\right),\dots, {\xi}^n\left(k,{x}_k^{*}\right)\right]\ \mathrm{at}\ \mathrm{stage}\ k, $$

(2.1)

for k ∈ κ

has to be upheld.

Crucial to the analysis is the formulation of a payment mechanism so that the imputation in (2.1) can be realized.

2.1 Payoff Distribution Procedure

Similar to the analysis of cooperative differential games , we first formulate a Payoff Distribution Procedure (PDP) so that the agreed imputations (2.1) can be realized. Let B ⁱ_k (x ^*_k ) denote the payment that player i will receive at stage k under the cooperative agreement along the cooperative trajectory $ {\left\{{x}_k^{*}\;\right\}}_{k=1}^T $.

The payment scheme involving B ⁱ_k (x ^*_k ) constitutes a PDP in the sense that the imputation to player i over the stages from k to T can be expressed as:

$$ \begin{array}{l}{\xi}^i\left(k,{x}_k^{*}\right)={B}_k^i\left({x}_k^{*}\right){\left(\frac{1}{1+r}\right)}^{k-1}\\ {}+\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\displaystyle \sum_{\zeta =k+1}^T}{B}_{\zeta}^i\left({x}_{\zeta}^{*}\right){\left(\frac{1}{1+r}\right)}^{\zeta -1}+{q}_{T+1}^i\left({x}_{T+1}\right){\left(\frac{1}{1+r}\right)}^T\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\}\end{array} $$

(2.2)

for $ i\in N $ and $ k\in \kappa $.

Using (2.2) one can obtain

$$ \begin{array}{l}{\xi}^i\left(k+1,{x}_{k+1}^{*}\right)={B}_{k+1}^i\left({x}_{k+1}^{*}\right){\left(\frac{1}{1+r}\right)}^k\\ {}+\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\displaystyle \sum_{\zeta =k+2}^T}{B}_{\zeta}^i\left({x}_{\zeta}^{*}\right){\left(\frac{1}{1+r}\right)}^{\zeta -1}+{q}_{T+1}^i\left({x}_{T+1}\right){\left(\frac{1}{1+r}\right)}^T\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\}.\end{array} $$

(2.3)

Upon substituting (2.3) into (2.2) yields

$$ {\xi}^i\left(k,{x}_k^{*}\right)={B}_k^i\left({x}_k^{*}\right){\left(\frac{1}{1+r}\right)}^{k-1}+{\xi}^i\left[k+1,\kern0.3em {f}_k\left({x}_k^{*},{\psi}_k\left({x}_k^{*}\right)\right)\right], $$

(2.4)

for $ i\in N $ and $ k\in \kappa $.

A theorem characterizing a formula for B ⁱ_k (x ^*_k ), for $ k\in \kappa $ and $ i\in N $, which yields (2.2) is provided below.

Theorem 2.1

A payment equaling

$$ {B}_k^i\left({x}_k^{*}\right)={\left(1+r\right)}^{k-1}\left[\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\xi}^i\left(k,{x}_k^{*}\right)-{\xi}^i\left[k+1,{f}_k\left({x}_k^{*},{\psi}_k\left({x}_k^{*}\right)\right)\right]\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right], $$

(2.5)

for $ i\in N $,

given to player i at stage $ k\in \left\{1,2,\cdots, T\right\} $ along the cooperative trajectory $ {\left\{{x}_k^{*}\;\right\}}_{k=1}^T $ would lead to the realization of the imputation {ξ(k, x ^*_k ), for $ k\in \kappa $}.

Proof

From (2.4), one can readily obtain (2.5). Theorem 2.1 can also be verified alternatively by showing that from (2.2)

$$ \begin{array}{l}{\xi}^i\left(k,{x}_k^{*}\right)={B}_k^i\left({x}_k^{*}\right){\left(\frac{1}{1+r}\right)}^{k-1}\\ {}+\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\displaystyle \sum_{\zeta =k+1}^T}{B}_{\zeta}^i\left({x}_{\zeta}^{*}\right){\left(\frac{1}{1+r}\right)}^{\zeta -1}+{q}_{T+1}^i\left({x}_{T+1}\right){\left(\frac{1}{1+r}\right)}^T\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\}\\ {}=\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\xi}^i\left(k,{x}_k^{*}\right)-\left(\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\xi}^i\left[k+1,{f}_k\left({x}_k^{*},{\psi}_k\left({x}_k^{*}\right)\right)\right]\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right)\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\}\\ {}+{\displaystyle \sum_{\zeta =k+1}^T}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\xi}^i\left(\zeta, {x}_{\zeta}^{*}\right)-\left(\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\xi}^i\left[\zeta +1,{f}_{\zeta}\left({x}_{\zeta}^{*},{\psi}_{\zeta}\left({x}_{\zeta}^{*}\right)\right)\right]\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right)\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\}\\ {}={\xi}^i\left(k,{x}_k^{*}\right);\end{array} $$

and $ {\xi}^i\left(T+1,{x}_{T+1}^{*}\right)={q}_{T+1}^i\left({x}_{T+1}\right){\left(\frac{1}{1+r}\right)}^T $.

Hence Theorem 2.1 follows. ■

The payment scheme in Theorem 2.1 gives rise to the realization of the imputation guided by the agreed-upon optimal principle and will be used to derive time (optimal-trajectory-subgame) consistent solutions in the next subsection.

2.2 Subgame Consistent Solution

We denote the discrete-time cooperative game with dynamics (1.1) and payoff (1.2) by Γ_c(1, x ₀). We then denote the game with dynamics (1.1) and payoff (1.2) which starts at stage υ with initial state x ^*_υ by Γ_c(υ, x ^*_υ ). Moreover, we let $ P\left(1,{x}_0\right)=\left\{{u}_h^i\ \mathrm{and}\ {B}_h^i\ \mathrm{f}\mathrm{o}\mathrm{r}\ h\in \kappa\ \mathrm{and}\ i\in N,\ \xi \left(1,{x}_0\right)\right\} $ denote the agreed-upon optimality principle for the cooperative game Γ_c(1, x ₀). Let $ \mathrm{P}\left({x}_{\upsilon}^{*}, \upsilon \right)=\left\{{u}_h^i\ \mathrm{and}\ {B}_h^i\ \mathrm{f}\mathrm{o}\mathrm{r}\ h\in \left\{\upsilon, \upsilon +1,\dots, T\right\}\ \mathrm{and}\ i\in N,\ \xi \left(\upsilon, {x}_{\upsilon}^{*}\right)\right\} $ denote the optimality principle of the cooperative game Γ_c(υ, x ^*_υ ) according to the original agreement.

A theorem characterizing a subgame consistent solution for the discrete-time cooperative game Γ_c(1, x ₀) is presented below.

Theorem 2.2

For the cooperative game Γ_c(1, x ₀) with optimality principle $ \mathrm{P}\left(1,{x}_0\right)=\left\{{u}_h^i\ \mathrm{and}\ {B}_h^i\ \mathrm{f}\mathrm{o}\mathrm{r}\ h\in \kappa\ \mathrm{and}\ i\in N,\ \xi \left(1,{x}_0\right)\right\} $ in which

(i)
$ {u}_h^i={\psi}_h^i\left({x}_h^{*}\right) $, for $ h\in \kappa $ and $ i\in N $, is the set of group optimal strategies for the game Γ_c(1, x ₀), and
(ii)
$ {B}_h^i={B}_h^i\left({x}_h^{*}\right) $, for $ h\in \kappa $ and $ i\in N $, where

$$ {B}_h^i\left({x}_h^{*}\right)={\left(1+r\right)}^{h-1}\left[\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\xi}^i\left(h,{x}_h^{*}\right)-{\xi}^i\left[k+1,\kern0.3em {f}_h\left({x}_h^{*},{\psi}_h\left({x}_h^{*}\right)\right)\right]\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right], $$

(2.6)

and [ξ ¹(h, x ^*_h ), ξ ²(h, x ^*_h ), …, ξ ⁱ(h, x ^*_h )], is the imputation according to the optimality principle P(h, x ^*_h );

is subgame consistent.

Proof

Follow the proof of the continuous-time analog in Theorem 2.2 of Chap. 3. ■

When all players are using the cooperative strategies, the payoff that player i will directly receive at stage k given that along the cooperative trajectory $ {\left\{{x}_k^{*}\;\right\}}_{k=1}^T $ is

$$ {g}_k^i\left[{x}_k^{*},{\psi}_k^1\left({x}_k^{*}\right),{\psi}_k^2\left({x}_k^{*}\right),\dots, {\psi}_k^n\left({x}_k^{*}\right),{x}_{k+1}^{*}\right]. $$

However, according to the agreed upon imputation , player i will receive B ⁱ_k (x ^*_k ) at stage k. Therefore a side-payment

$$ {\varpi}_k^i\left({x}_k^{*}\right)={B}_k^i\left({x}_k^{*}\right)-{g}_k^i\left[{x}_k^{*},{\psi}_k^1\left({x}_k^{*}\right),{\psi}_k^2\left({x}_k^{*}\right),\dots, {\psi}_k^n\left({x}_k^{*}\right),{x}_k^{*}\right], $$

(2.7)

for $ k\in \kappa $ and $ i\in N $,

will be given to player i to yield the cooperative imputation ξ ⁱ(k, x ^*_k ).

3 An Illustration in Cooperative Resource Extraction

Consider an economy endowed with a renewable resource and with two resource extractors (firms). The lease for resource extraction begins at stage 1 and ends at stage 3 for these two firms. Let u ⁱ_k denote the amount of resource extracted by firm i at stage k, for $ i\in \left\{1,2\right\} $. Let U ⁱ be the set of admissible extraction rates, and$ {x}_k\in X\subset {R}^{+} $ the size of the resource stock at stage k. The extraction cost for firm $ i\in \left\{1,2\right\} $ depends on the quantity of resource extracted u ⁱ_k , the resource stock size x _k, and cost parameters c ₁ and c ₂. The extraction cost for firm i at stage k is specified as c _i(u ⁱ_k )²/x _k. The price of the resource is P.

The profits that firm 1 and firm 2 will obtain at stage k are respectively:

$$ \left[P{u}_k^1-\frac{c_1}{x_k}{\left({u}_k^1\right)}^2\right]\kern0.75em \mathrm{and}\ \left[P{u}_k^2-\frac{c_2}{x_k}{\left({u}_k^2\right)}^2\right]\kern0.5em . $$

(3.1)

In stage 4, the firms will receive a salvage value equaling qx ₄.

The growth dynamics of the resource is governed by the difference equation:

$$ {x}_{k+1}={x}_k+a-b{x}_k-{\displaystyle \sum_{j=1}^2{u}_k^j}, $$

(3.2)

for $ k\in \left\{1,2,3\right\} $ and $ {x}_1={x}^0 $.

There exists an extraction constraint that human harvesting can at most exploit Y proportion of the existing biomass, hence $ {u}_k^1+{u}_k^2\le Y{x}_k $. Moreover $ b<1-Y $. The payoff of extractor $ i\in \left\{1,2\right\} $ is to maximize the present value of the stream of future profits:

$$ {\displaystyle \sum_{k=1}^3}\left[P{u}_k^i-\frac{c_i}{x_k}{\left({u}_k^{\mathrm{i}}\right)}^2\right]\kern0.5em {\left(\frac{1}{1+r}\right)}^{k-1}+{\left(\frac{1}{1+r}\right)}^3q{x}_4,\ \mathrm{f}\mathrm{o}\mathrm{r}\ i\in \left\{1,2\right\}, $$

(3.3)

subject to (3.2).

Invoking Theorem 1.1, one can characterize the noncooperative equilibrium strategies in a feedback solution for game (3.2 and 3.3). In particular, a set of strategies $ \left\{{\phi}_k^i(x),\ \mathrm{f}\mathrm{o}\mathrm{r}\ k\in \left\{1,2,3\right\}\ \mathrm{and}\ i\in \left\{1,2\right\}\right\} $ provides a Nash equilibrium solution to the game (3.2 and 3.3) if there exist functions V ⁱ(k, x), for $ i\in \left\{1,2\right\} $ and $ k\in \left\{1,2,3\right\} $, such that the following recursive relations are satisfied:

$$ \begin{array}{l}{V}^i\left(k,x\right)=\underset{u_k^i}{ \max}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.\left[P{u}_k^i-\frac{c_i}{x}{\left({u}_k^{\mathrm{i}}\right)}^2\right]\kern0.5em {\left(\frac{1}{1+r}\right)}^{k-1}\\ {}+{V}^i\left[k+1,x+a-bx-{u}_k^i-{\phi}_k^j(x)\right]\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\},\ \mathrm{f}\mathrm{o}\mathrm{r}\ k\in \left\{1,2,3\right\};\\ {}{V}^i\left(4,x\right)={\left(\frac{1}{1+r}\right)}^3qx.\end{array} $$

(3.4)

Performing the indicated maximization in (3.4) yields:

$$ \left(P-\frac{2{c}_i{u}_k^i}{x}\right){\left(\frac{1}{1+r}\right)}^{k-1}-{V}_{x_{k+1}}^i\left[k+1,x+a-bx-{u}_k^i-{\phi}_k^j(x)\right]=0, $$

(3.5)

for $ i\in \left\{1,2\right\} $ and $ k\in \left\{1,2,3\right\} $.

From (3.5), the game equilibrium strategies can be expressed as:

$$ {\phi}_k^i(x)=\left(\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.P-{V}_{x_{k+1}}^i\left[k+1,x+a-bx-{\displaystyle \sum_{\ell =1}^2{\phi}_k^{\ell }}(x)\right]{\left(1+r\right)}^{k-1}\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right)\frac{x}{2{c}_i}, $$

(3.6)

for $ i\in \left\{1,2\right\} $ and $ k\in \left\{1,2,3\right\} $.

The game equilibrium profits of the firms can be obtained as:

Proposition 3.1

The value function indicating the game equilibrium profit of firm i is:

$$ {V}^i\left(k,x\right)=\left[{A}_k^ix+{C}_k^i\right],\ \mathrm{f}\mathrm{o}\mathrm{r}\ i\in \left\{1,2\right\}\ \mathrm{and}\ k\in \left\{1,2,3\right\}, $$

(3.7)

where A ⁱ_k and C ⁱ_k , for $ i\in \left\{1,2\right\} $ and $ k\in \left\{1,2,3\right\} $, are constants in terms of the parameters of the game (3.2 and 3.3).

Proof

See Appendix A of this Chapter. ■

Substituting the relevant derivatives of the value functions in Proposition 3.1 into the game equilibrium strategies (3.6) yields a noncooperative feedback equilibrium solution of the game (3.2 and 3.3).

Now consider the case when the extractors agree to maximize their joint profit and share the excess of cooperative gains over their noncooperative payoffs equally. To maximize their joint payoff, they solve the problem of maximizing

$$ {\displaystyle \sum_{j=1}^2}{\displaystyle \sum_{k=1}^3}\left[P{u}_k^j-\frac{c_j}{x_k}{\left({u}_k^{\mathrm{j}}\right)}^2\right]\kern0.5em {\left(\frac{1}{1+r}\right)}^{k-1}+2{\left(\frac{1}{1+r}\right)}^3q{x}_4 $$

(3.8)

subject to (3.2).

Invoking Theorem 1.2, one can characterize the optimal controls in the dynamic programming problem (3.2) and (3.8). In particular, a set of control strategies $ \left\{{\psi}_k^i(x),\ \mathrm{f}\mathrm{o}\mathrm{r}\ k\in \left\{1,2,3\right\}\ \mathrm{and}\ i\in \left\{1,2\right\}\right\} $ provides an optimal solution to the problem (3.2) and (3.8) if there exist functions W(k, x): $ R\to R $, for $ k\in \left\{1,2,3\right\} $, such that the following recursive relations are satisfied:

$$ \begin{array}{l}W\left(k,x\right)=\underset{u_k^1,{u}_k^2}{ \max}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\displaystyle \sum_{j=1}^2}\left[P{u}_k^j-\frac{c_j}{x}{\left({u}_k^{\mathrm{j}}\right)}^2\right]\kern0.5em {\left(\frac{1}{1+r}\right)}^{k-1}\\ {}\begin{array}{cc}\hfill +W\left[k+1,x+a-bx-{\displaystyle \sum_{j=1}^2{u}_k^j}\right]\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\},\hfill & \hfill \mathrm{f}\mathrm{o}\mathrm{r}\ k\in \left\{1,2,3\right\}.\hfill \end{array}\\ {}W\left(4,x\right)=2{\left(\frac{1}{1+r}\right)}^3qx.\end{array} $$

(3.9)

Performing the indicated maximization in (3.9) yields:

$$ \left(P-\frac{2{c}_i{u}_k^i}{x}\right){\left(\frac{1}{1+r}\right)}^{k-1}-{W}_{x_{k+1}}\left[k+1,x+a-bx-{\displaystyle \sum_{j=1}^2{u}_k^j}\right]=0, $$

(3.10)

for $ i\in \left\{1,2\right\} $ and $ k\in \left\{1,2,3\right\} $.

In particular, the optimal cooperative strategies can be obtained from (3.10) as:

$$ {u}_k^i=\left(\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.P-{W}_{x_{k+1}}\left[k+1,x+a-bx-{\displaystyle \sum_{j=1}^2{u}_k^j}\right]{\left(1+r\right)}^{k-1}\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right)\frac{x}{2{c}_i}, $$

(3.11)

for $ i\in \left\{1,2\right\} $ and $ k\in \left\{1,2,3\right\} $.

The firms’ joint profit under cooperation can be obtained as:

Proposition 3.2

The value function indicating the maximized joint payoff is

$$ W\left(k,x\right)=\left[{A}_kx+{C}_k\right],\ \mathrm{f}\mathrm{o}\mathrm{r}\ k\in \left\{1,2,3\right\}, $$

(3.12)

where A _k and C _k, for $ k\in \left\{1,2,3\right\} $, are constants in terms of the parameters of the problem (3.8) and (3.2).

Proof

See Appendix B of this Chapter. ■

Using (3.11) and Proposition 3.2, the optimal cooperative strategies of the players can be expressed as:

$$ \begin{array}{cc}\hfill {\psi}_k^i(x)=\left[P-{A}_{k+1}{\left(1+r\right)}^{k-1}\right]\frac{x}{2{c}_i},\hfill & \hfill \mathrm{f}\mathrm{o}\mathrm{r}\ i\in \left\{1,2\right\}\ \mathrm{and}\ k\in \left\{1,2,3\right\}.\hfill \end{array} $$

(3.13)

Substituting ψ ⁱ_k (x) from (3.13) into (3.2) yields the optimal cooperative state trajectory:

$$ {x}_{k+1}={x}_k+a-b{x}_k-{\displaystyle \sum_{j=1}^2}\left[P-{A}_{k+1}{\left(1+r\right)}^{k-1}\right]\frac{x_k}{2{c}_j}, $$

(3.14)

for $ k\in \left\{1,2,3\right\} $ and $ {x}_1={x}^0 $.

Dynamics (3.14) is a linear difference equation readily solvable by standard techniques. Let $ \left\{{x}_k^{*},\ \mathrm{f}\mathrm{o}\mathrm{r}\ k\in \left\{1,2,3\right\}\right\} $ denote the solution to (3.14).

Since the extractors agree to share the excess of cooperative gains over their noncooperative payoffs equally, an imputation

$$ \begin{array}{l}{\xi}^i\left(k,{x}_k^{*}\right)={V}^i\left(k,{x}_k^{*}\right)+\frac{1}{2}\left[W\left(k,{x}_k^{*}\right)-{\displaystyle \sum_{j=1}^2{V}^j\left(k,{x}_k^{*}\right)}\right]\\ {}=\left({A}_k^i{x}_k^{*}+{C}_k^i\right)+\frac{1}{2}\left[\left({A}_k{x}_k^{*}+{C}_k\right)-{\displaystyle \sum_{j=1}^2\left({A}_k^j{x}_k^{*}+{C}_k^j\right)}\right],\end{array} $$

(3.15)

for $ k\in \left\{1,2,3\right\} $ and $ i\in \left\{1,2\right\} $ has to be maintained.

Invoking Theorem 2.1, if $ {x}_k^{*}\in X $ is realized at stage k a payment equaling

$$ \begin{array}{l}{B}_k^i\left({x}_k^{*}\right)={\left(1+r\right)}^{k-1}\left[\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\xi}^i\left(k,{x}_k^{*}\right)-{\xi}^i\left[k+1,{f}_k\left({x}_k^{*},{\psi}_k\left({x}_k^{*}\right)\right)\right]\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right]\\ {}={\left(1+r\right)}^{k-1}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.\left({A}_k^i{x}_k^{*}+{C}_k^i\right)+\frac{1}{2}\left(\left({A}_k{x}_k^{*}+{C}_k\right)-{\displaystyle \sum_{j=1}^2\left({A}_k^j{x}_k^{*}+{C}_k^j\right)}\right)\\ {}-\left[\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.\left({A}_{k+1}^i{x}_{k+1}^{*}+{C}_{k+1}^i\right)+\frac{1}{2}\left(\left({A}_{k+1}{x}_{k+1}^{*}+{C}_{k+1}\right)-{\displaystyle \sum_{j=1}^2\left({A}_{k+1}^j{x}_{k+1}^{*}+{C}_{k+1}^j\right)}\right)\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right]\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\},\end{array} $$

(3.16)

for i ∈ {1,2};

given to player i at stage $ k\in \kappa $ would lead to the realization of the imputation (3.15).

A subgame consistent solution can be readily obtained from (3.13), (3.15) and (3.16).

4 Cooperative Stochastic Dynamic Games

In this Section we present the basic framework of discrete-time cooperative stochastic dynamic games.

4.1 Game Formulation

Consider the general $ T- $ stage $ n- $ person nonzero-sum discrete-time cooperative stochastic dynamic game with initial state x ⁰. The state space of the game is $ X\in {R}^m $ and the state dynamics of the game is characterized by the stochastic difference equation:

$$ {x}_{k+1}={f}_k\left({x}_k,{u}_k^1,{u}_k^2,\cdots, {u}_k^n\right)+{G}_k\left({x}_k\right){\theta}_k, $$

(4.1)

for $ k\in \left\{1,2,\cdots, T\right\}\equiv \kappa $ and $ {x}_1={x}^0, $

where $ {u}_k^i\in {R}^{m_i} $ is the control vector of player i at stage k, $ {x}_k\in X $ is the state, and θ _k is a set of statistically independent random variables.

The objective of player i is

$$ \begin{array}{l}{E}_{\theta_1,{\theta}_2,\cdots, {\theta}_T}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\displaystyle \sum_{\zeta =1}^T}{g}_{\zeta}^i\left[{x}_{\zeta },{u}_{\zeta}^1,{u}_{\zeta}^2,\dots, {u}_{\zeta}^n\right]{\left(\frac{1}{1+r}\right)}^{\zeta -1}+{q}_{T+1}^i\left({x}_{T+1}\right){\left(\frac{1}{1+r}\right)}^T\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\},\\ {}\mathrm{f}\mathrm{o}\mathrm{r}\ i\in \left\{1,2,\dots, n\right\}\equiv N,\end{array} $$

(4.2)

where r is the discount rate and $ {E}_{\theta_1,{\theta}_2,\cdots, {\theta}_T} $ is the expectation operation with respect to the statistics of θ ₁, θ ₂, $ \cdots, {\theta}_T $.

The payoffs of the players are transferable.

We then characterize the noncooperative outcome of the discrete-time stochastic economic game (4.1 and 4.2). Let $ \left\{{\phi}_k^i(x),\ \mathrm{f}\mathrm{o}\mathrm{r}\ k\in \kappa\ \mathrm{and}\ i\in N\right\} $ denote a set of strategies that provides a feedback Nash equilibrium solution (if it exists) to the game (4.1 and 4.2), and

$$ {V}^i\left(k,x\right)={E}_{\theta_k,{\theta}_{k+1},\cdots, {\theta}_T}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\displaystyle \sum_{\zeta =k}^T}{g}_{\zeta}^i\left[{x}_{\zeta },{\phi}_{\zeta}^1\left({x}_{\zeta}\right),{\phi}_{\zeta}^2\left({x}_{\zeta}\right),\dots, {\phi}_{\zeta}^n\left({x}_{\zeta}\right)\right]\;{\left(\frac{1}{1+r}\right)}^{\zeta -1}+{q}_{T+1}^i\left({x}_{T+1}\right){\left(\frac{1}{1+r}\right)}^T\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\}, $$

where $ {x}_k=x $, for $ k\in K $ and $ i\in N $, denote the value function indicating the expected game equilibrium payoff to player i over the stages from k to $ T+1 $.

A frequently used way to characterize and derive a feedback Nash equilibrium of the game is provided in the theorem below.

Theorem 4.1

A set of strategies $ \left\{{\phi}_k^i(x),\ \mathrm{f}\mathrm{o}\mathrm{r}\ k\in \kappa\ \mathrm{and}\ i\in N\right\} $ provides a feedback Nash equilibrium solution to the game (4.1 and 4.2) if there exist functions V ⁱ(k, x), for $ k\in K $ and $ i\in N $, such that the following recursive relations are satisfied:

$$ \begin{array}{l}{V}^i\left(k,x\right)=\underset{u_k^i}{ \max }{E}_{\theta_k}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{g}_k^i\Big[x,{\phi}_k^1(x),{\phi}_k^2(x),\dots, {\phi}_k^{i-1}(x),{u}_k^i,{\phi}_k^{i+1}(x),\dots \\ {}\cdots, {\phi}_k^n(x)\Big]{\left(\frac{1}{1+r}\right)}^{k-1}+{V}^i\left[k+1,{\tilde{f}}_k^i\left(x,{u}_k^i\right)+{G}_k(x){\theta}_k\right]\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\}\\ {}={E}_{\theta_k}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{g}_k^i\Big[x,{\phi}_k^1(x),{\phi}_k^2(x),\dots, {\phi}_k^n(x){\left(\frac{1}{1+r}\right)}^{k-1}\\ {}+{V}^i\left[k+1,{f}_k\left(x,{\phi}_k^1(x),{\phi}_k^2(x),\dots, {\phi}_k^n(x)\right)+{G}_k(x){\theta}_k\right]\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\}\end{array} $$

(4.3)

$$ {V}^i\left(T+1,x\right)={q}_{T+1}^i(x){\left(\frac{1}{1+r}\right)}^T; $$

(4.4)

for $ i\in N $ and $ k\in \kappa $,

where $ {\tilde{f}}_k^i\left(x,{u}_k^i\right)={f}_k\left[x,{\phi}_k^1(x),{\phi}_k^2(x),\dots, {\phi}_k^{i-1}(x),{u}_k^i,{\phi}_k^{i+1}(x),\dots, {\phi}_k^n(x)\right] $ and $ {E}_{\theta_k} $ is the expectation operation with respect to the statistics of θ _k.

Proof

Invoking the discrete-time stochastic dynamic programming technique in Theorem A.6 of the Technical Appendices, V ⁱ(k, x) is the maximized payoff of player i for given strategies $ \left\{\;{\phi}_k^i(x)\right.,\ \mathrm{f}\mathrm{o}\mathrm{r}\ j\in N\ \mathrm{and}\ j\ne i\left.\right\} $ of the other $ n-1 $ players. Hence a Nash equilibrium appears. ■

Again, for the sake of exposition, we sidestep the issue of multiple equilibria and focus on solvable games in which a particular noncooperative Nash equilibrium is chosen by the players in the entire subgame.

4.2 Dynamic Cooperation under Uncertainty

Now consider the case when the players agree to cooperate and distribute the payoff among themselves according to an optimality principle . Once again, the essential properties of group optimality and individual rationality have to be satisfied. An agreed upon optimality principle entails group optimality and an imputation to distribute the total cooperative payoff among the players.

We first examine the group optimal solution and then the condition under which individual rationality will be maintained.

4.2.1 Group Optimality

Maximizing the players’ expected joint payoff guarantees group optimality in a game where payoffs are transferable. To maximize their expected joint payoff the players have to solve the discrete-time stochastic dynamic programming problem of maximizing

$$ \begin{array}{l}{E}_{\theta_1,{\theta}_2,\cdots, {\theta}_T}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\displaystyle \sum_{j=1}^n}{\displaystyle \sum_{k=1}^T\left[\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.}{g}_k^j\left[{x}_k,{u}_k^1,{u}_k^2,\dots, {u}_k^n\right]{\left(\frac{1}{1+r}\right)}^{k-1}\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right]\\ {}+{\displaystyle \sum_{j=1}^n}{q}_{T+1}^i\left({x}_{T+1}\right){\left(\frac{1}{1+r}\right)}^T\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\}\end{array} $$

(4.5)

subject to (4.1).

Invoking the discrete-time stochastic dynamic programming technique an optimal solution to the problem (4.1) and (4.5) can be characterized in the following theorem.

Theorem 4.2

A set of strategies $ \left\{{\psi}_k^i(x),\ \mathrm{f}\mathrm{o}\mathrm{r}\ k\in \kappa\ \mathrm{and}\ i\in N\right\} $, provides an optimal solution to the problem (4.1) and (4.5) if there exist functions W(k, x), for $ k\in K $, such that the following recursive relations are satisfied:

$$ \begin{array}{l}W\left(k,x\right)=\underset{u_k^1,{u}_k^2,\dots, {u}_k^n}{ \max }{E}_{\theta_k}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\displaystyle \sum_{j=1}^n}{g}_k^j\left[x,{u}_k^1,{u}_k^2,\dots, {u}_k^n\right]{\left(\frac{1}{1+r}\right)}^{k-1}\\ {}+W\left[k+1,{f}_k\left(x,{u}_k^1,{u}_k^2,\dots, {u}_k^n\right)+{G}_k(x){\theta}_k\right]\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\}\\ {}={E}_{\theta_k}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\displaystyle \sum_{j=1}^n{g}_k^j\left[x,{\psi}_k^1(x),{\psi}_k^2(x),\dots, {\psi}_k^n(x)\right]\times {\left(\frac{1}{1+r}\right)}^{k-1}}\\ {}+W\left[k+1,{f}_k\left(x,{\psi}_k^1(x),{\psi}_k^2(x),\dots, {\psi}_k^n(x)\right)+{G}_k(x){\theta}_k\right]\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\},\\ {}\end{array} $$

(4.6)

$$ W\left(T+1,x\right)={\displaystyle \sum_{j=1}^n}{q}_{T+1}^j(x){\left(\frac{1}{1+r}\right)}^T. $$

(4.7)

Proof

Follow the proof of the discrete-time stochastic dynamic programming technique in Theorem A.6 of the Technical Appendices. ■

Substituting the optimal control $ \left\{{\psi}_k^i(x),\ \mathrm{f}\mathrm{o}\mathrm{r}\ k\in \kappa\ \mathrm{and}\ i\in N\right\} $ into the state dynamics (4.1), one can obtain the dynamics of the cooperative trajectory as:

$$ {x}_{k+1}={f}_k\left({x}_k,{\psi}_k^1\left({x}_k\right),{\psi}_k^2\left({x}_k\right),\dots, {\psi}_k^n\left({x}_k\right)\right)+{G}_k\left({x}_k\right){\theta}_k, $$

(4.8)

for $ k\in \kappa $ and $ {x}_1={x}^0 $.

We use X ^*_k to denote the set of realizable values of x _k at stage k generated by (4.8). The term $ {x}_k^{*}\in {X}_k^{*} $ is used to denote an element in X ^*_k .

The term W(k, x ^*_k ) gives the expected total cooperative payoff over the stages from k to $ T+1 $ if $ {x}_k^{*}\in {X}_k^{*} $ is realized at stage $ k\in \kappa $.We then proceed to consider individual rationality .

4.2.2 Individual Rationality

The players have to agree to an optimality principle in distributing the total cooperative payoff among themselves. For individual rationality to be upheld the expected payoffs an player receives under cooperation have to be no less than his expected noncooperative payoff along the cooperative state trajectory. Let $ \xi \left(\cdot, \cdot \right) $ denote the imputation vector guiding the distribution of the total cooperative payoff under the agreed-upon optimality principle along the cooperative trajectory $ {\left\{{x}_k^{*}\;\right\}}_{k=1}^T $. At stage k, the imputation vector according to $ \xi \left(\cdot, \cdot \right) $ is $ \xi \left(k,{x}_k^{*}\right)=\left[{\xi}^1\left(k,{x}_k^{*}\right),{\xi}^2\left(k,{x}_k^{*}\right),\dots, {\xi}^n\left(k,{x}_k^{*}\right)\right] $, for $ k\in \kappa $.

For individual rationality to be maintained throughout all the stages $ k\in \kappa $, it is required that:

$$ {\xi}^i\left(k,{x}_k^{*}\right)\ge {V}^i\left(k,{x}_k^{*}\right),\ \mathrm{f}\mathrm{o}\mathrm{r}\ i\in N\ \mathrm{and}\ k\in \kappa . $$

In particular, the above condition guaranties that the expected payoff allocated to any player under cooperation will be no less than its expected noncooperative payoff.

To satisfy group optimality , the imputation vector has to satisfy

$$ \begin{array}{cc}\hfill W\left(k,{x}_k^{*}\right)={\displaystyle \sum_{j=1}^n}{\xi}^j\left(k,{x}_k^{*}\right),\hfill & \hfill \mathrm{f}\mathrm{o}\mathrm{r}\ k\in \kappa .\hfill \end{array} $$

This condition guarantees the highest expected joint payoffs for the participating players.

If the optimality principle specifies that the players share the excess of the expected total cooperative payoff over the sum of expected individual noncooperative payoffs equally, then the imputation to player i becomes:

$$ {\xi}^i\left(k,{x}_k^{*}\right)={V}^i\left(k,{x}_k^{*}\right)+\frac{1}{n}\left[W\left(k,{x}_k^{*}\right)-{\displaystyle \sum_{j=1}^n{V}^j\left(k,{x}_k^{*}\right)}\right], $$

for $ i\in N $ and $ k\in \kappa $.

If the optimality principle specifies that the players share the expected total cooperative proportional to their expected noncooperative payoffs, then the imputation to player i becomes:

$$ {\xi}^i\left(k,{x}_k^{*}\right)=\frac{V^i\left(k,{x}_k^{*}\right)}{{\displaystyle \sum_{j=1}^n{V}^j\left(k,{x}_k^{*}\right)}}W\left(k,{x}_k^{*}\right), $$

for $ i\in N $ and $ k\in \kappa $.

5 Subgame Consistent Solutions and Payment Mechanism

Now, we proceed to consider dynamically stable solutions in cooperative stochastic dynamic games . To guarantee dynamical stability in a stochastic dynamic cooperation scheme, the solution has to satisfy the property of subgame consistency . A cooperative solution is subgame-consistent if an extension of the solution policy to a subgame starting at a later time with any realizable state brought about by prior optimal behavior would remain optimal under the agreed upon optimality principle . In particular, subgame consistency ensures that as the game proceeds players are guided by the same optimality principle at each stage of the game, and hence do not possess incentives to deviate from the previously adopted optimal behavior. Yeung and Petrosyan (2010) developed conditions leading to subgame consistent solutions in stochastic differential games .

For subgame consistency to be satisfied, the imputation $ \xi \left(\cdot, \cdot \right) $ according to the original optimality principle has to be maintained at all the T stages along the cooperative trajectory $ {\left\{{x}_k^{*}\;\right\}}_{k=1}^T $. In other words, the imputation

$$ \xi \left(k,{x}_k^{*}\right)=\left[{\xi}^1\left(k,{x}_k^{*}\right),{\xi}^2\left(k,{x}_k^{*}\right),\dots, {\xi}^n\left(k,{x}_k^{*}\right)\right]\;\mathrm{at}\ \mathrm{stage}\ k,\mathrm{f}\mathrm{o}\mathrm{r}\ k\in \kappa, $$

(5.1)

has to be upheld.

Crucial to the analysis is the formulation of a payment mechanism so that the imputation in (5.1) can be realized.

5.1 Payoff Distribution Procedure

Following the analysis of Yeung and Petrosyan (2010), we formulate a discrete-time Payoff Distribution Procedure (PDP) so that the agreed imputations (5.1) can be realized. Let B ⁱ_k (x ^*_k ) denote the payment that player i will receive at stage k under the cooperative agreement if $ {x}_k^{*}\in {X}_k^{*} $ is realized at stage $ k\in \kappa $.

The payment scheme involving B ⁱ_k (x ^*_k ) constitutes a PDP in the sense that if$ {x}_k^{*}\in {X}_k^{*} $ is realized at stage k the imputation to player i over the stages from k to T can be expressed as:

$$ \begin{array}{l}{\xi}^i\left(k,{x}_k^{*}\right)={B}_k^i\left({x}_k^{*}\right){\left(\frac{1}{1+r}\right)}^{k-1}\\ {}+{E}_{\theta_k,{\theta}_{k+1},\dots, {\theta}_T}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\displaystyle \sum_{\zeta =k+1}^T}{B}_{\zeta}^i\left({x}_{\zeta}^{*}\right){\left(\frac{1}{1+r}\right)}^{\zeta -1}+{q}_{T+1}^i\left({x}_{T+1}\right){\left(\frac{1}{1+r}\right)}^T\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\}\end{array} $$

(5.2)

for $ i\in N $ and $ k\in \kappa $.

A theorem characterizing a formula for B ⁱ_k (x ^*_k ), for $ k\in \kappa $ and $ i\in N $, which yields (5.2) is provided below.

Theorem 5.1

A payment equaling

$$ {B}_k^i\left({x}_k^{*}\right)={\left(1+r\right)}^{k-1}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\xi}^i\left(k,{x}_k^{*}\right)-{E}_{\theta_k}\left(\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\xi}^i\left[k+1,{f}_k\left({x}_k^{*},{\psi}_k\left({x}_k^{*}\right)\right)+{G}_k\left({x}_k^{*}\right){\theta}_k\right]\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right)\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\}, $$

(5.3)

for $ i\in N $,

given to player i at stage $ k\in \kappa $, if $ {x}_k^{*}\in {X}_k^{*} $ would lead to the realization of the imputation $ \left\{\xi \left(k,{x}_k^{*}\right),\ \mathrm{f}\mathrm{o}\mathrm{r}\ k\in \kappa \right\} $.

Proof

Using (5.2) one can obtain

$$ \begin{array}{l}{\xi}^i\left(k+1,{x}_{k+1}^{*}\right)={B}_{k+1}^i\left({x}_{k+1}^{*}\right){\left(\frac{1}{1+r}\right)}^k\\ {}+{E}_{\theta_{k+1},{\theta}_{k+3},\dots, {\theta}_T}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\displaystyle \sum_{\zeta =k+2}^T}{B}_{\zeta}^i\left({x}_{\zeta}^{*}\right){\left(\frac{1}{1+r}\right)}^{\zeta -1}+{q}_{T+1}^i\left({x}_{T+1}\right){\left(\frac{1}{1+r}\right)}^T\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\}.\end{array} $$

(5.4)

Upon substituting (5.4) into (5.2) yields

$$ \begin{array}{l}{\xi}^i\left(k,{x}_k^{*}\right)={B}_k^i\left({x}_k^{*}\right){\left(\frac{1}{1+r}\right)}^{k-1}\\ {}+{E}_{\theta_k}\left(\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\xi}^i\left[k+1,{f}_k\left({x}_k^{*},{\psi}_k\left({x}_k^{*}\right)\right)+{G}_k\left({x}_k^{*}\right){\theta}_k\right]\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right)\end{array} $$

(5.5)

for $ i\in N $ and $ k\in \kappa $.

Hence Theorem 5.1 follows. ■

The payment scheme in Theorem 5.1 gives rise to the realization of the imputation guided by the agreed-upon optimal principle and will be used to derive subgame consistent solutions in the next subsection.

5.2 Subgame Consistent Solution

We denote the discrete-time cooperative game with dynamics (4.1) and payoff (4.2) by Γ_c(1, x ₀). Then we denote the game with dynamics (4.1) and payoff (4.2) which starts at stage $ \upsilon \ge 1 $ with initial state $ {x}_{\upsilon}^{*}\in {X}_{\upsilon}^{*} $ by Γ_c(υ, x ^*_υ ). Moreover, we let $ P\left(1,{x}_0\right)=\left\{{u}_h^i\ \mathrm{and}\ {B}_h^i\ \mathrm{f}\mathrm{o}\mathrm{r}\ h\in \kappa\ \mathrm{and}\ i\in N,\xi \left(1,{x}_0\right)\right\} $ denote the agreed-upon optimality principle for the cooperative game Γ_c(1, x ₀). Let $ \mathrm{P}\left({x}_{\upsilon}^{*}, \upsilon \right)=\left\{{u}_h^i\ \mathrm{and}\ {B}_h^i\ \mathrm{f}\mathrm{o}\mathrm{r}\ h\in \left\{\upsilon, \upsilon +1,\dots, T\right\}\ \mathrm{and}\ i\in N,\xi \left(\upsilon, {x}_{\upsilon}^{*}\right)\right\} $ denote the optimality principle of the cooperative game Γ_c(υ, x ^*_υ ) according to the original agreement.

A theorem characterizing a subgame consistent solution for the discrete-time cooperative game Γ_c(1, x ₀) is presented below.

Theorem 5.2

For the cooperative game Γ_c(1, x ₀) with optimality principle $ P\left(1,{x}_0\right)=\left\{{u}_h^i\left({x}_h^{*}\right)\ \mathrm{and}\ {B}_h^i\left({x}_h^{*}\right)\ \mathrm{f}\mathrm{o}\mathrm{r}\ h\in \kappa\ \mathrm{and}\ i\in N\ \mathrm{and}\ {x}_h^{*}\in {X}_h^{*},\xi \left(1,{x}_0\right)\right\} $ in which

(i)
$ {u}_h^i\left({x}_h^{*}\right)={\psi}_h^i\left({x}_h^{*}\right) $, for $ h\in \kappa $ and $ i\in N $ and $ {x}_h^{*}\in {X}_h^{*} $, is the set of group optimal strategies for the game Γ_c(1, x ₀), and
(ii)
$ {B}_h^i\left({x}_h^{*}\right)={B}_h^i\left({x}_h^{*}\right) $, for $ h\in \kappa $ and $ i\in N $ and $ {x}_h^{*}\in {X}_h^{*} $, where

$$ \begin{array}{l}{B}_h^i\left({x}_h^{*}\right)={\left(1+r\right)}^{h-1}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\xi}^i\left(h,{x}_h^{*}\right)-{E}_{\theta_h}\left(\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\xi}^i\left[h+1,{f}_h\left({x}_h^{*},{\psi}_h\left({x}_h^{*}\right)\right)+{G}_h\left({x}_h^{*}\right){\theta}_h\right]\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right)\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\},\end{array} $$

(5.6)

and $ \left[{\xi}^1\left(h,{x}_h^{*}\right),{\xi}^2\left(h,{x}_h^{*}\right),\dots, {\xi}^i\left(h,{x}_h^{*}\right)\right]\in P\left(h,{x}_h^{*}\right) $ is the imputation according to optimality principle P(h, x ^*_h );

is subgame consistent.

Proof

Follow the proof of the continuous-time analog in Theorem 5.2 of Chap. 7. ■

When all players are using the cooperative strategies, the payoff that player i will directly receive at stage k given that $ {x}_k^{*}\in {X}_k^{*} $ is

$$ {g}_k^i\left[{x}_k^{*},{\psi}_k^1\left({x}_k^{*}\right),{\psi}_k^2\left({x}_k^{*}\right),\cdots, {\psi}_k^n\left({x}_k^{*}\right)\right]. $$

However, according to the agreed upon imputation , player i will receive B ⁱ_k (x ^*_k ) at stage k. Therefore a side-payment

$$ {\varpi}_k^i\left({x}_k^{*}\right)={B}_k^i\left({x}_k^{*}\right)-{g}_k^i\left[{x}_k^{*},{\psi}_k^1\left({x}_k^{*}\right),{\psi}_k^2\left({x}_k^{*}\right),\dots, {\psi}_k^n\left({x}_k^{*}\right)\right], $$

(5.7)

for $ k\in \kappa $ and $ i\in N $,

will be given to player i to yield the cooperative imputation ξ ⁱ(k, x ^*_k ).

6 Cooperative Resource Extraction under Uncertainty

Consider an economy endowed with a renewable resource and with two resource extractors (firms). The lease for resource extraction begins at stage 1 and ends at stage 3 for these two firms. Let u ⁱ_k denote the rate of resource extraction of firm i at stage k, for $ i\in \left\{1,2\right\} $. Let U ⁱ be the set of admissible extraction rates, and $ {x}_k\in X\subset {R}^{+} $ the size of the resource stock at stage k. The extraction cost for firm $ i\in \left\{1,2\right\} $ depends on the quantity of resource extracted u ⁱ_k , the resource stock size x _k, and cost parameters c ₁ and c ₂. In particular, extraction cost for firm i at stage k is specified as c _i(u ⁱ_k )²/x _k. The price of the resource is P.

The profits that firm 1 and firm 2 will obtain at stage k are respectively:

$$ \left[P{u}_k^1-\frac{c_1}{x_k}{\left({u}_k^1\right)}^2\right]\kern0.75em \mathrm{and}\ \left[P{u}_k^2-\frac{c_2}{x_k}{\left({u}_k^2\right)}^2\right]. $$

(6.1)

In stage 4, the firms will receive a salvage value equaling qx ₄. The growth dynamics of the resource is governed by the stochastic difference equation:

$$ {x}_{k+1}={x}_k+a-{\theta}_k{x}_k-{\displaystyle \sum_{j=1}^2{u}_k^j}, $$

(6.2)

for $ k\in \left\{1,2,3\right\} $ and $ {x}_1={x}^0 $,

where θ _k is a random variable with non-negative range {θ ¹_k , θ ²_k , θ ³_k } and corresponding probabilities {λ ¹_k , λ ²_k , λ ³_k }.

With no human harvesting, the natural growth of the resource stock is $ {x}_{k+1}-{x}_k=a-{\theta}_k{x}_k $. The natural growth of the resource is while the death rate exhibits stochasticity. There exists an extraction constraint that human harvesting can at most exploit b proportion of the existing biomass, hence $ {u}_k^1+{u}_k^2\le b{x}_k $. In addition, the highest value of $ {\theta}_k^y<\left(1-b\right) $ for $ k\in \left\{1,2,3\right\} $ and $ y\in \left\{1,2,3\right\} $.

The objective of extractor $ i\in \left\{1,2\right\} $ is to maximize the present value of the expected stream of future profits:

$$ {E}_{\theta_1{\theta}_2{\theta}_3}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\displaystyle \sum_{k=1}^3}\left[P{u}_k^i-\frac{c_i}{x_k}{\left({u}_k^{\mathrm{i}}\right)}^2\right]{\left(\frac{1}{1+r}\right)}^{k-1}+{\left(\frac{1}{1+r}\right)}^3q{x}_4\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\},\ \mathrm{f}\mathrm{o}\mathrm{r}\ i\in \left\{1,2\right\}, $$

(6.3)

subject to (6.2).

Invoking Theorem 4.2, one can characterize the noncooperative equilibrium strategies in a feedback solution for game (6.2 and 6.3). In particular, a set of strategies $ \left\{{\phi}_k^i(x),\ \mathrm{f}\mathrm{o}\mathrm{r}\ k\in \left\{1,2,3\right\}\ \mathrm{and}\ i\in \left\{1,2\right\}\right\} $ provides a Nash equilibrium solution to the game (6.2 and 6.3) if there exist functions V ⁱ(k, x), for $ i\in \left\{1,2\right\} $ and $ k\in \left\{1,2,3\right\} $, such that the following recursive relations are satisfied:

$$ \begin{array}{l}{V}^i\left(k,x\right)=\underset{u_k^i}{ \max }{E}_{\theta_k}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.\left[P{u}_k^i-\frac{c_i}{x}{\left({u}_k^{\mathrm{i}}\right)}^2\right]\kern0.5em {\left(\frac{1}{1+r}\right)}^{k-1}\\ {}\kern3.85em +{V}^i\left[k+1,x+a-{\theta}_kx-{u}_k^i-{\phi}_k^j(x)\right]\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\}\\ {}\kern3.1em =\underset{u_k^i}{ \max}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.\left[P{u}_k^i-\frac{c_i}{x}{\left({u}_k^{\mathrm{i}}\right)}^2\right]\kern0.5em {\left(\frac{1}{1+r}\right)}^{k-1}\\ {}\kern3.85em +{\displaystyle \sum_{y=1}^3{\lambda}_k^y}{V}^i\left[k+1,x+a-{\theta}_k^yx-{u}_k^i-{\phi}_k^j(x)\right]\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\};\\ {}{V}^i\left(T+1,x\right)={\left(\frac{1}{1+r}\right)}^3q{x}_4.\end{array} $$

(6.4)

Performing the indicated maximization in (6.4) yields:

$$ \left(P-\frac{2{c}_i{u}_k^i}{x}\right){\left(\frac{1}{1+r}\right)}^{k-1}-{\displaystyle \sum_{y=1}^3{\lambda}_k^y}{V}_{x_{k+1}}^i\left[k+1,x+a-{\theta}_k^yx-{u}_k^i-{\phi}_k^j(x)\right]=0, $$

(6.5)

for $ i\in \left\{1,2\right\} $ and $ k\in \left\{1,2,3\right\} $.

From (6.5), the game equilibrium strategies can be expressed as:

$$ {\phi}_k^i(x)=\left(\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.P-{\displaystyle \sum_{y=1}^3{\lambda}_k^y}{V}_{x_{k+1}}^i\left[k+1,x+a-{\theta}_k^yx-{\displaystyle \sum_{\ell =1}^2{\phi}_k^{\ell }}(x)\right]{\left(1+r\right)}^{k-1}\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right)\frac{x}{2{c}_i}, $$

(6.6)

for $ i\in \left\{1,2\right\} $ and $ k\in \left\{1,2,3\right\} $.

The expected game equilibrium profits of the firms can be obtained as:

Proposition 6.1

The value function indicating the expected game equilibrium profit of firm i is

$$ {V}^i\left(k,x\right)=\left[{A}_k^ix+{C}_k^i\right],\ \mathrm{f}\mathrm{o}\mathrm{r}\ i\in \left\{1,2\right\}\ \mathrm{and}\ k\in \left\{1,2,3\right\}, $$

(6.7)

where A ⁱ_k and C ⁱ_k , for $ i\in \left\{1,2\right\} $ and $ k\in \left\{1,2,3\right\} $, are constants in terms of the parameters of the game (6.2 and 6.3).

Proof

See Appendix C of this Chapter. ■

Substituting the relevant derivatives of the value functions in Proposition 6.1 into the game equilibrium strategies (6.6) yields a noncooperative feedback equilibrium solution of the game (6.2 and 6.3).

Now consider the case when the extractors agree to maximize their expected joint profit and share the excess of cooperative gains over their expected noncooperative payoffs equally. To maximize their expected joint payoff, they solve the problem of maximizing

$$ {E}_{\theta_1{\theta}_2{\theta}_3}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\displaystyle \sum_{j=1}^2}{\displaystyle \sum_{k=1}^3}\left[P{u}_k^j-\frac{c_j}{x_k}{\left({u}_k^{\mathrm{j}}\right)}^2\right]\kern0.5em {\left(\frac{1}{1+r}\right)}^{k-1}+2{\left(\frac{1}{1+r}\right)}^3q{x}_4\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\} $$

(6.8)

subject to (6.2).

Invoking Theorem 4.2, one can characterize the optimal controls in the stochastic dynamic programming problem (6.2) and (6.8). In particular, a set of control strategies $ \left\{{\psi}_k^i(x),\ \mathrm{f}\mathrm{o}\mathrm{r}\ k\in \left\{1,2,3\right\}\ \mathrm{and}\ i\in \left\{1,2\right\}\right\} $ provides an optimal solution to the problem (6.2) and (6.8) if there exist functions $ W\left(k,x\right):R\to R $, for $ k\in \left\{1,2,3\right\} $, such that the following recursive relations are satisfied:

$$ \begin{array}{l}W\left(k,x\right)=\underset{u_k^1,{u}_k^2}{ \max }{E}_{\theta_{k+1}}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\displaystyle \sum_{j=1}^2}\left[P{u}_k^j-\frac{c_j}{x}{\left({u}_k^{\mathrm{j}}\right)}^2\right]\kern0.5em {\left(\frac{1}{1+r}\right)}^{k-1}\\ {}\kern3.75em +W\left[k+1,x+a-{\theta}_kx-{\displaystyle \sum_{j=1}^2{u}_k^j}\right]\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\}\\ {}\kern3em =\underset{u_k^1,{u}_k^2}{ \max}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\displaystyle \sum_{j=1}^2}\left[P{u}_k^j-\frac{c_j}{x}{\left({u}_k^{\mathrm{j}}\right)}^2\right]\kern0.5em {\left(\frac{1}{1+r}\right)}^{k-1}\\ {}\begin{array}{cc}\hfill \kern3.6em +{\displaystyle \sum_{y=1}^3{\lambda}_k^y}\kern0.24em W\left[k+1,x+a-{\theta}_k^yx-{\displaystyle \sum_{j=1}^2{u}_k^j}\right]\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\},\hfill & \hfill \mathrm{f}\mathrm{o}\mathrm{r}\ k\in \left\{1,2,3\right\}.\hfill \end{array}\\ {}W\left(T+1,x\right)=2{\left(\frac{1}{1+r}\right)}^3q{x}_4.\end{array} $$

(6.9)

Performing the indicated maximization in (6.9) yields:

$$ \left(P-\frac{2{c}_i{u}_k^i}{x}\right){\left(\frac{1}{1+r}\right)}^{k-1}-{\displaystyle \sum_{y=1}^3{\lambda}_k^y}\ {W}_{x_{k+1}}\left[k+1,x+a-{\theta}_k^yx-{\displaystyle \sum_{j=1}^2{u}_k^j}\right]=0, $$

(6.10)

for $ i\in \left\{1,2\right\} $ and $ k\in \left\{1,2,3\right\} $.

In particular, the optimal cooperative strategies can be obtained from (6.10) as:

$$ {u}_k^i\left(\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.P-{\displaystyle \sum_{y=1}^3{\lambda}_k^y}\kern0.5em {W}_{x_{k+1}}\left[k+1,x+a-{\theta}_k^yx-{\displaystyle \sum_{j=1}^2{u}_k^j}\right]{\left(1+r\right)}^{k-1}\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right)\frac{x}{2{c}_i}, $$

(6.11)

for $ i\in \left\{1,2\right\} $ and $ k\in \left\{1,2,3\right\} $.

The expected joint profit under cooperation is given below.

Proposition 6.2

The value function indicating the maximized expected joint payoff is

$$ W\left(k,x\right)=\left[{A}_kx+{C}_k\right],\ \mathrm{f}\mathrm{o}\mathrm{r}\ k\in \left\{1,2,3\right\}, $$

(6.12)

where A _k and C _k, for $ k\in \left\{1,2,3\right\} $, are constants in terms of the parameters of the problem (6.8) and (6.2).

Proof

See Appendix D of this Chapter. ■

Using (6.11) and Proposition 6.2, the optimal cooperative strategies of the extracting firms can be expressed as:

$$ \begin{array}{cc}\hfill {\psi}_k^i(x)=\left[P-{A}_{k+1}{\left(1+r\right)}^{k-1}\right]\frac{x}{2{c}_i},\hfill & \hfill \mathrm{f}\mathrm{o}\mathrm{r}\ i\in \left\{1,2\right\}\ \mathrm{and}\ k\in \left\{1,2,3\right\}.\hfill \end{array} $$

(6.13)

Substituting ψ ⁱ_k (x) from (6.13) into (6.2) yields the optimal cooperative state trajectory:

$$ {x}_{k+1}={x}_k+a-{\theta}_k{x}_k-{\displaystyle \sum_{j=1}^2}\left[P-{A}_{k+1}{\left(1+r\right)}^{k-1}\right]\frac{x_k}{2{c}_j}, $$

(6.14)

for $ k\in \left\{1,2,3\right\} $ and $ {x}_1={x}^0 $.

Dynamics (6.14) is a linear stochastic difference equation readily solvable by standard techniques. Let $ \left\{{x}_k^{*},\ \mathrm{f}\mathrm{o}\mathrm{r}\ k\in \left\{1,2,3\right\}\right\} $ denote the solution to (6.14).

Since the extractors agree to share the excess of cooperative gains over their expected noncooperative payoffs equally, an imputation

$$ \begin{array}{l}{\xi}^i\left(k,{x}_k^{*}\right)={V}^i\left(k,{x}_k^{*}\right)+\frac{1}{2}\left[W\left(k,{x}_k^{*}\right)-{\displaystyle \sum_{j=1}^2{V}^j\left(k,{x}_k^{*}\right)}\right]\\ {}=\left({A}_k^i{x}_k^{*}+{C}_k^i\right)+\frac{1}{2}\left[\left({A}_k{x}_k^{*}+{C}_k\right)-{\displaystyle \sum_{j=1}^2\left({A}_k^j{x}_k^{*}+{C}_k^j\right)}\right],\end{array} $$

(6.15)

for $ k\in \left\{1,2,3\right\} $ and $ i\in \left\{1,2\right\} $ has to be maintained.

Invoking Theorem 4.1, if $ {x}_k^{*}\in X $ is realized at stage k a payment equaling

$$ \begin{array}{l}{B}_k^i\left({x}_k^{*}\right)={\left(1+r\right)}^{k-1}\left[\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\xi}^i\left(k,{x}_k^{*}\right)-{E}_{\theta_k}\left(\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\xi}^i\left[k+1,{x}_{k+1}^{\left({\theta}_y\right)*}\right]\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right)\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right]\\ {}\kern2.75em ={\left(1+r\right)}^{k-1}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.\left({A}_k^i{x}_k^{*}+{C}_k^i\right)+\frac{1}{2}\left(\left({A}_k{x}_k^{*}+{C}_k\right)-{\displaystyle \sum_{j=1}^2\left({A}_k^j{x}_k^{*}+{C}_k^j\right)}\right)\\ {}\kern3.5em -{\displaystyle \sum_{y=1}^3{\lambda}_k^y}\left[\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.\left({A}_{k+1}^i{x}_{k+1}^{*\left({\theta}_k^y\right)}+{C}_{k+1}^i\right)\\ {}\kern3.5em +\frac{1}{2}\left(\left({A}_{k+1}{x}_{k+1}^{*\left({\theta}_k^y\right)}+{C}_{k+1}\right)-{\displaystyle \sum_{j=1}^2\left({A}_{k+1}^j{x}_{k+1}^{*\left({\theta}_k^y\right)}+{C}_{k+1}^j\right)}\right)\left.\begin{array}{c}\hfill \hfill \\ {}\end{array}\right]\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\},\\ {}\mathrm{f}\mathrm{o}\mathrm{r}\ i\in \left\{1,2\right\};\end{array} $$

(6.16)

where $ {x}_{k+1}^{*\left({\theta}_k^y\right)}={x}_k^{*}+a-{\theta}_k^y{x}_k^{*}-{\displaystyle \sum_{j=1}^2}\left[P-{A}_{k+1}{\left(1+r\right)}^{k-1}\right]\frac{x_k^{*}}{2{c}_j} $, for $ y\in \left\{1,2,3\right\} $,

given to firm i at stage $ k\in \kappa $ would lead to the realization of the imputation (6.15).

A subgame consistent solution can be readily obtained from (6.13), (6.15) and (6.16).

7 A Heuristic Approach

In some game situations it may not be possible or practical to obtain all the information needed in this Chapter. Therefore a heuristic method may have to be considered to resolve the problem. To solve the problem in concern a heuristic method employs a practical methodology not guaranteed to be optimal or perfect, but sufficient for the immediate goals. Where finding an optimal solution is impossible or impractical, heuristic methods often prove to be able to speed up the process of finding a satisfactory solution. In particular, heuristic methods use strategies and information that are readily accessible (though not a 100% exact and accurate) to obtain a solution.

Consider the case of a heuristic approach to solving a subgame consistent solution in a situation where the differentiable functions

$$ {f}_k\left({x}_k,{u}_k^1,{u}_k^2,\cdots, {u}_k^n\right), $$

G _k(x _k)θ _k, and

$ {g}_k^i\left[{x}_k,{u}_k^1,{u}_k^2,\cdots, {u}_k^n\right] $, for $ i\in \left\{1,2,\cdots, n\right\}\equiv N $ and $ k\in \left\{1,2,\cdots, T\right\}\equiv \kappa $, in (4.1 and 4.2)

are not available.

However, the players concur with the adoption of a set of cooperative strategies $ \Big\{{\widehat{\psi}}_k^i\left({x}_k\right) $, for $ k\in \kappa $ and $ i\in N\Big\} $. Though the cooperative strategies may not be the set of theoretically optimal controls they are perceived to be certainly beneficial to the joint well-being of all players.

In addition, with expert knowledge and statistical techniques the expected value of cooperative payment $ {\displaystyle \sum_{j=1}^n}{\widehat{g}}_{\tau}^j\left[{\widehat{x}}_{\tau },{\widehat{\psi}}_{\tau}^1\left({\widehat{x}}_{\tau}\right),{\widehat{\psi}}_{\tau}^2\left({\widehat{x}}_{\tau}\right),\cdots, {\widehat{\psi}}_{\tau}^n\left({\widehat{x}}_{\tau}\right)\right] $ received in each stage $ \tau \in \left\{k,k+1,k+2,\cdots, T\right\} $ can be estimated with acceptable degrees of accuracy. The value $ \widehat{W}\left(k,{\widehat{x}}_k\right) $ can be obtained by summing the cooperative payments $ {\displaystyle \sum_{j=1}^n}{\widehat{g}}_{\tau}^j\left[{\widehat{x}}_{\tau },{\widehat{\psi}}_{\tau}^1\left({\widehat{x}}_{\tau}\right),{\widehat{\psi}}_{\tau}^2\left({\widehat{x}}_{\tau}\right),\cdots, {\widehat{\psi}}_{\tau}^n\left({\widehat{x}}_{\tau}\right)\right] $ expected to be received in each stage from stage k to stage T for $ k\in \kappa $ along the cooperation path $ {\left\{{\widehat{x}}_{\tau}\right\}}_{\;\tau =k}^{\kern0.5em T} $, that is:

$$ \begin{array}{l}\widehat{W}\left(k,{\widehat{x}}_k\right)={\displaystyle \sum_{\tau =k}^T{\displaystyle \sum_{j=1}^n}}{\widehat{g}}_{\tau}^j\left[{\widehat{x}}_{\tau },{\widehat{\psi}}_{\tau}^1\left({\widehat{x}}_{\tau}\right),{\widehat{\psi}}_{\tau}^2\left({\widehat{x}}_{\tau}\right),\dots, {\widehat{\psi}}_{\tau}^n\left({\widehat{x}}_{\tau}\right)\right]\\ {}\kern4.4em +{\displaystyle \sum_{j=1}^n}{q}_{T+1}^j\left({\widehat{x}}_{T+1}\right){\left(\frac{1}{1+r}\right)}^T,\ \mathrm{f}\mathrm{o}\mathrm{r}\ k\in \kappa .\end{array} $$

(7.1)

Again, with expert knowledge and statistical techniques the expected value of non-cooperative payment $ {\overline{g}}_{\tau}^i\left[{\overline{x}}_{\tau },{\overline{\phi}}_{\tau}^1\left({\overline{x}}_{\tau}\right),{\overline{\phi}}_{\tau}^1\left({\overline{x}}_{\tau}\right),\cdots, {\overline{\phi}}_{\tau}^1\left({\overline{x}}_{\tau}\right)\right] $ of player $ i\in N $ received in each stage $ \tau \in \left\{k,k+1,k+2,\cdots, T\right\} $ if the players revert to non-cooperation from stage k to stage T for $ k\in \kappa $ can be estimated with acceptable degrees of accuracy. The value $ {\overline{V}}^i\left(k,{\widehat{x}}_k\right) $ can be obtained by summing of the expected payments to be received by player i in each stage from stage k to stage T for $ k\in \kappa $ along the non-cooperation path $ {\left\{{\overline{x}}_{\tau}\right\}}_{\;\tau =k}^{\kern0.5em T} $ where $ {\overline{x}}_k={\widehat{x}}_k $, that is

$$ \begin{array}{l}{\overline{V}}^i\left(k,{\widehat{x}}_k\right)={\displaystyle \sum_{\tau =k}^T}{\overline{g}}_{\tau}^i\left[{\overline{x}}_{\tau },{\overline{\phi}}_{\tau}^1\left({\overline{x}}_{\tau}\right),{\overline{\phi}}_{\tau}^1\left({\overline{x}}_{\tau}\right),\dots, {\overline{\phi}}_{\tau}^1\left({\overline{x}}_{\tau}\right)\right]+{q}_{T+1}^i\left({\overline{x}}_{T+1}\right){\left(\frac{1}{1+r}\right)}^T\\ {}\mathrm{f}\mathrm{o}\mathrm{r}\ i\in N.\end{array} $$

(7.2)

If the agreed upon optimality principle specifies that the players share the expected total cooperative proportional to their expected noncooperative payoffs, then the imputation to player i becomes:

$$ {\widehat{\xi}}^i\left(k,{\widehat{x}}_k\right)=\frac{{\overline{V}}^i\left(k,{\widehat{x}}_k\right)}{{\displaystyle \sum_{j=1}^n{\overline{V}}^j\left(k,{\widehat{x}}_k\right)}}\widehat{W}\left(k,{\widehat{x}}_k\right), $$

(7.3)

for $ i\in N $ and $ k\in \kappa $.

Invoking Theorem 5.1 a theoretically subgame consistent payment distribution procedure can be obtained with:

$$ \begin{array}{l}{B}_k^i\left({x}_k^{*}\right)={\left(1+r\right)}^{k-1}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\xi}^i\left(k,{x}_k^{*}\right)-{E}_{\theta_k}\left(\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\xi}^i\left[k+1,{f}_k\left({x}_k^{*},{\psi}_k\left({x}_k^{*}\right)\right)+{G}_k\left({x}_k^{*}\right){\theta}_k\right]\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right)\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\},\end{array} $$

(7.4)

for i ∈ N,

given to player i at stage $ k\in \kappa $, if $ {x}_k^{*}\in {X}_k^{*} $.

Using (7.1, 7.2, 7.3 and 7.4) a subgame consistent PDP under a heuristic scheme can be obtained with:

$$ \begin{array}{l}{B}_k^i\left({\widehat{x}}_k\right)={\left(1+r\right)}^{k-1}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.\frac{{\overline{V}}^i\left(k,{\widehat{x}}_k\right)}{{\displaystyle \sum_{j=1}^n{\overline{V}}^j\left(k,{\widehat{x}}_k\right)}}\widehat{W}\left(k,{\widehat{x}}_k\right)\\ {}\kern3.5em -\frac{{\overline{V}}^i\left(k+1,{\widehat{x}}_{k+1}\right)}{{\displaystyle \sum_{j=1}^n{\overline{V}}^j\left(k+1,{\widehat{x}}_{k+1}\right)}}\widehat{W}\left(k+1,{\widehat{x}}_{k+1}\right)\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\}\end{array} $$

(7.5)

given to player $ i\in N $ at stage $ k\in \kappa $, along the cooperation path $ {\left\{{\widehat{x}}_k\right\}}_{\;k=1}^{\kern0.5em T} $.

The heuristic approach allows the application of subgame consistent solution in dynamic game situations if estimates of the expected cooperative payoffs and individual non-cooperative payoffs with acceptable degrees of accuracy are available. This approach would be helpful to resolving the unstable elements in cooperative schemes for a wide range of game theoretic real-world problems.

8 Chapter Appendices

Appendix A. Proof of Proposition 3.1

Consider first the last stage, that is stage 3. Invoking that $ {V}^i\left(3,x\right)=\left[{A}_3^ix+{C}_3^i\right] $ from Proposition 3.1 and $ {V}^i\left(4,x\right)={\left(\frac{1}{1+r}\right)}^3qx $, the conditions in Eq. (3.4) become

$$ \begin{array}{l}{V}^i\left(3,x\right)=\left[{A}_3^ix+{C}_3^i\right]=\underset{u_3^i}{ \max}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.\left[P{u}_3^i-\frac{c_i}{x}{\left({u}_3^{\mathrm{i}}\right)}^2\right]\kern0.5em {\left(\frac{1}{1+r}\right)}^2\\ {}\begin{array}{cc}\hfill \kern3.6em +{\left(\frac{1}{1+r}\right)}^3q\left[x+a-bx-{u}_3^i-{\phi}_3^j(x)\right]\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\},\hfill & \hfill \mathrm{f}\mathrm{o}\mathrm{r}\ i\in \left\{1,2\right\}.\hfill \end{array}\end{array} $$

(8.1)

Performing the indicated maximization in (8.1) yields the game equilibrium strategies in stage 3 as:

$$ \begin{array}{cc}\hfill {\phi}_3^i(x)=\frac{\left[P-{\left(1+r\right)}^{-1}q\right]x}{2{c}_i},\hfill & \hfill \mathrm{f}\mathrm{o}\mathrm{r}\ i\in \left\{1,2\right\}.\hfill \end{array} $$

(8.2)

Substituting (8.2) into (8.1) yields:

$$ \begin{array}{l}{V}^i\left(3,x\right)=\left[{A}_3^ix+{C}_3^i\right]={\left(\frac{1}{1+r}\right)}^2\left(\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.\left[P-{\left(1+r\right)}^{-1}q\right]\frac{P}{2{c}_i}x\\ {}\kern3.7em -{\left[P-{\left(1+r\right)}^{-1}q\right]}^2\frac{1}{4{c}_i}x\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right)\\ {}\kern3.7em +q\left(\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.x+a-bx-\left[P-{\left(1+r\right)}^{-1}q\right]\frac{1}{2{c}_i}x\\ {}\kern3.7em -\left[P-{\left(1+r\right)}^{-1}q\right]\frac{1}{2{c}_j}x\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right){\left(\frac{1}{1+r}\right)}^3\ \mathrm{f}\mathrm{o}\mathrm{r}\ i,j\in \left\{1,2\right\}\ \mathrm{and}\ i\ne j.\end{array} $$

(8.3)

Using (8.3), we can obtain A ⁱ₃ and C ⁱ₃ , for $ i\in \left\{1,2\right\} $.

Now we proceed to stage 2, the conditions in Eq. (3.4) become

$$ \begin{array}{l}{V}^i\left(2,x\right)=\left[{A}_2^ix+{C}_2^i\right]=\underset{u_2^i}{ \max}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.\left[P{u}_2^i-\frac{c_i}{x}{\left({u}_2^{\mathrm{i}}\right)}^2\right]\kern0.5em \left(\frac{1}{1+r}\right)\\ {}\kern3.7em +{A}_3^i\left[x+a-bx-{u}_2^i-{\phi}_2^j(x)\right]\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\},\ \mathrm{f}\mathrm{o}\mathrm{r}\ i,j\in \left\{1,2\right\}\ \mathrm{and}\ i\ne j.\end{array} $$

(8.4)

Performing the indicated maximization in (8.4) yields the game equilibrium strategies in stage 2 as:

$$ \begin{array}{cc}\hfill {\phi}_2^i(x)=\left[P-\left(1+r\right){A}_3^i\right]\frac{x}{2{c}_i},\hfill & \hfill \mathrm{f}\mathrm{o}\mathrm{r}\ i\in \left\{1,2\right\}.\hfill \end{array} $$

(8.5)

Substituting (8.5) into (8.4) yields

$$ \begin{array}{l}{V}^i\left(2,x\right)=\left[{A}_2^ix+{C}_2^i\right]=\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.\left(\frac{1}{1+r}\right)\left[P-\left(1+r\right){A}_3^i\right]\frac{P+\left(1+r\right){A}_3^i}{4{c}_i}\\ {}+{A}_3^i\left(1-b\right)-\left[P-\left(1+r\right){A}_3^i\right]\frac{A_3^i}{2{c}_i}-\left[P-\left(1+r\right){A}_3^j\right]\frac{A_3^i}{2{c}_j}\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\}x+a{A}_3^i,\\ {}\mathrm{f}\mathrm{o}\mathrm{r}\ i,j\in \left\{1,2\right\}\ \mathrm{and}\ i\ne j.\end{array} $$

(8.6)

Substituting A ⁱ₃ for $ i\in \left\{1,2\right\} $ into (8.6), A ⁱ₂ and C ⁱ₂ for $ i\in \left\{1,2\right\} $ are obtained in explicit terms.

Finally, we proceed to the first stage, the conditions in Eq. (3.4) become

$$ \begin{array}{l}{V}^i\left(1,x\right)=\left[{A}_1^ix+{C}_1^i\right]=\underset{u_1^i}{ \max}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.\left[P{u}_1^i-\frac{c_i}{x}{\left({u}_1^{\mathrm{i}}\right)}^2\right]\kern0.5em \\ {}+\left(\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{A}_2^i\left[x+a-bx-{u}_1^i-{\phi}_1^j(x)\right]+{C}_2^i\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right)\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\},\ \mathrm{f}\mathrm{o}\mathrm{r}\ i,j\in \left\{1,2\right\}\ \mathrm{and}\ i\ne j.\end{array} $$

(8.7)

Performing the indicated maximization in (8.7) yields the game equilibrium strategies in stage 1 as:

$$ \begin{array}{cc}\hfill {\phi}_1^i(x)=\left[P-{A}_2^i\right]\frac{x}{2{c}_i},\hfill & \hfill \mathrm{f}\mathrm{o}\mathrm{r}\ i\in \left\{1,2\right\}.\hfill \end{array} $$

(8.8)

Substituting (8.8) into (8.7) yields:

$$ \begin{array}{l}{V}^i\left(3,x\right)=\left[{A}_1^ix+{C}_1^i\right]=\\ {}\left[\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.\left(P-{A}_2^i\right)\frac{P+{A}_2^i}{4{c}_i}+{A}_2^i\left(1-b\right)-\left(P-{A}_2^i\right)\frac{A_2^i}{2{c}_i}-\left(P-{A}_2^j\right)\frac{A_2^i}{2{c}_j}\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right]\;x\\ {}\begin{array}{cc}\hfill +a{A}_2^i+{C}_2^i,\hfill & \hfill \mathrm{f}\mathrm{o}\mathrm{r}\ i,j\in \left\{1,2\right\}\ \mathrm{and}\ i\ne j.\hfill \end{array}\end{array} $$

(8.9)

Substituting the explicit terms for A ⁱ₂ , A ^j₂ , C ⁱ₂ and C ^j₂ from (8.6) into (8.9), A ⁱ₁ and C ⁱ₁ for $ i\in \left\{1,2\right\} $ are obtained in explicit terms. ■

Appendix B. Proof of Proposition 3.2

Consider first the last stage, that is stage 3. Invoking that $ W\left(3,x\right)=\left[{A}_3x+{C}_3\right] $ from Proposition 3.2 and $ W\left(4,x\right)=2{\left(\frac{1}{1+r}\right)}^3qx $, the conditions in Eq. (3.9) become

$$ \begin{array}{l}W\left(3,x\right)=\left[{A}_3x+{C}_3\right]=\underset{u_3^1,{u}_3^2}{ \max}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\displaystyle \sum_{j=1}^2}\left[P{u}_3^j-\frac{c_j}{x}{\left({u}_3^{\mathrm{j}}\right)}^2\right]\kern0.5em {\left(\frac{1}{1+r}\right)}^2\\ {}+2{\left(\frac{1}{1+r}\right)}^3q\left[x+a-bx-{u}_3^1-{u}_3^2\right]\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\}.\end{array} $$

(8.10)

Performing the indicated maximization in (8.10) yields the optimal cooperative strategies in stage 3 as:

$$ \begin{array}{cc}\hfill {\psi}_3^i(x)=\frac{\left[P-{\left(1+r\right)}^{-1}2q\right]x}{2{c}_i},\hfill & \hfill \mathrm{f}\mathrm{o}\mathrm{r}\ i\in \left\{1,2\right\}.\hfill \end{array} $$

(8.11)

Substituting (8.11) into (8.10) yields:

$$ \begin{array}{l}W\left(3,x\right)=\left[{A}_3x+{C}_3\right]={\left(\frac{1}{1+r}\right)}^2{\displaystyle \sum_{j=1}^2\left\{\right.}\left[P-{\left(1+r\right)}^{-1}q\right]\frac{P}{2{c}_j}x\\ {}-{\left[P-{\left(1+r\right)}^{-1}q\right]}^2\frac{1}{4{c}_j}x\left.\right\}+2q\left(\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.x+a-bx-\left[P-{\left(1+r\right)}^{-1}q\right]\frac{1}{2{c}_i}x\\ {}-\left[P-{\left(1+r\right)}^{-1}q\right]\frac{1}{2{c}_j}x\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right){\left(\frac{1}{1+r}\right)}^3.\end{array} $$

(8.12)

Using (8.12), we obtain A ₃ and C ₃.

Now we proceed to stage 2, the conditions in Eq. (3.9) become

$$ \begin{array}{l}W\left(2,x\right)=\left[{A}_2x+{C}_2\right]=\underset{u_2^1,{u}_2^2}{ \max}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\displaystyle \sum_{j=1}^2}\left[P{u}_2^j-\frac{c_j}{x}{\left({u}_2^{\mathrm{j}}\right)}^2\right]\kern0.5em \left(\frac{1}{1+r}\right)\\ {}+{A}_3\left[x+a-bx-{\displaystyle \sum_{j=1}^2{u}_2^j}\right]\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\}\end{array} $$

(8.13)

Performing the indicated maximization in (8.13) yields the optimal cooperative strategies in stage 2 as:

$$ \begin{array}{cc}\hfill {\psi}_2^i(x)=\left[P-\left(1+r\right){A}_3\right]\frac{x}{2{c}_i},\hfill & \hfill \mathrm{f}\mathrm{o}\mathrm{r}\ i\in \left\{1,2\right\}.\hfill \end{array} $$

(8.14)

Substituting (8.14) into (8.13) yields:

$$ \begin{array}{l}W\left(2,x\right)=\left[{A}_2x+{C}_2\right]=\left[\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.\left(\frac{1}{1+r}\right){\displaystyle \sum_{j=1}^2\left[P-\left(1+r\right){A}_3\right]}\frac{P+\left(1+r\right){A}_3}{4{c}_j}\\ {}+{A}_3\left(1-b\right)-\left[P-\left(1+r\right){A}_3\right]\frac{A_3}{2{c}_1}-\left[P-\left(1+r\right){A}_3\right]\frac{A_3^i}{2{c}_2}\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right]\;x+a{A}_3\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right).\end{array} $$

(8.15)

Substituting A ₃ into (8.15), A ₂ and C ₂ are obtained in explicit terms.

Finally, we proceed to the first stage, the conditions in Eq. (3.9) become

$$ \begin{array}{l}W\left(1,x\right)=\left[{A}_1x+{C}_1\right]=\underset{u_1^1,{u}_1^2}{ \max}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\displaystyle \sum_{j=1}^2}\left[P{u}_1^j-\frac{c_j}{x}{\left({u}_1^{\mathrm{j}}\right)}^2\right]\kern0.5em \\ {}+\left(\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{A}_2\left[x+a-bx-{\displaystyle \sum_{j=1}^2{u}_1^j}\right]+{C}_2\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right)\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\}.\end{array} $$

(8.16)

Performing the indicated maximization in (8.16) yields the optimal cooperative strategies in stage 1 as:

$$ \begin{array}{cc}\hfill {\psi}_1^i(x)=\left(P-{A}_2\right)\frac{x}{2{c}_i},\hfill & \hfill \mathrm{f}\mathrm{o}\mathrm{r}\ i\in \left\{1,2\right\}.\hfill \end{array} $$

(8.17)

Substituting (8.17) into (8.16) yields:

$$ \begin{array}{l}W\left(1,x\right)=\left[{A}_1x+{C}_1\right]=\\ {}\left[\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\displaystyle \sum_{j=1}^2\left(P-{A}_2\right)}\frac{P+{A}_2}{4{c}_j}+{A}_2\left(1-b\right)-\left(P-{A}_2\right)\frac{A_2}{2{c}_1}-\left(P-{A}_2\right)\frac{A_2}{2{c}_2}\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right]\;x\\ {}+a{A}_2+{C}_2.\end{array} $$

(8.18)

Substituting the explicit terms for A ₂ and C ₂ from (8.15) into (8.18), A ₁ and C ₁ are obtained in explicit terms. ■

Appendix C. Proof of Proposition 6.1

Consider first the last operating stage, that is stage 3. Invoking that $ {V}^i\left(3,x\right)=\left[{A}_3^ix+{C}_3^i\right] $ from Proposition 6.1 and $ {V}^i\left(4,x\right)={\left(\frac{1}{1+r}\right)}^3q{x}_4, $ the conditions in Eq. (6.4) become

$$ \begin{array}{l}{V}^i\left(3,x\right)=\left[{A}_3^ix+{C}_3^i\right]=\underset{u_3^i}{ \max}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.\left[P{u}_3^i-\frac{c_i}{x}{\left({u}_3^{\mathrm{i}}\right)}^2\right]\kern0.5em {\left(\frac{1}{1+r}\right)}^2\\ {}\begin{array}{cc}\hfill +{\displaystyle \sum_{y=1}^3{\lambda}_3^y}{\left(\frac{1}{1+r}\right)}^3q\left[x+a-{\theta}_3^yx-{u}_3^i-{\phi}_3^j(x)\right]\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\},\hfill & \hfill \mathrm{f}\mathrm{o}\mathrm{r}\ i\in \left\{1,2\right\}.\hfill \end{array}\end{array} $$

(8.19)

Performing the indicated maximization in (8.19) yields the game equilibrium strategies in stage 3 as:

$$ \begin{array}{cc}\hfill {\phi}_3^i(x)=\frac{\left[P-{\left(1+r\right)}^{-1}q\right]x}{2{c}_i},\hfill & \hfill \mathrm{f}\mathrm{o}\mathrm{r}\ i\in \left\{1,2\right\}.\hfill \end{array} $$

(8.20)

Substituting (8.20) into (8.19) yields:

$$ \begin{array}{l}{V}^i\left(3,x\right)=\left[{A}_3^ix+{C}_3^i\right]={\left(\frac{1}{1+r}\right)}^2\left(\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.\left[P-{\left(1+r\right)}^{-1}q\right]\frac{P}{2{c}_i}x\hfill \\ {}-{\left[P-{\left(1+r\right)}^{-1}q\right]}^2\frac{1}{4{c}_i}x\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right)+q\left(\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.x+a-{\displaystyle \sum_{y=1}^3{\lambda}_3^y{\theta}_3^y}x\hfill \\ {}-\left[P-{\left(1+r\right)}^{-1}q\right]\frac{1}{2{c}_i}x-\left[P-{\left(1+r\right)}^{-1}q\right]\frac{1}{2{c}_j}x\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right){\left(\frac{1}{1+r}\right)}^3\hfill \\ {}\kern0.5em \mathrm{f}\mathrm{o}\mathrm{r}\ i,j\in \left\{1,2\right\}\ \mathrm{and}\ i\ne j.\hfill \end{array} $$

(8.21)

Using (8.21), we can obtain A ⁱ₃ and C ⁱ₃ , for $ i\in \left\{1,2\right\} $.

Now we proceed to stage 2, the conditions in Eq. (6.4) become

$$ \begin{array}{l}{V}^i\left(2,x\right)=\left[{A}_2^ix+{C}_2^i\right]=\underset{u_2^i}{ \max}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.\left[P{u}_2^i-\frac{c_i}{x}{\left({u}_2^{\mathrm{i}}\right)}^2\right]\kern0.5em \left(\frac{1}{1+r}\right)\\ {}+{\displaystyle \sum_{y=1}^3{\lambda}_2^y}{A}_3^i\left[x+a-{\theta}_2^yx-{u}_2^i-{\phi}_2^j(x)\right]\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\},\\ {}\mathrm{f}\mathrm{o}\mathrm{r}\ i,j\in \left\{1,2\right\}\ \mathrm{and}\ i\ne j.\end{array} $$

(8.22)

Performing the indicated maximization in (8.22) yields the game equilibrium strategies in stage 2 as:

$$ \begin{array}{cc}\hfill {\phi}_2^i(x)=\left[P-\left(1+r\right){A}_3^i\right]\frac{x}{2{c}_i},\hfill & \hfill \mathrm{f}\mathrm{o}\mathrm{r}\ i\in \left\{1,2\right\}.\hfill \end{array} $$

(8.23)

Substituting (8.23) into (8.22) yields:

$$ \begin{array}{l}{V}^i\left(2,x\right)=\left[{A}_2^ix+{C}_2^i\right]=\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.\left(\frac{1}{1+r}\right)\left[P-\left(1+r\right){A}_3^i\right]\frac{P+\left(1+r\right){A}_3^i}{4{c}_i}\\ {}+{A}_3^i\left(1-{\displaystyle \sum_{y=1}^3{\lambda}_2^y}{\theta}_2^y\right)-\left[P-\left(1+r\right){A}_3^i\right]\frac{A_3^i}{2{c}_i}-\left[P-\left(1+r\right){A}_3^j\right]\frac{A_3^i}{2{c}_j}\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\}x+a{A}_3^i,\\ {}\mathrm{f}\mathrm{o}\mathrm{r}\ i,j\in \left\{1,2\right\}\ \mathrm{and}\ i\ne j.\end{array} $$

(8.24)

Substituting A ⁱ₃ for $ i\in \left\{1,2\right\} $ into (8.24), A ⁱ₂ and C ⁱ₂ for $ i\in \left\{1,2\right\} $ are obtained in explicit terms.

Finally, we proceed to the first stage, the conditions in Eq. (6.4) become

$$ \begin{array}{l}{V}^i\left(1,x\right)=\left[{A}_1^ix+{C}_1^i\right]=\underset{u_1^i}{ \max}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.\left[P{u}_1^i-\frac{c_i}{x}{\left({u}_1^{\mathrm{i}}\right)}^2\right]\kern0.5em \\ {}+{\displaystyle \sum_{y=1}^3{\lambda}_1^y}\left(\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{A}_2^i\left[x+a-{\theta}_1^yx-{u}_1^i-{\phi}_1^j(x)\right]+{C}_2^i\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right)\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\},\\ {}\mathrm{f}\mathrm{o}\mathrm{r}\ i,j\in \left\{1,2\right\}\ \mathrm{and}\ i\ne j.\end{array} $$

(8.25)

Performing the indicated maximization in (8.25) yields the game equilibrium strategies in stage 1 as:

$$ \begin{array}{cc}\hfill {\phi}_1^i(x)=\left[P-{A}_2^i\right]\frac{x}{2{c}_i},\hfill & \hfill \mathrm{f}\mathrm{o}\mathrm{r}\ i\in \left\{1,2\right\}.\hfill \end{array} $$

(8.26)

Substituting (8.26) into (8.25) yields:

$$ \begin{array}{l}{V}^i\left(3,x\right)=\left[{A}_1^ix+{C}_1^i\right]=\\ {}\left[\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.\left(P-{A}_2^i\right)\frac{P+{A}_2^i}{4{c}_i}+{A}_2^i\left(1-{\displaystyle \sum_{y=1}^3{\lambda}_1^y}{\theta}_1^y\right)-\left(P-{A}_2^i\right)\frac{A_2^i}{2{c}_i}-\left(P-{A}_2^j\right)\frac{A_2^i}{2{c}_j}\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right]\;x\\ {}\begin{array}{cc}\hfill +a{A}_2^i+{C}_2^i,\hfill & \hfill \mathrm{f}\mathrm{o}\mathrm{r}\ i,j\in \left\{1,2\right\}\ \mathrm{and}\ i\ne j.\hfill \end{array}\end{array} $$

(8.27)

Substituting the explicit terms for A ⁱ₂ , A ^j₂ , C ⁱ₂ and C ^j₂ from (8.24) into (8.27), A ⁱ₁ and C ⁱ₁ for $ i\in \left\{1,2\right\} $ are obtained in explicit terms.

Appendix D. Proof of Proposition 6.2

Consider first the last stage, that is stage 3. Invoking that $ W\left(3,x\right)=\left[{A}_3x+{C}_3\right] $ from Proposition 6.2 and $ W\left(4,x\right)=2{\left(\frac{1}{1+r}\right)}^3q{x}_4 $, the conditions in Eq. (6.9) become

$$ \begin{array}{l}W\left(3,x\right)=\left[{A}_3x+{C}_3\right]=\underset{u_3^1,{u}_3^2}{ \max}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\displaystyle \sum_{j=1}^2}\left[P{u}_3^j-\frac{c_j}{x}{\left({u}_3^{\mathrm{j}}\right)}^2\right]\kern0.5em {\left(\frac{1}{1+r}\right)}^2\\ {}+{\displaystyle \sum_{y=1}^3{\lambda}_3^y}\ 2{\left(\frac{1}{1+r}\right)}^3q\left[x+a-{\theta}_3^yx-{\displaystyle \sum_{j=1}^2{u}_3^j}\right]\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\}.\end{array} $$

(8.28)

Performing the indicated maximization in (8.28) yields the optimal cooperative strategies in stage 3 as:

$$ \begin{array}{cc}\hfill {\psi}_3^i(x)=\frac{\left[P-{\left(1+r\right)}^{-1}2q\right]x}{2{c}_i},\hfill & \hfill \mathrm{f}\mathrm{o}\mathrm{r}\ i\in \left\{1,2\right\}.\hfill \end{array} $$

(8.29)

Substituting (8.29) into (8.28) yields:

$$ \begin{array}{l}W\left(3,x\right)=\left[{A}_3x+{C}_3\right]={\left(\frac{1}{1+r}\right)}^2{\displaystyle \sum_{j=1}^2\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.}\left[P-{\left(1+r\right)}^{-1}q\right]\frac{P}{2{c}_j}x\\ {}-{\left[P-{\left(1+r\right)}^{-1}q\right]}^2\frac{1}{4{c}_j}x\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\}\\ {}+2q\left(\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.x+a-{\displaystyle \sum_{y=1}^3{\lambda}_3^y}{\theta}_3^yx-\left[P-{\left(1+r\right)}^{-1}q\right]\frac{1}{2{c}_i}x\\ {}-\left[P-{\left(1+r\right)}^{-1}q\right]\frac{1}{2{c}_j}x\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right){\left(\frac{1}{1+r}\right)}^3.\end{array} $$

(8.30)

Using (8.30), we obtain A ₃ and C ₃.

Now we proceed to stage 2, the conditions in Eq. (6.9) become

$$ \begin{array}{l}W\left(2,x\right)=\left[{A}_2x+{C}_2\right]=\underset{u_2^1,{u}_2^2}{ \max}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\displaystyle \sum_{j=1}^2}\left[P{u}_2^j-\frac{c_j}{x}{\left({u}_2^{\mathrm{j}}\right)}^2\right]\kern0.5em \left(\frac{1}{1+r}\right)\\ {}+{\displaystyle \sum_{y=1}^3{\lambda}_2^y}{A}_3\left[x+a-{\theta}_2^yx-{\displaystyle \sum_{j=1}^2{u}_2^j}\right]\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\}.\end{array} $$

(8.31)

Performing the indicated maximization in (8.31) yields the optimal cooperative strategies in stage 2 as:

$$ \begin{array}{cc}\hfill {\psi}_2^i(x)=\left[P-\left(1+r\right){A}_3\right]\frac{x}{2{c}_i},\hfill & \hfill \mathrm{f}\mathrm{o}\mathrm{r}\ i\in \left\{1,2\right\}.\hfill \end{array} $$

(8.32)

Substituting (8.32) into (8.31) yields:

$$ \begin{array}{l}W\left(2,x\right)=\left[{A}_2x+{C}_2\right]=\left[\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.\left(\frac{1}{1+r}\right){\displaystyle \sum_{j=1}^2\left[P-\left(1+r\right){A}_3\right]}\frac{P+\left(1+r\right){A}_3}{4{c}_j}\\ {}+{A}_3\left(1-{\displaystyle \sum_{y=1}^3{\lambda}_2^y}{\theta}_2^y\right)-\left[P-\left(1+r\right){A}_3\right]\frac{A_3}{2{c}_1}-\left[P-\left(1+r\right){A}_3\right]\frac{A_3^i}{2{c}_2}\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right]\;x\\ {}+a{A}_3.\end{array} $$

(8.33)

Substituting A ₃ into (8.33), A ₂ and C ₂ are obtained in explicit terms.

Finally, we proceed to the first stage, the conditions in Eq. (6.9) become

$$ \begin{array}{l}W\left(1,x\right)=\left[{A}_1x+{C}_1\right]=\underset{u_1^1,{u}_1^2}{ \max}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\displaystyle \sum_{j=1}^2}\left[P{u}_1^j-\frac{c_j}{x}{\left({u}_1^{\mathrm{j}}\right)}^2\right]\kern0.5em \\ {}+{\displaystyle \sum_{y=1}^3{\lambda}_1^y}\left(\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{A}_2\left[x+a-{\theta}_1^yx-{\displaystyle \sum_{j=1}^2{u}_1^j}\right]+{C}_2\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right)\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\}.\end{array} $$

(8.34)

Performing the indicated maximization in (8.34) yields the optimal cooperative strategies in stage 1 as:

$$ \begin{array}{cc}\hfill {\psi}_1^i(x)=\left(P-{A}_2\right)\frac{x}{2{c}_i},\hfill & \hfill \mathrm{f}\mathrm{o}\mathrm{r}\ i\in \left\{1,2\right\}.\hfill \end{array} $$

(8.35)

Substituting (8.35) into (8.34) yields:

$$ \begin{array}{l}W\left(1,x\right)=\left[{A}_1x+{C}_1\right]=\left[\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\displaystyle \sum_{j=1}^2\left(P-{A}_2\right)}\frac{P+{A}_2}{4{c}_j}+{A}_2\left(1-{\displaystyle \sum_{y=1}^3{\lambda}_1^y}{\theta}_1^y\right)\\ {}-\left(P-{A}_2\right)\frac{A_2}{2{c}_1}-\left(P-{A}_2\right)\frac{A_2}{2{c}_2}\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right]\;x+a{A}_2+{C}_2.\end{array} $$

(8.36)

Substituting the explicit terms for A ₂ and C ₂ from (8.33) into (8.36), A ₁ and C ₁ are obtained in explicit terms.

9 Chapter Notes

Discrete-time dynamic games often are more suitable for real-life applications and operations research analyses. Properties of Nash equilibria in dynamic games are examined in Basar (1974, 1976). Solution algorithm for solving dynamic games can be found in Basar (1977a, b). Petrosyan and Zenkevich (1996) presented an analysis on cooperative dynamic games in discrete time framework. The SIAM Classics on Dynamic Noncoperative Game Theory by Basar and Olsder (1995) gave a comprehensive treatment of discrete-time noncooperative dynamic games. Bylka et al. (2000) analyzed oligopolistic price competition in a dynamic game model. Wie and Choi (2000) examined discrete-time traffic network. Beard and McDonald (2007) investigated water sharing agreements, and Amir and Nannerup (2006) considered resource extraction problems in a discrete-time dynamic framework. Krawczyk and Tidball (2006) considered a dynamic game of water allocation. Nie et al. (2006) considered dynamic programming approach to discrete time dynamic Stackelberg games. Dockner and Nishimura (1999) and Rubio and Ulph (2007) presented discrete-time dynamic game for pollution management. Dutta and Radner (2006) presented a discrete-time dynamic game to study global warming. Ehtamo and Hamalainen (1993) examined cooperative incentive equilibrium for a dynamic resource game. Yeung (2014) examined dynamically consistent collaborative environmental management with technology selection in a discrete-time dynamic game framework. Lehrer and Scarsini (2013) considered the core of dynamic cooperative games .

Discrete-time stochastic differential game analyses are less frequent than its continuous-time counterpart. Basar and Ho (1974) examined informational properties of the Nash solutions of stochastic nonzero-sum games. Elimination of informational nonuiqueness in Nash equilibrium through a stochastic formulation was first discussed in Basar (1976) and further examined in Basar (1975, 1979, 1989). Basar and Mintz (1972, 1973) and Basar (1978) developed equilibrium solution of linear-quadratic stochastic dynamic games with noisy observation. Bauso and Timmer (2009) considered robust dynamic cooperative games where at each point in time the coalitional values are unknown but bounded by a polyhedron. Smith and Zenou (2003) considered a discrete-time stochastic job searching model. Esteban-Bravo and Nogales (2008) analyzed mathematical programming for stochastic discrete-time dynamics arising in economic systems including examples in a stochastic national growth model and international growth model with uncertainty. Basar and Olsder (1995) gave a comprehensive treatment of noncooperative stochastic dynamic games . Yeung and Petrosyan (2010) provided the techniques in characterizing subgame consistent solutions for stochastic dynamic games subgame consistent solutions for stochastic dynamic games . Finally, a heuristic approach of obtaining subgame consistent solutions is provided in Sect. 7.7 to widen the application to a wide range of cooperative game problems in which only estimates of the expected cooperative payoffs and individual non-cooperative payoffs with acceptable degrees of accuracy are available.

10 Problems

(1)
Consider an economy endowed with a renewable resource and with 2 resource extractors (firms). The lease for resource extraction begins at stage 1 and ends at stage 3 for these two firms. Let u ⁱ_k denote the rate of resource extraction of firm i at stage k, for $ i\in \left\{1,2\right\} $. Let U ⁱ be the set of admissible extraction rates, and $ {x}_k\in X\subset {R}^{+} $ the size of the resource stock at stage k. In particular, we have U ⁱ ∈ R⁺and $ {u}_k^1+{u}_k^2\le {x}_k $. The extraction cost for firms 1 and 2 are respectively

(u ¹_k )²/x _k and 1.5(u ⁱ_k )²/x _k.

The profits that firm 1 and firm 2 will obtain at stage k are respectively:
$$ \left[10{u}_k^1-\frac{4}{x_k}{\left({u}_k^1\right)}^2\right]\kern0.75em \mathrm{and}\ \left[4{u}_k^2-\frac{2}{x_k}{\left({u}_k^2\right)}^2\right]\kern0.5em . $$
A terminal payment of 4x ₄ will be given to each firm after stage 3.

The growth dynamics of the resource is governed by the difference equation:
$$ \begin{array}{cc}\hfill {x}_{k+1}={x}_k+20-0.1{x}_k-{\displaystyle \sum_{j=1}^2{u}_k^j},\hfill & \hfill \mathrm{f}\mathrm{o}\mathrm{r}\ k\in \left\{1,2,3\right\}\ \mathrm{and}\ {x}_1=24.\hfill \end{array} $$
Characterize the feedback Nash equilibrium solution for the above resource economy.
(2)
If the extractors agree to cooperate and maximize their joint payoff, derive the optimal cooperative strategies and the optimal resource trajectory.
(3)
Consider the case when the extractors agree to share the excess of cooperative gains over their noncooperative payoffs equally. Derive a subgame consistent solution.
(4)
Consider an economy endowed with a renewable resource and with two resource extractors (firms). The lease for resource extraction begins at stage 1 and ends at stage 4 for these two firms. Let u ⁱ_k denote the rate of resource extraction of firm i at stage k, for $ i\in \left\{1,2\right\} $. Let U ⁱ be the set of admissible extraction rates, and $ {x}_k\in X\subset {R}^{+} $ the size of the resource stock at stage k. In particular, we have U ⁱ ∈ R+ and $ {u}_k^1+{u}_k^2\le {x}_k $.

The profits that firm 1 and firm 2 will obtain at stage k are respectively:
$$ \left[5{u}_k^1-\frac{2}{x_k}{\left({u}_k^1\right)}^2\right]\kern0.75em \mathrm{and}\ \left[3{u}_k^2-\frac{1}{x_k}{\left({u}_k^2\right)}^2\right]\kern0.5em . $$
A terminal payment of 3x ₄ will be given to each firm after stage 4.

The growth dynamics of the resource is governed by the stochastic difference equation:
$$ {x}_{k+1}={x}_k+15-0.1{x}_k-{\displaystyle \sum_{j=1}^2{u}_k^j}+{\theta}_k{x}_k, $$
for $ k\in \left\{1,2,3,4\right\} $ and $ {x}_1=55 $,

where θ _k is a random variable with range {0, 0.1, 0.2} and corresponding probabilities {0.3, 0.5, 0.2}

Characterize a Nash equilibrium solution for the above discrete-time stochastic market game.
(5)
If the extractors agree to cooperate and maximize their expected joint payoff, derive the group optimal cooperative strategies .
(6)
Consider the case when the extractors agree to share the excess of expected cooperative gains proportional to their expected noncooperative payoffs. Derive a subgame consistent solution.

References

Amir, R., Nannerup, N.: Information structure and the tragedy of the commons in resource extraction. J. Bioecon. 8, 147–165 (2006)
Article Google Scholar
Basar, T.: A counter example in linear-quadratic games: existence of non-linear Nash solutions. J. Optim. Theory Appl. 14, 425–430 (1974)
Article MathSciNet MATH Google Scholar
Basar, T.: Nash strategies for M-person differential games with mixed information structures. Automatica 11, 547–551 (1975)
Article MathSciNet MATH Google Scholar
Basar, T.: On the uniqueness of the Nash solution in linear-quadratic differential games. Int. J. Game Theory 5, 65–90 (1976)
Article MathSciNet MATH Google Scholar
Basar, T.: Two general properties of the saddle-point solutions of dynamic games. IEEE Trans. Autom. Control. AC-22, 124–126 (1977a)
Article MathSciNet MATH Google Scholar
Basar, T.: Existence of unique equilibrium solutions in nonzero-sum stochastic differential games. In: Roxin, E.O., Liu, P.T., Sternberg, R. (eds.) Differential Games and Control Theory II, pp. 201–228. Marcel Dekker, Inc., New York (1977b)
Google Scholar
Basar, T.: Decentralized multicriteria optimization of linear stochastic systems. IEEE Trans. Autom. Control AC-23, 233–243 (1978)
Article MathSciNet MATH Google Scholar
Basar, T.: Information structures and equilibria in dynamic games. In: Aoki, M., Marzollo, A. (eds.) New Trends in Dynamic System Theory and Economics, pp. 3–55. Academic Press, New York/London (1979)
Google Scholar
Basar, T.: Time consistency and robustness of equilibria in noncooperative dynamic games. In: Van der Ploeg, F., de Zeeuw, A. (eds.) Dynamic Policy Games in Economics, pp. 9–54. Elsevier Science Pub Co, Amsterdam/North-Holland (1989)
Google Scholar
Basar, T., Ho, Y.C.: Informational properties of the nash solutions of two stochastic nonzero-sum games. J. Econ. Theory 7, 370–387 (1974)
Article MathSciNet Google Scholar
Basar, T., Mintz, M.: On the existence of linear saddle-point strategies for a two-person zero-sum stochastic game. In: Proceedings of the IEEE 11th Conference on Decision and Control, New Orleans, LA, 1972, pp. 188–192, IEEE Computer Society Press, Los Alamitos, CA (1972)
Google Scholar
Basar, T., Mintz, M.: A multistage pursuit-evasion game that admits a Gaussian random process as a maximum control policy. Stochastics 1, 25–69 (1973)
MathSciNet MATH Google Scholar
Basar, T., Olsder, G.J.: Dynamic Noncooperative Game Theory, 2nd edn. Academic Press, London (1995)
MATH Google Scholar
Bauso, D., Timmer, J.: Robust dynamic cooperative games. Int. J. Game Theory 38, 23–36 (2009)
Article MathSciNet MATH Google Scholar
Beard, R., McDonald, S.: Time-consistent fair water sharing agreements. Ann. Int. Soc. Dyn. Games 9, 393–410 (2007)
Article MathSciNet MATH Google Scholar
Bylka, S., Ambroszkiewicz, S., Komar, J.: Discrete time dynamic game model for price competition in an oligopoly. Ann. Oper. Res. 97, 69–89 (2000)
Article MathSciNet MATH Google Scholar
Dockner, E.J., Nishimura, K.: Transboundary pollution in a dynamic game model. Jpn. Econ. Rev. 50(4), 443–456 (1999)
Article Google Scholar
Dutta, P.-K., Radner, R.: Population growth and technological change in a global warming model. Economic Theory 29(2), 251–270 (2006)
Article MathSciNet MATH Google Scholar
Ehtamo, H., Hamalainen, R.: A cooperative incentive equilibrium for a resource management. J. Econ. Dyn. Control 17(4), 659–678 (1993)
Article MathSciNet MATH Google Scholar
Esteban-Bravo, M., Nogales, F.J.: Solving dynamic stochastic economic models by mathematical programming decomposition methods. Comput. Oper. Res. 35, 226–240 (2008)
Article MathSciNet MATH Google Scholar
Krawczyk, J.B., Tidball, M.: A discrete-time dynamic game of seasonal water allocation. J. Optim. Theory Appl. 128(2), 411–429 (2006)
Article MathSciNet MATH Google Scholar
Lehrer, E., Scarsini, M.: On the core of dynamic cooperative games. Dyn. Games Appl. 3, 359–373 (2013)
Article MathSciNet MATH Google Scholar
Nie, P., Chen, L., Fukushima, M.: Dynamic programming approach to discrete time dynamic feedback stackelberg games with independent and dependent followers. Eur. J. Oper. Res. 169, 310–328 (2006)
Article MathSciNet MATH Google Scholar
Petrosyan, L.A., Zenkevich, N.A.: Game Theory. World Scientific Publishing Co. Pte. Ltd., Republic of Singapore (1996)
Book MATH Google Scholar
Rubio, S.J., Ulph, A.: An infinite-horizon model of dynamic membership of international environmental agreements. J. Environ. Econ. Manag. 54(3), 296–310 (2007)
Article MATH Google Scholar
Smith, A.E., Zenou, Y.: A discrete-time stochastic model of job matching. Rev. Econ. Dyn. 6, 54–79 (2003)
Article Google Scholar
Wie, B.W., Choi, K.: The computation of dynamic Cournot-Nask traffic network equilibria in discrete time. KSCE J. Civ. Eng. 4(4), 239–248 (2000)
Article Google Scholar
Yeung, D.W.K.: Dynamically consistent collaborative environmental management with production technique choices. Ann. Oper. Res. 220(1), 181–204 (2014)
Article MathSciNet MATH Google Scholar
Yeung, D.W.K., Petrosyan, L.A.: Subgame consistent solutions for cooperative stochastic dynamic games. J. Optim. Theory Appl. 145(3), 579–596 (2010)
Article MathSciNet MATH Google Scholar
Yeung, D.W.K., Petrosyan, L.A.: Subgame Consistent Economic Optimization: An Advanced Cooperative Dynamic Game Analysis. Birkhäuser, Boston, 395 pp (2012a)
Google Scholar

Download references

Author information

Authors and Affiliations

Center of Game Theory, Saint Petersburg State University, St Petersburg, Russia
David W. K. Yeung
SRS Consortium for Advanced Study in Cooperative Dynamic Games, Hong Kong Shue Yan University, Hong Kong, Hong Kong
David W. K. Yeung
Faculty of Applied Mathematics-Control Processes, St Petersburg State University, St Petersburg, Russia
Leon A. Petrosyan

Authors

David W. K. Yeung
View author publications
You can also search for this author in PubMed Google Scholar
Leon A. Petrosyan
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Yeung, D.W.K., Petrosyan, L.A. (2016). Subgame Consistent Cooperative Solution in Dynamic Games. In: Subgame Consistent Cooperation. Theory and Decision Library C, vol 47. Springer, Singapore. https://doi.org/10.1007/978-981-10-1545-8_7

Download citation

DOI: https://doi.org/10.1007/978-981-10-1545-8_7
Published: 24 September 2016
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-1544-1
Online ISBN: 978-981-10-1545-8
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

Subgame Consistent Cooperative Solution in Dynamic Games

Abstract

Similar content being viewed by others

Strongly Time-Consistent Solutions in Cooperative Dynamic Games

Nontransferable Utility Cooperative Dynamic Games

Nontransferable Utility Cooperative Dynamic Games

Keywords

1 Cooperative Dynamic Games

1.1 Game Formulation

1.2 Noncooperative Outcome

Theorem 1.1

Proof

1.3 Dynamic Cooperation

1.3.1 Group Optimality

Theorem 1.2

Proof

1.3.2 Individual Rationality

2 Subgame Consistent Solutions and Payment Mechanism

2.1 Payoff Distribution Procedure

Theorem 2.1

Proof

2.2 Subgame Consistent Solution

Theorem 2.2

Proof

3 An Illustration in Cooperative Resource Extraction

Proposition 3.1

Proof

Proposition 3.2

Proof

4 Cooperative Stochastic Dynamic Games

4.1 Game Formulation

Theorem 4.1

Proof

4.2 Dynamic Cooperation under Uncertainty

4.2.1 Group Optimality

Theorem 4.2

Proof

4.2.2 Individual Rationality

5 Subgame Consistent Solutions and Payment Mechanism

5.1 Payoff Distribution Procedure

Theorem 5.1

Proof

5.2 Subgame Consistent Solution

Theorem 5.2

Proof

6 Cooperative Resource Extraction under Uncertainty

Proposition 6.1

Proof

Proposition 6.2

Proof

7 A Heuristic Approach

8 Chapter Appendices

Appendix A. Proof of Proposition 3.1

Appendix B. Proof of Proposition 3.2

Appendix C. Proof of Proposition 6.1

Appendix D. Proof of Proposition 6.2

9 Chapter Notes

10 Problems

References

Author information

Authors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation