Subgame Consistency in Randomly-Furcating Cooperative Stochastic Differential Games

Yeung, David W. K.; Petrosyan, Leon A.

doi:10.1007/978-981-10-1545-8_4

David W. K. Yeung^17,18 &
Leon A. Petrosyan¹⁹

Part of the book series: Theory and Decision Library C ((TDLC,volume 47))

446 Accesses

Abstract

An essential characteristic of time – and hence decision making over time – is that though an individual may, through the expenditure of resources, gather past and present information, the future is inherently unknown and therefore (in the mathematical sense) uncertain. There is no escape from this fact, regardless of what resources the individual should choose to devote to obtaining data, information, and to forecasting. An empirically meaningful theory must therefore incorporate time-uncertainty in an appropriate manner. Important forms of structure uncertainty follow from uncertainty of payoffs and perturbing stochastic state dynamics. Causes of structure uncertainty include (a) Imprecise or incomplete knowledge about the game’s payoffs over time – the benefits and costs from playing are generally known only probabilistically, and (b) imperfect knowledge regarding the behavior of the game’s state variables – generally, how the game evolves over time is only known probabilistically. To meet the challenges following from structure-uncertainty, randomly-furcating stochastic differential games allows random shocks in the stock dynamics and stochastic changes in payoffs. Since future payoff are not known with certainty, the term “randomly-furcating” is introduced to highlight the fact that a particularly useful way to analyze the situation is to assume that payoffs change at any future time instant according to (known) probability distributions defined in terms of multiple-branching stochastic processes (see Yeung (2001) and Yeung (2003)).

Access provided by Autonomous University of Puebla. Download chapter PDF

Certainty Equivalence

Zero-Sum Stochastic Games

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

An essential characteristic of time – and hence decision making over time – is that though an individual may, through the expenditure of resources, gather past and present information, the future is inherently unknown and therefore (in the mathematical sense) uncertain. There is no escape from this fact, regardless of what resources the individual should choose to devote to obtaining data, information, and to forecasting. An empirically meaningful theory must therefore incorporate time-uncertainty in an appropriate manner. Important forms of structure uncertainty follow from uncertainty of payoffs and perturbing stochastic state dynamics. Causes of structure uncertainty include (a) Imprecise or incomplete knowledge about the game’s payoffs over time – the benefits and costs from playing are generally known only probabilistically, and (b) imperfect knowledge regarding the behavior of the game’s state variables – generally, how the game evolves over time is only known probabilistically. To meet the challenges following from structure-uncertainty, randomly-furcating stochastic differential games allows random shocks in the stock dynamics and stochastic changes in payoffs. Since future payoff are not known with certainty, the term “randomly-furcating” is introduced to highlight the fact that a particularly useful way to analyze the situation is to assume that payoffs change at any future time instant according to (known) probability distributions defined in terms of multiple-branching stochastic processes (see Yeung (2001) and Yeung (2003)).

This Chapter presents an $ n- $ player counterpart of the Petrosyan and Yeung’s (2007) 2-player analysis on subgame-consistent cooperative solutions in randomly-furcating stochastic differential games . The organization of the Chapter is as follows. Section 4.1 presents the basic formulation of randomly-furcating cooperative differential games . Section 4.2 presents an analysis on subgame consistent dynamic cooperation of this class of games. Derivation of a subgame consistent payoff distribution procedure is provided in Sect. 4.3. An illustration of the solution mechanism is given in a cooperative fishery game in Sect. 4.4. Subgame consistency in infinite horizon randomly-furcating cooperative differential games is examined in Sect. 4.4. Chapter notes are given in Sect. 4.5 and problems in Sect. 4.6.

1 Game Formulation and Noncooperative Outcomes

Consider a class of randomly furcating stochastic differential game in which there are n players. The game interval is [t ₀, T]. When the game commences at t ₀, the payoff structures of the players in the interval $ \left[{t}_0,{t}_1\right) $ are known. In future instants of time $ {t}_k\ \left(k=1,\kern0.5em 2,\kern0.5em \cdots, \kern0.5em m\right) $, where $ {t}_0<{t}_m<T\equiv {t}_{m+1} $, the payoff structures in the time interval $ \left[{t}_k,{t}_{k+1}\right) $ are affected by a series of random events Θ^k. In particular, Θ^k for $ k\in \left\{1,\kern0.5em 2,\kern0.5em \cdots, \kern0.5em m\right\} $, are independent and identically distributed random variables with range {θ ₁, θ ₂, …, θ _η} and corresponding probabilities {λ ₁, λ ₂, …, λ _η}. Changes in preference, technology, legal arrangements and the physical environments are examples of factors which constitute the change in payoff structures. At time T a terminal value q ⁱ(x(T)) will be given to player i. Specifically player i seeks to maximize the expected payoff:

$$ \begin{array}{l}{E}_{t_0}\left\{{\displaystyle {\int}_{t_0}^{t_1}{g}^{\left[i,{\theta}_0^0\right]}\left[s,x(s),{u}_1(s),{u}_2(s),\cdots, {u}_n(s)\right]}\right.\kern0.5em {e}^{-r\left(s-{t}_0\right)}ds\hfill \\ {}+\left.{\displaystyle \sum_{h=1}^m{\displaystyle \sum_{a_h=1}^{\eta }{\lambda}_{a_h}}{\displaystyle {\int}_{t_h}^{t_{h+1}}{g}^{\left[i,{\theta}_{a_h}^h\right]}}\left[s,x(s),{u}_1(s),{u}_2(s),\cdots, {u}_n(s)\right]\;{e}^{-r\left(s-{t}_0\right)}+{e}^{-r\left(T-{t}_0\right)}{q}^i\left(x(T)\right)}\right\},\hfill \\ {}\mathrm{f}\mathrm{o}\mathrm{r}\kern0.30em i\in \left\{1,2,\cdots, n\right\}\equiv N,\hfill \end{array} $$

(1.1)

where $ x(s)\in X\subset {R}^{\kappa } $ is a vector of state variables, $ {\theta}_{a_k}^h\in \left\{{\theta}_1,{\theta}_2,\kern0.5em \dots, {\theta}_{\eta}\right\} $ for $ k\in \left\{1,2,\cdots, m\right\} $, $ {\theta}_{a_0}={\theta}_0^0 $ is known at time t ₀, r is the discount rate, $ {u}_i\in {U}^i $ is the control of player i, and $ {E}_{t_0} $ denotes the expectation operator performed at time t ₀. The payoffs of the players are transferable.

The state dynamics of the game is characterized by the vector-valued stochastic differential equations:

$$ dx(s)=f\left[s,x(s),{u}_1(s),{u}_2(s),\cdots, {u}_n(s)\right]ds+\sigma \left[s,x(s)\right]dz(s),\;x\left({t}_0\right)={x}_0, $$

(1.2)

where σ[s, x(s)] is a $ \kappa \times \upsilon $ matrix and z(s) is a υ -dimensional Wiener process and the initial state x ₀ is given. Let $ \Omega \left[s,x(s)\right]=\sigma \left[s,x(s)\right]\sigma {\left[s,x(s)\right]}^T $ denote the covariance matrix with its element in row h and column ζ denoted by Ω^hζ[s, x(s)]. $ {u}_i\in {U}_i\subset comp{R}^{\ell } $ is the control vector of player i, for $ i\in N $.

To obtain a Nash equilibrium solution for the game (1.1 and 1.2), we first consider the solution for the subgame in the last time interval, that is [t _m, T]. For the case where $ {\theta}_{a_m}^m\in \left\{{\theta}_1,{\theta}_2,\kern0.5em \dots, {\theta}_{\eta}\right\} $ has occurred at time instant t _m and $ x\left({t}_m\right)={x}_{t_m}\in X $, player i maximizes the payoff:

$$ \begin{array}{l}{E}_{t_m}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\displaystyle {\int}_{t_m}^T{g}^{\left[i,{\theta}_{a_m}^m\right]}\left[s,x(s),{u}_1(s),{u}_2(s),\cdots, {u}_n(s)\right]}\kern0.5em {e}^{-r\left(s-{t}_0\right)}ds\\ {}+{q}^i\left(x(T)\right){e}^{-r\left(T-{t}_0\right)}\left|\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.x\left({t}_m\right)={x}_{t_m}\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\},\end{array} $$

(1.3)

subject to

$$ dx(s)=f\left[s,x(s),{u}_1(s),{u}_2(s),\cdots, {u}_n(s)\right]ds+\sigma \left[s,x(s)\right]dz(s),x\left({t}_m\right)={x}_{t_m}. $$

(1.4)

The conditions characterizing a Nash equilibrium solution of the game (1.3 and 1.4) is provided in the lemma below.

Lemma 1.1

A set of feedback strategies {$ {u}_i^{(m){\theta}_{\alpha_m}^m}(t)={\phi}_i^{(m){\theta}_{\alpha_m}^m}\left(t,x\right);i\in \left\{1,2\right\} $ and $ t\in \left[{t}_m,T\right] $} constitutes a Nash equilibrium solution for the game (1.3 and 1.4), if there exist continuously differentiable functions $ {V}^{i\left[{\theta}_{\alpha_m}^m\right](m)}\left(t,x\right):\left[{t}_m,T\right]\times {R}^{\kappa}\to R $, for $ i\in \left\{1,2\right\} $, which satisfy the following partial differential equations:

$$ \begin{array}{l}-{V}_t^{i\left[{\theta}_{\alpha_m}^m\right](m)}\left(t,x\right)-\frac{1}{2}{\displaystyle \sum_{h,\zeta =1}^n{\Omega}^{h\zeta}\left(t,x\right)\kern0.1em }{V}_{x^h{x}^{\zeta}}^{i\left[{\theta}_{\alpha_m}^m\right](m)}\left(t,x\right)\\ {}=\underset{u_i^{\theta_{\alpha_m}}\in {U}^i}{ \max}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{g}^{\left[i,{\theta}_{\alpha_m}^m\right]}\left[t,x,{u}_i^{(m){\theta}_{\alpha_m}^m},{\underline{\phi}}_{N\backslash i}^{(m){\theta}_{\alpha_m}^m}\left(t,x\right)\right]{e}^{-r\left(t-{t}_0\right)}\\ {}+{V}_x^{i\left[{\theta}_{\alpha_m}^m\right](m)}\left(t,x\right)f\left[t,x,{u}_i^{(m){\theta}_{\alpha_m}^m},{\underline{\phi}}_{N\backslash i}^{(m){\theta}_{\alpha_m}^m}\left(t,x\right)\right]\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\},\;\mathrm{and}\\ {}{V}^{i\left[{\theta}_{\alpha_m}^m\right](m)}\left(T,x\right)={e}^{-r\left(T-{t}_m\right)}{q}^i(x),\kern1.12em \mathrm{f}\mathrm{o}\mathrm{r}\kern0.24em i\in N,\kern0.24em j\in N\kern0.5em \mathrm{and}\kern0.5em j\ne i,\end{array} $$

(1.5)

where

$$ \begin{array}{l}{\underline{\phi}}_{N\backslash i}^{(m){\theta}_{\alpha_m}^m}\left(t,x\right)=\\ {}\left[{\phi}_1^{(m){\theta}_{\alpha_m}^m}\left(t,x\right),{\phi}_2^{(m){\theta}_{\alpha_m}^m}\left(t,x\right),\cdots, {\phi}_{i-1}^{(m){\theta}_{\alpha_m}^m}\left(t,x\right),{\phi}_{i+1}^{(m){\theta}_{\alpha_m}^m}\left(t,x\right),\cdots, {\phi}_n^{(m){\theta}_{\alpha_m}^m}\left(t,x\right)\right].\end{array} $$

Proof

System (1.5) satisfies the optimal conditions in stochastic dynamic programming in Theorem A.3 in the Technical Appendices for each player and the Nash equilibrium condition (1951). Hence Lemma 1.1 follows. ■

For ease of exposition and sidestepping the issue of multiple equilibria, we assume that a particular noncooperative Nash equilibrium is adopted in the entire subgame. In order to formulate the subgame in the second last time interval $ \left[{t}_{m-1},{t}_m\right) $, it is necessary to identify the expected terminal payoffs at time t _m. If $ {\theta}_{a_m}^m $ occurs at time t _m, one can invoke Lemma 1.1 and obtain player i’s payoffs at time t _m as $ {V}^{i\left[{\theta}_{\alpha_m}^m\right](m)}\left({t}_m,{x}_{t_m}\right) $. Note that $ {V}^{i\left[{\theta}_{\alpha_m}^m\right](m)}\left({t}_m,{x}_{t_m}\right) $ gives the expected payoff to player i for playing the subgame in the last interval if $ {\theta}_{a_m}^m $ occurs at time t _m. Taking into consideration of all the possibilities of $ {\theta}_{a_m}^m $ $ \in \left\{{\theta}_1,{\theta}_2,\kern0.5em \dots, {\theta}_{\eta}\right\} $, the expected payoff to player i for playing the subgame in the last interval payoff can be obtained as:

$$ {\displaystyle \sum_{a=1}^{\eta }{\lambda}_a}{V}^{i\left[{\theta}_{\alpha}^m\right](m)}\left({t}_m,{x}_{t_m}\right). $$

(1.6)

The expected terminal payoff of player i, for $ i\in N $, in the subgame over the time interval $ \left[{t}_{m-1},{t}_m\right] $ is reflected by (1.6) under the assumption that a particular Nash equilibrium is adopted in each of the possible subgame scenarios in the time interval [t _m, T]. If $ {\theta}_{a_{m-1}}^{m-1}\in \left\{{\theta}_1,{\theta}_2,\kern0.5em \dots, {\theta}_{\eta}\right\} $ occurs at time $ {t}_{m-1} $, the subgame in the time interval $ \left[{t}_{m-1},{t}_m\right] $ can be formally set up as:

$$ \begin{array}{l}\underset{u_i}{ \max }{E}_{t_{m-1}}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\displaystyle {\int}_{t_{\tau -1}}^{t_{\tau }}{g}^{\left[i,{\theta}_{a_{m-1}}^{m-1}\right]}\left[s,x(s),{u}_1(s),{u}_2(s),\cdots, {u}_n(s)\right]}\kern0.5em {e}^{-r\left(s-{t}_0\right)}ds\\ {}+{\displaystyle \sum_{a=1}^{\eta }{\lambda}_a}{V}^{i\left[{\theta}_{\alpha}^m\right](m)}\left({t}_m,x\left({t}_m\right)\right)\left|\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.x\left({t}_{m-1}\right)={x}_{t_{m-1}}\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\},\;\mathrm{f}\mathrm{o}\mathrm{r}\;i\in N\end{array} $$

(1.7)

subject to

$$ \begin{array}{l}dx(s)=f\left[s,x(s),{u}_1(s),{u}_2(s),\cdots, {u}_n(s)\right]ds+\sigma \left[s,x(s)\right]dz(s),\\ {}x\left({t}_{m-1}\right)={x}_{t_{m-1}}\in X\end{array} $$

(1.8)

Similarly, if $ {\theta}_{a_k}^k\in \left\{{\theta}_1,{\theta}_2,\kern0.5em \dots, {\theta}_{\eta}\right\} $ occurs at time t _k the subgame in the time interval $ \left[{t}_k,{t}_{k+1}\right) $, for $ k\in \left\{0,1,2,\cdots, m-2\right\} $ can be set up as:

$$ \begin{array}{l}\underset{u_i}{ \max}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\displaystyle {\int}_{t_k}^{t_{k+1}}{g}^{\left[i,{\theta}_{a_k}^k\right]}\left[s,x(s),{u}_1(s),{u}_2(s),\cdots, {u}_n(s)\right]}\kern0.5em {e}^{-r\left(s-{t}_0\right)}ds\\ {}+{\displaystyle \sum_{a=1}^{\eta }{\lambda}_a}{V}^{i\left[{\theta}_{\alpha}^{k+1}\right]\left(k+1\right)}\left({t}_{k+1},x\left({t}_{k+1}\right)\right)\left|\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.x\left({t}_k\right)={x}_{t_k}\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\},\;\mathrm{f}\mathrm{o}\mathrm{r}\;i\in N,\end{array} $$

(1.9)

subject to

$$ \begin{array}{l}dx(s)=f\left[s,x(s),{u}_1(s),{u}_2(s),\cdots, {u}_n(s)\right]ds+\sigma \left[s,x(s)\right]dz(s),\\ {}x\left({t}_k\right)={x}_{t_k}\in X.\end{array} $$

(1.10)

Following Lemma 1.1 a Nash equilibrium solution of game (1.1 and 1.2) can be characterized by the following theorem.

Theorem 1.1

A set of feedback strategies $ \Big\{{u}_i^{(m){\theta}_{\alpha_m}^m}(t)={\phi}_i^{(m){\theta}_{\alpha_m}^m}\left(t,x\right), $ for $ t\in \left[{t}_m,T\right] $; $ {u}_i^{(k){\theta}_{\alpha_k}^k}(t)={\phi}_i^{(k){\theta}_{\alpha_k}^k}\left(t,x\right) $, for $ t\in \left[{t}_k,{t}_{k+1}\right) $, $ k\in \left\{0,1,2,\cdots, m-1\right\} $ and $ i\in N\Big\} $, contingent upon the events $ {\theta}_{\alpha_m}^m\in \left\{{\theta}_1,{\theta}_2,\kern0.5em \dots, {\theta}_{\eta}\right\} $ and $ {\theta}_{\alpha_k}^k\in \left\{{\theta}_1,{\theta}_2,\kern0.5em \dots, {\theta}_{\eta}\right\} $ for $ k\in \left\{1,2,\cdots, m-1\right\} $ constitutes a Nash equilibrium solution for the game (1.1 and 1.2), if there exist continuously differentiable functions $ {V}^{i\left[{\theta}_{\alpha_m}^m\right](m)}\left(t,x\right):\left[{t}_m,T\right]\times {R}^{\kappa}\to R $ and $ {V}^{i\left[{\theta}_{\alpha_k}^k\right](k)}\left(t,x\right):\left[{t}_k,{t}_{k+1}\right]\times {R}^{\kappa}\to R $, for $ k\in \left\{0,1,2,\cdots, m-1\right\} $ and $ i\in N $, which satisfy the following partial differential equations:

$$ \begin{array}{l}-{V}_t^{i\left[{\theta}_{\alpha_m}^m\right](m)}\left(t,x\right)-\frac{1}{2}{\displaystyle \sum_{h,\zeta =1}^n{\Omega}^{h\zeta}\left(t,x\right)\kern0.1em }{V}_{x^h{x}^{\zeta}}^{i\left[{\theta}_{\alpha_m}^m\right](m)}\left(t,x\right)\\ {}=\underset{u_i^{\theta_{\alpha_m}}\in {U}^i}{ \max}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{g}^{\left[i,{\theta}_{\alpha_m}^m\right]}\left[t,x,{u}_i^{(m){\theta}_{\alpha_m}^m},{\underline{\phi}}_{N\backslash i}^{(m){\theta}_{\alpha_m}^m}\left(t,x\right)\right]{e}^{-r\left(t-{t}_0\right)}\\ {}+{V}_x^{i\left[{\theta}_{\alpha_m}^m\right](m)}\left(t,x\right)f\left[t,x,{u}_i^{(m){\theta}_{\alpha_m}^m},{\underline{\phi}}_{N\backslash i}^{(m){\theta}_{\alpha_m}^m}\left(t,x\right)\right]\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\},\;\mathrm{and}\\ {}{V}^{i\left[{\theta}_{\alpha_m}^m\right](m)}\left(T,x\right)={e}^{-r\left(T-{t}_0\right)}{q}^i(x);\\ {}-{V}_t^{i\left[{\theta}_{\alpha_k}^k\right](k)}\left(t,x\right)-\frac{1}{2}{\displaystyle \sum_{h,\zeta =1}^n{\Omega}^{h\zeta}\left(t,x\right)\kern0.1em }{V}_{x^h{x}^{\zeta}}^{i\left[{\theta}_{\alpha_k}^k\right](k)}\left(t,x\right)\\ {}=\underset{u_i^{\theta_{\alpha_k}}\in {U}^i}{ \max}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{g}^{\left[i,{\theta}_{\alpha_k}^k\right]}\left[t,x,{u}_i^{(k){\theta}_{\alpha_k}^k},{\underline{\phi}}_{N\backslash i}^{(k){\theta}_{\alpha_k}^k}\left(t,x\right)\right]{e}^{-r\left(t-{t}_0\right)}\\ {}+{V}_x^{i\left[{\theta}_{\alpha_k}^k\right](k)}\left(t,x\right)f\left[t,x,{u}_i^{(k){\theta}_{\alpha_k}^k},{\underline{\phi}}_{N\backslash i}^{(k){\theta}_{\alpha_k}^k}\left(t,x\right)\right]\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\},\;\mathrm{and}\\ {}{V}^{i\left[{\theta}_{\alpha_k}^k\right](k)}\left({t}_{k+1},x\right)={\displaystyle \sum_{a=1}^{\eta }{\lambda}_a}{V}^{i\left[{\theta}_a^{k+1}\right]\left(k+1\right)}\left({t}_{k+1},x\right),\end{array} $$

for $ i\in N $ and $ k\in \left\{0,1,2,\cdots, m-1\right\} $.

Proof

The results in Theorem 1.1 satisfy the optimal conditions in stochastic dynamic programming in Technical Appendix A.3 for each player and the Nash equilibrium condition (1951). Hence Theorem 1.1 follows. ■

Two remarks given below will be utilized in subsequent analysis.

Remark 1.1

One can readily verify that $ {\overline{V}}^{i\left[{\theta}_{\alpha_k}^k\right](k)}\left({t}_k,{x}_{t_k}\right)={V}^{i\left[{\theta}_{\alpha_k}^k\right](k)}\left({t}_k,{x}_{t_k}\right){e}^{r\left({t}_k-{t}_0\right)} $ is the expected feedback Nash equilibrium payoff of player i in the game

$$ \begin{array}{l}\underset{u_i}{ \max}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\displaystyle {\int}_{t_k}^{t_{k+1}}{g}^{\left[i,{\theta}_{a_k}^k\right]}\left[s,x(s),{u}_1(s),{u}_2(s),\cdots, {u}_n(s)\right]}\kern0.5em {e}^{-r\left(s-{t}_k\right)}ds\\ {}+{e}^{-r\left({t}_{k+1}-{t}_k\right)}{\displaystyle \sum_{a=1}^{\eta }{\lambda}_a}{\overline{V}}^{i\left[{\theta}_a^{k+1}\right]\left(k+1\right)}\left({t}_{k+1},x\left({t}_{k+1}\right)\right)\left|\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.x\left({t}_k\right)={x}_{t_k}\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\},\\ {}\mathrm{f}\mathrm{o}\mathrm{r}\;i\in N,\end{array} $$

subject to

$$ \begin{array}{l}dx(s)=f\left[s,x(s),{u}_1(s),{u}_2(s),\cdots, {u}_n(s)\right]ds+\sigma \left[s,x(s)\right]dz(s),\\ {}x\left({t}_k\right)={x}_{t_k}\in X.\end{array} $$

Remark 1.2

One can also readily verify that $ {\overline{V}}^{i\left[{\theta}_{\alpha_k}^k\right](k)\tau}\left(\tau, {x}_{\tau}\right)={V}^{i\left[{\theta}_{\alpha_k}^k\right](k)}\left(\tau, {x}_{\tau}\right){e}^{r\left(\tau -{t}_0\right)} $, for $ \tau \in \left[{t}_k,{t}_{k+1}\right) $, is the expected feedback Nash equilibrium payoff of player i in the game

$$ \begin{array}{l}\underset{u_i}{ \max}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\displaystyle {\int}_{\tau}^{t_{k+1}}{g}^{\left[i,{\theta}_{a_k}^k\right]}\left[s,x(s),{u}_1(s),{u}_2(s),\cdots, {u}_n(s)\right]}\kern0.5em {e}^{-r\left(s-\tau \right)}ds\\ {}+{e}^{-r\left({t}_{k+1}-\tau \right)}{\displaystyle \sum_{a=1}^{\eta }{\lambda}_a}{\overline{V}}^{i\left[{\theta}_a^{k+1}\right]\left(k+1\right)}\left({t}_{k+1},x\left({t}_{k+1}\right)\right)\left|\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.x\left(\tau \right)={x}_{\tau}\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\},\kern0.5em \mathrm{f}\mathrm{o}\mathrm{r}\kern0.5em i\in N,\end{array} $$

subject to

$$ \begin{array}{l}dx(s)=f\left[s,x(s),{u}_1(s),{u}_2(s),\cdots, {u}_n(s)\right]ds+\sigma \left[s,x(s)\right]dz(s),\\ {}x\left(\tau \right)={x}_{\tau}\in X.\end{array} $$

2 Dynamic Cooperation

Now consider the case when the players want to cooperate and agree to act and allocate the cooperative payoff according to a set of agreed upon optimality principles. The agreement on how to act cooperatively and allocate cooperative payoff constitutes the solution optimality principle of a cooperative scheme. In particular, the optimality principle includes:

(i)
an agreement on a set of cooperative strategies/controls,

and
(ii)
a mechanism to distribute total payoff between players.

Both group rationality and individual rationality are required in a cooperative plan. Group rationality requires the players to seek a set of cooperative strategies/controls that yields a Pareto optimal solution. The allocation principle has to satisfy individual rationality in the sense that no player would be worse off than before under cooperation.

2.1 Group Rationality

Since payoffs are transferable, group rationality requires the players to maximize their expected joint payoff

$$ \begin{array}{l}{E}_{t_0}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}{\displaystyle \sum_{j=1}^n}{\displaystyle {\int}_{t_0}^{t_1}{g}^{\left[j,{\theta}_0^0\right]}\left[s,x(s),{u}_1(s),{u}_2(s),\cdots, {u}_n(s)\right]}\right.\kern0.5em {e}^{-r\left(s-{t}_0\right)}ds\\ {}+{\displaystyle \sum_{j=1}^n{\displaystyle \sum_{h=1}^m{\displaystyle \sum_{a_h=1}^{\eta }{\lambda}_{a_h}}}}{\displaystyle {\int}_{t_h}^{t_{h+1}}{g}^{\left[j,{\theta}_{a_h}^h\right]}}\left[s,x(s),{u}_1(s),{u}_2(s),\cdots, {u}_n(s)\right]\;{e}^{-r\left(s-{t}_0\right)}ds\\ {}\left.+,{e}^{-r\left(T-{t}_0\right)},{\displaystyle \sum_{j=1}^n},{q}^j,\left(x(T)\right)\right\}\end{array} $$

(2.1)

subject to (1.2).

We solve the control problem (2.1) and (1.1) in a manner similar to that we used to solve the game (1.1 and 1.2). In particular, an optimal solution of the problem (1.2) and (2.1) is characterized by the theorem below.

Theorem 2.1

A set of controls {$ {u}_i^{(m){\theta}_{\alpha_m}^m}(t)={\psi}_i^{(m){\theta}_{\alpha_m}^m}\left(t,x\right), $ for $ t\in \left[{t}_m,T\right] $; $ {u}_i^{(k){\theta}_{\alpha_k}^k}(t)={\psi}_i^{(k){\theta}_{\alpha_k}^k}\left(t,x\right) $, for $ t\in \left[{t}_k,{t}_{k+1}\right) $, $ k\in \left\{0,1,2,\cdots, m-1\right\} $ and $ i\in N $}, contingent upon the events $ {\theta}_{\alpha_m}^m $ and $ {\theta}_{\alpha_k}^k $ constitutes an optimal solution for the stochastic control problem (2.1 and 1.2), if there exist continuously differentiable functions $ {W}^{\left[{\theta}_{\alpha_m}^m\right](m)}\left(t,x\right):\left[{t}_m,T\right]\times {R}^{\kappa}\to R $ and $ {W}^{\left[{\theta}_{\alpha_k}^k\right](k)}\left(t,x\right):\left[{t}_k,{t}_{k+1}\right] $ $ \times {R}^{\kappa}\to R $ for $ k\in \left\{0,1,2,\cdots, m-1\right\} $ which satisfy the following partial differential equations:

$$ \begin{array}{l}-{W}_t^{\left[{\theta}_{\alpha_m}^m\right](m)}\left(t,x\right)-\frac{1}{2}{\displaystyle \sum_{h,\zeta =1}^n{\Omega}^{h\zeta}\left(t,x\right)\kern0.1em }{W}_{x^h{x}^{\zeta}}^{\left[{\theta}_{\alpha_m}^m\right](m)}\left(t,x\right)\\ {}=\underset{u_1^{\theta_{\alpha_m}},{u}_2^{\theta_{\alpha_m}},\cdots, {u}_n^{\theta_{\alpha_m}}}{ \max}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\displaystyle \sum_{j=1}^n}{g}^{\left[j,{\theta}_{\alpha_m}^m\right]}\left[t,x(t),{u}_1^{(m){\theta}_{\alpha_m}^m},{u}_2^{(m){\theta}_{\alpha_m}^m},\cdots, {u}_n^{(m){\theta}_{\alpha_m}^m}\right]{e}^{-r\left(t-{t}_{\tau}\right)}\\ {}+{W}_x^{\left[{\theta}_{\alpha_m}^m\right](m)}\left(t,x\right)f\left[t,x,{u}_1^{(m){\theta}_{\alpha_m}^m},{u}_2^{(m){\theta}_{\alpha_m}^m},\cdots, {u}_n^{(m){\theta}_{\alpha_m}^m}\right]\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\},\;\mathrm{and}\\ {}{W}^{\left[{\theta}_{\alpha_m}^m\right](m)}\left(T,x\right)={e}^{-r\left(T-{t}_0\right)}{\displaystyle \sum_{j=1}^n{q}^j(x)};\\ {}-{W}_t^{\left[{\theta}_{\alpha_k}^k\right](k)}\left(t,x\right)-\frac{1}{2}{\displaystyle \sum_{h,\zeta =1}^n{\Omega}^{h\zeta}\left(t,x\right)\kern0.1em }{W}_{x^h{x}^{\zeta}}^{\left[{\theta}_{\alpha_k}^k\right](k)}\left(t,x\right)\\ {}=\underset{u_1^{\theta_{\alpha_k}},{u}_2^{\theta_{\alpha_k}},\cdots, {u}_2^{\theta_{\alpha_k}}}{ \max}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\displaystyle \sum_{j=1}^n}{g}^{\left[j,{\theta}_{\alpha_k}^k\right]}\left[t,x,{u}_1^{(k){\theta}_{\alpha_k}^k},{u}_2^{(k){\theta}_{\alpha_k}^k},\cdots, {u}_n^{(k){\theta}_{\alpha_k}^k}\right]{e}^{-r\left(t-{t}_k\right)}\\ {}+{W}_x^{\left[{\theta}_{\alpha_k}^k\right](k)}\left(t,x\right)f\left[t,x,{u}_1^{(k){\theta}_{\alpha_k}^k},{u}_2^{(k){\theta}_{\alpha_k}^k},\cdots, {u}_n^{(k){\theta}_{\alpha_k}^k}\right]\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\},\;\mathrm{and}\\ {}{W}^{\left[{\theta}_{\alpha_k}^k\right](k)}\left({t}_{k+1},x\right)={\displaystyle \sum_{a=1}^{\eta }{\lambda}_a}{W}^{\left[{\theta}_a^{k+1}\right](k)}\left({t}_{k+1},x\right),\;\mathrm{f}\mathrm{o}\mathrm{r}\;k\in \left\{0,1,2,\cdots, m-1\right\}.\end{array} $$

Proof

Following the argument in the analysis in Sect. 4.1 we obtain $ {\displaystyle \sum_{a=1}^{\eta }{\lambda}_a}{W}^{\left[{\theta}_{\alpha}^m\right]\left(k+1\right)}\left({t}_{k+1},{x}_{t_{k+1}}\right) $ as the expected terminal value for the stochastic control problem in the time interval $ \left[{t}_k,{t}_{k+1}\right] $, for $ k\in \left\{0,1,2,\cdots, m\right\} $. Then direct application of the stochastic control technique in Theorem A.3 in the Technical Appendices and the Nash equilibrium condition yields Theorem 2.1. ■

Hence under cooperation the players will adopt the cooperative strategy $ \left[{\psi}_i^{(h){\theta}_{a_h}^h}\left(t,x\right),{\psi}_2^{(h){\theta}_{a_h}^h}\left(t,x\right),\cdots, {\psi}_n^{(h){\theta}_{a_h}^h}\left(t,x\right)\right] $ in the time interval $ \left[{t}_h,{t}_{h+1}\right) $ if $ {\theta}_{a_h}\in \left\{{\theta}_1,{\theta}_2,\kern0.5em \dots, {\theta}_{\eta}\right\} $ occurs at time t _h, for $ h\in \left\{0,1,2,\cdots, m\right\} $. In a cooperative framework, the issue of non-uniqueness of the optimal controls can be resolved by agreement between the players on a particular set of controls. Substituting the set of cooperative strategy into (1.2) yields the dynamics of the cooperative state trajectory in the time interval $ \left[{t}_k,{t}_{k+1}\right) $ for $ k\in \left\{0,1,2,\cdots, m\right\} $ as

$$ \begin{array}{l}dx(s)=f\left[s,x(s),{\psi}_1^{(k){\theta}_{\alpha_k}^k}\left(s,x(s)\right),{\psi}_2^{(k){\theta}_{\alpha_k}^k}\left(s,x(s)\right),\cdots, {\psi}_n^{(k){\theta}_{\alpha_k}^k}\left(s,x(s)\right)\right]\;ds\\ {}+\sigma \left[s,x(s)\right]\;dz(s),\end{array} $$

(2.2)

$ x\left({t}_k\right)={x}_{t_k} $, for $ s\in \left[{t}_k,{t}_{k+1}\right) $, if $ {\theta}_{a_k}^k\in \left\{{\theta}_1,{\theta}_2,\kern0.5em \dots, {\theta}_{\eta}\right\} $ occurs at time t _k.

For simplicity in exposition we denote the set of state variable realizable at time t according to (2.2) by X ^*_t , and use x ^*_t to denote an element in X ^*_t that would occur.

Finally, similar to Remarks 1.1 and 1.2 we have two results that will be utilized in subsequent analysis:

Remark 2.1

One can readily verify that $ {\overline{W}}^{\left[{\theta}_{\alpha_k}^k\right](k)}\left({t}_k,{x}_k\right)={W}^{\left[{\theta}_{\alpha_k}^k\right](k)}\left({t}_k,{x}_k\right){e}^{r\left({t}_k-{t}_0\right)} $ is the maximized value of the stochastic control problem

$ \begin{array}{l}\underset{u_1,{u}_2,\cdots, {u}_n}{ \max }{E}_{t_k}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}{\displaystyle \sum_{j=1}^n}{\displaystyle {\int}_{t_k}^{t_{k+1}}{g}^{\left[j,{\theta}_{a_k}^k\right]}\left[s,x(s),{u}_1(s),{u}_2(s),\cdots, {u}_n(s)\right]}\right.\kern0.5em {e}^{-r\left(s-{t}_k\right)}ds\\ {}+{\displaystyle \sum_{j=1}^n{\displaystyle \sum_{h=k+1}^m{\displaystyle \sum_{a_h=1}^{\eta }{\lambda}_{a_h}}{\displaystyle {\int}_{t_h}^{t_{h+1}}{g}^{\left[j,{\theta}_{a_h}^h\right]}}\left[s,x(s),{u}_1(s),{u}_2(s),\cdots, {u}_n(s)\right]\;{e}^{-r\left(s-{t}_k\right)}ds}}\\ {}\left.+,{e}^{-r\left(T-{t}_k\right)},{\displaystyle \sum_{j=1}^n},{q}^j,\left(x(T)\right)\right\}\end{array} $ subject to

$$ \begin{array}{l}dx(s)=f\left[s,x(s),{u}_1(s),{u}_2(s),\cdots, {u}_n(s)\right]ds+\sigma \left[s,x(s)\right]dz(s),\\ {}x\left({t}_k\right)={x}_{t_k}\in X.\end{array} $$

Remark 2.2

One can readily verify that

$$ {\overline{W}}^{\left[{\theta}_{\alpha_k}^k\right](k)\tau}\left(\tau, {x}_{\tau}\right)={W}^{\left[{\theta}_{\alpha_k}^k\right](k)}\left(\tau, {x}_{\tau}\right){e}^{r\left(\tau -{t}_0\right)},\;\mathrm{f}\mathrm{o}\mathrm{r}\;\tau \in \left[{t}_k,{t}_{k+1}\right), $$

is the maximized value of the stochastic control problem

$$ \begin{array}{l}\underset{u_1,{u}_2,\cdots, {u}_n}{ \max }{E}_{\tau}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}{\displaystyle \sum_{j=1}^n}{\displaystyle {\int}_{\tau}^{t_{k+1}}{g}^{\left[j,{\theta}_{a_k}^k\right]}\left[s,x(s),{u}_1(s),{u}_2(s),\cdots, {u}_n(s)\right]}\right.\kern0.5em {e}^{-r\left(s-\tau \right)}ds\\ {}+{\displaystyle \sum_{j=1}^2{\displaystyle \sum_{h=k+1}^m{\displaystyle \sum_{a_h=1}^{\eta }{\lambda}_{a_h}}{\displaystyle {\int}_{t_h}^{t_{h+1}}{g}^{\left[j,{\theta}_{a_h}^h\right]}}\left[s,x(s),{u}_1(s),{u}_2(s),\cdots, {u}_n(s)\right]\;{e}^{-r\left(s-\tau \right)}ds}}\\ {}\left.+,{e}^{-r\left(T-\tau \right)},{\displaystyle \sum_{j=1}^n},{q}^j,\left(x(T)\right)\right\}\end{array} $$

subject to

$$ dx(s)=f\left[s,x(s),{u}_1(s),{u}_2(s),\cdots, {u}_n(s)\right]ds+\sigma \left[s,x(s)\right]dz(s),x\left(\tau \right)={x}_{\tau}\in X. $$

2.2 Individual Rationality

Assume that at time t ₀ when the initial state is x ₀ the agreed upon optimality principle assigns a set of imputation vectors contingent upon the events θ ⁰₀ and $ {\theta}_{a_h}^h $ for $ {\theta}_{a_h}^h\in \left\{{\theta}_1,{\theta}_2,\kern0.5em \dots, {\theta}_{\eta}\right\} $ and $ h\in \left\{1,2,\cdots, m\right\} $. We use

$$ \left[{\xi}^{1\left[{\theta}_0^0\right](0)}\left({t}_0,{x}_0\right),{\xi}^{2\left[{\theta}_0^0\right](0)}\left({t}_0,{x}_0\right),\cdots, {\xi}^{n\left[{\theta}_0^0\right](0)}\left({t}_0,{x}_0\right)\right] $$

to denote an imputation vector of the gains in such a way that the share of the ith player over the time interval [t ₀, T] is equal to $ {\xi}^{i\left[{\theta}_0^0\right](0){t}_0}\left({t}_0,{x}_0\right) $.

Individual rationality requires that

$$ {\xi}^{i\left[{\theta}_0^0\right](0){t}_0}\left({t}_0,{x}_0\right)\ge {V}^{i\left[{\theta}_0^0\right](0)}\left({t}_0,{x}_0\right),\mathrm{f}\mathrm{o}\mathrm{r}\;i\in N. $$

In a dynamic framework, individual rationality has to be maintained at every instant of time $ t\in \left[{t}_0,T\right] $ along the cooperative trajectory. At time t, for $ t\in \left[{t}_0,{t}_1\right) $, if the players are allowed to reconsider their cooperative plan, they will compare their expected cooperative payoff to their expected noncooperative payoff at that time. Using the same optimality principle , at time t, for $ t\in \left[{t}_0,{t}_1\right) $, an imputation vector will assign the shares of the players over the time interval [t, T] as $ \left[{\xi}^{1\left[{\theta}_0^0\right](0)t}\left(t,{x}_t^{*}\right),{\xi}^{2\left[{\theta}_0^0\right](0)t}\left(t,{x}_t^{*}\right),\cdots, {\xi}^{n\left[{\theta}_0^0\right](0)t}\left(t,{x}_t^{*}\right)\right] $ (in current value at time t). Individual rationality requires that

$$ {\xi}^{i\left[{\theta}_0^0\right](0)t}\left(t,{x}_t^{*}\right)\ge {\overline{V}}^{i\left[{\theta}_0^0\right](0)t}\left(t,{x}_t^{*}\right),\;\mathrm{f}\mathrm{o}\mathrm{r}\;i\in N\;\mathrm{and}\;t\in \left[{t}_0,{t}_1\right). $$

At time t _h, for $ h\in \left\{1,2,\cdots, m\right\} $, if $ {\theta}_{a_h}^h\in \left\{{\theta}_1,{\theta}_2,\kern0.5em \dots, {\theta}_{\eta}\right\} $ has occurred and the state is $ {x}_{t_h}^{*} $, the same optimality principle assigns an imputation vector $ \left[{\xi}^{1\left[{\theta}_{a_h}^h\right](h){t}_h}\left({t}_h,{x}_{t_h}^{*}\right),{\xi}^{2\left[{\theta}_{a_h}^h\right](h){t}_h}\left({t}_h,{x}_{t_h}^{*}\right),\cdots, {\xi}^{n\left[{\theta}_{a_h}^h\right](h){t}_h}\left({t}_h,{x}_{t_h}^{*}\right)\right] $ (in current value at time t _h). Individual rationality is satisfied if:

$$ \begin{array}{cc}\hfill {\xi}^{i\left[{\theta}_{a_h}^h\right](h){t}_h}\left({t}_h,{x}_{t_h}^{*}\right)\ge {\overline{V}}^{i\left[{\theta}_{a_h}^h\right](h){t}_h}\left({t}_h,{x}_{t_h}^{*}\right).\kern2em \hfill & \hfill \mathrm{f}\mathrm{o}\mathrm{r}\;i\in N\hfill \end{array}. $$

Using the same optimality principle , at time t, for $ t\in \left[{t}_h,{t}_{h+1}\right) $, an imputation vector will assign the shares of the players over the time interval [t, T] as $ \left[{\xi}^{1\left[{\theta}_{a_h}^h\right](h)t}\left(t,{x}_t^{*}\right),{\xi}^{2\left[{\theta}_{a_h}^h\right](h)t}\left(t,{x}_t^{*}\right),\cdots, {\xi}^{n\left[{\theta}_{a_h}^h\right](h)t}\left(t,{x}_t^{*}\right)\right] $ (in terms of current value at time t). Individual rationality requires that

$ {\xi}^{i\left[{\theta}_{a_h}^h\right](h)t}\left(t,{x}_t^{*}\right)\ge {\overline{V}}^{i\left[{\theta}_{a_h}^h\right](h)t}\left(t,{x}_t^{*}\right) $, for $ i\in N $, $ t\in \left[{t}_h,{t}_{h+1}\right) $ and $ h\in \left\{1,2,\cdots, m\right\} $.

3 Subgame Consistent Solution and Payoff Distribution

A stringent requirement for solutions of cooperative stochastic differential games to be dynamically stable is the property of subgame consistency . Under subgame consistency , an extension of the solution policy to a situation with a later starting time and any feasible state brought about by prior optimal behaviors would remain optimal. In particular, when the game proceeds, at each instant of time the players are guided by the same optimality principles , and hence do not have any ground for deviation from the previously adopted optimal behavior throughout the game. A dynamically stable solution to the randomly furcating game (1.1 and 1.2) is sought in this section.

3.1 Solution Imputation Vector

According to the solution optimality principle the players agree to share their cooperative payoff according to the following set of imputation vectors

$$ \begin{array}{l}\left[{\xi}^{1\left[{\theta}_0^0\right](0)}\left({t}_0,{x}_0\right),{\xi}^{2\left[{\theta}_0^0\right](0)}\left({t}_0,{x}_0\right),\cdots, {\xi}^{n\left[{\theta}_0^0\right](0)}\left({t}_0,{x}_0\right)\right]\;\mathrm{at}\ \mathrm{time}\;{t}_0,\\ {}\left[{\xi}^{1\left[{\theta}_0^0\right](0)t}\left(t,{x}_t^{*}\right),{\xi}^{2\left[{\theta}_0^0\right](0)t}\left(t,{x}_t^{*}\right),\cdots, {\xi}^{n\left[{\theta}_0^0\right](0)t}\left(t,{x}_t^{*}\right)\right]\;\mathrm{f}\mathrm{o}\mathrm{r}\;t\in \left[{t}_0,{t}_1\right),\\ {}\left[{\xi}^{1\left[{\theta}_{a_h}^h\right](h)}\left({t}_h,{x}_{t_h}^{*}\right),{\xi}^{2\left[{\theta}_{a_h}^h\right](h)}\left({t}_h,{x}_{t_h}^{*}\right),\cdots, {\xi}^{n\left[{\theta}_{a_h}^h\right](h)}\left({t}_h,{x}_{t_h}^{*}\right)\right]\;\mathrm{at}\ \mathrm{time}\;{t}_h,\\ {}\mathrm{f}\mathrm{o}\mathrm{r}\;{\theta}_{a_h}^h\in \left\{{\theta}_1,{\theta}_2,\kern0.5em \dots, {\theta}_{\eta}\right\}\;\mathrm{and}\;h\in \left\{1,2,\cdots, m\right\},\\ {}\left[{\xi}^{1\left[{\theta}_{a_h}^h\right](h)t}\left(t,{x}_t^{*}\right),{\xi}^{1\left[{\theta}_{a_h}^h\right](h)t}\left(t,{x}_t^{*}\right),\cdots, {\xi}^{n\left[{\theta}_{a_h}^h\right](h)t}\left(t,{x}_t^{*}\right)\right]\\ {}\mathrm{f}\mathrm{o}\mathrm{r}\;t\in \left[{t}_h,{t}_{h+1}\right)\;\mathrm{and}\;{\theta}_{a_h}^h\in \left\{{\theta}_1,{\theta}_2,\kern0.5em \dots, {\theta}_{\eta}\right\}\;\mathrm{and}\;h\in \left\{1,2,\cdots, m\right\}.\end{array} $$

(3.1)

Since (3.1) is guided by a solution optimality principle group optimality and individual rationality are satisfied.

The solution imputation $ {\xi}^{i\left[{\theta}_{a_k}^k\right](k)\tau}\left(t,{x}_t^{*}\right) $ may be governed by many specific principles. For instance, the players agree to maximize the sum of their payoffs and equally divide the excess of the cooperative payoff over the noncooperative payoff. The imputation scheme has to satisfy:

Scheme 3.1

\( \begin{array}{l}{\xi}^{i\left[{\theta}_{a_k}^k\right](k)}\left({t}_k,{x}_{t_k}^{*}\right)={\overline{V}}^{i\left[{\theta}_{a_k}^k\right](k)}\left({t}_k,{x}_{t_k}^{*}\right)+\frac{1}{n}\left[\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\overline{W}}^{\left[{\theta}_{a_k}^k\right](k)}\left({t}_k,{x}_{t_k}^{*}\right)-{\displaystyle \sum_{j=1}^n}{\overline{V}}^{j\left[{\theta}_{a_k}^k\right](k)}\left({t}_k,{x}_{t_k}^{*}\right)\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right],\;\mathrm{and}\\ {}{\xi}^{i\left[{\theta}_{a_k}^k\right](k)t}\left(t,{x}_t^{*}\right)={\overline{V}}^{i\left[{\theta}_{a_k}^k\right](k)t}\left(t,{x}_t^{*}\right)+\frac{1}{n}\left[\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\overline{W}}^{\left[{\theta}_{a_k}^k\right](k)}\left({t}_k,{x}_{t_k}^{*}\right)-{\displaystyle \sum_{j=1}^n}{\overline{V}}^{j\left[{\theta}_{a_k}^k\right](k)t}\left({t}_k,{x}_{t_k}^{*}\right)\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right],\\ {}\mathrm{f}\mathrm{o}\mathrm{r}\;i\in N\;\mathrm{and}\;t\in \left({t}_k,{t}_{k+1}\right).\end{array} \)

As another example, the solution imputation $ {\xi}^{i\left[{\theta}_{a_k}^k\right](k)\tau}\left(t,{x}_t^{*}\right) $ may be an allocation principle in which the players allocate the total joint payoff according to the relative sizes of the firms’ noncooperative profits. Hence the imputation scheme has to satisfy

Scheme 3.2

$$ \begin{array}{l}{\xi}^{i\left[{\theta}_{a_k}^k\right](k)}\left({t}_k,{x}_{t_k}^{*}\right)=\frac{{\overline{V}}^{i\left[{\theta}_{a_k}^k\right](k)}\left({t}_k,{x}_{t_k}^{*}\right)}{{\displaystyle \sum_{j=1}^n{\overline{V}}^{j\left[{\theta}_{a_k}^k\right](k)}\left({t}_k,{x}_{t_k}^{*}\right)}}{\overline{W}}^{\left[{\theta}_{a_k}^k\right](k)}\left({t}_k,{x}_{t_k}^{*}\right),\;\mathrm{and}\\ {}{\xi}^{i\left[{\theta}_{a_k}^k\right](k)t}\left(t,{x}_t^{*}\right)=\frac{{\overline{V}}^{i\left[{\theta}_{a_k}^k\right](k)t}\left(t,{x}_t^{*}\right)}{{\displaystyle \sum_{j=1}^n{\overline{V}}^{i\left[{\theta}_{a_k}^k\right](k)t}\left(t,{x}_t^{*}\right)}}{\overline{W}}^{\left[{\theta}_{a_k}^k\right](k)t}\left(t,{x}_t^{*}\right),\kern0.35em \mathrm{f}\mathrm{o}\mathrm{r}\kern0.35em i\in N\kern0.35em \mathrm{and}\kern0.35em t\in \left({t}_k,{t}_{k+1}\right).\end{array} $$

Crucial to the analysis is the formulation of a payoff distribution mechanism that would lead to the realization of Condition (3.1). This will be done in the next subsection.

3.2 Subgame-Consistent Payoff Distribution Procedure

First consider the cooperative subgame in the last time interval, that is [t _m, T] in which $ {\theta}_{a_m}^m\in \left\{{\theta}_1,{\theta}_2,\kern0.5em \dots, {\theta}_{\eta}\right\} $ has occurred at time t _m. To maximize expected joint payoff the players

$$ \begin{array}{l}\underset{u_1,{u}_2,\cdots, {u}_n}{ \max }{E}_{t_m}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\displaystyle \sum_{j=1}^n{\displaystyle {\int}_{t_m}^T{g}^{\left[j,{\theta}_{a_m}^m\right]}\left[s,x(s),{u}_1(s),{u}_2(s),\cdots, {u}_n(s)\right]}}\;{e}^{-r\left(s-{t}_m\right)}ds\\ {}+{e}^{-r\left(T-{t}_m\right)}{\displaystyle \sum_{j=1}^n}{q}^j\left(x(T)\right)\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\}\end{array} $$

(3.2)

subject to

$$ \begin{array}{l}dx(s)=f\left[s,x(s),{u}_1(s),{u}_2(s),\cdots, {u}_n(s)\right]ds+\sigma \left[s,x(s)\right]dz(s),\\ {}x\left({t}_m\right)={x}_{t_m}^{*}.\end{array} $$

(3.3)

According to (3.1) the players agree to share their cooperative payoff according to the imputation

$$ \left[{\xi}^{1\left[{\theta}_{a_m}^m\right](m){t}_m}\left({t}_m,{x}_{t_m}^{*}\right),{\xi}^{2\left[{\theta}_{a_m}^m\right](m){t}_m}\left({t}_m,{x}_{t_m}^{*}\right),\cdots, {\xi}^{n\left[{\theta}_{a_m}^m\right](m){t}_m}\left({t}_m,{x}_{t_m}^{*}\right)\right]. $$

Following Yeung and Petrosyan (2004), we formulate a payoff distribution over time so that the agreed imputations can be realized. Let the vectors

$ \left[{B}_1^{\left({\theta}_{a_m}^m\right)m}(s),{B}_2^{\left({\theta}_{a_m}^m\right)m}(s),\cdots, {B}_n^{\left({\theta}_{a_m}^m\right)m}(s)\right] $ denote the instantaneous payoff at time $ s\in \left[{t}_m,T\right] $ for the cooperative subgame (3.2 and 3.3). In other words, player i, for $ i\in N $, obtains an instantaneous payment $ {B}_i^{\left({\theta}_{a_m}^m\right)m}(s) $ at time instant s. A terminal value of q ⁱ(x ^*_T ) is received by player i at time T.

In particular, $ {B}_i^{\left({\theta}_{a_m}^m\right)m}(s) $ and q ⁱ(x ^*_T ) constitute a payoff distribution for the subgame in the sense that

$$ \begin{array}{l}{\xi}^{i\left[{\theta}_{a_m}^m\right](m){t}_m}\left({t}_m,{x}_{t_m}^{*}\right)\\ {}={E}_{t_m}\left\{\left(\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.\right.{\displaystyle {\int}_{t_m}^{\;T}{B}_i^{\left({\theta}_{a_m}^m\right)m}(s)\;{e}^{-r\left(s-{t}_m\right)}ds}+{e}^{-r\left(T-{t}_m\right)}{q}^i\left({x}_T^{*}\right)\left.\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right)\left|\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}x\left({t}_m\right)={x}_{t_m}^{*}\right.\right\},\\ {}\mathrm{f}\mathrm{o}\mathrm{r}\;i\in N.\end{array} $$

(3.4)

As the game proceed to at time t, for $ t\in \left[{t}_m,T\right) $, using the same optimality principle an imputation vector will assign the shares of the players over the time interval [t, T] as $ {\xi}^{i\left[{\theta}_{a_m}^m\right](m)t}\left(t,{x}_t^{*}\right) $. For consistency reasons, it is required that

$$ \begin{array}{l}{\xi}^{i\left[{\theta}_{a_m}^m\right](m)t}\left(t,{x}_t^{*}\right)\\ {}={E}_t\left\{\left(\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.\right.{\displaystyle {\int}_t^{\;T}{B}_i^{\left({\theta}_{a_m}^m\right)m}(s)\;{e}^{-r\left(s-t\right)}ds}+{e}^{-r\left(T-t\right)}{q}^i\left({x}_T^{*}\right)\left.\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right)\left|\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}x(t)={x}_t^{*}\right.\right\},\\ {}\mathrm{f}\mathrm{o}\mathrm{r}\;t\in \left[{t}_m,T\right].\end{array} $$

(3.5)

To fulfill group optimality , it is required that

$$ \begin{array}{l}{\displaystyle \sum_{j=1}^n}{\xi}^{j\left[{\theta}_{a_m}^m\right](m)t}\left(t,{x}_t^{*}\right)={\overline{W}}^{\left[{\theta}_{a_m}^m\right](m)t}\left(t,{x}_t^{*}\right)\;\mathrm{f}\mathrm{o}\mathrm{r}\;t\in \left[{t}_m,T\right],\;\mathrm{and}\\ {}{\displaystyle \sum_{j=1}^n}{B}^{j\left[{\theta}_{a_m}^m\right](m)t}(t)\\ {}={\displaystyle \sum_{j=1}^n{g}^{\left[j,{\theta}_{a_m}^m\right]}\left[t,{x}_t^{*},{\psi}_1^{(m){\theta}_{\alpha_m}^m}\left(t,{x}_t^{*}\right),{\psi}_2^{(m){\theta}_{\alpha_m}^m}\left(t,{x}_t^{*}\right),\cdots, {\psi}_n^{(m){\theta}_{\alpha_m}^m}\left(t,{x}_t^{*}\right)\right]}.\end{array} $$

(3.6)

If the conditions from (3.4) to (3.6) are satisfied, one can say that the solution imputations are time-consistent in the sense that (3.1) can be realized.

Now we consider

$$ \begin{array}{l}{\xi}^{i\left[{\theta}_{a_m}^m\right](m){t}_m}\left(t,{x}_t^{*}\right)={E}_{t_m}\left\{\left(\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.\right.{\displaystyle {\int}_t^{\;T}{B}_i^{\left({\theta}_{a_m}^m\right)m}(s)\;{e}^{-r\left(s-{t}_m\right)}ds}\\ {}+{e}^{-r\left(T-{t}_m\right)}{q}^i\left({x}_T^{*}\right)\left.\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right)\left|\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}x(t)={x}_t^{*}\right.\right\},\\ {}\mathrm{f}\mathrm{o}\mathrm{r}\;t\in \left[{t}_m,T\right]\;\mathrm{and}\;i\in N.\end{array} $$

(3.7)

Using (3.4), (3.5) and (3.7), we have

$$ {\xi}^{i\left[{\theta}_{a_m}^m\right](m){t}_m}\left(t,{x}_t^{*}\right)={e}^{-r\left(t-{t}_m\right)}{\xi}^{i\left[{\theta}_{a_m}^m\right](m)t}\left(t,{x}_t^{*}\right),\kern1em \mathrm{f}\mathrm{o}\mathrm{r}\;t\in \left[{t}_m,T\right]. $$

(3.8)

Moreover, we can write

$$ \begin{array}{l}{\xi}^{i\left[{\theta}_{a_m}^m\right](m)\tau}\left(\tau, {x}_{\tau}^{*}\right)={E}_{\tau}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\displaystyle {\int}_{\tau}^{\kern0.24em \tau +\Delta t}{B}_i^{\left({\theta}_{a_m}^m\right)m}(s)\;{e}^{-r\left(s-\tau \right)}ds}\\ {}+{e}^{-r\left(\varDelta\;t\right)}{\xi}^{i\left[{\theta}_{a_m}^m\right](m)\tau +\Delta t}\left(\tau +\Delta t,{x}_{\tau}^{*}+\Delta {x}_{\tau}^{*}\right)\left|\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}x\left(\tau \right)={x}_{\tau}^{*}\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\}\right.\\ {}\mathrm{f}\mathrm{o}\mathrm{r}\;\tau \in \left[{t}_m,T\right]\;\mathrm{and}\;i\in N;\end{array} $$

(3.9)

where

$$ \begin{array}{l}\Delta {x}_{\tau}^{*}=f\left[\tau, {x}_{\tau}^{*},{\psi}_1^{(m){\theta}_{a_m}^m}\left(\tau, {x}_{\tau}^{*}\right),{\psi}_2^{(m){\theta}_{a_m}^m}\left(\tau, {x}_{\tau}^{*}\right),\cdots, {\psi}_n^{(m){\theta}_{a_m}^m}\left(\tau, {x}_{\tau}^{*}\right)\right]\;\Delta t\\ {}+\sigma \left[\tau, {x}_{\tau}^{*}\right]\Delta {z}_{\tau }+o\left(\Delta t\right),\end{array} $$

and

$ \Delta {z}_{\tau }=z\left(\tau +\Delta t\right)-z\left(\tau \right) $, and $ {E}_t\left[o\left(\Delta t\right)\right]/\Delta t\to 0 $ as $ \Delta t\to 0 $.

From (3.9) we obtain

$$ \begin{array}{l}{E}_{\tau}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\displaystyle {\int}_{\tau}^{\kern0.24em \tau +\Delta t}{B}_i^{\left({\theta}_{a_m}^m\right)m}(s)\;{e}^{-r\left(s-\tau \right)}ds}\left|\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}x\left(\tau \right)={x}_{\tau}^{*}\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\}\right.\\ {}={\xi}^{i\left[{\theta}_{a_m}^m\right](m)t}\left(t,{x}_t^{*}\right)-{e}^{-r\left(\Delta\;t\right)}{\xi}^{i\left[{\theta}_{a_m}^m\right](m)t+\Delta t}\left(t+\Delta t,{x}_t^{*}+\Delta {x}_t^{*}\right).\end{array} $$

(3.10)

Invoking (3.8) yields

$$ \begin{array}{l}{E}_{\tau}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\displaystyle {\int}_t^{\kern0.24em t+\Delta t}{B}_i^{\left({\theta}_{a_m}^m\right)m}(s)\;{e}^{-r\left(s-t\right)}ds}\left|\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}x(t)={x}_t^{*}\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\}\right.\\ {}={\xi}^{i\left[{\theta}_{a_m}^m\right](m)\tau}\left(\tau, {x}_{\tau}^{*}\right)-{\xi}^{i\left[{\theta}_{a_m}^m\right](m)\tau}\left(\tau +\Delta t,{x}_{\tau}^{*}+\Delta {x}_{\tau}^{*}\right),\end{array} $$

(3.11)

For imputations $ {\xi}^{i\left[{\theta}_{a_m}^m\right](m)\tau}\left(t,{x}_t^{*}\right) $, for $ \tau \in \left[{t}_m,T\right] $ and $ t\in \left[\tau, T\right] $ being functions that are continuously twice differentiable in t and x ^*_t , one can express (3.11), with $ \Delta t\to 0 $, as:

$$ \begin{array}{l}{E}_{\tau}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{B}_i^{\left({\theta}_{a_m}^m\right)m}\left(\tau \right)\Delta t+o\left(\Delta t\right)\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\}={E}_{\tau}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.-\left[{\xi}_t^{i\left[{\theta}_{a_m}^m\right](m)\tau}\left(t,{x}_t^{*}\right)\left|{}_{t=\tau}\right.\right]\Delta\;t\\ {}-\left[{\xi}_{x_t^{*}}^{i\left[{\theta}_{a_m}^m\right](m)\tau}\left(t,{x}_t^{*}\right)\left|{}_{t=\tau}\right.\right]f\left[\tau, {x}_{\tau}^{*},{\psi}_1^{(m){\theta}_{a_m}^m}\left(\tau, {x}_{\tau}^{*}\right),{\psi}_2^{(m){\theta}_{a_m}^m}\left(\tau, {x}_{\tau}^{*}\right),\cdots, {\psi}_n^{(m){\theta}_{a_m}^m}\left(\tau, {x}_{\tau}^{*}\right)\right]\;\Delta t\\ {}-\frac{1}{2}{\displaystyle \sum_{h,\zeta =1}^n{\Omega}^{h\zeta}\left(\tau, {x}_{\tau}^{*}\right)}\left[{\left.{\xi}_{x_t^h{x}_t^{\zeta}}^{i\left[{\theta}_{a_m}^m\right](m)\tau}\left(t,{x}_t^{*}\right)\right|}_{t=\tau}\right]\Delta t\\ {}\left.-,{}{}\left[{\left.{\xi}_{x_t}^{i\left[{\theta}_{a_m}^m\right](m)\tau}\left(t,{x}_t^{*}\right)\right|}_{t=\tau}\right],\sigma, \left[\tau, {x}_{\tau}^{*}\right],\Delta, {z}_{\tau },-,o,\left(\Delta t\right)\right\}.\end{array} $$

(3.12)

Dividing (3.12) throughout by Δt, with $ \Delta t\to 0 $, and taking expectation yield

$$ \begin{array}{l}{B}_i^{\left({\theta}_{a_m}^m\right)m}\left(\tau \right)=-\left[{\xi}_t^{i\left[{\theta}_{a_m}^m\right](m)\tau}\left(t,{x}_t^{*}\right)\left|{}_{t=\tau}\right.\right]\\ {}-\left[{\xi}_{x_t^{*}}^{i\left[{\theta}_{a_m}^m\right](m)\tau}\left(t,{x}_t^{*}\right)\left|{}_{t=\tau}\right.\right]\\ {}f\left[\tau, {x}_{\tau}^{*},{\psi}_1^{(m){\theta}_{a_m}^m}\left(\tau, {x}_{\tau}^{*}\right),{\psi}_2^{(m){\theta}_{a_m}^m}\left(\tau, {x}_{\tau}^{*}\right),\cdots, {\psi}_n^{(m){\theta}_{a_m}^m}\left(\tau, {x}_{\tau}^{*}\right)\right]\;\\ {}-\frac{1}{2}{\displaystyle \sum_{h,\zeta =1}^n{\Omega}^{h\zeta}\left(\tau, {x}_{\tau}^{*}\right)}\left[{\left.{\xi}_{x_t^h{x}_t^{\zeta}}^{i\left[{\theta}_{a_m}^m\right](m)\tau}\left(t,{x}_t^{*}\right)\right|}_{t=\tau}\right],\;\mathrm{f}\mathrm{o}\mathrm{r}\;i\in N,\end{array} $$

(3.13)

One can repeat the analysis from (3.4) to (3.13) for all $ {\xi}^{i\left[{\theta}_{a_m}^m\right](m)\tau}\left(\tau, {x}_{\tau}^{*}\right) $ each associated with an $ {\theta}_{a_m}^m\in \left\{{\theta}_1,{\theta}_2,\kern0.5em \dots, {\theta}_{\eta}\right\} $ and obtain the corresponding $ {B}_i^{\left({\theta}_{a_m}^m\right)m}\left(\tau \right) $ for $ \tau \in \left[{t}_m,T\right] $.

In order to formulate the cooperative subgame in the second last time interval $ \left[{t}_{m-1},{t}_m\right] $, it is necessary to identify the expected terminal payoffs at time t _m. Using Theorem 2.1, one can obtain $ {W}^{\left[{\theta}_{a_m}^m\right](m)}\left({t}_m,{x}_m^{*}\right) $ if $ {\theta}_{a_m}^m\in \left\{{\theta}_1,{\theta}_2,\kern0.5em \dots, {\theta}_{\eta}\right\} $ occurs at time t _m. The term $ {\displaystyle \sum_{a=1}^{\eta }}{W}^{\left[{\theta}_a^m\right](m)}\left({t}_m,{x}_m^{*}\right) $ gives the expected joint payoff of the cooperative game over the duration [t _m, T] and hence is the expected terminal joint payoff for the cooperative subgame in the time interval $ \left[{t}_{m-1},{t}_m\right] $. In a similar manner, the term $ {\displaystyle \sum_{a=1}^{\eta }}{W}^{\left[{\theta}_a^{k+1}\right]\left(k+1\right)}\left({t}_{k+1},{x}_{k+1}^{*}\right) $ gives the expected terminal joint payoff for the cooperative subgame in the time interval $ \left[{t}_k,{t}_{k+1}\right] $ for $ k\in \left\{0,1,2,\cdots, m-1\right\} $. In general, the cooperative subgame in the time interval $ \left[{t}_k,{t}_{k+1}\right] $ if $ {\theta}_{a_k}^k\in \left\{{\theta}_1,{\theta}_2,\kern0.5em \dots, {\theta}_{\eta}\right\} $ occurs at time t _k for $ k\in \left\{0,1,2,\cdots, m-1\right\} $ can be expressed as:

$$ \begin{array}{l}\underset{u_1,{u}_2}{ \max }{E}_{t_k}\left\{\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\displaystyle \sum_{j=1}^2{\displaystyle {\int}_{t_k}^{t_{k+1}}{g}^{\left[j,{\theta}_{a_k}^k\right]}\left[s,x(s),{u}_1(s),{u}_2(s),\cdots, {u}_n(s)\right]}}\;{e}^{-r\left(s-{t}_k\right)}ds\\ {}+{e}^{-r\left({t}_{k+1}-{t}_k\right)}{\displaystyle \sum_{j=1}^2}{\displaystyle \sum_{a=1}^{\eta }}{\overline{W}}^{\left[{\theta}_a^{k+1}\right]\left(k+1\right)}\left({t}_{k+1},x\left({t}_{k+1}\right)\right)\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right\}\end{array} $$

(3.14)

subject to

$$ \begin{array}{l}dx(s)=f\left[s,x(s),{u}_1(s),{u}_2(s),\cdots, {u}_n(s)\right]ds+\sigma \left[s,x(s)\right]dz(s),\\ {}x\left({t}_k\right)={x}_k^{*}.\end{array} $$

(3.15)

One can repeat the analysis from (3.4) to (3.13) for all $ {\xi}^{i\left[{\theta}_{a_k}^k\right](k)\tau}\left(\tau, {x}_{\tau}^{*}\right) $ each associated with an $ {\theta}_{a_k}^k\in \left\{{\theta}_1,{\theta}_2,\kern0.5em \dots, {\theta}_{\eta}\right\} $ for $ k\in \left\{0,1,2,\cdots, m-1\right\} $ and derive the corresponding $ {B}_i^{\left({\theta}_{a_k}^k\right)k}\left(\tau \right) $ for $ \tau \in \left[{t}_k,{t}_{k+1}\right) $.

A theorem characterizing a subgame consistent PDP is provided below.

Theorem 3.1

If the solution imputations $ {\xi}^{i\left[{\theta}_{a_k}^k\right](k)\tau}\left(t,{x}_t^{*}\right) $, for $ i\in N $ and $ \tau \in \left[{t}_k,{t}_{k+1}\right] $ and $ t\in \left[\tau, {t}_{k+1}\right] $ and $ k\in \left\{0,1,2,\cdots, m-1\right\} $, satisfy group optimality , individual rationality and are differentiable in t and x ^*_t , a PDP with a terminal payment $ {q}^i\left({x}_T^{*}\right)\Big) $ at time T and an instantaneous payment at time $ \tau \in \left[{t}_k,{t}_{k+1}\right] $:

$$ \begin{array}{l}{B}_i^{\left({\theta}_{a_k}^k\right)k}\left(\tau \right)=-\left[{\xi}_t^{i\left[{\theta}_{a_k}^k\right](k)\tau}\left(t,{x}_t^{*}\right)\left|{}_{t=\tau}\right.\right]\\ {}-\left[{\xi}_{x_t^{*}}^{i\left[{\theta}_{a_k}^k\right](k)\tau}\left(t,{x}_t^{*}\right)\left|{}_{t=\tau}\right.\right]\\ {}f\left[\tau, {x}_{\tau}^{*},{\psi}_1^{(k){\theta}_{a_k}^k}\left(\tau, {x}_{\tau}^{*}\right),{\psi}_2^{(k){\theta}_{a_k}^k}\left(\tau, {x}_{\tau}^{*}\right),\cdots, {\psi}_n^{(k){\theta}_{a_k}^k}\left(\tau, {x}_{\tau}^{*}\right)\right]\\ {}-\frac{1}{2}{\displaystyle \sum_{h,\zeta =1}^n{\Omega}^{h\zeta}\left(\tau, {x}_{\tau}^{*}\right)}\left[{\left.{\xi}_{x_t^h{x}_t^{\zeta}}^{i\left[{\theta}_{a_k}^k\right](k)\tau}\left(t,{x}_t^{*}\right)\right|}_{t=\tau}\right],\end{array} $$

(3.16)

for $ i\in N $ and $ k\in \left\{1,2,\cdots, m\right\} $,

contingent upon $ {\theta}_{a_k}^k\in \left\{{\theta}_1,{\theta}_2,\kern0.5em \dots, {\theta}_{\eta}\right\} $ has occurred at time t _k,

yields a subgame-consistent cooperative solution to the randomly furcating stochastic differential game (1.1 and 1.2).

Proof

Theorem 3.1 can be proved by following the analysis from (3.4) to (3.15). ■

4 An Illustration in Cooperative Resource Extraction

Consider a resource extraction game, in which two extractors are awarded leases to extract a renewable resource over the time interval [t ₀, T]. The resource stock $ x(s)\in X\subset R $ follows the dynamics:

$$ dx(s)=\left[ ax{(s)}^{1/2}-bx(s)-{u}_1(s)-{u}_2(s)\right]ds+\sigma x(s)dz(s),\kern0.3em x\left({t}_0\right)={x}_0\in X, $$

(4.1)

where u ₁(s) is the harvest rate of extractor 1 and u ₂(s) is the harvest rate of extractor 2. The dynamics is adopted from Jørgensen and Yeung (1996).

The instantaneous payoff at time $ s\in \left[{t}_0,T\right] $ for player 1 and player 2 are respectively:

$$ \left[{u}_1{(s)}^{1/2}-\frac{\varepsilon_1^{\left[a\right]}{c}_1}{x{(s)}^{1/2}}{u}_1(s)\right]\;\mathrm{and}\;\left[{u}_2{(s)}^{1/2}-\frac{\varepsilon_2^{\left[a\right]}{c}_2}{x{(s)}^{1/2}}{u}_2(s)\right], $$

if the event θ _a happens for $ a\in \left\{1,2,3\right\} $, where ε ^[a]₁ , ε ^[a]₂ , c ₁ and c ₂ are constants.

At time t ₀, it is known that θ ₁ has occurred. θ ₁ will remain in effect until time $ {t}_1\in \left({t}_0,T\right) $. At time t ₁, the corresponding probabilities for the events {θ ₁, θ ₂, θ ₃} to occur are {λ ₁, λ ₂, λ ₃} = {1/4, 1/2, 1/4}. The occurred event will remain until the end of the game, that is time T. At time T, each extractor will receive a termination bonus qx(T)^1/2, which depends on the resource remaining at the terminal time. Payoffs are transferable between player 1 and player 2 and over time. There is a constant discount rate r.

Applying Theorem 1.1 , we obtain the following value functions for the associating noncooperative games .

$$ \begin{array}{l}{V}^{i\left[{\theta}_a^1\right](1)}\left(t,x\right)= \exp \left[-r\left(t-{t}_0\right)\right]\;\left[{A}_i^{\theta_a^1(1)}(t){x}^{1/2}+{C}_i^{\theta_a^1(1)}(t)\right],\\ {}\mathrm{f}\mathrm{o}\mathrm{r}\;i\in \left\{1,2\right\},a\in \left\{1,2,3\right\}\;\mathrm{and}\;t\in \left[{t}_1,T\right];\end{array} $$

(4.2)

$$ \begin{array}{l}{V}^{i\left[{\theta}_1\right](0)}\left(t,x\right)= \exp \left[-r\left(t-{t}_0\right)\right]\;\left[{A}_i^{\theta_1(0)}(t){x}^{1/2}+{C}_i^{\theta_1(0)}(t)\right],\\ {}\mathrm{f}\mathrm{o}\mathrm{r}\;i\in \left\{1,2\right\}\;\mathrm{and}\;t\in \left[{t}_0,{t}_1\right],\end{array} $$

(4.3)

where $ {A}_i^{\theta_a^1(1)}(t) $, $ {C}_i^{\theta_a^1(1)}(t) $, $ {A}_i^{\theta_1(0)}(t) $ and $ {C}_i^{\theta_1(0)}(t) $ satisfy:

$$ \begin{array}{l}{\overset{.}{A}}_i^{\theta_a^1(1)}(t)=\left[r+\frac{\sigma^2}{8}+\frac{b}{2}\right]{A}_i^{\theta_a^1(1)}(t)\\ {}-\frac{1}{2\left[{\varepsilon}_i^{\left[a\right]}{c}_i+{A}_i^{\theta_a^1(1)}(t)/2\right]}+\frac{\varepsilon_i^{\left[a\right]}{c}_i}{4{\left[{\varepsilon}_i^{\left[a\right]}{c}_i+{A}_i^{\theta_a^1(1)}(t)/2\right]}^{\;2}}\\ {}+\frac{A_i^{\theta_a^1(1)}(t)}{8{\left[{\varepsilon}_i^{\left[a\right]}{c}_i+{A}_i^{\theta_a^1(1)}(t)/2\right]}^2}+\frac{A_{\theta_a^1(1)i}(t)}{8{\left[{\varepsilon}_j^{\left[a\right]}{c}_j+{A}_j^{\theta_a^1(1)}(t)/2\right]}^{\;2}},\\ {}{\overset{.}{C}}_i^{\theta_a^1(1)}(t)=r{C}_i^{\theta_a^1(1)}(t)-\frac{\alpha }{2}{A}_i^{\theta_a^1(1)}(t),\\ {}{A}_i^{\theta_a^1(1)}(T)=q,\;\mathrm{and}\;{C}_i^{\theta_a^1(1)}(T)=0;\kern0.5em \mathrm{f}\mathrm{o}\mathrm{r}\;i\in \left\{1,2\right\}\;\mathrm{and}\;a\in \left\{1,2,3\right\};\end{array} $$

$$ \begin{array}{l}{\overset{.}{A}}_i^{\theta_1(0)}(t)=\left[r+\frac{\sigma^2}{8}+\frac{b}{2}\right]{A}_i^{\theta_1(0)}(t)\\ {}-\frac{1}{2\left[{\varepsilon}_i^{\left[1\right]}{c}_i+{A}_i^{\theta_1(0)}(t)/2\right]}+\frac{\varepsilon_i^{\left[1\right]}{c}_i}{4{\left[{\varepsilon}_i^{\left[1\right]}{c}_i+{A}_i^{\theta_1(0)}(t)/2\right]}^{\;2}}\\ {}+\frac{A_i^{\theta_1(0)}(t)}{8{\left[{\varepsilon}_i^{\left[1\right]}{c}_i+{A}_i^{\theta_1(0)}(t)/2\right]}^2}+\frac{A_i^{\theta_1(0)}(t)}{8{\left[{\varepsilon}_j^{\left[1\right]}{c}_j+{A}_j^{\theta_1(0)}(t)/2\right]}^{\;2}},\\ {}{\overset{.}{C}}_i^{\theta_1(0)}(t)=r{C}_i^{\theta_1(0)}(t)-\frac{\alpha }{2}{A}_i^{\theta_1(0)}(t),\\ {}{A}_i^{\theta_1(0)}\left({t}_1\right)={\displaystyle \sum_{h=1}^3{\lambda}_h}{A}_i^{\theta_h^1(1)}\left({t}_1\right),\;\mathrm{and}\;{C}_i^{\theta_1(0)}\left({t}_1\right)={\displaystyle \sum_{h=1}^3{\lambda}_h}{C}_i^{\theta_h^1(1)}\left({t}_1\right).\end{array} $$

Applying Theorem 2.1, we obtain

$$ \begin{array}{l}{W}^{\left[{\theta}_{\alpha}^1\right](1)}\left(t,x\right)= \exp \left[-r\left(t-{t}_0\right)\right]\;\left[{\widehat{A}}^{\theta_a^1(1)}(t){x}^{1/2}+{\widehat{B}}^{\theta_a^1(1)}(t)\right],\\ {}\mathrm{f}\mathrm{o}\mathrm{r}\;a\in \left\{1,2,3\right\}\;\mathrm{and}\;t\in \left[{t}_1,T\right];\end{array} $$

(4.4)

$$ \begin{array}{l}{W}^{\left[{\theta}_1\right](0)}\left(t,x\right)= \exp \left[-r\left(t-{t}_0\right)\right]\;\left[{\widehat{A}}^{\theta_1(0)}(t){x}^{1/2}+{\widehat{B}}^{\theta_1(0)}(t)\right],\\ {}\mathrm{f}\mathrm{o}\mathrm{r}\;t\in \left[{t}_0,{t}_1\right].\end{array} $$

(4.5)

where $ {\widehat{A}}^{\theta_a^1(1)}(t) $, $ {\widehat{B}}^{\theta_a^1(1)}(t) $, $ {\widehat{A}}^{\theta_1(0)}(t) $ and $ {\widehat{B}}^{\theta_1(0)}(t) $ satisfy:

$$ \begin{array}{l}{\overset{.}{\widehat{A}}}^{\theta_a^1(1)}(t)=\left[r+\frac{\sigma^2}{8}+\frac{b}{2}\right]{\widehat{A}}^{\theta_a^1(1)}(t)-{\displaystyle \sum_{j=1}^2\frac{1}{2\left[{\varepsilon}_j^{\left[a\right]}{c}_j+{\widehat{A}}^{\theta_a^1(1)}(t)/2\right]}}\\ {}+{\displaystyle \sum_{j=1}^2}\frac{\varepsilon_j^{\left[a\right]}{c}_j}{4{\left[{\varepsilon}_j^{\left[a\right]}{c}_j+{\widehat{A}}^{\theta_a^1(1)}(t)/2\right]}^2}+{\displaystyle \sum_{j=1}^2}\frac{{\widehat{A}}^{\theta_a^1(1)}(t)}{8{\left[{\varepsilon}_j^{\left[a\right]}{c}_j+{\widehat{A}}^{\theta_a^1(1)}(t)/2\right]}^2},\\ {}{\overset{.}{\widehat{B}}}^{\theta_a^1(1)}(t)=r{\widehat{B}}^{\theta_a^1(1)}(t)-\frac{a}{2}{\widehat{A}}^{\theta_a^1(1)}(t),\\ {}{\widehat{A}}^{\theta_a^1(1)}(T)=2q,\;\mathrm{and}\;{\widehat{B}}^{\theta_a^1(1)}(T)=0;\\ {}{\overset{.}{\widehat{A}}}^{\theta_1(0)}(t)=\left[r+\frac{\sigma^2}{8}+\frac{b}{2}\right]{\widehat{A}}^{\theta_1(0)}(t)-{\displaystyle \sum_{j=1}^2\frac{1}{2\left[{\varepsilon}_j^{\left[1\right]}{c}_j+{\widehat{A}}^{\theta_1(0)}(t)/2\right]}}\\ {}+{\displaystyle \sum_{j=1}^2}\frac{\varepsilon_j^{\left[1\right]}{c}_j}{4{\left[{\varepsilon}_j^{\left[1\right]}{c}_j+{\widehat{A}}^{\theta_1(0)}(t)/2\right]}^2}+{\displaystyle \sum_{j=1}^2}\frac{{\widehat{A}}^{\theta_1(0)}(t)}{8{\left[{\varepsilon}_j^{\left[1\right]}{c}_j+{\widehat{A}}^{\theta_1(0)}(t)/2\right]}^2},\\ {}{\overset{.}{\widehat{B}}}^{\theta_1(0)}(t)=r{\widehat{B}}^{\theta_1(0)}(t)-\frac{a}{2}{\widehat{A}}^{\theta_1(0)}(t),\\ {}{\widehat{A}}^{\theta_1(0)}\left({t}_1\right)={\displaystyle \sum_{h=1}^3{\lambda}_h}{\widehat{A}}^{\theta_h^1(1)}\left({t}_1\right),\;\mathrm{and}\;{\widehat{B}}^{\theta_1(0)}(T)={\displaystyle \sum_{h=1}^3{\lambda}_h}{\widehat{B}}^{\theta_h^1(1)}\left({t}_1\right).\end{array} $$

Using (4.4) and (4.5) the optimal cooperative controls can then be obtained as:

$$ {\psi}_i^{(0){\theta}_1}\left(t,x\right)=\frac{x}{4{\left[{\varepsilon}_i^{\left[1\right]}{c}_i+{\widehat{A}}^{\theta_1(0)}(t)/2\right]}^2},\;\mathrm{f}\mathrm{o}\mathrm{r}\;i\in \left\{1,2\right\}\;\mathrm{and}\;t\in \left[{t}_0,{t}_1\right); $$

(4.6)

$$ {\psi}_i^{(1){\theta}_a^1}\left(t,x\right)=\frac{x}{4{\left[{\varepsilon}_i^{\left[a\right]}{c}_i+{\widehat{A}}^{\theta_a^1(1)}(t)/2\right]}^2},\;\mathrm{f}\mathrm{o}\mathrm{r}\;i\in \left\{1,2\right\}\;\mathrm{and}\;t\in \left[{t}_1,T\right], $$

(4.7)

if $ {\theta}_a^1\in \left\{{\theta}_1,{\theta}_2,{\theta}_3\right\} $ occurs at time t ₁.

Substituting these control strategies into (2.2) yields the dynamics of the state trajectory under cooperation. The optimal cooperative state trajectory in the time interval $ \left[{t}_0,{t}_1\right) $ can be obtained as:

$$ {x}^{*}(t)=\varpi {\left({t}_0,t,{\theta}_1\right)}^2\left[\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{x}_0^{1/2}+{\displaystyle {\int}_{t_0}^t{\varpi}^{-1}\left({t}_0,s\right)}\frac{\alpha }{2}ds{\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right]}^2,\mathrm{f}\mathrm{o}\mathrm{r}\;t\in \left[{t}_0,{t}_1\right), $$

(4.8)

where $ \varpi \left({t}_0,t,{\theta}_1\right)= \exp \left[{\displaystyle {\int}_{t_0}^t\left[{H}_0\left({\theta}_1,\upsilon \right)-\frac{\sigma^2}{8}\right]d\upsilon +{\displaystyle {\int}_{t_0}^t\frac{\sigma }{2}dz\left(\upsilon \right)}}\right] $, and

$$ {H}_0\left({\theta}_1,s\right)=-\left[\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.\frac{b}{2}+{\displaystyle \sum_{j=1}^2}\frac{1}{8{\left[{\varepsilon}_j^{\left[1\right]}{c}_j+{\widehat{A}}^{\theta_1(0)}(s)/2\right]}^2}+\frac{\sigma^2}{8}\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right]. $$

If $ {\theta}_a^1\in \left\{{\theta}_1,{\theta}_2,{\theta}_3\right\} $ occurs at time t ₁, the optimal cooperative state trajectory in the interval [t ₁, T] becomes

$$ {x}^{*}(t)=\varpi {\left({t}_0,t,{\theta}_a^1\right)}^2\left[\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.{\left({x}_{t_1}^{*}\right)}^{1/2}+{\displaystyle {\int}_{t_0}^t{\varpi}^{-1}\left({t}_1,s,{\theta}_a^1\right)}\frac{\alpha }{2}ds{\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right]}^2,\;\mathrm{f}\mathrm{o}\mathrm{r}\;t\in \left[{t}_1,T\right], $$

(4.9)

where $ \varpi \left({t}_1,t,{\theta}_a^1\right)= \exp \left[{\displaystyle {\int}_{t_1}^t\left[{H}_1\left({\theta}_a^1,\upsilon \right)-\frac{\sigma^2}{8}\right]d\upsilon +{\displaystyle {\int}_{t_1}^t\frac{\sigma }{2}dz\left(\upsilon \right)}}\right] $, and

$ {H}_1\left({\theta}_a^1,s\right)=-\left[\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right.\frac{b}{2}+{\displaystyle \sum_{j=1}^2}\frac{1}{8{\left[{\varepsilon}_j^{\left[1\right]}{c}_j+{\widehat{A}}^{\theta_a^1(0)}(s)/2\right]}^2}+\frac{\sigma^2}{8}\left.\begin{array}{c}\hfill \hfill \\ {}\hfill \hfill \end{array}\right] $.

Now suppose that the players agree to divide their cooperative gains according to scheme 3.1 in the time interval $ \left[{t}_0,{t}_1\right) $, according scheme 3.1 if θ ₁ occurs at time t ₁ and according scheme 3.2 if θ ₂ or θ ₃ occurs at time t ₁.

Using Schemes 3.1 and 3.2, Theorem 3.1 and the results derived in section, an instantaneous payment at time $ \tau \in \left[{t}_k,{t}_{k+1}\right] $:

$$ \begin{array}{l}{B}_i^{\left({\theta}_{a_k}^k\right)k}\left(\tau \right)=-\left[{\xi}_t^{i\left[{\theta}_{a_k}^k\right](k)\tau}\left(t,{x}_t^{*}\right)\left|{}_{t=\tau}\right.\right]\\ {}-\left[{\xi}_{x_t^{*}}^{i\left[{\theta}_{a_k}^k\right](k)\tau}\left(t,{x}_t^{*}\right)\left|{}_{t=\tau}\right.\right]f\left[\tau, {x}_{\tau}^{*},{\psi}_1^{(k){\theta}_{a_k}^k}\left(\tau, {x}_{\tau}^{*}\right),{\psi}_2^{(k){\theta}_{a_k}^k}\left(\tau, {x}_{\tau}^{*}\right)\right]\\ {}-\frac{1}{2}{\displaystyle \sum_{h,\zeta =1}^n{\Omega}^{h\zeta}\left(\tau, {x}_{\tau}^{*}\right)}\left[{\left.{\xi}_{x_t^h{x}_t^{\zeta}}^{i\left[{\theta}_{a_k}^k\right](k)\tau}\left(t,{x}_t^{*}\right)\right|}_{t=\tau}\right]\end{array} $$

for $ i\in \left\{1,2\right\} $, $ k\in \left\{0,1\right\} $, $ {\theta}_{a_0}^0={\theta}_1 $ and $ {\theta}_a^1\in \left\{{\theta}_1,{\theta}_2,{\theta}_3\right\} $ can be obtained explicitly using the results derived in (4.2) to (4.8).

5 Chapter Notes

This chapter considers subgame-consistent cooperative solutions in randomly furcating stochastic differential games . This approach widens the application of cooperative stochastic differential game theory to problems where future environments are not known with certainty. If the state dynamics is deterministic the above analysis yields subgame consistent cooperative solutions for randomly-furcating differential games. Yeung (2008) considered subgame consistent solutions for a pollution management differential game in collaborative abatement under uncertain future payoffs .

Finally, the random event Θ^k, for $ k\in \left\{1,\kern0.5em 2,\kern0.5em \cdots, \kern0.5em m\right\} $, affecting the payoffs may be more complex stochastic processes, like a branching process with a series of random events Θ^k, for $ k\in \left\{1,\kern0.5em 2,\kern0.5em \cdots, \kern0.5em m\right\} $, which is a random variable stemming from the branching process as described below.

Given that $ {\theta}_{a_1}^1 $ is realized in time interval $ \left[{t}_1,{t}_2\right) $, for $ {a}_1=1,2,\dots, {\eta}_1 $, the process Θ² in time interval $ \left[{t}_2,{t}_3\right) $ has a range $ {\theta}^2=\left\{{\theta}_1^{2\left[\left(1,{a}_1\right)\right]},{\theta}_2^{2\left[\left(1,{a}_1\right)\right]},\kern0.5em \dots, {\theta}_{\eta_{2\left[\left(1,{a}_1\right)\right]}}^{2\left[\left(1,{a}_1\right)\right]}\right\} $ with the corresponding probabilities $ \left\{{\lambda}_1^{2\left[\left(1,{a}_1\right)\right]},{\lambda}_2^{2\left[\left(1,{a}_1\right)\right]},\kern0.5em \dots, {\lambda}_{\eta_{2\left[\left(1,{a}_1\right)\right]}}^{2\left[\left(1,{a}_1\right)\right]}\right\} $.

Given that $ {\theta}_{a_1}^1 $ is realized in time interval $ \left[{t}_1,{t}_2\right) $ and $ {\theta}_{a_2}^{2\left[\left(1,{a}_1\right)\right]} $ is realized in time interval $ \left[{t}_2,{t}_3\right) $, for $ {a}_1=1,2,\dots, {\eta}_1 $ and $ {a}_2=1,2,\dots, {\eta}_{2\left[\left(1,{a}_1\right)\right]} $, $ {\theta}^3=\left\{{\theta}_1^{3\left[\left(1,{a}_1\right)\;\left(2,{a}_2\right)\right]},{\theta}_2^{3\left[\left(1,{a}_1\right)\;\left(2,{a}_2\right)\right]},\kern0.5em \dots, {\theta}_{\eta_{3\left[\left(1,{a}_1\right)\;\left(2,{a}_2\right)\right]}}^{3\left[\left(1,{a}_1\right)\;\left(2,{a}_2\right)\right]}\right\} $ would be realized with the corresponding probabilities $ \left\{{\lambda}_1^{3\left[\left(1,{a}_1\right)\;\left(2,{a}_2\right)\right]},{\lambda}_2^{3\left[\left(1,{a}_1\right)\;\left(2,{a}_2\right)\right]},\kern0.5em \dots, {\lambda}_{\eta_{3\left[\left(1,{a}_1\right)\;\left(2,{a}_2\right)\right]}}^{3\left[\left(1,{a}_1\right)\;\left(2,{a}_2\right)\right]}\right\} $.

In general, given that $ {\theta}_{a_1}^1 $ is realized in time interval $ \left[{t}_1,{t}_2\right) $, $ {\theta}_{a_2}^{2\left[\left(1,{a}_1\right)\right]} $ is realized in time interval $ \left[{t}_2,{t}_3\right) $, …, and $ {\theta}_{a_{k-1}}^{k-1\left[\left(1,{a}_1\right)\left(2,{a}_2\right)\dots \left(k-2,{a}_{k-2}\right)\right]} $ is realized in time interval $ \left[{t}_{k-1},{t}_k\right) $, for $ {a}_1=1,2,\dots, {\eta}_1 $, $ {a}_2=1,2,\dots, {\eta}_{2\left[\left(1,{a}_1\right)\right]} $, …, $ {a}_{k-1}=1,2,\dots, {\eta}_{k-1\left[\left(1,{a}_1\right)\left(2,{a}_2\right)\dots \left(k-1,{a}_{k-1}\right)\right]} $, $ {\theta}^k=\left\{{\theta}_1^{k\left[\left(1,{a}_1\right)\;\left(2,{a}_2\right)\dots \left(k-1,{a}_{k-1}\right)\right]},{\theta}_2^{k\left[\left(1,{a}_1\right)\;\left(2,{a}_2\right)\dots \left(k-1,{a}_{k-1}\right)\right]},\dots, {\theta}_{\eta_{k\left[\left(1,{a}_1\right)\left(2,{a}_2\right)\dots \left(k-1,{a}_{k-1}\right)\right]}}^{k\left[\left(1,{a}_1\right)\;\left(2,{a}_2\right)\dots \left(k-1,{a}_{k-1}\right)\right]}\right\} $

would be realized with the corresponding probabilities

$$ \left\{{\lambda}_1^{k\left[\left(1,{a}_1\right)\;\left(2,{a}_2\right)\dots \left(k-1,{a}_{k-1}\right)\right]},{\lambda}_2^{k\left[\left(1,{a}_1\right)\;\left(2,{a}_2\right)\dots \left(k-1,{a}_{k-1}\right)\right]},\kern0.5em \dots, {\lambda}_{\eta_{k\left[\left(1,{a}_1\right)\left(2,{a}_2\right)\dots \left(k-1,{a}_{k-1}\right)\right]}}^{k\left[\left(1,{a}_1\right)\;\left(2,{a}_2\right)\dots \left(k-1,{a}_{k-1}\right)\right]}\right\} $$

for $ k=1,2,\dots, \tau $.

6 Problems

1.
Consider a resource extraction game, in which two extractors are awarded leases to extract a renewable resource over the time interval [0, 4]. The resource stock $ x(s)\in X\subset R $ follows the dynamics:
$$ dx(s)=\left[10x{(s)}^{1/2}-x(s)-{u}_1(s)-{u}_2(s)\right]ds+0.05x(s)dz(s),x(0)=80, $$
where u ₁(s) is the harvest rate of extractor 1 and u ₂(s) is the harvest rate of extractor 2.

The instantaneous payoff at time $ s\in \left[0,2\right) $ for player 1 and player 2 are known to be respectively:
$$ \left[2{u}_1{(s)}^{1/2}-\frac{2}{x{(s)}^{1/2}}{u}_1(s)\right]\kern0.2em \mathrm{and}\kern0.2em \left[{u}_2{(s)}^{1/2}-\frac{2}{x{(s)}^{1/2}}{u}_2(s)\right]. $$

The instantaneous payoff at time $ s\in \left[2,4\right] $ for player 1 and player 2 are known to be respectively:
$$ \begin{array}{l}\left[2{u}_1{(s)}^{1/2}-\frac{2}{x{(s)}^{1/2}}{u}_1(s)\right]\kern0.2em \mathrm{and}\kern0.2em \left[3{u}_2{(s)}^{1/2}-\frac{2}{x{(s)}^{1/2}}{u}_2(s)\right]\;\mathrm{with}\kern0.5em \mathrm{probability}\kern0.5em 0.3,\hfill \\ {}\left[2{u}_1{(s)}^{1/2}-\frac{1}{x{(s)}^{1/2}}{u}_1(s)\right]\kern0.2em \mathrm{and}\kern0.2em \left[2{u}_2{(s)}^{1/2}-\frac{1}{x{(s)}^{1/2}}{u}_2(s)\right]\kern0.5em \mathrm{with}\kern0.5em \mathrm{probability}\kern0.5em 0.4,\hfill \\ {}\kern0.2em \mathrm{and}\kern0.5em \left[3{u}_1{(s)}^{1/2}-\frac{0.5}{x{(s)}^{1/2}}{u}_1(s)\right]\kern0.62em \mathrm{and}\;\left[4{u}_2{(s)}^{1/2}-\frac{2}{x{(s)}^{1/2}}{u}_2(s)\right]\kern0.24em \mathrm{with}\kern0.5em \mathrm{probability}\kern0.5em 0.3.\hfill \end{array} $$

At terminal time 4, extractor 1 will receive a termination bonus 2x(4)^1/2 and extractor 2 will receive a termination bonus x(4)^1/2. The discount rate is 0.05.

Characterize a feedback Nash equilibrium .
2.
Obtain a group optimal solution which maximizes the joint expected payoff of the extractors.
3.
Derive a subgame consistent solution in which the players share the excess gain from cooperation equally.

References

Jørgensen, S., Yeung, D.W.K.: Stochastic differential game model of a common property fishery. J. Optim. Theory Appl. 90, 391–403 (1996)
Article MathSciNet MATH Google Scholar
Nash, J.F.: Non-cooperative games. Ann. Math. 54, 286–295 (1951)
Article MathSciNet MATH Google Scholar
Petrosyan, L.A., Yeung, D.W.K.: Subgame-consistent cooperative solutions in randomly-furcating stochastic differential games. Int. J. Math. Comput. Model. (Special Issue on Lyapunov’s Methods in Stability and Control), 45, 1294–1307 (2007)
Google Scholar
Yeung, D.W.K.: Infinite horizon stochastic differential games with branching payoffs. J. Optim. Theory Appl. 111(2), 445–460 (2001)
Article MathSciNet MATH Google Scholar
Yeung, D.W.K.: Randomly-furcating stochastic differential games. In: Petrosyan, L., Yeung, D. (eds.) ICM Millennium Lectures on Games, pp. 107–126. Springer, Berlin (2003)
Chapter Google Scholar
Yeung, D.W.K.: Dynamically consistent solution for a pollution management game in collaborative abatement with uncertain future payoffs. In: Yeung, D.W.K., Petrosyan L.A. (Guest eds.) Special issue on frontiers in game theory: In honour of John F. Nash. Int. Game Theory Rev. 10(4), 517–538 (2008)
Google Scholar
Yeung, D.W.K., Petrosyan, L.A.: Subgame consistent cooperative solution in stochastic differential games. J. Optim. Theory Appl. 120(3), 651–666 (2004)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Center of Game Theory, Saint Petersburg State University, St Petersburg, Russia
David W. K. Yeung
SRS Consortium for Advanced Study in Cooperative Dynamic Games, Hong Kong Shue Yan University, Hong Kong, Hong Kong
David W. K. Yeung
Faculty of Applied Mathematics-Control Processes, St Petersburg State University, St Petersburg, Russia
Leon A. Petrosyan

Authors

David W. K. Yeung
View author publications
You can also search for this author in PubMed Google Scholar
Leon A. Petrosyan
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Yeung, D.W.K., Petrosyan, L.A. (2016). Subgame Consistency in Randomly-Furcating Cooperative Stochastic Differential Games. In: Subgame Consistent Cooperation. Theory and Decision Library C, vol 47. Springer, Singapore. https://doi.org/10.1007/978-981-10-1545-8_4

Download citation

DOI: https://doi.org/10.1007/978-981-10-1545-8_4
Published: 24 September 2016
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-1544-1
Online ISBN: 978-981-10-1545-8
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

Subgame Consistency in Randomly-Furcating Cooperative Stochastic Differential Games

Abstract

Similar content being viewed by others

Certainty Equivalence

Zero-Sum Stochastic Games

Zero-Sum Stochastic Games

Keywords

1 Game Formulation and Noncooperative Outcomes

Lemma 1.1

Proof

Theorem 1.1

Proof

Remark 1.1

Remark 1.2

2 Dynamic Cooperation

2.1 Group Rationality

Theorem 2.1

Proof

Remark 2.1

Remark 2.2

2.2 Individual Rationality

3 Subgame Consistent Solution and Payoff Distribution

3.1 Solution Imputation Vector

Scheme 3.1

Scheme 3.2

3.2 Subgame-Consistent Payoff Distribution Procedure

Theorem 3.1

Proof

4 An Illustration in Cooperative Resource Extraction

5 Chapter Notes

6 Problems

References

Author information

Authors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation