Looking Forward Approach in Cooperative Differential Games with Uncertain Stochastic Dynamics

Petrosian, Ovanes; Barabanov, Andrey

doi:10.1007/s10957-016-1009-8

Looking Forward Approach in Cooperative Differential Games with Uncertain Stochastic Dynamics

Published: 15 September 2016

Volume 172, pages 328–347, (2017)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal of Optimization Theory and Applications Aims and scope Submit manuscript

Looking Forward Approach in Cooperative Differential Games with Uncertain Stochastic Dynamics

Download PDF

460 Accesses
20 Citations
Explore all metrics

Abstract

In this study, a novel approach for defining and computing a solution for a differential game is presented for a case, wherein players do not have complete information about the game structure for the full time interval. At any instant in time, players have certain information about the motion equations and payoff functions for a current subinterval, and a forecast about the game structure for the rest of the time interval. The forecast is described by stochastic differential equations. The information about the game structure updates at fixed instants of time and is completely unknown in advance. A new solution is defined as a recursive combination of sets of imputations in the combined truncated subgames that are analyzed by the Looking Forward Approach. An example with a resource extraction game is presented to demonstrate a comparison of payoff functions without a forecast and that with stochastic and deterministic forecasts.

About the Looking Forward Approach in Cooperative Differential Games with Transferable Utility

Strong Time-Consistent Subset of the Core in Cooperative Differential Games with Finite Time Horizon

Article 12 October 2018

Nontransferable Utility Cooperative Dynamic Games

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The theory of cooperative differential games suggests socially optimal and group efficient solutions for various decision problems involving strategic actions. Formulation of the optimal behavior for players or economic agents is a fundamental element in this theory. The main problems of the theory include the design of the cooperative strategy and the corresponding payoff, the way to distribute the payoff between the players, and the time consistency of the corresponding solution. The problem of dynamic instability of the Nash bargaining solutions in differential games was discussed by Haurie [1]. Petrosyan formalized the notion of time consistency of solutions of differential games mathematically [2]. In this study, we focused on a specific case of cooperative differential games with a changing structure. In this case, players do not have complete information about future game structure, but they may use prediction. Obviously, this predicted information is valid only for a certain time period and has to be updated. In order to formulate the optimal behavior for players in such a cooperative differential game, it is necessary to develop a special approach, which we call the Looking Forward Approach. The prediction can be based on a stochastic dynamic model. Optimal trajectories are obtained by a technique from the theory of stochastic differential games.

This approach is similar in spirit to the model predictive control theory developed within the framework of numerical optimal control. We refer to [3–6] for an overview of the recent results in this field. Model predictive control is a form of control in which the current control action is obtained by solving, at each sampling instant, a finite horizon open-loop optimal control problem using the current state of the plant as the initial state. An important advantage of this type of control is its ability to cope with hard constraints on controls and states. It has, therefore, been widely applied in petro-chemical and related industries, where important operating points are located close to a set of admissible states and controls.

The idea of the Looking Forward Approach is new in game theory, particularly in cooperative differential games. The approach raises the following questions: how to define a cooperative trajectory, how to define a cooperative solution and to allocate the cooperative payoff, and what properties the obtained solution will have. This paper is dedicated to answering the aforementioned questions. Further, we propose a method to construct a cooperative trajectory and a cooperative solution for such types of games. It is interesting that the solution is defined using the imputation distribution procedure [7] with respect to the available information about the game structure. It is proved that this newly built solution is not only time-consistent (which is a very rare event in cooperative differential games), but also strongly time-consistent. Several variations of the Looking Forward Approach are presented in [8–10].

An interpretation of the model considered can be presented as follows. Assume that the game is defined on a closed time interval, which is divided into subintervals. At the beginning of each subinterval, the information for players is updated. The truncated information about the structure of the game contains certain information about the structure of the game, including motion equations and payoff functions for the next subinterval and a forecast about the structure of the game for the rest of the time interval for which the game is defined. The Hamilton–Jacobi–Bellman equation [11] defines optimal cooperative strategies for players at the beginning of each subinterval.

A characteristic function of a coalition is a very important notion in the theory of differential games. This function is defined in accordance with [12] as the maximal sum of payoffs of players in the coalition with respect to the Nash equilibrium strategies for the left-out players. This approach requires computation of the Nash equilibrium, which is described in detail in [13]. A set of imputations or a solution of the game is determined by the characteristic function at the beginning of each subinterval. For any set of imputations, the imputation distribution procedure (IDP) first introduced by L. Petrosyan in [7] is studied. See recent publications on this topic in [12, 14, 15]. In order to define a solution for the whole game, it is necessary to combine partial solutions and their IDP on subintervals. The properties of time consistency and strong time consistency introduced by Petrosyan [2, 16] are also studied for the combined solution.

The motion equation with switching points at the beginning of subintervals is also studied in the theory of discrete event dynamic systems [17]. In this theory, the switching instant and parameter jumps are determined by the trajectory as a deterministic or a stochastic function of the current state vector. On the one hand, the switching point is not known in advance. On the other hand, the initial data determine the system trajectory, including all switching events. In the paper, we study the systems without information for a new set of motion parameters.

To illustrate the Looking Forward Approach, an example of a cooperative resource extraction game is presented. The example was first considered by Jorgensen and Yeung [18]; the problem of time consistency in this game was investigated by Yeung [19]. In the paper, both the analytical and the numerical solution for specific parameters were demonstrated. The classical approach without a forecast is compared with the Looking Forward Approach, based on the deterministic forecast and the stochastic forecast; accordingly, a proportional solution was considered for computing the optimal imputation.

The remainder of this paper is organized as follows: The basic game models are described in Sect. 2. The motion equation contains switching parameters. In Sect. 2, the stochastic forecast is defined for the time interval in which the information is updated. A sequence of the auxiliary combined truncated subgames is defined in Sect. 3. Solutions to these subgames are presented in Sect. 4. This includes cooperative behavior of players for the whole game and allocation of the cooperative payoff between the players at each stage of the game. In Sect. 5, a new concept of the game solution is presented for the case related to updating information. The allocation of cooperative payoff between the players is defined for the whole game. The time consistency and strong time consistency properties of the solution are formulated and proved. In Sect. 6, the Looking Forward Approach is applied to the game of Cooperative Resource Extraction.

2 The Basic Game

The n-person differential game ${\varGamma }(x_{0}, 0, T)$ for the time interval [0, T] starting from the initial state $x_{0} \in I\!R^{n}$ is defined using motion equations as follows:

$$\begin{aligned} \dot{x}:=g(t, x, u), \qquad x(0) := x_0, \end{aligned}$$

(1)

where $x \in I\!R^{m}$, $u = (u_{1} \ldots u_{n})$. Denote the set of players by $N=\{1,2,\ldots ,n\}$. A player i chooses a control $u_{i} \in U_{i} \subset CompI\!R^{k}$ for $i=0, \ldots , n$.

The payoff function of the player i is defined as

$$\begin{aligned} K_{i}(x_0, 0, T; u) := \int \limits _{0}^{T}h_{i}(x(\tau ), u(\tau ))\hbox {d}\tau + q_{i}(x(T)), \end{aligned}$$

(2)

where $x(\tau )$ is the trajectory (the solution) of system (1) for the time interval [0, T] with the control input u and $q_i(x(T))$ is the terminal payoff.

Suppose the function g(x, u) has the following form (Fig. 1):

$$\begin{aligned} g(t, x, u):= {\left\{ \begin{array}{ll} g_{0}(t, x, u), \quad t \in [0, \Delta t], \\ \ldots \\ g_{j}(t, x, u), \quad t \in ] j \Delta t, (j+1) \Delta t], \\ \ldots \\ g_{l-1}(t, x, u), \quad t \in ] (l-1) \Delta t, l \Delta t]. \end{array}\right. } \end{aligned}$$

We emphasize the part of game dynamics $g_{j}(t, x, u)$ for the time interval $[ j \Delta t, (j+1) \Delta t]$, $j=0, \ldots , l-1$ in order to formulate the behavior of players for that time interval.

Suppose information for players is updated at fixed time instants $t=j\Delta t$, $j=0, \ldots , l-1$. During the time interval $[ j \Delta t, (j+1) \Delta t]$, players have certain information about the dynamics of the game described by the function $g_{j}(t, x, u)$. However, they do not know certain game dynamics about the time interval $[(j+1) \Delta t, T]$. Any forecast on this interval is based on present information about the process, which is the game dynamics function $g_{j}(t, x, u)$ in the current model. The only option for players is to assume that the structure of the game will not change and to use the function $g_{j}(t, x, u)$ for forecasting the game dynamics for the time interval $[(j+1) \Delta t, T]$.

In order to apply the Looking Forward Approach, players must have an opportunity to use combined truncated information about the structure of the game. The combined truncated information at any time instant $t \in [j \Delta t, (j+1) \Delta t]$ includes the motion equation on this interval

$$\begin{aligned} \dot{x}:=g_{j}(t, x, u), \end{aligned}$$

(3)

and a forecast of the motion for the future time interval $[(j+1) \Delta t, T]$. We define the forecast using a stochastic equation with the same deterministic part:

$$\begin{aligned} \hbox {d}x:=g_{j}(t, x, u)\hbox {d}t + \sigma (t,x)\hbox {d}z(t), \end{aligned}$$

(4)

where $\sigma (t,x)$ is $m \times \theta $ matrix and z(t) is a $\theta $-dimensional Wiener process. When information about the game dynamics updates at the time instants $j\Delta t$ players recalculate their decisions using the new updated information.

Problems of this type often occur in real-life situations since the structure of the conflicting process on large time intervals is not certain and it is necessary to use a forecast.

3 A Combined Truncated Subgame

During the first time interval $[0, \Delta t]$, players have certain information about the structure of the game for $[0, \Delta t]$ and a forecast for the time interval $[\Delta t, T]$. At the instant $\Delta t$, the information about the game is being updated, and for the second time interval $[\Delta t, 2\Delta t]$, players have certain information about the structure of the game during the time interval $[\Delta t, 2 \Delta t]$ and a forecast for the time interval $[2 \Delta t, T]$. To include this fact in the model, the following definition is introduced (Fig. 2):

Definition 3.1

Let $0\le j\le l-1$. A combined truncated subgame $\bar{{\varGamma }}_j(x, j \Delta , T)$ is defined for the time interval $[j \Delta t, T]$ in the following way. The motion equation and the payoff function for the time interval $[j \Delta t, (j+1) \Delta t]$ coincide with that of the game ${\varGamma }(x_{0}, 0, T)$ for the same time interval. However, for the time interval $[(j+1) \Delta t, T]$, the subgame $\bar{{\varGamma }}_j(x, j \Delta , T)$ is a stochastic differential game. The motion equation and the initial condition of the combined truncated subgame $\bar{{\varGamma }}_j(x, j \Delta t, T)$ have the following form:

$$\begin{aligned} \hbox {d}x:= & {} {\left\{ \begin{array}{ll} g_{j}(t, x, u)\hbox {d}t, \phantom {+ \sigma (t,x)\hbox {d}z(t),} \quad t \in [j \Delta t, (j+1) \Delta t], \\ g_{j}(t, x, u)\hbox {d}t+ \sigma (t,x)\hbox {d}z(t), \quad t \in ] (j+1) \Delta t, T], \end{array}\right. } \nonumber \\ x(j\Delta t):= & {} x. \end{aligned}$$

(5)

The payoff function of the player i is equal to

$$\begin{aligned}&K^{j}_{i}(x, j \Delta t, T; u) := \nonumber \\&\quad = \int \limits _{j \Delta t}^{(j+1) \Delta t}h_{i}(x(\tau ), u(\tau ))\hbox {d}\tau + E \left\{ \int \limits _{(j+1) \Delta t}^{T}h_{i}(x(\tau ), u(\tau ))\hbox {d}\tau + q_{i}(x(T)) \right\} .\nonumber \\ \end{aligned}$$

(6)

It is easy to see that a combined truncated subgame $\bar{{\varGamma }}_j(x, j \Delta t, T)$ for $j = 0, \ldots , l-1$ can be defined as a stochastic differential game with following motion equations and initial conditions:

$$\begin{aligned} \hbox {d}x=g_{j}(t, x, u)\hbox {d}t + I(j, t) \cdot \sigma (t,x)\hbox {d}z(t), \qquad x(j\Delta t) = x, \end{aligned}$$

(7)

where

$$\begin{aligned} I(j, t):= {\left\{ \begin{array}{ll} 0, \quad t \in [j \Delta t, (j+1) \Delta t], \\ 1, \quad t \in ] (j+1) \Delta t, T], \end{array}\right. } \end{aligned}$$

(8)

and the payoff function of the player i is equal to

$$\begin{aligned} K^{j}_{i}(x, j \Delta t, T; u) = E \left\{ \int \limits _{j \Delta t}^{T}h_{i}(x(\tau ), u(\tau ))\hbox {d}\tau + q_{i}(x(T)) \right\} . \end{aligned}$$

(9)

4 Solution to a Combined Truncated Cooperative Subgame

Consider a combined truncated cooperative subgame $\bar{{\varGamma }}^{c}_j(x, j \Delta t, T)$ for the time interval $[j \Delta t, T]$ with the initial condition $x(j\Delta t)=x$. In this game, the expected profit of all players to be maximized is

$$\begin{aligned} \sum \limits _{i \in N}^{} K^{j}_{i}(x, j\Delta , T; u^{j}) = E \left\{ \sum \limits _{i \in N}^{} \int \limits _{j \Delta t}^{T}h_{i}(x(\tau ), u(\tau ))\hbox {d}\tau + q_{i}(x(T)) \right\} , \end{aligned}$$

(10)

subject to

$$\begin{aligned} \hbox {d}x= & {} g_{j}(t, x, u)\hbox {d}t + I(j,t) \cdot \sigma (t,x)\hbox {d}z(t), \nonumber \\ x(j\Delta t)= & {} x. \end{aligned}$$

(11)

This is a stochastic control problem. Sufficient conditions for the solution and the optimal feedback are presented based on the following assertion, which is a special case of theorem from [19].

Theorem 4.1

Assume there exists a twice continuously differential function $W^{(j \Delta t)}(t, x): [j \Delta t, T] \times {I\!R}^{m} \Rightarrow {I\!R}$ satisfying the partial differential equation

$$\begin{aligned}&-W_{t}^{(j \Delta t)}(t, x) - \frac{1}{2}\sum \limits _{h,\zeta =1}^{m} I(j, t) \sigma ^{h,\cdot }(t,x(t))\sigma ^{\zeta ,\cdot }(t,x(t))^{T} W_{x^{h},x^{\zeta }}^{(j \Delta t)}(t, x) = \nonumber \\&\quad = \max \limits _{u}^{} \left\{ \sum \limits _{i = 1}^{n} h_{i}(t, x, u) + W_{x}^{(j \Delta t)}(t, x) g_{j}(t, x, u) \right\} \end{aligned}$$

(12)

with the boundary condition

$$\begin{aligned} W^{(j \Delta t)}(T, x)=\sum \limits _{i=1}^{n}q_{i}(x). \end{aligned}$$

Assume that the maximum is achieved at $u=\psi ^{* j}(t,x)$. Then, the feedback $u(t,x)=\psi ^{* j}(t,x)$ is optimal in the game $\bar{{\varGamma }}^{c}_j(x, j \Delta t, T)$ if the closed loop stochastic system has a solution.

In accordance with the Looking Forward Approach, the combined truncated information about the structure of the game is only available to players in the game ${\varGamma }(x, 0, T)$. This information is not sufficient to design an optimal cooperative control. Instead of optimal cooperative trajectory in the game ${\varGamma }(x, 0, T)$, a recursive conditionally optimal cooperative trajectory $\{ \hat{x}^{*}(t) \}^{T}_{t = 0}$ is constructed. For the time interval $[0 ,\Delta t]$, the trajectory $x^{*}_{0}(t)$ is optimal in the combined truncated cooperative subgame $\bar{{\varGamma }}^{c}_0(x_{0}, 0, T)$. At the time instant $\Delta t$, information about the structure of the game updates, and the position of players is $x^{*}_{0}(\Delta t)$. For the time interval $[\Delta t, 2\Delta t]$, the function $x^{*}_{1}(t)$ is a part of the optimal cooperative trajectory in the combined truncated cooperative subgame $\bar{{\varGamma }}^c_1(x^{*}_{0}(\Delta t), \Delta t, T)$, which starts at the instant $\Delta t$, in the position $x^{*}_{0}(\Delta t)$. At the instant $j \Delta t$, information about the structure of the game updates and the system position is $x^{*}_{j-1}(j \Delta t)$. The function $\hat{x}^*(t)$ for the time interval $[j \Delta t, (j+1)\Delta t]$ is defined as a part of the optimal cooperative trajectory $x^{*}_{j}(t)$ in the combined truncated cooperative subgame $\bar{{\varGamma }}^c_j(x^{*}_{j-1}(j \Delta t), j \Delta t, T)$, which starts at the instant $j \Delta t$ in the position $x^{*}_{j-1}(j \Delta t)$. Therefore, the conditionally cooperative trajectory $\{ \hat{x}^{*}(t) \}^{T}_{t = 0}$ is defined as a composition of the cooperative trajectories $x^{*}_{j}(t)$ of the combined truncated cooperative subgames $\bar{{\varGamma }}^c_j(x^{*}_{j-1}(j \Delta t), j \Delta t, T)$ defined for the successive time intervals $[j \Delta t, (j+1) \Delta t]$ (Fig. 3).

$$\begin{aligned} \{ \hat{x}^{*}(t) \}^{T}_{t = 0} := {\left\{ \begin{array}{ll} x^{*}_{0}(t) \quad t \in [0, \Delta t], \\ \cdots \\ x^{*}_{j}(t) \quad t \in ]j\Delta t, (j+1)\Delta t], \\ \cdots \\ x^{*}_{l-1}(t) \quad t \in ](l-1)\Delta t, l\Delta t]. \end{array}\right. } \end{aligned}$$

(13)

We denote the boundary vector by $x^*_{j,0}:=x^*_{j-1}(j\Delta t)=x^*_{j}(j\Delta t)$. Then, the combined truncated cooperative subgame is $\bar{{\varGamma }}^c_j(x^{*}_{j,0}, j \Delta t, T)$.

For each coalition, $S \subset N$ and $j=0,\ldots ,l-1$, define the values of the characteristic function as it was done in [12]:

$$\begin{aligned} V_j(S, x^{*}_{j,0}, j \Delta t, T):= {\left\{ \begin{array}{ll} \sum \limits _{i=1}^{n} K^{j}_{i}(x^{*}_{j,0}, j \Delta t, T; u^{*}_j) \quad S = N \\ \tilde{V}_{j}(S) \quad S \subset N \\ 0 \quad S = \emptyset , \end{array}\right. } \end{aligned}$$

(14)

where $\tilde{V}_{j}(S)$ is a value of the noncooperative combined truncated subgame $\bar{{\varGamma }}_j(x^{*}_{j,0}, j \Delta t, T)$ is defined as the maximized sum of payoffs of players in coalition S with respect to the Nash equilibrium strategies for the left-out players. An imputation $\xi ^{j}(x^*_{j,0}, j \Delta t, T)$ for each combined truncated cooperative subgame $\bar{{\varGamma }}^{c}_j(x^*_{j,0}, j \Delta t, T)$ is defined as an arbitrary vector function that satisfies the conditions

$$\begin{aligned} \xi ^{j}_{i}(x^*_{j,0}, j \Delta t, T)\ge & {} V_j(\{i\}, x^*_{j,0}, j \Delta t, T), \quad i\in N, \nonumber \\ \sum \limits _{i\in N} \xi ^{j}_{i}(x^*_{j,0}, j \Delta t, T)= & {} V_j(N, x^*_{j,0}, j \Delta t, T). \end{aligned}$$

(15)

Denote the set of all possible imputations for the combined truncated subgame by $E_{j}(x^*_{j,0}, j \Delta t, T)$. Suppose that a nonempty and optimal solution $W_{j}(x^{*}_{j,0}, j \Delta t, T) \subset E_{j}(x^{*}_{j,0}, j \Delta t, T)$ is chosen for each combined truncated subgame $\bar{{\varGamma }}^{c}_j(x^{*}_{j,0}, j \Delta t, T)$. It can be Core, NM solution, nucleus, or the Shapley value. It is easy to suggest that the distribution of the total payoff of players in the game ${\varGamma }(x_{0}, 0, T)$ along the conditionally cooperative trajectory $\{ \hat{x}^{*}(t) \}^{T}_{t = 0}$ is organized as a composition of imputations for each time interval $[j\Delta t, (j+1)\Delta t]$, $j=0, \ldots , l-1$, in accordance with the structure of the game ${\varGamma }(x_{0}, 0, T)$. This is formalized in this section as a new solution concept. The family of sets $W_{j}(x^{*}_{j,0}, j \Delta t, T)$ do not compose directly a solution for the game ${\varGamma }(x_{0}, 0, T)$. For any $j=0, \ldots , l-1$, the optimal solution for the truncated subgame $\bar{{\varGamma }}^{c}_j(x^{*}_{j,0}, j \Delta t, T)$ is defined for the time interval $[j\Delta t, T]$. This particular solution makes a sense for the interval $[j\Delta t, (j+1)\Delta t]$ only because the information about the game structure updates after every $\Delta t$ time interval, and it is irrelevant to use a solution that is based upon outdated information. The necessary information can be extracted by using the imputation distribution procedure (IDP) [7] for each truncated subgame. The IDP also provides the time consistency property of a solution in the new concept and an opportunity to define a cooperative solution at any instant of time in $[j\Delta t, T]$.

5 Concept of the Combined Solution

In order to construct a solution concept for the game ${\varGamma }(x_{0}, 0, T)$, an IDP needs to be introduced for all combined truncated cooperative subgames $\bar{{\varGamma }}^{c}_j(x^{*}_{j,0}, j \Delta t, T)$, $j=0, \ldots ,l-1$.

Let $0\le j\le l-1$. Define a family of subgames of the game $\bar{{\varGamma }}^{c}_j(x^{*}_{j,0}, j \Delta t, T)$ along its cooperative trajectory $x^{*}_{j}(t)$ as $\bar{{\varGamma }}^{c}_j(x^{*}_{j}(t), t, T)$, where $t \in ]j \Delta t, T]$ is the starting time of the subgame.

The characteristic function along $x^{*}_{j}(t)$ in the family of subgames of $\bar{{\varGamma }}^{c}_j(x^{*}_{j}(t), t, T)$ for the time interval [t, T] is defined as in (14). Denote by $E_{j}(x^{*}_{j}, t, T)$ the set of imputations in $\bar{{\varGamma }}^{c}_j(x^{*}_{j}(t), t, T)$ along $x^{*}_{j}$.

Suppose that for any subgame of previously defined, combined truncated cooperative subgame $\bar{{\varGamma }}^{c}_j(x^{*}_{j}(t), t, T)$, an optimal solution $W_{j}(x^{*}_{j}, t, T) \ne \emptyset $ along the cooperative trajectory $x^{*}_{j}$ is chosen.

Suppose that for any combined truncated subgame $\bar{{\varGamma }}^{c}_j(x^{*}_{j,0}, j \Delta t, T)$ in the starting positions $x^{*}_{j,0}$ players agree to choose the imputations

$$\begin{aligned} \xi ^{j}(x^{*}_{j,0}, j \Delta t, T) \in W_{j}(x^{*}_{j,0}, j \Delta t, T) \end{aligned}$$

(16)

and the corresponding IDP

$$\begin{aligned} B_{j}(t, x^{*}_{j}):=[B^{j}_{1}(t, x^{*}_{j}) \ldots B^{j}_{n}(t, x^{*}_{j})], \end{aligned}$$

(17)

where $t \in [j \Delta t, T]$, which guarantees the time consistency of this imputation [2]:

$$\begin{aligned} \xi ^{j}(x^{*}_{j,0}, j \Delta t, T) := E \left\{ \int \limits _{j \Delta t}^{T}B_{j}(t, x^{*}_{j})\hbox {d}t + q_{i}(x^{*}(T)) \right\} . \end{aligned}$$

(18)

The IDP $B_{j}(t, x^{*}_{j})$ can be obtained from $\xi ^{j}_{t}(x^{*}_{j}, t, T)$ by direct differentiation [19]:

Theorem 5.1

If the imputations $\xi ^{j}(x^{*}_{j}, t, T)$ are twice continuously differentiable in t and $x^{*}_{j}$, then

$$\begin{aligned} B_{j}(\tau , x^{*}_{j}):= & {} - \left[ \xi ^{j}_{t}(x^{*}_{j}, t, T) |_{t=\tau } \right] - \nonumber \\&- \left[ \xi ^{j}_{x^{*}_{j}}(x^{*}_{j}, t, T) |_{t=\tau }\right] g_{j}\left[ \tau , x^{*}_{j}, \psi _{1}^{* j}(\tau ,x) \ldots \psi _{n}^{* j}(\tau ,x)\right] - \nonumber \\&- \frac{1}{2} \sum \limits _{h,\zeta =1}^{m}{\varOmega }^{h,\zeta }(j, \tau , x^{*}_{j}) \left[ \xi _{x^{h},x^{\zeta }}^{(j \Delta t)}(x^{*}_{j}, t, T) |_{t=\tau }\right] , \end{aligned}$$

(19)

where ${\varOmega }(j, t, x)=I(j, t)\sigma (t,x)\sigma (t,x)^{T}$ is the covariance matrix and I(j, t) is the indicator function.

A new concept of a solution in the combined differential game ${\varGamma }(x_{0}, 0, T)$ involves a recursive merging of a family of time-consistent solutions $W_{j}(x^{*}_{j,0}, j \Delta t, T)$ of the combined truncated subgames $\bar{{\varGamma }}_{c}(x^{*}_{j,0}(j \Delta t), j \Delta t, T)$ for $j=0, \ldots , l-1$, obtained by the Looking Forward Approach. For any imputation $\xi _{j}(x^{*}_{j,0}, j \Delta t, T) \in W_{j}(x^{*}_{j,0}, j \Delta t, T)$, there exists an IDP $B_{j}(t, x^{*}_{j})$. Define the combined truncated distribution

$$\begin{aligned} \hat{B}(t,\hat{x}^*) := B_{j}(t, x^{*}_{j}), \qquad t \in [j\Delta t, (j+1)\Delta t], \quad 0\le j\le l-1. \end{aligned}$$

(20)

The function $\hat{B}(t, \hat{x}^*)$ determines the combined vector

$$\begin{aligned} \hat{\xi }^j(x_{j,0}, j\Delta t, T):= & {} \int \limits _{j\Delta t}^{T} \hat{B}(\tau , \hat{x}^*(\tau )) \hbox {d}\tau + q(\hat{x}^{*}(T)) = \end{aligned}$$

(21)

$$\begin{aligned}= & {} \sum \limits _{m=j}^{l-1} \left[ \int \limits _{m\Delta t}^{(m+1)\Delta t} B_{m}(\tau , x^{*}_{m}(\tau )) \hbox {d}\tau \right] + q(x^{*}_{l}(T)) \end{aligned}$$

(22)

for $j=0, \ldots , l-1$. Denote by $\hat{W}_j(x^*_{j,0}, j\Delta t, T)$ the set of all vectors $\hat{\xi }^j(x_{j,0}, j\Delta t, T)$ composed by (20), (21).

A solution of the game ${\varGamma }(x_{0}, 0, T)$, in accordance with the new concept, is defined by $\hat{W}:=(\hat{W}_j(x^*_{j,0}, j\Delta t, T))_{j=0}^{l-1}$.

The construction of $\hat{B}(t,\hat{x}^*)$ does not require the function $B_{j}(t, x^{*}_{j})$ to be defined for the full time interval where the chosen imputation $\xi ^{j}(x^{*}_{j,0}, j \Delta t, T)$ is defined. We need it for the time interval $[j \Delta t, (j+1) \Delta t]$ only. The last term in (19) equals to zero on this interval:

$$\begin{aligned} - \frac{1}{2} \sum \limits _{h,\zeta =1}^{m}{\varOmega }^{h,\zeta }(j, t, x^{*}_{j}) \left[ \xi _{x^{h},x^{\zeta }}^{(j \Delta t)}(t, x^{*}_{j}) \right] =0, \quad t \in ]j \Delta t, (j+1) \Delta t], \end{aligned}$$

because ${\varOmega }(j, t, x)=I(j, t)\sigma (t,x)\sigma (t,x)^{T}$ and the indicator function $I(j, t)=0$ for $t \in ]j \Delta t, (j+1) \Delta t]$. Therefore, the formula for calculation of $\hat{B}(t, \hat{x}^*)$ is valid if the imputation $\xi ^{j}(x^{*}_{j}, t, T)$ is only one time continuously differentiable in t and $x^{*}_{j}$.

It is easy to see that the solution $\hat{W}$ is time-consistent. However, there is another surprising property of $\hat{W}$ (Fig. 4).

Definition 5.1

A solution $W=:(W_j(x^*_{j,0}, j\Delta t, T))_{j=0}^{l-1}$ is called strong $\Delta t$-time-consistent iff for any $j=0, \ldots , l-1$ and for any $\xi \in W_0(x_0, 0,T)$ the IDP $B(t, x^*)$ generated by $\xi $ satisfies the inclusion

$$\begin{aligned} \int \limits _{0}^{j \Delta t}B(\tau , x^*(\tau ))\,\hbox {d}\tau \oplus W_j(x^{*}_{j,0},j \Delta t,T) \subset W_0(x_0,0,T), \end{aligned}$$

(23)

where $a \oplus A := \{ a + a': a' \in A\}$.

Theorem 5.2

The solution $\hat{W}$ is strong $\Delta t$-time-consistent in the game ${\varGamma }(x_{0}, 0, T)$.

Proof

Let $0\le j\le l-1$. Further, an imputation $\hat{\xi }^0(x_{0}, 0, T)\in \hat{W}_0(x_0,0,T)$ generates an IDP $\hat{B}(t, \hat{x}^*)$. Then, for $0\le k < j$ there exist imputations $\xi ^k(x_{k,0}^*, k\Delta t, T)\in W_k(x_{k,0}^*, k\Delta t, T)$ with IDP’s $B_k(t, x_k^*)$ such that

$$\begin{aligned} \hat{B}(t, \hat{x}^*) = B_k(t, x^*_k), \qquad t\in [k\Delta t, (k+1)\Delta t[, \qquad 0\le k \le j-1. \end{aligned}$$

Hence,

$$\begin{aligned} \int \limits _{0}^{j \Delta t}\hat{B}(\tau , \hat{x}^*(\tau ))\,\hbox {d}\tau = \sum _{k=0}^{j-1} \int _{k\Delta t}^{(k+1)\Delta t} B_k(t, x^*_k(t))\,\hbox {d}t. \end{aligned}$$

Assume $\xi '' \in W_j(x^{*}_{j,0}, j \Delta t, T)$. Then, for $j\le k \le l-1$ there exists $\xi ^k(x_{k,0}^*, k\Delta t, T)\in W_k(x_{k,0}^*, k\Delta t, T)$ with IDP’s $B_k(t, x_k^*)$ such that $\hat{B}(t, \hat{x}^*) = B_k(t, x^*_k)$ for $t\in [k\Delta t, (k+1)\Delta t[$ and

$$\begin{aligned} \xi '' = \sum \limits _{m=j}^{l-1} \left[ \int \limits _{m\Delta t}^{(m+1)\Delta t} B_{m}(\tau , x^{*}_{m}(\tau )) \hbox {d}\tau \right] + q(x^{*}_{l}(T)). \end{aligned}$$

Therefore,

$$\begin{aligned} \int \limits _{0}^{j \Delta t}\hat{B}(\tau , \hat{x}^*(\tau ))\,\hbox {d}\tau + \xi ''= & {} \sum \limits _{m=0}^{l-1} \left[ \int \limits _{m\Delta t}^{(m+1)\Delta t} B_{m}(\tau , x^{*}_{m}(\tau )) \hbox {d}\tau \right] + q(x^{*}_{l}(T)) \in \\\in & {} \hat{W}_0(x_0, 0, T), \end{aligned}$$

this completes the proof. $\square $

6 Looking Forward Approach in Cooperative Extraction

The following example of the resource extraction game was considered by Jorgensen and Yeung [18]. The problem of time consistency in this example was studied by Yeung [19]. We apply the Looking Forward Approach to the example with the new factor in the solution—the indicator function I(j, t). The resulting subgames form the combined solution.

Consider the differential resource extracting game with two asymmetric extractors. The motion equation for the resource stock $x(t) \in X \subset I\!R$ has the following form:

$$\begin{aligned} \dot{x}:= & {} a\sqrt{x(t)}-bx(t)-u_{1}-u_{2}, \nonumber \\ x(0):= & {} x_{0}, \end{aligned}$$

(24)

where $u_{i}$ is the harvest rate of extractor $i=1,2$. Suppose that the dynamics of the game change and the parameters of the motion equations switch in successive time intervals: $a=a_j$ and $b=b_j$ if $t\in [j\Delta t, (j+1) \Delta t[$, for $j=0, \ldots , l-1$, where $T=l\, \Delta t$.

The payoff function of the extractor i is defined as

$$\begin{aligned} K_{i}(x, 0, T; u_{1}, u_{2}) := \int \limits _{0}^{T}h_{i}(x(\tau ), u(\tau ))\hbox {d}\tau +q_i\sqrt{x(T)}, \qquad i=1, 2, \end{aligned}$$

(25)

with

$$\begin{aligned} h_{i}(x(\tau ), u(\tau )) := \sqrt{u_{i}(\tau )}-\frac{c_{i}}{\sqrt{x(\tau )}}u_{i}(\tau ), \qquad i=1, 2, \end{aligned}$$

(26)

where $c_1, c_2$ are constant and $c_{1} \ne c_{2}$.

The basic game ${\varGamma }(x_0,0,T)$ is defined for the time interval [0, T]. Suppose that for any $t \in [j \Delta t, (j+1) \Delta t]$, $j=0 , \ldots , l-1$ players have combined truncated information about the structure of a game. It includes certain information about motion equations for the time interval $[j \Delta t, (j+1) \Delta t]$ and a forecast about the motion for the time interval $[(j+1) \Delta t, T]$. The combined truncated information is formalized in the combined truncated subgame $\bar{{\varGamma }}_j(x, j \Delta t, T)$. The motion equations and the initial conditions for this subgame have the following form:

$$\begin{aligned} \hbox {d}x:= & {} \left[ a_{j}\sqrt{x(t)}-b_{j}x(t)-u_{1}-u_{2}\right] \hbox {d}t + I(j, t) \cdot \sigma x(t) \hbox {d}z(t), \nonumber \\ x(j\Delta t):= & {} x, \end{aligned}$$

(27)

where

$$\begin{aligned} I(j, t):= {\left\{ \begin{array}{ll} 0, \quad t \in [j \Delta t, (j+1) \Delta t], \\ 1, \quad t \in ] (j+1) \Delta t, T]. \end{array}\right. } \end{aligned}$$

(28)

The payoff function of the extractor i in the stochastic game $\bar{{\varGamma }}_j(x, j \Delta t, T)$ is equal to

$$\begin{aligned} K^{j}_{i}(x, j \Delta t, T; u_{1}, u_{2}) := E \left\{ \int \limits _{j \Delta t}^{T}h_{i}(x(\tau ), u(\tau ))\hbox {d}\tau + q_i\sqrt{x(T)} \right\} . \end{aligned}$$

(29)

The resource extraction game was studied in detail in [18, 19]. The combined truncated subgame $\bar{{\varGamma }}_j(x, j \Delta t, T)$ has a Nash equilibrium point defined by the feedback

$$\begin{aligned} u_i^j(t,x) := \frac{x}{4[c_i + A_i^j(t)/2]^2}, \qquad i=1,2, \end{aligned}$$

where the functions $A_i^j(t)$ are defined by the equations

$$\begin{aligned} \dot{A}_i^j(t):= & {} A_i^j(t) \left[ \frac{b_j}{2} + \frac{1}{8} \sigma ^2 + \frac{1}{8(c_j + A_{3-i}^j(t)/2)^2}\right] - \frac{1}{4(c_i + A_i^j(t)/2)}, \\ \dot{C}_i^j(t):= & {} - \frac{a_j}{2} A_i^j(t) \end{aligned}$$

for $i=1,2$ with the boundary conditions $A_i^j(T):=q_i$ and $C_i^j(T):=0$.

The value function of the extractor $i=1,2$ in the Nash equilibrium point is equal [18] to

$$\begin{aligned} V^j_i(t, x) = A^{j}_{i}(t)\sqrt{x}+C^{j}_{i}(t), \qquad i=1, 2. \end{aligned}$$

(30)

Now consider a case wherein resource extractors agree to act cooperatively in the combined truncated subgame $\bar{{\varGamma }}^c_j(x, j \Delta t, T)$. They follow the optimality principle, under which they maximize their joint expected payoffs and share the excess of the total expected cooperative payoff.

The maximized expected joint payoff in the game $\bar{{\varGamma }}^c_j(x, j \Delta t, T)$ is derived in a similar way [18, 19]:

$$\begin{aligned} W^j(t, x) = A^{j}(t)\sqrt{x}+C^{j}(t), \end{aligned}$$

(31)

where the functions $A^{j}(t)$, $C^{j}(t)$ satisfy the equations

$$\begin{aligned} \dot{A}^{j}(t):= & {} \left[ \frac{1}{8} I(j, t) \cdot \sigma ^{2}+\frac{b_{j}}{2}\right] A^{j}(t) - \frac{1}{4\left[ c_{1}+\frac{A^{j}(t)}{2}\right] } - \frac{1}{4\left[ c_{2}+\frac{A^{j}(t)}{2}\right] }, \nonumber \\ \dot{C}^{j}(t):= & {} -\frac{a_{j}}{2}A^{j}(t), \quad A^{j}(T):=\sum \limits _{i=1}^{2} q_{i}, \quad C^{j}(T):=0. \end{aligned}$$

(32)

The optimal cooperative trajectory $x^*_j(t)$ of the stochastic game $\bar{{\varGamma }}^c_j(x, j \Delta t, T)$ can be represented explicitly [19] for the full interval $[j\Delta t, T]$. The formula for the beginning part of $x^*_j(t)$ for the interval $[j\Delta t, (j+1)\Delta t]$ is simpler because the motion equation on this interval is not stochastic. The trajectory of the initial data $x=x^*_{j,0}$ is

$$\begin{aligned} x_{j}^{*}(t)= \varpi _{j}^{2}(j \Delta t, t) \left[ \sqrt{x^{*}_{j,0}} + \frac{1}{2} a_j \int \limits _{j \Delta t}^{t} \varpi _{j}(j \Delta t, \tau )^{-1} \hbox {d}\tau \right] ^{2}, \quad t \in ]j \Delta t, (j+1) \Delta t], \end{aligned}$$

(33)

where

$$\begin{aligned} \varpi _{j}(j \Delta t, t)= \exp \int \limits _{j \Delta t}^{t} -\left[ \frac{1}{2}b_{j} + \frac{1}{8 \left[ c_{1}+\frac{A^{j}(\tau )}{2}\right] ^{2}}+\frac{A^{j}(\tau )}{8 \left[ c_{2}+\frac{A^{j}(\tau )}{2}\right] ^{2}}\right] \hbox {d}\tau . \end{aligned}$$

(34)

The initial data are defined recursively by the optimal trajectory of the previous game: $x^*_{0,0}:=x_0$ and $x^*_{j,0}:=x^*_{j-1}(j\Delta t)$ for $1\le j\le l-1$. The conditionally cooperative trajectory $\hat{x}^*(t)$ is defined, in accordance with the Looking Forward Approach, as follows

$$\begin{aligned} \hat{x}^*_j(t) := x^*_j(t), \qquad t\in [j \Delta t, (j+1) \Delta t], \qquad 0\le j\le l-1. \end{aligned}$$

Consider a numerical example with four time intervals, $T:=4$, $\Delta t:=1$ and the following parameters of the motion equations:

$$\begin{aligned} a_{0}:= & {} 10, \quad a_{1}:=9, \quad a_{2}:=12, \quad a_{3}:=8, \nonumber \\ b_{0}:= & {} 0.5, \quad b_{1}:=0.8, \quad b_{2}:=0.5, \quad b_{3}:=1.6. \end{aligned}$$

(35)

Assume that $c_{1}:=0.05$, $c_{2}:=0.1$, $q_1:=1.5$, $q_2:=1$ in the payoff function and the initial data $x_0:=1$.

The conditionally cooperative trajectory $\hat{x}^*(t)$ is composed from solutions of the combined truncated subgames $\bar{{\varGamma }}^c_j(x^*_{j,0}, j \Delta t, T)$ with the motion equation (27). If $\sigma =0$ in this equation, then the forecast is deterministic and the current information expands for the full remaining interval $[j\Delta t, T]$. If $\sigma \ne 0$, then the forecast is stochastic. Next, the simulation compares the deterministic and stochastic forecasts and shows a solution for both with and without a forecast. The stochastic forecast is calculated for $\sigma :=2$.

A trajectory without forecast is calculated on each interval $[j\Delta t, (j+1)\Delta t]$ as optimal in the game on this interval with the payoff function

$$\begin{aligned} \kappa ^{j}(x,u) := \sum _{i=1}^2 \left[ \int _{j\Delta t}^{(j+1)\Delta t} h_i(x(\tau ), u(\tau ))\,\hbox {d}\tau + q_i \sqrt{x((j+1)\Delta t)}\right] . \end{aligned}$$

The initial data on this interval are the value of the trajectory for the previous interval in the boundary point $j\Delta t$. The conditionally cooperative trajectories $\hat{x}^*(t)$ for the deterministic forecast, with and without the stochastic forecast, are shown in Fig. 5. The resource stock $\hat{x}^*(t)$ grows faster without a forecast. The deterministic forecast slows this growth down, and the stochastic forecast slows it down further. However, this does not imply that the payoff becomes less. The instantaneous payoff functions

$$\begin{aligned} h_{1,2}(\hat{x}^*(t), \hat{u}^*(t)) := \sum _{i=1}^2 h_i(\hat{x}^*(t), \hat{u}^*(t)) \end{aligned}$$

were calculated along conditionally cooperative trajectories $\hat{x}^*(t)$ for systems with deterministic forecast, with stochastic forecast, and without a forecast. They are shown in Fig. 6. At the beginning part of the game time interval, the order of the payoff functions is opposite to the order of the resource stock: The greatest value corresponds to the stochastic forecast, and the lowest value is obtained without a forecast.

The residual payoff functions

$$\begin{aligned} K^c(t,T) := \int \limits _{t}^{T}h_{1,2}(\hat{x}^*(\tau ), \hat{u}^*(\tau ))\hbox {d}\tau +(q_1+q_2)\sqrt{\hat{x}^*(T)}, \qquad t\in [0, T], \end{aligned}$$

(36)

along the trajectories $\hat{x}^*(t)$ are shown in Fig. 7 by solid lines. The total payoff is $K^c(0,T)$. This value is equal to 34.04 for the stochastic forecast, to 34.0 for the deterministic forecast, and to 33.54 without a forecast. The dashed lines in Fig. 7 show the expected values of the residual payoff function in the current truncated subgame $\bar{{\varGamma }}^c_j(x_{j,0}, j \Delta t, T)$. These subgames change at the switching points $t=j=1, 2, 3$, which imply jumps of the expected residual payoff values.

Analysis of Figs. 6, 7 and of similar numerical examples of the resource extraction game has led to the following conclusions. The total cooperative payoff function $K^c(0,T)$ is nearly the same for the three algorithms considered. The main difference consists of a distribution of the payoff in time that is shown in Fig. 6, and in the growth of the resource stock shown in Fig. 5. The deterministic forecast is ”cautious” compared to the algorithm without a forecast because it prefers to increase the payoff and to give up the resource stock value at the beginning of the game. The stochastic forecast is even more ”cautious” for the same reason.

The continuous time imputation function $\xi _i(t, T)$ must satisfy group and individual rationality principle and must be time-consistent [18, 19]. Sufficient conditions are obtained by the Nash equilibrium point, which yields the proportional imputation for the truncated cooperative subgame $\bar{{\varGamma }}^c_j(x_{j,0}, j\Delta t, T)$, $0\le j\le l-1$,

$$\begin{aligned} \xi ^{(j)i}(x, t, T):= & {} \frac{V^j_i (t, x)}{\sum _{k = 1}^{2}V^j_k(t, x)} W^j(t, x) = \\= & {} \frac{\left[ A^{j}_{i}(t)\sqrt{x}+C^{j}_{i}(t)\right] }{\sum \limits _{k = 1}^{2} \left[ A^{j}_{k}(t)\sqrt{x}+C^{j}_{k}(t)\right] }\left[ A^{j}(t)\sqrt{x}+C^{j}(t)\right] . \end{aligned}$$

Denote the corresponding IDP by

$$\begin{aligned} B_{j}(t, x^{*}_{j}):=\left[ B^{j}_{1}(t, x^{*}_{j}), B^{j}_{2}(t, x^{*}_{j})\right] . \end{aligned}$$

(37)

The general formula for IDP $B^{j}_i(t, x^{*}_{j})$ in the stochastic resource extraction game was derived by Yeung and Petrosyan [19]:

$$\begin{aligned}&B_{i}^{j}(\tau , x^{*}_{j})= - \left[ \xi ^{(j)i}_{t}(x^{*}_{j}, t, T) |_{t=\tau }\right] - \nonumber \\&\quad - \left[ \xi ^{(j)i}_{x^{*}_{j}}(x^{*}_{j}, t, T) |_{t=\tau }\right] \left[ a_{j}\sqrt{x^{*}_{j}}-b_{j}x^{*}_{j}-\frac{x^{*}_{j}}{4\left[ c_{1} + A^{j}(t)\right] ^{2}}-\frac{x^{*}_{j}}{4\left[ c_{2} + A^{j}(t)\right] ^{2}}\right] -\nonumber \\&\quad - I(j, t) \cdot \frac{1}{2} \sigma ^{2}(x^{*}_{j})^2 \left[ \xi _{x^{*}_{j},x^{*}_{j}}^{(j)i}(t, x^{*}_{j}) |_{t=\tau }\right] , \quad i=1,2. \end{aligned}$$

(38)

The combined truncated distribution $\hat{B}_i(t, \hat{x}^*)$ is composed from $B_{i}^{j}(\tau , x^{*}_{j})$ by (20) in accordance with the Looking Forward Approach, $i=1, 2$. The last term in (38) is equal to zero for the time interval $[j \Delta t, (j+1) \Delta t]$ because $I(j, t)=0$. The partial derivatives $\xi ^{(j)i}_t$ and $\xi ^{(j)i}_x$ are explicitly represented by the functions $\hat{x}^*$, $A^j$, $C^j$, $A^j_1$, $C^j_1$, $A^j_2$, $C^j_2$, which are obtained by integration of the differential equations. The resulting IDPs are shown in Fig. 8 for the solutions with deterministic and stochastic forecasts. This is a distribution of the instantaneous payoff function $h_{12}$ from Fig. 6 between players. The sum of the solid line and the dashed line in Fig. 8 is equal to the line in Fig. 6 of the same color. Each player may receive greater or less than a half of the full payoff at a time, and this share depends on both the instant of time and the forecast algorithm.

7 Conclusions

A novel approach to defining a solution of a differential game with a switching structure is presented. The game is defined on a time interval divided into subintervals. The players are not provided with complete information about the structure of the game for the full time interval. Instead, they know parameters of the motion equations and the payoff function for the current subinterval and a forecast of the game structure for the rest of the time interval. A cooperative optimal extraction is calculated for any truncated subgame generated by local information. A combined trajectory is composed recursively by the local trajectories. A new solution concept is described. It is proved that the new solution is not only time-consistent, but also strong $\Delta t$-time-consistent, which is a rare property of cooperative differential games.

The approach was illustrated using an example of the resource extraction game. Three competitive strategies were implemented: with stochastic forecast for the rest of the time interval, with deterministic forecast for the same interval, and without a forecast. The total values of the payoff appeared to be nearly the same, but they differ in the rate of payoff. A strategy that recommends taking more payoff at the beginning of the game can be considered as cautious. In this sense, the stochastic forecast strategy is more cautious than the deterministic forecast strategy, and the strategy without a forecast is the least cautious. It is also shown that the instantaneous imputation distribution depends on a forecast.

References

Haurie, A.: A note on nonzero-sum differential games with bargaining solutions. J. Optim. Theory Appl. 18, 31–39 (1976)
Article MathSciNet MATH Google Scholar
Petrosyan, L.A.: Time-consistency of solutions in multi-player differential games. Vestn. Leningr. State Univ. 4, 46–52 (1977)
Google Scholar
Goodwin, G.C., Seron, M.M., Dona, J.A.: Constrained Control and Estimation: An Optimisation Approach. Springer, New York (2005)
Book MATH Google Scholar
Kwon, W.H., Han, S.H.: Receding Horizon Control: Model Predictive Control for State Models. Springer, New York (2005)
Google Scholar
Wang, L.: Model Predictive Control System Design and Implementation Using MATLAB. Springer, New York (2005)
Google Scholar
Rawlings, J.B., Mayne, D.Q.: Model Predictive Control: Theory and Design. Nob Hill Publishing, Madison, WI (2009)
Google Scholar
Petrosyan, L.A., Danilov, N.N.: Stability of solutions in non-zero sum differential games with transferable payoffs. Vestn. Leningr. Univ. 1, 52–59 (1979)
MATH Google Scholar
Petrosian, O.L.: Looking forward approach in cooperative differential games. Int Game Theory Rev (2016). doi:10.1142/S0219198916400077
Article MathSciNet MATH Google Scholar
Petrosian, O.L.: Looking Forward approach in cooperative differential games with infinite-horizon. Vestn. Leningr. Univ. (2016) (to be published)
Gromova, E.V., Petrosian, O.L.: Control of informational horizon for cooperative differential game of pollution control. IEEE (2016). doi:10.1109/STAB.2016.7541187
Bellman, R.: Dynamic Programming. Princeton University Press, Princeton (1957)
MATH Google Scholar
Petrosyan, L.A., Zaccour, G.: Time-consistent Shapley value allocation of pollution cost reduction. J. Econ. Dyn. Control. 3, 381–398 (2003)
Article MathSciNet MATH Google Scholar
Basar, T., Olsder, G.J.: Dynamic Noncooperative Game Theory. Academic Press, London (1995)
MATH Google Scholar
Petrosyan, L.A., Yeung, D.W.K.: Dynamically stable solutions in randomly-furcating differential games. Trans. Steklov Inst. Math. 1, 208–220 (2006)
Article MathSciNet MATH Google Scholar
Jorgensen, S., Martin-Herran, G., Zaccour, G.: Agreeability and time consistency in linear-state differential games. J. Optim. Theory Appl. 1, 49–63 (2003)
Article MathSciNet MATH Google Scholar
Petrosjan, L.A.: Strongly time-consistent differential optimality principles. Vestn. St. Petersb. Univ. Math. 4, 40–46 (1993)
MathSciNet Google Scholar
Cassandras, C.J., Lafortune, S.: Introduction to Discrete Event Systems. Springer, New York (2008)
Book MATH Google Scholar
Jorgensen, S., Yeung, D.W.K.: Inter- and intragenerational renewable resource extraction. Ann. Oper. Res. 88, 275–289 (1999)
Article MathSciNet MATH Google Scholar
Yeung, D.W.K., Petrosyan, L.A.: Subgame-consistent Economic Optimization. Springer, New York (2012)
Book MATH Google Scholar

Download references

Acknowledgments

The first author acknowledges Saint-Petersburg State University for the research Grant No. 9.38.205.2014. The work of the second author was supported by Saint-Petersburg State University, Project 6.37.349.2015 and 6.38.230.2015.

Author information

Authors and Affiliations

Saint-Petersburg State University, Saint-Petersburg, Russia
Ovanes Petrosian & Andrey Barabanov

Authors

Ovanes Petrosian
View author publications
You can also search for this author in PubMed Google Scholar
Andrey Barabanov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ovanes Petrosian.

Additional information

Communicated by Dean A. Carlson.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Petrosian, O., Barabanov, A. Looking Forward Approach in Cooperative Differential Games with Uncertain Stochastic Dynamics. J Optim Theory Appl 172, 328–347 (2017). https://doi.org/10.1007/s10957-016-1009-8

Download citation

Received: 24 February 2016
Accepted: 03 September 2016
Published: 15 September 2016
Issue Date: January 2017
DOI: https://doi.org/10.1007/s10957-016-1009-8

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Looking Forward Approach in Cooperative Differential Games with Uncertain Stochastic Dynamics

Abstract

Similar content being viewed by others

About the Looking Forward Approach in Cooperative Differential Games with Transferable Utility

Strong Time-Consistent Subset of the Core in Cooperative Differential Games with Finite Time Horizon

Nontransferable Utility Cooperative Dynamic Games

1 Introduction

2 The Basic Game

3 A Combined Truncated Subgame

Definition 3.1

4 Solution to a Combined Truncated Cooperative Subgame

Theorem 4.1

5 Concept of the Combined Solution

Theorem 5.1

Definition 5.1

Theorem 5.2

Proof

6 Looking Forward Approach in Cooperative Extraction

7 Conclusions

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Looking Forward Approach in Cooperative Differential Games with Uncertain Stochastic Dynamics

Abstract

Similar content being viewed by others

About the Looking Forward Approach in Cooperative Differential Games with Transferable Utility

Strong Time-Consistent Subset of the Core in Cooperative Differential Games with Finite Time Horizon

Nontransferable Utility Cooperative Dynamic Games

1 Introduction

2 The Basic Game

3 A Combined Truncated Subgame

Definition 3.1

4 Solution to a Combined Truncated Cooperative Subgame

Theorem 4.1

5 Concept of the Combined Solution

Theorem 5.1

Definition 5.1

Theorem 5.2

Proof

6 Looking Forward Approach in Cooperative Extraction

7 Conclusions

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation