Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Motivation

Static team theory is a mathematical formalism of decision problems with multiple decision makers acting on different information affecting a single payoff [1]. Its generalization to dynamic team theory has far reaching implications in all human activity including science and engineering systems. In general, decentralized systems consist of multiple local observation posts and decision or control stations, in which the actions applied at the decision stations are calculated using different information, that is, the arguments of the decision strategies are different. We call, as usual, “information structures or patterns” the information available for decisions at the decision stations to implement their actions, and we call such informations “decentralized information structures” if the information available for decisions at the various decisions stations are not identical to all stations. Moreover, we call an information structure “classical” if all decision stations have the same information structures, and the information structure at the decision stations is nested, also called perfect recall, (i.e., a decision station that at some time has available certain information will have available the same information at any subsequent times), otherwise we call it “nonclassical.” Early work discussing the importance of information structures in decision making and its applications is found in [2, 3], while more recent one is found in [46]. The most general examples with nested information structures admitting closed form solutions appear to be the ones in [7] .

Stochastic discrete-time dynamic decision problems with nonclassical information structures are often formulated using team theory; the two methods proposed over the years are based on identifying conditions so that discrete-time stochastic dynamic team problems can be equivalently reduced to static team problems [5] and dynamic programming [6].

In this chapter, we discuss recent results by the authors in [812] on stochastic dynamic decision problems with nonclassical information structures, formulated using dynamic team theory. For an introduction to static team theory and related literature, we suggest the paper by Jan van Schuppen in this book.

Our objectives are the following.

  1. (1)

    Apply Girsanov’s change of probability measure to transform stochastic dynamic team problems to equivalent problems in which the state and/or the observations and information structures are not affected by any of the team decisions;

  2. (2)

    apply stochastic maximum principle to derive necessary and sufficient team and PbP optimality conditions.

We illustrate the importance of Girsanov’s change of probability measure [13] in generalizing Witsenhausen’s [5] notion on equivalence between discrete-time stochastic dynamic team problems which can be transformed to equivalent static team problems, to continuous-time Itô stochastic nonlinear differential decentralized decision problems, and to general classes of discrete-time models. The optimal strategies of Witsenhausen’s counterexample [2] can be derived using this method. We also invoke the stochastic maximum principe to present necessary and sufficient team and PbP optimality conditions described in terms of conditional variational inequalities with respect to the information structures and BSDEs.

2 Team Theory of Stochastic Dynamic Systems

Given the information structure \(\{I^k(t): t \in [0, T]\}\) available at the \(k\)th decision station and the corresponding admissible regular strategies \({\mathbb U}_{reg}^k[0,T]\) for \(k=1, \ldots , K\), the team decision problem is defined as follows.

$$\begin{aligned}&\inf \{ J(u^1, \ldots , u^K): (u^1,\ldots , u^K) \in \times \,_{k=1}^K {\mathbb U}_{reg}^k[0,T]\} \end{aligned}$$
(19.1)
$$\begin{aligned}&J(u^1, \ldots , u^K) = \mathbf{E} \Big \{ \int _{0}^T \ell (t,x(t),u^1(t,I^1), \ldots , u^K(t,I^K))\mathrm{{d}}t + \varphi (x(T)) \Big \} , \end{aligned}$$
(19.2)

subject to stochastic dynamics with state \(x(\cdot )\) and noisy observations at the observation posts \(\{y^i(\cdot ): i=1, \ldots , M\}\), which are solutions of the Itô differential equations

$$\begin{aligned} dx(t) =&f(t,x(t),u^1(t,I^1), \ldots , u^K(t,I^K))\mathrm{{d}}t \nonumber \\ &+ \sigma (t,x(t), u^1(t,I^1),\ldots , u^K(t,I^K))dW(t), \,\, x(0)=x_0, \end{aligned}$$
(19.3)
$$\begin{aligned} dy^m(t) =&h^m(t,x(t),u^1(t,I^1), \ldots , u^K(t,I^K))\mathrm{{d}}t + D^{m,\frac{1}{2}}(t) dB^{m}(t), \; m=1, \ldots , M. \end{aligned}$$
(19.4)

Here, \(u^k(t, I^k) \in {\mathbb A}^k \subseteq {\mathbb R}^{d_k}\) and \(\{ W(t): t \in [0,T]\}, \{B^{m}(t): t \in [0,T]\}\) are independent Brownian motion (BM) processes. The stochastic system (19.3) can be specialized to any interconnected subsystems architectures. For simplicity, we assume \(K=M \equiv N\) and we consider the following information structures. Denote the observation posts which communicate to the \(i\)th decision station by \({\fancyscript{O}}(i) \subset \{1,2,\ldots ,i-1, i+1,\ldots ,N\} \subset {\mathbb Z}_N \mathop {=}\limits ^{\triangle }\{1, 2, \ldots , N\}\), \(i=1, \ldots , N\).

Nonclassical Information Structures. The decision applied at the \(i\)th decision station, \(u_t^i \equiv u^i(t, I^i)\) at time \(t\in [0,T]\) is a nonanticipative measurable function of \( I^i(s) \mathop {=}\limits ^{\triangle }\{y^i(s), y^j(s-\varepsilon _j): \varepsilon _j >0, j \in {\fancyscript{O}}(i)\}, 0 \le s \le T, i=1, \ldots , N\).

Letting \({\fancyscript{G}}_{0,t}^{I^i} \mathop {=}\limits ^{\triangle }\sigma \big \{I^i(s): 0\le s \le t\Big \}\) denote the minimal \(\sigma -\)algebra generated by \(\{ I^i(s): 0\le s \le t\}, t \in [0,T]\), then \(\{{\fancyscript{G}}_{0,t}^{I^i}: i=1, \ldots N\}\) are nonclassical because they are different .

Restricting \({\fancyscript{G}}_{0,t}^{I^i}\) to \({\fancyscript{G}}^{I^i(t)} \mathop {=}\limits ^{\triangle }\sigma \big \{I^i(t)\}\), then \(\{ {\fancyscript{G}}^{I^i(t)}: i=1, \ldots , N\} \) are nonclassical because they are different, and nonnested, because \({\fancyscript{G}}^{I^i(t)} \nsubseteq {\fancyscript{G}}^{I^i(\tau )}, \forall \tau >t, i=1, \ldots , N \).

A generic information structure is denoted by \({\fancyscript{G}}_T^i\mathop {=}\limits ^{\triangle }\{ {\fancyscript{G}}_t^i: t \in [0,T]\}\), which can be specialized to \(\{{\fancyscript{G}}_{0,t}^{y^i}: t \in [0,T]\}, \{{\fancyscript{G}}_{0,t}^{I^i}: t \in [0,T]\}, \{{\fancyscript{G}}^{I^i(t)}: t \in [0,T]\}\).

Next, we describe the set of admissible relaxed strategies which are conditional distributions, since regular strategies are special cases of delta measures [9, 11].

Definition 19.1

(The Admissible Relaxed Strategies) The strategy applied at \(i\)th decision station is a conditional distribution defined by

$$\begin{aligned}u_t^i(\varGamma )= q_t^i(\varGamma | {\fancyscript{G}}_{t}^{i}), \quad \text{ for } \,\,t \in [0, T], \,\, \text{ and } \quad \forall \,\varGamma \; \in {\fancyscript{B}}({\mathbb A}^i), \,\, i=1, \ldots , N. \end{aligned}$$

Each strategy is member of an appropriate function space [9, 11] denoted by \({\mathbb U}_{rel}^i[0,T], i =1, \ldots , N\). An \(N\) tuple of relaxed strategies is \({\mathbb U}_{rel}^{(N)}[0,T] \mathop {=}\limits ^{\triangle }\times _{i=1}^N {\mathbb U}_{rel}^i[0,T]\).

The notation for relaxed strategies \(u \in {\mathbb U}_{rel}^{(N)}[0,T]\) is \(f(t,x,u_t) \mathop {=}\limits ^{\triangle }\int ( f(t,x,\xi ^1, \ldots , \xi ^N)) \times _{i=1}^N u_t^i(d\xi ^i) \) and similarly for \(\{\sigma , h, \ell \}\) in (19.1)–(19.4).

Problem 19.1

(Team and PbP Optimality) A relaxed strategy \(u^o \in {\mathbb U}_{rel}^{(N)}[0,T]\) for (19.1)–(19.4) is called team optimal if

$$\begin{aligned} J(u^{1,o}, u^{2,o}, \ldots , u^{N,o}) \le J(u^1, u^2, \ldots , u^N), \quad \forall u\mathop {=}\limits ^{\triangle }(u^1, u^2, \ldots , u^N) \in {\mathbb U}_{rel}^{(N)}[0,T], \end{aligned}$$

and it is called PbP optimal if it satisfies \( \tilde{J}(u^{i,o}, u^{-i,o}) \le \tilde{J}(u^{i}, u^{-i,o}), \forall u^i \in {\mathbb U}_{rel}^{i}[0,T]\), \( \forall i \in {\mathbb Z}_N,\) \(\tilde{J}(v,u^{-i})\mathop {=}\limits ^{\triangle }J(u^1,u^2,\ldots , u^{i-1},v,u^{i+1},\ldots ,u^N).\)

Notation. \(C([0,T], {\mathbb R}^n)\): space of continuous \({\mathbb R}^n-\)valued functions defined on \([0,T]\).

\(L_{{\mathbb F}_T}^2([0,T],{\mathbb R}^n)\): space of \( \{ {\mathbb F}_{0,t}: t \in [0, T]\}-\)adapted \({\mathbb R}^n-\)valued random processes \(\{z(t): t \in [0,T]\}\) such that \({\mathbb E}\int _{[0,T]} |z(t)|_{{\mathbb R}^n}^2 \mathrm{{d}}t < \infty \).

\(L_{{\mathbb F}_T}^2([0,T], {\fancyscript{L}}({\mathbb R}^m,{\mathbb R}^n)) \): space of \( \{ {\mathbb F}_{0,t}: t \in [0, T]\}-\)adapted \(n\times m\) matrix valued random processes \(\{ \Sigma (t): t \in [0,T]\}\) such that \( {\mathbb E}\int _{[0,T]} |\Sigma (t)|_{{\fancyscript{L}}({\mathbb R}^m,{\mathbb R}^n)}^2 \mathrm{{d}}t < \infty .\)

\(B_{{\mathbb F}_T}^{\infty }([0,T], L^2(\varOmega ,{\mathbb R}^{n}))\): space of \(\{{\mathbb F}_{0,t}: t\in [0,T]\}\)-adapted \({\mathbb R}^{n}-\) valued second-order random processes endowed with the norm topology \( \parallel x \parallel ^2 \mathop {=}\limits ^{\triangle }\sup _{t \in [0,T]} {\mathbb E}|x(t)|_{{\mathbb R}^{n}}^2\).

3 Equivalent Stochastic Dynamic Team Problems

In this section, we invoke Girsanov’s theorem to transform the original stochastic dynamic decision problem to an equivalent decision problem with corresponding observations and information structures which are independent and independent of any of the team decisions. Consider [(WP3)] 

(WP1):

\(x(0)=x_0\): an \({\mathbb R}^n\)-valued random variable with distribution \(\varPi _0(dx)\);

(WP2):

\(\{ W(t): t \in [0,T]\}\): an \({\mathbb R}^{m}\)-valued standard BM, independent of \(x(0)\);

(WP3):

\(\{ y^i(t) \mathop {=}\limits ^{\triangle }\int _{0}^t D^{i,\frac{1}{2}}(s) dB^i(s): t \in [0,T]\}\): \({\mathbb R}^{k_i}\)-valued, \(i=1, \ldots , N\), mutually independent BMs, independent of \(\{W(t): t \in [0,T]\}\).

  where \(W(\cdot ) \in C([0,T], {\mathbb R}^m)\), with Borel \(\sigma -\)algebra \({\fancyscript{F}}_{0,T}^W \mathop {=}\limits ^{\triangle }{\fancyscript{B}}(C[0,T], {\mathbb R}^m))\) and \({\mathbb P}^W\) its Wiener measure on it, similarly for \(y^i(\cdot )\), with \({\fancyscript{F}}_{0,T}^{y^i} \mathop {=}\limits ^{\triangle }{\fancyscript{B}}(C(0,T], {\mathbb R}^{k_i}))\), \({\mathbb P}^{y^i}\), \(i=1,\ldots , N\), and \({\fancyscript{B}}(C(0,T], {\mathbb R}^{k})) \mathop {=}\limits ^{\triangle }\otimes _{i=1}^N {\fancyscript{B}}(C(0,T], {\mathbb R}^{k_i}))\), \(k=\sum _{i=1}^N k_i\), \({\mathbb P}^{y} \mathop {=}\limits ^{\triangle }\times \, _{i=1}^N {\mathbb P}^{y^i}\). Then, we define the reference probability space \((\varOmega , {\mathbb F}, \{{\mathbb F}_{0,t}: t \in [0,T]\}, {\mathbb P})\), by \(\varOmega \mathop {=}\limits ^{\triangle }{\mathbb R}^n \times C([0,T], \mathfrak {R}^m) \times C([0,T], {\mathbb R}^{k})\), \({\mathbb F} \mathop {=}\limits ^{\triangle }{\fancyscript{B}}({\mathbb R}^n) \otimes {\fancyscript{B}}( C([0,T], {\mathbb R}^m) \otimes {\fancyscript{B}}( C([0,T], {\mathbb R}^{k})\), \({\mathbb F}_{0,t} \mathop {=}\limits ^{\triangle }{\fancyscript{B}}({\mathbb R}^n) \otimes {\fancyscript{F}}_{0,t}^W \otimes {\fancyscript{G}}_{0,t}^y, t \in [0,T],\) \({\mathbb P} \mathop {=}\limits ^{\triangle }\varPi _0 \times {\mathbb P}^W \times {\mathbb P}^{y}\), the independent observations (WP3), which are independent of the decisions, and the state process by

$$\begin{aligned} dx^u(t)=f(t,x^u(t),u_t)\mathrm{{d}}t + \sigma (t,x^u(t),u_t)dW(t), \quad x(0)=x_0, \,\, t \in (0,T]. \end{aligned}$$
(19.5)

Then, we introduce the exponential functions for \(i=1, \ldots , N\):

$$\begin{aligned} \varLambda ^{i,u}(t) \mathop {=}\limits ^{\triangle }&\exp \Big \{ \int _{0}^t h^{i,*}(s,x(s),u_s)D^{i,-1}(s) dy^i(s) \nonumber \\&-\frac{1}{2} \int _{0}^t h^{i,*}(s,x(s),u_s)D^{i,-1}(s) h^{i}(s,x(s),u_s)ds \Big \}, \; \varLambda ^u(t) \mathop {=}\limits ^{\triangle }\prod _{i=1}^N \varLambda ^{i,u}(t), \nonumber \\ d \varLambda ^u(t) =&\varLambda ^u(t) \sum _{i=1}^N h^{i,*}(t,x(t),u_t)D^{i,-1}(t) dy^i(t), \quad \varLambda ^u(0)=1, \,\, t \in [0,T]. \end{aligned}$$
(19.6)

For \(u \in {\mathbb U}_{rel}^{(N)}[0,T]\), the payoff under the reference probability space \((\varOmega , {\mathbb F}, {\mathbb P})\) is

$$\begin{aligned} J(u) \mathop {=}\limits ^{\triangle }{\mathbb E} \Big \{ \int _{0}^{T} \varLambda ^u(t) \ell (t,x(t),u_t) \mathrm{{d}}t + \varLambda ^u(T) \varphi (x(T)\Big \}. \end{aligned}$$
(19.7)

Under the reference probability measure \({\mathbb P}\), the payoff (19.7) with \(\varLambda ^u(\cdot )\) given by (19.6), and the state process satisfying (19.5) is a transformed problem with observations which are not affected by any of the team decisions. If \(\{ \varLambda ^{u}(t): t \in [0, T]\}\) is an \((\{{\mathbb F}_{0,t}: t \in [0,T]\}, {\mathbb P})\)-martingale, then, we define a probability measure by setting

$$\begin{aligned} \frac{ d{\mathbb P}^u}{ d {{\mathbb P}}} \Big |_{ {\mathbb F}_T} = \varLambda ^u(T). \end{aligned}$$
(19.8)

Under the probability measure \((\varOmega , {\mathbb F}, {\mathbb P}^u)\), the observations, state, and payoff are defined by (19.2)–(19.3), with \(B^{i}(\cdot )\) replaced by \(B^{i,u}(\cdot ), i=1,\ldots , N\), and \(\mathbf{E}\) by \({\mathbb E}^u\).

Fact 1. There formulation of the stochastic dynamic team problem under probability space \((\varOmega ,{\mathbb F},\{ {\mathbb F}_{0,t}: t \in [0, T]\}, {\mathbb P}^u)\), (19.1)–(19.4), is equivalent to that under the reference probability space \((\varOmega ,{\mathbb F},\{ {\mathbb F}_{0,t}: t \in [0, T]\}, {\mathbb P})\), (19.5)–(19.7), in which \(\{y^i(t): t \in [0,T]\}, i=1, \ldots , N\) are Brownian motions, and hence, the information structures are independent of the team decisions.

Fact 2. Girsanov’s approach implies that the probability measure \({\mathbb P}^u\) and the Brownian motions \(\{B^{i,u}(t): t \in [0,T]\}\) depend on \(u\) but \(\{y^i(t): t \in [0,T]\}\) and \(\{ {\fancyscript{G}}_{0,t}^{y^i}: t \in [0,T]\}, i=1, \ldots , N\), are fixed â priori and independent of \(u\). When the information structures are functionals of the state process \(\{x(t): t \in [0, T]\}\), then for \(\sigma \) independent of \(u\), we can also make \(\{x(t): t \in [0, T]\}\) to be independent of \(u\) by replacing (19.5) by \(dx(t)=\sigma (t,x(t))dW(t)\) [12]. We can derive a BSDE relating the value process and PbP optimality [12].

Fact 3. Girsanov’s measure transformation generalizes and makes precise Witsenhausen’s [5] equivalence of discrete-time stochastic dynamic decision problems to static team problems. The “common denominator condition" and “change of variables" described in [5] are equivalent to the “change of probability measure” via (19.8) and the associated Bayes’ theorem of expressing \(J(u)\) via (19.7) [11, 12].

4 Team and PbP Optimality Conditions

In this section, we describe some of the consequences of Fact 1, Fact 2, Fact 3, in deriving necessary and sufficient team and PbP optimality conditions.

4.1 Discrete-Time Equivalence of Static and Dynamic Team Problems

Consider a discrete-time team problem on the probability space \((\varOmega , {\mathbb F}, \{ {\mathbb F}_{0,t}: t \in {\mathbb N}_0^T\}, {{\mathbb P}^u}), {\mathbb N}_0^T\mathop {=}\limits ^{\triangle }\{0,1, \ldots , T\}, {\mathbb N}_1^T\mathop {=}\limits ^{\triangle }\{1, 2, \ldots , T\}\) described by

$$\begin{aligned} x(t+1) =\,&f(t,x(0), \ldots , x(t),u^1(t), \ldots , u^N(t))+G(t) w^u(t+1), \,\, t \in {\mathbb N}_0^{T-1}, \end{aligned}$$
(19.9)
$$\begin{aligned} y^m(t) =\,&h^m(t,x(t),u^1(t), \ldots , u^N(t))+D^{\frac{1}{2},m}(t) b^{m,u}(t), \; t \in {\mathbb N}_0^T, \; \forall m \in {\mathbb Z}_N, \end{aligned}$$
(19.10)
$$\begin{aligned} J(u) =\,&{\mathbb E}^u \{ \sum _{t=0}^{T-1} \ell (t,x(t),u^1(t), \ldots , u^N(t)) + \varphi (x(T))\}, \end{aligned}$$
(19.11)

where \(\{x(0), w^u(\cdot ), b^{m,u}(\cdot )\}\) are independent, distributed according to \(\varPi _0(dx)\),

\(\{w(t) \sim \zeta _t (\cdot ) \mathop {=}\limits ^{\triangle }\text{ Gaussian }(0, I_{n\times n}): t \in {\mathbb N}_1^T\}\), \(\{b^{m,u}(t) \sim \lambda _t^m(\cdot ) \mathop {=}\limits ^{\triangle }\text{ Gaussian }(0, I_{k_m\times k_m}): t \in {\mathbb N}_0^T\}\), and \(G(\cdot ), D^{\frac{1}{2},m}(\cdot )\) invertible, for \(m=1, \ldots , N\).

Next, we specify the information structures. For \(t\in \{0,1, \ldots , T\}\), let \({\fancyscript{Y}}_t \mathop {=}\limits ^{\triangle }\{(\tau ,m)\in \{0,1, \ldots , t\} \times \{1, 2, \ldots , N\}\}\). A data basis at time \(t \in \{0, 1, \ldots , T\}\) for the \(k\)th decision station is a subset \({\fancyscript{Y}}_{t,k} \subseteq {\fancyscript{Y}}_t\). The interpretation is that the decision applied by the \(k\)th station at time \(t\) is based on \(\{y^\mu (\tau ): (\tau , \mu ) \in {\fancyscript{Y}}_{t, k}\}\), i.e., \(u^k(t) \equiv \gamma _t^k( \{ y^\mu (\tau ): (\tau , \mu ) \in {\fancyscript{Y}}_{t,k} \}, t \in {\mathbb N}_0^T, k=1, \ldots , N\).

We introduce Girsanov’s measure transformation via the following quantity.

$$\begin{aligned} \varTheta _{0,t}^u\mathop {=}\limits ^{\triangle }&\prod _{\tau =1}^{t} \frac{\zeta _{\tau }( G^{-1}(\tau -1)( x(\tau )-f(\tau -1,x(0), \ldots , x(\tau -1),u^1(\tau -1), \ldots , u^N(\tau -1))))}{ |G(\tau -1)| \zeta _{\tau }(x(\tau ))} \nonumber \\&.\frac{\lambda _{\tau }(D^{-\frac{1}{2}}(\tau )(y(\tau )-h(\tau ,x(\tau ),u^1(\tau ), \ldots , u^N(\tau ))))}{|D^{\frac{1}{2}}(\tau )|\lambda _{\tau }(y(\tau ))}, \,\, t \in {\mathbb N}_1^T, \end{aligned}$$
(19.12)

where \(\varTheta _{0,0}^u \mathop {=}\limits ^{\triangle }\frac{\lambda _{0}( D^{-\frac{1}{2}}(0)(y(0)-h(0,x(0),u^1(0), \ldots , u^N(0))))}{|D^{\frac{1}{2}}(0)|\lambda _{0}(y(0))}.\) Under \((\varOmega , {\mathbb F}, \{ {\mathbb F}_{0,t}: t \in {\mathbb N}_0^T\}, {{\mathbb P}})\), \(\{(x(t), y^m(t)): t \in {\mathbb N}_0^T\}\) are independent, with \(x(0)\) having distribution \(\varPi _0(dx)\), \(\{x(t) \sim \zeta _t (\cdot ): t \in {\mathbb N}_1^T\}\), and \(\{y^m(t) \sim \lambda _t^m(\cdot ): t \in {\mathbb N}_0^T\}\), for \(m=1, \ldots , N\), unaffected by the team decisions, and the discrete-time team payoff is given by

$$\begin{aligned} J(u) =&\int \Big \{\varTheta _{0,T}^u(x(0),u^1(0),\ldots , u^N(0), y(0),\ldots , x(T), u^1(T),\ldots , u^N(T), y(T)) \nonumber \\&.\Big (\sum _{t=1}^{T-1} \ell (t,x(t),u^1(t), \ldots , u^N(t)) + \varphi (x(T)) \Big ) \Big \} \nonumber \\&. \varPi _{0}(dx(0)) .\lambda _{0}(y(0)).\prod _{t=1}^{T} \zeta _{t}(x(t)) .\lambda _{t}(y(t))dx(t) .dy(t), \end{aligned}$$
(19.13)
$$\begin{aligned} J(\gamma _{[0,T]}^*) =&\inf \Big \{ J(\gamma _{[0,T]}): \gamma _{[0,T]} \in {\mathbb U}^{(N)}[0,T], \,\, J(\cdot ) \equiv (19.3) \Big \}. \end{aligned}$$
(19.14)

This is the transformed equivalent stochastic team problem.

Fact 4. Dynamic team (19.9)–(19.11) is equivalent to the static team (19.13), (19.14), and hence, optimality conditions in [4] are applicable to (19.14). This is applied in [11] to compute the optimal strategies of Witsenhausen’s [2] counterexample.

4.2 Continuous-Time Stochastic Maximum Principle for Team Optimality

Next, we present the optimality conditions for Problem 19.1, derived in [11]. Consider the equivalent team problem under the reference probability space \((\varOmega ,{\mathbb F},\{ {\mathbb F}_{0,t}: t \in [0, T]\}, {\mathbb P})\), described by \(\{\varLambda , x\}\) satisfying (19.6), (19.5), and reward (19.7). Let \(X \mathop {=}\limits ^{\triangle }Vector\{\varLambda , x\} \in {\mathbb R} \times {\mathbb R}^n, \overline{W}(\cdot )\mathop {=}\limits ^{\triangle }Vector\{\int _{0}^\cdot D^{\frac{1}{2}}(s)dB(s), W(\cdot )\} \in {\mathbb R}^{k+m},\)

\(h(t,x,u) \mathop {=}\limits ^{\triangle }Vector\{h^1(t,x,u),\ldots , h^N(t,x,u)\}, \; L(t,X,u) \mathop {=}\limits ^{\triangle }\varLambda \ell (t,x,u), \, \varPhi (X) \mathop {=}\limits ^{\triangle }\varLambda \varphi (x)\), then

$$\begin{aligned} dX(t) =&F(t,X(t),u_t)dt + G(t,X(t),u_t) d\overline{W}(t), \, X(0)=X_0, \, t \in (0,T]. \end{aligned}$$
(19.15)
$$\begin{aligned} J(u) =&{\mathbb E} \{ \int _{0}^T L(t,X(t),u_t)dt + \varPhi (X(T))\} . \end{aligned}$$
(19.16)

The Hamiltonian \( {\fancyscript{H}}: [0, T] \times {\mathbb R}^{n+1} \times {\mathbb R}^{n+1} \times {\fancyscript{L}}( {\mathbb R}^{m+k}, {\mathbb R}^{n+1})\times {\fancyscript{M}}_1( {\mathbb A}^{(N)}) \rightarrow {\mathbb R}\) is

$$\begin{aligned} {\fancyscript{H}}(t,X,\varPsi ,Q,u) \mathop {=}\limits ^{\triangle }\langle F(t,X,u),\varPsi \rangle + tr (Q^* G(t,X,u)) + L(t,X,u). \end{aligned}$$
(19.17)

For any \(u \in {\mathbb U}_{rel}^{(N)}[0,T]\), the state process satisfies (19.15), the adjoint process \((\varPsi , Q) \in L_{{\mathbb F}_T}^2([0,T], {\mathbb R}^{n+1}) \times L_{{\mathbb F}_T}^2([0,T], {\fancyscript{L}}( {\mathbb R}^{m+k}, {\mathbb R}^{n+1}))\) and it satisfies the BSDE

$$\begin{aligned} d\varPsi (t) =- {\fancyscript{H}}_X(t,X(t),\varPsi (t),Q(t),u_t) dt + Q(t) d\overline{W}(t), \; \varPsi (T)= \varPhi _X(X(T)). \end{aligned}$$
(19.18)

Theorem 19.1

([11] Team Optimality Conditions. Necessary Conditions) For an element \( u^o \in {\mathbb U}_{rel}^{(N)}[0,T]\) with the corresponding solution \(X^o \in B_{{\mathbb F}_T}^{\infty }([0,T], L^2(\varOmega , {\mathbb R}^{n+1}))\) to be team optimal, it is necessary that

(1) There exists \(({\varPsi }^o,Q^o) \in L_{{\mathbb F}_T}^2([0,T],{\mathbb R}^{n+1})\times L_{{\mathbb F}_T}^2([0,T],{\fancyscript{L}}({\mathbb R}^{m+k},{\mathbb R}^{n+1}))\).

(2) The variational inequality is satisfied:

$$\begin{aligned} \sum _{i=1}^N {\mathbb E} \Big \{ \int _0^T {\fancyscript{H}} (t,X^o(t),\varPsi ^o(t), Q^{o}(t), u_t^{-i,o},u_t^i-u_t^{i,o}) dt \Big \}\ge 0, \,\, \forall u \in {\mathbb U}_{rel}^{(N)}[0,T]. \end{aligned}$$

(3) \(({\varPsi }^o,Q^o)\) is a unique solution of the BSDE (19.18) with \(u^o \in {\mathbb U}_{rel}^{(N)}[0,T]\) satisfying

$$\begin{aligned} {\mathbb E} \Big \{ {\fancyscript{H}}(t,X^o(t),&\varPsi ^o(t),Q^o(t),u_t^{-i, o}, \nu ^i) |{\fancyscript{G}}_{t}^{i} \Big \} \ge {\mathbb E} \Big \{ {\fancyscript{H}}(t,X^o(t), \varPsi ^o(t),Q^o(t),u_t^{o}) |{\fancyscript{G}}_{ t}^{i} \Big \}, \nonumber \\&\forall \nu ^i \in {\fancyscript{M}}_1({\mathbb A}^i), a.e. t \in [0,T], {\mathbb P}|_{{\fancyscript{G}}_{t}^{i}}- a.s., i \in {\mathbb Z}_N \end{aligned}$$
(19.19)

Sufficient Conditions. Let \((X^o(\cdot ), u^o(\cdot ))\) denote an admissible state and decision pair and let \(\varPsi ^o(\cdot )\) the corresponding adjoint processes and assume

(C) \({\fancyscript{H}} (t, \cdot ,\varPsi ,Q,\nu )\) is convex in \(X \in {\mathbb R}^{n+1}\); \(\varPhi (\cdot )\) is convex in \(X \in {\mathbb R}^{n+1}\).

Then, \((X^o(\cdot ),u^o(\cdot ))\) is optimal if it satisfies (19.19).

Fact 5. The necessary conditions for team optimality are equivalent to those of PbP optimality, and under (C), PbP optimality implies team optimality. Since regular strategies \( {\mathbb U}_{reg}^{(N)}[0,T]\) are embedded into relaxed strategies\( {\mathbb U}_{rel}^{(N)}[0,T]\), from Theorem 19.1, we obtain the optimality conditions for regular strategies [8, 11]. Applications of (19.19) lead to fixed point-type equations. Example can be carried out as in [9].

5 Additional Results and Open Issues

For noiseless and noisy nonclassical information structures, related results and examples, without invoking Girsanov’s measure transformation, are found in [9, 10].

Extensions to a stochastic differential equation driven by continuous and jump martingales can be derived from those in [8].

Extensions of the stochastic maximum principle to discrete-time dynamic systems can be derived from those in [911].

Extensions to minimax or Nash-equilibrium strategies are still open problems.