Abstract
Radner’s solution of the static team decision problem is revisited. A careful and complete statement of the static decentralized optimization problem, also referred to as the team decision problem, is given. Decentralized optimization is considered in the framework of nonzero-sum game theory, and the impact of the partial information pattern on the structure of the optimal strategies is analyzed. The complete solution of the static decentralized multivariate Quadratic Gaussian (QG) optimization problem is obtained.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
A static stochastic decentralized optimization problem where a team consisting of two decision makers/players is at work is considered. The cost function is
where \(u \in {R}^{m_{u}}\) and \(v \in {R}^{m_{v}}\) are the two players’ respective decision variables/controls and the state of nature, \(\zeta \in {R}^{n}\), n ≥ 2, is a random variable whose p.d.f. f(ζ) is known to both players. This is the players’ prior information—it is public information. The random variable ζ is partitioned
and the information pattern is as follows. At decision time the component ζ 1 is known to the player whose control is u, the u-player, and the component ζ 2 is known to the player whose control is v, the v-player. Thus, both players have imperfect information. The u-player is oblivious of the ζ 2 component of the random variable, which is the v-player’s private information, and consequently the strategy of the u-player is u = u(ζ 1). The v-player is oblivious of the ζ 1 component of the random variable, which is the u-player’s private information, and consequently his or her strategy is v = v(ζ 2). The players have partial, or incomplete, information.
To obtain the optimal solution/strategies of the team/decentralized optimization problem, the following optimization problem in Hilbert space must be solved.
The instance where the u-player is interested in minimizing the cost function (1) whereas the v-player strives to maximize the cost (1) calls for the formulation of a stochastic zero-sum game with incomplete information, where a saddle point in pure strategies, in Hilbert space, is sought: the value of the game, if it exists, is
This static zero-sum game in Hilbert space is in normal form.
In both the decentralized optimization problem posed in (2) and in the zero-sum static game formulation (3), the u- and v-players have partial information. And in both the decentralized optimization problem and in the zero-sum game, the players decide on their respective strategies u( ⋅) and v( ⋅), knowing the type of information that will become available to them, but before the information is actually received. In (2) and (3), the players’ strategies are of priorcommitment type. This is the reason why, although the players have partial information and consequently it stands to reason that their respective costs are conditional on their private information and therefore they have different costs, the game (3) is nevertheless zero-sum. And for the same reason, the solution of the decentralized optimization problem (2) entails the minimization of just one cost functional.
The decentralized stochastic static optimization problem in Hilbert space (2), referred to as a team decision problem, was addressed by Radner in his pioneering paper [1]. The present work could aptly be named “variations on a team by Radner.” Since a strong interest in Witsenhausen’s counterexample from 1968 [2] persists to this day, it is important to revisit Radner’s 1962 paper. Indeed, after the appearance of Radner’s paper and until the publication of Witsenhausen’s counterexample, it was widely believed in the controls community that the linear quadratic Gaussian (LQG) paradigm is a guarantor of the applicability of the separation, or, certainty equivalence, principle, and, as in LQG optimal control, the state is Gaussian distributed so that the sufficient statistics are linear in the measurements/information and are provided by linear Kalman filters. Consequently, the players’ optimal strategies will be linear in the sufficient statistic, and in particular, the linear state estimate. However, Radner showed in [1] that in the static Quadratic Gaussian (QG) optimization problem with incomplete information, although the players’ optimal strategies are affine in the information, the separation, or, certainty equivalence, principle does not apply. And in [2] Witsenhausen showed that in the simplest decentralized dynamic LQG optimal control problem neither does the separation principle apply, nor are the optimal strategies linear in the measurements. The bottom line: Radner’s paper [1] relates to Witsenhausen’s paper [2] like the Statics and Dynamics fields in Mechanical Engineering. Thus, with a view to also obtaining a better understanding of Witsenhausen’s counterexample, it is instructive to revisit Radner’s work and closely examine the informational and game theoretic aspects of the decentralized static QG optimization problem/team decision problem.
The article is organized as follows. In Sect. 2 the decentralized optimization problem is analyzed using the concept of delayedcommitment strategies and necessary conditions for the existence of a solution are obtained. The necessary conditions derived in Sect. 2 are used in Sect. 3 to directly obtain the solution of the decentralized static multivariate QG optimization problem. The applicability of the separation principle/certainty equivalence is discussed in Sect. 4. The necessary and sufficient conditions for the existence of a solution of the decentralized static multivariate QG optimization problem are discussed in Sect. 5. The solution of the decentralized static multivariate QG optimization problem using the concept of prior commitment strategies is presented in Sect. 6. It is shown that although in the static case the delayed commitment and prior commitment strategies are equivalent, when the concept of prior commitment strategies is used, the strategies are harder to derive. Finally, in Sect. 7 the decentralized static multivariate QG optimization problem where the players’ information is asymmetric is solved. The structure of the optimal solutions for cases of extreme informational asymmetry yields interesting insights into decentralized optimal control. Conclusions are presented in Sect. 8.
2 Analysis
The solution of the static team/decentralized optimization problem pursued in this paper is based on the following approach. Rather than tackling the Hilbert space optimization problem (2) head on, we instead opt for a game theoretic analysis of the decision problem on hand.
Consider first the decision problem faced by the u-player after he has received the information ζ 1, but before anyone has acted. His or Her cost is evaluated as follows.
Similar considerations apply to the v-player: having received the ζ 2 information, the cost which the v-player strives to minimize is
Now, the u and v-players’ strategies are of delayedcommitment type. Consequently, although both players strive to minimize the cost function (1), since they have partial information, their expected costs will be conditional on their private information and will not be the same—each player minimizes his or her own cost functional. The static team problem/decentralized optimal control problem (2) has been reformulated as a stochastic nonzero-sum game with incomplete information. Hence, a Nash equilibrium is sought. Using delayedcommitment type strategies has highlighted informational issues which are apparent in extensive-form games but are suppressed in normal form games.
If a solution to the team/decentralized control problem in the form of a Nash equilibrium exists, it can be obtained as follows.
The u-player’s valuefunction is
and his or her optimal strategy is obtained as follows: the u-player calculates the vector in \({R}^{m_{u}}\)
The v-player’s valuefunction is
and his or her optimal strategy is obtained as follows: the v-player calculates the vector in \({R}^{m_{v}}\)
Hence, in order to determine the players’ optimal strategies, that is, the functions u ∗ ( ⋅) and v ∗ ( ⋅), the equation in \(u \in {R}^{m_{u}}\),
must be solved ∀ζ 1, and in this way the u-player’s strategy u ∗ (ζ 1) is obtained. At the same time the equation in \(v \in {R}^{m_{v}}\)
must be solved ∀ζ 2, and in this way the v-player’s strategy v ∗ (ζ 2) is obtained. In addition, the following second-order conditions/inequalities must hold
and
A set of two coupled functional equations (6) and (7) has been derived whose solution, if it exists, yields the u- and v-players’ respective Nash strategies u ∗ (ζ 1) and v ∗ (ζ 2). Evidently, the solution of static team/decentralized optimization problems and/or nonzero-sum stochastic games calls for the solution of a somewhat nonconventional mathematical problem, (6) and (7). The culprit is the partial information pattern.
At this juncture it is apparent that the solution concept advanced for the original team/decentralized control problem is a Nash equilibrium in the nonzero-sum stochastic game (4) and (5). Using delayed commitment strategies, a Person-By-Person Satisfactory (PBPS) minimization is pursued: the strategy u ∗ ( ⋅) of the u-player is best, given that the v-player uses the strategy v ∗ ( ⋅), and the strategy v ∗ ( ⋅) of the v-player must be best, given that the u-player uses the strategy u ∗ ( ⋅). Thus, the derived strategies \(({u}^{{\ast}}(\cdot ),{v}^{{\ast}}(\cdot ))\) are person-by-person minimal. This is so because the players’ outcomes provided by \(({u}^{{\ast}}(\cdot ),{v}^{{\ast}}(\cdot ))\) cannot be improved by unilaterally changing, say, u ∗ ( ⋅) alone; and, vice versa, the strategy \(({u}^{{\ast}}(\cdot ),{v}^{{\ast}}(\cdot ))\) cannot be improved by changing v ∗ ( ⋅) alone—this being the essence of a Nash equilibrium. Now, in nonzero-sum games, the calculated Nash equilibrium better be unique, for the solution to be applicable. However, in the absence of conflict of interest, as is the case in our original team/decentralized optimization problem (2), uniqueness of the Nash equilibrium solution is not an issue and the players will naturally settle on that particular Nash equilibrium \(({u}^{{\ast}}(\cdot ),{v}^{{\ast}}(\cdot ))\) which yields the minimal expected cost—we here refer to the calculated expected cost (2), namely
Uniqueness of the obtained Nash equilibrium follows if the cost function (1) is convex in u and in v. This is so because the weighted sum of convex functions is convex—see (4) and (5).
Clearly, the optimal solution of the original team/decentralized optimization problem (2), if it exists, is PBPS, that is, it is a Nash equilibrium. However, having found an even unique Nash equilibrium of the nonzero-sum stochastic game (4) and (5) does not guarantee optimality in the original team/decentralized control problem, where one is interested in the expected cost (2). To answer the question of the existence of an optimal solution of the original team/decentralized control problem, the optimization problem (2) must be considered in a Hilbert space setting, as in [1], and convexity in (u, v) of the cost function (1) is required.
In summary, if a solution of the team/decentralized optimization problem exists, the above outlined solution of the attendant nonzero-sum stochastic game (4) and (5) will yield its optimal solution. However, should the cost function (1) be convex in u and v, but not in (u, v), then, while a Nash equilibrium in the nonzero-sum game (4) and (5) might exist, a solution of the decentralized optimization problem (2) might not exist.
3 Static Quadratic Gaussian Team
Using the theory developed in Sect. 2, the complete solution of the multivariate QG team decision/decentralized optimization problem is now derived.
The payoff function (1) is quadratic:
and the components of the random variable ζ are \(\zeta _{1} \in {R}^{m}\), \(\zeta _{2} \in {R}^{n-m}\). The u- and v-players’ control variables are u ∈ R m and v ∈ R n − m and the respective controls’ effort weighing matrices
R (u, v) is an (n − m) ×m coupling matrix.
We calculate the v-player’s payoff
Differentiation in v yields the unique optimal control response to the u-player’s strategy u(ζ 1),
The u-player’s payoff is
and differentiation in u yields the unique optimal control response to the v-player’s strategy v(ζ 2),
Furthermore, the positive definiteness of the controls’ effort weighing matrices guarantees that the conditions (8) and (9) hold.
At this point we assume that the p.d.f. f of the random variable ζ is a multivariate normal distribution, that is,
and the covariance matrix P is real, symmetric, and positive definite. In other words, the random variable
In the special case of a bivariate normal distribution with \(\zeta _{1},\zeta _{2} \in {R}^{1}\),
and − 1 < ρ < 1.
The following is well known.
Lemma 1.
Consider the multivariate normal distribution (15). The distribution of ζ 1 conditional on ζ 2 is
and the distribution of ζ 2 conditional on ζ 1 is
The marginal p.d.f.s f 1 (ζ 1 ) and f 2 (ζ 2 ) are also Gaussian, that is,
and
In the special case of a bivariate normal distribution (16), the distribution of ζ 1 conditional on ζ 2 is
and the distribution of ζ 2 conditional on ζ 1 is
The marginal p.d.f.s f 1 (ζ 1 ) and f 2 (ζ 2 ) are
and
Inserting (18) into (14) yields
where
and inserting (17) into (12) yields
where
Using the convolution notation we obtain
where the function \(G_{P_{2,2}-P_{1,2}^{T}P_{1,1}^{-1}P_{1,2}}\) is the p.d.f. of the Gaussian random variable w 1. Similarly
where the function \(G_{P_{1,1}-P_{1,2}P_{2,2}^{-1}P_{1,2}^{T}}\) is the p.d.f. of the Gaussian random variable w 2. Hence, the optimal strategies satisfy the equations
and
Equations (25) and (26) constitute a linear system of two convolution-type Fredholm integral equations of the second kind with Gaussian kernels, in the unknown functions/optimal strategies u ∗ ( ⋅) and v ∗ ( ⋅). Moreover, the forcing functions are linear in their arguments. In view of these observations, we apply
Ansatz 2.
The u- and v-players’ optimal strategies are affine, that is,
and
□
Inserting the strategies (27) and (28) into the respective (25) and (26), we calculate
and
We conclude that the following four linear equations in the four unknowns \(K_{m\times m}^{(u)}\), \(K_{(n-m)\times (n-m)}^{(v)}\), \({c}^{(u)} \in {R}^{m}\) and \({c}^{(v)} \in {R}^{n-m}\) hold:
and
Combining (29) and (30) yields the respective linear, Lyapunov type, matrix equations for K (u) and K (v),
and
Solving the linear Lyapunov-type matrix equations (33) and (34) yields the optimal gains K (u) and K (v), whereupon the constant vectors \({c}^{(u)} \in {R}^{m_{u}}\) and \({c}^{(v)} \in {R}^{m_{v}}\) are
Concerning the calculation of the intercepts c (u) and c (v), the following holds.
A necessary condition for the existence of a solution to the multivariate decentralized QG optimization problem is that the Schur complements \({R}^{(u)} - {({R}^{(u,v)})}^{T}{({R}^{(v)})}^{-1}{R}^{(u,v)}\) and \({R}^{(v)} - {R}^{(u,v)}{({R}^{(u)})}^{-1}{({R}^{(u,v)})}^{T}\) are nonsingular.
In the special case where the controls are scalars and the p.d.f. of the random variable ζ is the bivariate normal distribution (16), the optimal gains are
and
The intercepts are the solution of the linear system
so that
and
The following holds.
Proposition 3.
The necessary and sufficient conditions for the existence of a solution of the scalar decentralized QG optimization problem using delayed commitment strategies are
and
The u- and v-players’ optimal strategies are specified in (35)–(38) and are determined by the scalar problem parameters R (u) , R (v) , R (u,v) , \(\overline{\zeta }_{1}\) , \(\overline{\zeta }_{2}\) , σ 1 , σ 2 , and ρ. The optimal solution (35)–(38) is symmetric.□
Corollary 4.
In the special scalar case where the random variable’s components ζ 1 and ζ 2 are uncorrelated and ρ = 0, the optimal strategies are
and
Also, in the special case where in the quadratic cost function there is no coupling and R (u,v) = 0, the optimal strategies are linear:
and
□
4 Certainty Equivalence
We briefly digress and first examine the centralized static QG optimization problem.
4.1 Centralized QG Optimization Problem
In the centralized static QG optimization problem where both players have complete knowledge of the state of nature \({(\zeta _{1},\zeta _{2})}^{T}\), a necessary and sufficient condition for the existence of an optimal solution is
and the optimal controls \({({u}^{{\ast}},{v}^{{\ast}})}^{T}\) are
We shall require the following.
Lemma 5.
Consider the blocked symmetric matrix
and let
Assuming the required matrix inverses exit, the inverse matrix
where the blocks
An alternative representation in blocked form of the inverse matrix N is
Proof.
By inspection, and the application of the Matrix Inversion Lemma. □
We shall also require
Lemma 6.
The real symmetric matrix
is positive definite iff the matrices R (v) > 0, R (u) > 0 and their respective Schur complements are positive definite, that is,
□
In view of Lemmas 5 and 6, the following holds.
where
or, alternatively,
Hence, in the centralized scenario the explicit formulae for the optimal controls are
and
Corollary 7.
In the special case where the controls are scalars, the necessary and sufficient conditions for the existence of an optimal solution are
and
The optimal controls are linear and the solution is symmetric:
□
4.2 Separation Principle
We now return to the decentralized QG optimization problem and ascertain the applicability of certaintyequivalence, a.k.a., the separation principle. We confine our attention to the scalar case and a bivariate Gaussian random variable (16).
When the information available to the u-player is restricted to the ζ 1 component of the state of nature, then, according to Lemma 1, his or her Maximum Likelihood (ML) estimate of the ζ 2 component of the state of nature will be
Similarly, when the information available to the v-player is restricted to the ζ 2 component of the state of nature, then, according to Lemma 1, his or her ML estimate of the ζ 1 component of the state of nature will be
Replacing ζ 2 in the centralized solution given by Corollary 7, (43), by the u-player’s ML estimate \(\hat{\zeta _{2}}\) of ζ 2 yields the u-player’s certainty equivalence-based affine strategy
and replacing ζ 1 in the centralized solution given by Corollary 7, (44), by the v-player’s ML estimate \(\hat{\zeta }_{1}\) of ζ 1 yields the v-player’s affine strategy
In the special case where the random variable’s components ζ 1 and ζ 2 are not correlated, that is, ρ = 0, the players’ certainty equivalence-based affine strategies are
and
In the special case where there is no coupling in the quadratic payoff function and R (u, v) = 0, the players’ certainty equivalence-based strategies are (39) and (40).
5 Discussion
Similar to the optimal strategies in the decentralized control problem, also the certainty equivalence-based strategies (45) and (46) are affine and symmetric. However, a comparison of the u-player’s optimal strategy which is specified in (35) and (37), and his or her certainty equivalence-based strategy (45), and similarly, a comparison of the v-player’s optimal strategy which is specified in (36) and (38), and his or her certainty equivalence-based strategy (46), leads one to conclude that certainty equivalence does not hold. This is so even when there is no correlation and the parameter ρ = 0. Certainty equivalence holds only in the special case where there is no coupling in the quadratic payoff function and R (u, v) = 0. This state of affairs is attributable to the partial information pattern.
It is also interesting to contrast the conditions for the existence of a solution of the centralized QG optimization problem and the conditions for the existence of a solution of the decentralized QG optimization problem. We note that the solution (41) and (42) of the centralized optimization problem can be formally derived using the PBPS solution concept. For this we need
and the Schur complements must be nonsingular, that is,
and
At the same time, we know that an optimal solution of the centralized optimization problem exists iff the matrix M is positive definite. Hence, in view of Lemma 6, we conclude that the positive definiteness of the Schur complements of the positive definite matrices R (u) and R (v) is a necessary condition for the existence of an optimal solution of the centralized optimization problem. At the same time, the invertibility of the Schur complements, while not sufficient to guarantee the existence of a solution of the centralized optimal control problem, is sufficient to allow a solution which conforms to the PBPS solution concept-based decentralized optimization problem—we have obtained a unique Nash solution and in the scalar case the respective u and v-players’ Nash strategies are determined by (35), (37), and (36), (38), respectively.
Now, in view of [1], the positive definiteness of M is sufficient for the existence of an optimal solution of the decentralized optimization problem (2): the necessary and sufficient condition for the existence of a solution of the centralized optimization problem is a sufficient condition for the existence of an optimal solution of the decentralized problem, and moreover, the u- and v-players’ Nash strategies determined by (35), (37), and (36), (38), respectively, are then optimal. However, if the matrix M is not positive definite but the matrices R (u) and R (v) are positive definite and their Schur complements are nonsingular, then while an optimal solution to the centralized optimization problem does not exist, in the decentralized control problem a PBPS solution concept-based unique Nash equilibrium exists.
6 Decentralized Static Quadratic Gaussian Optimization Problem
The original formulation of the decentralized optimization problem with a quadratic payoff functional, as formulated by Radner, (2), is considered in the special context of the multivariate QG optimization problem:
From [1] we know that optimal prior commitment strategies u ∗ ( ⋅) and v ∗ ( ⋅) exist and they are affine, provided the quadratic cost function is convex, that is, the matrix M is positive definite. Thus, the u- and v-players’ strategies are parameterized as follows:
and
The subscript p indicates that now the strategies are of the priorcommitment type.
Inserting the expressions (48) and (49) into (47) yields
The payoff (50) is a function of the parameters \(K_{p}^{(u)}\), \(K_{p}^{(v)}\), \(c_{p}^{(u)}\), and \(c_{p}^{(v)}\).
The payoff function is differentiated in the parameters and the derivatives are set equal to zero. We can interchange the order of integration and differentiation. We shall use the notation.
e i is the unit vector in the Euclidian spaces R m or R n − m, all of whose entries are zeroes except entry number i.
The following calculations are needed.
Lemma 8.
and consequently, using the properties of the Trace operator and the fact that the marginal p.d.f. of ζ 1 is Gaussian with expectation \(\overline{\zeta }_{1}\) and covariance P 1,1 , we calculate
Similarly,
and consequently
In addition
and consequently
Similarly,
and consequently
Also,
and consequently
Similarly,
and consequently
Furthermore,
and consequently
Similarly,
and consequently
In addition,
and consequently
Similarly,
and consequently
Finally,
and consequently
Similarly,
and consequently
□
The optimality conditions and Lemma 8 yield the system of \(n(n + 1) - 2m(n - m)\) linear equations
where \(e_{i},e_{j} \in {R}^{m}\) and i = 1, …, m, j = 1, …, m,
where \(e_{i},e_{j} \in {R}^{n-m}\) and \(i = 1,\ldots,n - m\), \(j = 1,\ldots,n - m\),
and
The unknowns are \(K_{p}^{(u)}\), an m ×m matrix, \(K_{p}^{(v)}\), an \((n - m) \times (n - m)\) matrix, \(c_{p}^{(u)} \in {R}^{m}\) and \(c_{p}^{(v)} \in {R}^{n-m}\), a total of \(n(n + 1) - 2m(n - m)\) unknowns.
Using (53) and (54) we express the intercepts \(c_{p}^{(u)}\) and \(c_{p}^{(v)}\) as linear functions of \(K_{p}^{(u)}\) and \(K_{p}^{(v)}\):
Hence,
Substituting these expressions into (51) and (52) yields a reduced linear system of \({n}^{2} - 2m(n - m)\) equations in the \({n}^{2} - 2m(n - m)\) unknowns which populate the matrices \(K_{p}^{(u)}\) and \(K_{p}^{(v)}\). Note that if \(\overline{\zeta }_{1} = 0\) and \(\overline{\zeta }_{2} = 0\), \(c_{p}^{(u)} = 0\), \(c_{p}^{(v)} = 0\) and the equations for \(K_{p}^{(u)}\) and \(K_{p}^{(v)}\) are
Example.
In the special case of scalar controls and a bivariate normal distribution we obtain a system of four linear equations for the four scalar unknowns \(K_{p}^{(u)}\), \(K_{p}^{(v)}\), \(c_{p}^{(u)}\), and \(c_{p}^{(v)}\):
Compare the optimal prior commitment strategies specified in (55)–(58) and the delayed commitment strategies explicitly specified in (35)–(38). The optimization problem is static and therefore the prior commitment and delayed commitment strategies are all the same:
So the two sets of formulae (35)–(38) and (55)–(58) give rise to interesting identities. In particular, in the multivariate case new matrix identities will be obtained.
Taking a game theoretic approach naturally leads to the concept of delayed commitment strategies. Although the prior commitment strategies and delayed commitment strategies are equivalent, the above example illustrates that it is much easier to calculate the latter.
7 Asymmetric Players
Scenarios where one team member is strongly informationally disadvantaged relative to the second team member are investigated.
7.1 Asymmetric Players: Case 1
Assume the u-player has perfect information, that is, he is privy to the state of nature ζ = (ζ 1, ζ 2), whereas the v-player has access to ζ 2 only. At the same time, the u-player knows that the v-player has the prior information \(\overline{\zeta }_{1}\), \(\overline{\zeta }_{2}\), ρ, σ 1, and σ 2; in fact, and in the best tradition of Bayesian games, it is tacitly assumed that both players are simultaneously presented the prior information before the game starts—the prior information is public information.
In this case the u-player’s payoff is
that is, in the case of perfect information the u-player need not calculate an expectation; v(ζ 2) is the unknown input of the v-player.
If the payoff function J is quadratic,
and differentiation in u yields the relationship
The v-player’s payoff function is
and differentiating it in v yields the relationship
Hence,
In the special case of scalar inputs and a bivariate normal distribution (16), the optimal strategy of the v-player is
provided that R (u, v) is not the geometric mean of R (u) and R (v)—which is the case if the quadratic payoff function is concave in the control variable (u, v), whereupon \({R}^{(u)}{R}^{(v)} - {({R}^{(u,v)})}^{2} > 0\). The optimal strategy of the u-player is
Interestingly, although the u-player has complete state of nature information, his or her optimal strategy is affine and he also uses the public prior information. Concerning the strategy of the informationally disadvantaged v-player: certainty equivalence holds.
7.2 Asymmetric Players: Case 2
As in Sects. 1–6, the private information of the u-player is the ζ 1 component of the state of nature vector ζ. However, we now assume the v-player has no private information and he is totally dependent on the public prior information. As in Sect. 7.1, the u-player is aware that the public information is available to the v-player and he also knows that the v-player is “blind.”
The v-player’s payoff is
and differentiation in v yields the unique optimal control response to the u-player’s strategy u(ζ 1),
The expectation E ζ (ζ 2) in (59) is calculated as follows.
where \(f(\zeta _{1},\zeta _{2})\) is the p.d.f. of the state of nature Gaussian random variable ζ and f m(ζ 2) is a marginal Gaussian p.d.f. of f(ζ 1, ζ 2). Recall that to obtain the marginal distribution over a subset of the components of a multivariate normal random variable, one only needs to drop the irrelevant variables (the variables that one wants to marginalize out) from the mean vector and the covariance matrix. For example, in the bivariate normal case the (Gaussian) marginal p.d.f. \(f_{\mathrm{m}}(\zeta _{1})\) is characterized by the parameters \((\overline{\zeta }_{1},\sigma _{1})\) and the marginal p.d.f. f m(ζ 2) is characterized by the parameters \((\overline{\zeta }_{2},\sigma _{2})\). Similarly,
Thus,
The u-player’s payoff is
Note: Now, as far as the u-player is concerned, the v-player does not employ a strategy, therefore the v-player’s input v is no longer a random variable and one need not compute an expectation: the u-player knows that the v-player is “blind.”
Differentiation in u yields the unique optimal control response to the v-player’s input v
Combining (60) and (61) yields the relationship
that is, the v-player’s optimal control is
and the u-player’s optimal strategy is
If the controls are scalars,
and
In conclusion, in the case where the v-player is “blind,” the strategy of the u-player is as if there would be no correlation, that is, the parameter ρ = 0—as in Corollary 4. As far as the v-player is concerned, certainty equivalence holds. A little bit of thought will convince the reader that these results are expected.
8 Conclusion
The static decentralized decision problem has been analyzed. Special attention is given to the multivariate Quadratic Gaussian (QG) payoff function. The optimization problem is static, yet the players have partial information and as such, this is a small step away from the celebrated LQG paradigm. Informational issues, prior commitment strategies vs. delayed commitment strategies, as well as Nash equilibria solution concepts, are discussed. Necessary and sufficient conditions for the existence of a solution are provided and the optimal strategies are calculated. Extreme cases of informational asymmetry are also explored. This work lays the groundwork for gaining a better understanding of optimization problems with partial information where also dynamics are at play.
References
Radner, R.: “Team Decision Problems”, Annals of Mathematical Statistics, Vol. 33, No. 3, September 1962, pp. 857–881.
H. S. Witsenhausen: “A Counterexample in Stochastic Optimal Control”, SIAM Journal of Control, Vol. 6, No 1, pp. 131–147, 1968.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media New York
About this paper
Cite this paper
Pachter, M., Pham, K. (2013). Static Teams and Stochastic Games. In: Sorokin, A., Pardalos, P. (eds) Dynamics of Information Systems: Algorithmic Approaches. Springer Proceedings in Mathematics & Statistics, vol 51. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-7582-8_5
Download citation
DOI: https://doi.org/10.1007/978-1-4614-7582-8_5
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-7581-1
Online ISBN: 978-1-4614-7582-8
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)