Abstract
In this paper a simple static two-player linear-quadratic game where the players have private information is addressed. The players have private information, however each players is able to formulate an expression for his expected payoff, without the need, a la Harsanyi, to provide a prior probability distribution function of the game’s parameter, and without recourse to the player Nature. Hence, the closed-form solution of the game is obtained. It is shown that in this special case of a one-stage linear-quadratic game where the players have private information, the solution is similar in structure to the solution of the game with complete information, namely, the deterministic linear-quadratic game, and the linear-quadratic game with partial information, where the information about the game’s parameter is shared by the players. It is shown that the principle of certainty equivalence holds.
Access provided by Autonomous University of Puebla. Download chapter PDF
Similar content being viewed by others
Keywords
- Linear-Quadratic Gaussian Games
- Private information
- Imperfect information
- Perfect information
- Certainty equivalence
- Static games
1 Introduction
This paper is a first step in an attempt at bringing closer together the dynamic games paradigm and the theory of games, which historically have developed along separate lines. Dynamic game theorists have traditionally emphasized control theoretic aspects and the backward induction/dynamic programming solution method, whereas game theorists have focused on information economics, that is, the role of information in games.
Linear-Quadratic Dynamic Games (LQDG) with perfect information have received a great deal of attention (Başar and Bernhard 2008; Başar and Olsder 1995; Engwerda 2005). In these works, the concepts of state, and state feedback, are emphasized and the solution method entails backward induction, a.k.a., dynamic programming. In previous work (Pachter and Pham 2013) a static LQG team problem was addressed. In this paper a static LQDG, where each player has private information, is considered. Specifically, the simplest linear-quadratic game with incomplete/partial information is addressed: a one-stage, two-player, “zero-sum,” Linear-Quadratic Gaussian Game (LQGG) is solved.
In this paper a simple static linear-quadratic game where the players have private information, however each players is able to formulate an expression for his expected payoff, without the need to provide a prior probability distribution function of the game’s parameter and without recourse to the player Nature, is analyzed. Thus, in Sect. 5.2 the static linear-quadratic Gaussian game, where the players have private information, is introduced. The solution of the baseline game with perfect information is given in Sect. 5.3 and the solution of the game with imperfect information is given in Sect. 5.4. The scenario where the players have private information is analyzed in Sect. 5.5, and the complete solution of the game is given in Sect. 5.6. Concluding remarks are made in Sect. 5.7.
2 LQGG Problem Statement
The following linear-quadratic game, a static, two-player, “zero-sum” game, is considered. The players are P and E and their respective control variables are u and v. It is a one-stage game with linear “dynamics”
where the state \(x_{0},x_{1} \in {R}^{n}\). The P and E players’ controls are \(u \in {R}^{m_{u}}\) and \(v \in {R}^{m_{v}}\). The payoff function is quadratic:
where the Q F , R u , and R v weighing matrices are real, symmetric, and positive definite. Both players are cognizant of the A, B, C, Q F , R u , and R v data.
Player P strives to minimize the payoff/cost function (5.2) and player E strives to maximize the payoff (5.2).
The initial state information available to player P is
where the vector \(\overline{x}_{0}^{(P)} \in {R}^{n}\) and the n × n covariance matrix P 0 (P) is real, symmetric, and positive definite. The initial state information available to player E is
where the vector \(\overline{x}_{0}^{(E)} \in {R}^{n}\) and the n × n covariance matrix \(P_{0}^{(E)}\) is real, symmetric, and positive definite. The \(P_{0}^{(P)}\) and \(P_{0}^{(E)}\) data is public knowledge—only the \(\overline{x}_{0}^{(P)}\) and \(\overline{x}_{0}^{(E)}\) information is proprietary to the respective P and E players. This is tantamount to saying that players P and E took separate measurements of the initial state x 0, yet the accuracy of the instruments they used is known; however, the actual measurements \(\overline{x}_{0}^{(P)}\) and \(\overline{x}_{0}^{(E)}\) are the respective P and E players’ private information.
Since the pertinent random variables are Gaussian, we shall refer to the game (5.1)–(5.4) as a Linear-Quadratic Gaussian Game (LQGG).
3 Linear-Quadratic Game with Perfect Information
It is instructive to first analyze the perfect information version of the linear-quadratic game (5.1) and (5.2).
If the initial state x 0 is known to both players, we have a game with perfect information.
The closed-form solution of Linear-Quadratic Dynamic Games with perfect information, a.k.a., deterministic Linear-Quadratic Dynamic Games (LQDGs), is derived in Pachter and Pham (2010, Theorem 2.1). The Schur complement concept (Zhang 2005) was used in (Pachter and Pham 2010) to invert a blocked \((m_{u} + m_{v}) \times (m_{u} + m_{v})\) matrix and derive explicit formulae for the P and E players’ optimal strategies. The said matrix contains four blocks and its diagonal blocks are m u × m u and m u × m u matrices. One can improve on the results of Pachter and Pham (2010) by noting that a matrix with four blocks has two Schur complements, say S B and S C .
Concerning the linear-quadratic game (5.1) and (5.2), where the initial state/game parameter x 0 is known to both players and thus the game is a game with perfect information, the following holds.
Theorem 5.1.
A necessary and sufficient condition for the existence of a solution to the zero-sum game (5.1) and (5.2) with perfect information is
A Nash equilibrium/saddle point exists and the players’ optimal strategies are the linear state feedback control laws
An alternative formula for the optimal strategy of player E is
The value of the game
where the matrix
is the first Schur complement of the blocked matrix and
is the second Schur complement of the blocked matrix.
Remark 5.1.
Using both Schur complements of the blocked matrix renders the respective P and E players’ strategies, (5.6) and (5.7), “symmetric.”
4 Linear-Quadratic Gaussian Game with Imperfect Information
If in (5.3) and (5.4) \(P_{0}^{(P)} = P_{0}^{(E)} = P_{0}\) and the P and E players’ information \(\overline{x}_{0}^{(P)} = \overline{x}_{0}^{(E)} = \overline{x}_{0}\) is public knowledge, we have on hand a linear-quadratic game with imperfect information; this is tantamount to saying that both players, together, took the measurement of the initial state and the outcome was
This is a stochastic game.
The closed-form solution of Linear-Quadratic Dynamic Games with imperfect information proceeds as follows.
Using (5.1) and (5.2), we calculate the payoff function
The random variable at work is the initial state x 0. The players calculate the expected payoff function
The expected payoff function \(\overline{J}(u_{0},v_{0};\overline{x}_{0})\) is convex in u 0 and concave in v 0. Differentiation in u 0 and v 0 yields a coupled linear system in the decision variables u 0 and v 0. Its solution is obtained using the Schur complement concept and it yields the optimal P and E strategies. The following holds.
Theorem 5.2.
A necessary and sufficient condition for the existence of a solution to the zero-sum game (5.1) and (5.2) with imperfect information, that is, a game where the initial state information (5.11) is available to both P and E, is that condition (5.5) holds. The respective optimal P and E strategies are given by (5.6) and (5.7) , where x 0 is replaced by \(\overline{x}_{0}\) . The value of the game is
where, as before, the real symmetric matrix P 1 is given by (5.9).
Similar to LQG optimal control, in the game with imperfect information the separation principle/certainty equivalence holds.
5 Linear-Quadratic Gaussian Game with Private Information
The initial state x 0 features in the payoff function (5.12). The players’ information on the initial state x 0 is now private information: Player P believes the initial state to be
whereas player E believes the initial state to be
This is tantamount to stipulating that players P and E took separate measurements of the initial state x 0. Assuming that the quality of the players’ instruments used to take the measurements is public knowledge—we refer to the measurement error covariances \(P_{0}^{(E)}\) and \(P_{0}^{(E)}\)—the private information of the players P and E are their respective measurements, \(\overline{x}_{0}^{(P)}\) and \(\overline{x}_{0}^{(E)}\). The measurement recorded by player E, \(\overline{x}_{0}^{(E)}\), is his private information and is not shared with player P. Hence, as far as player P is concerned, an E player with the private information \(\overline{x}_{0}^{(E)} = x\) is an E player of typex. Thus, the P player’s information on the game is incomplete. Similarly, the measurement recorded by player P, \(\overline{x}_{0}^{(P)}\) is his private information and is not shared with the E player. Therefore, as far as the E player is concerned, a player P with the private information \(\overline{x}_{0}^{(P)} = y\) is a P player of typey; also the E player’s information on the game is incomplete.
We are analyzing what appears to be a game with incomplete information. In the process of planning his strategy, the player’s opposition type is not known to him. However, although the information is incomplete, a Bayesian player can nevertheless assess, based on the private information available to him, the probability that the opposition he is facing is of a certain type. Consequently, the player can calculate the expectation of the payoff functional, conditioned on his private information.
The strategies available to player P are mappings \(f: {R}^{n} \rightarrow {R}^{m_{u}}\) from his information set into his actions set; thus, the action of player P is
Similarly, the strategies available to the E player are mappings \(g: {R}^{n} \rightarrow {R}^{m_{v}}\) from his information set into his actions set; thus, the action of player E is
From player P’s vantage point, the action v 0 of player E is a random variable because from player P’s vantage point, the measurement \(\overline{x}_{0}^{(E)}\) used by player E to form his control v 0, is a random variable. Similarly, from player E’s vantage point, the action u 0 of player P is a random variable.
Consider the decision process of player P whose private information is \(\overline{x}_{0}^{(P)}\).
From player P’s perspective, the random variables at work are x 0 and \(\overline{x}_{0}^{(E)}\). Player P is confronted with a stochastic optimization problem and he calculates the expectation of the payoff function (5.12), conditional on his private information \(\overline{x}_{0}^{(P)}\),
It is important to realize that by using in the calculation of his expected cost in (5.19) player’s E strategy \(g(\overline{x}_{0}^{(E)})\), rather than player E’s controlv 0, player P has eliminated the possibility of an infinite regress in reciprocal reasoning. Thus, player P calculates
Player P calculates the expectations with respect to the random variable \(\overline{x}_{0}^{(E)}\), which feature in (5.20). To this end, player P models his measurement \(\overline{x}_{0}^{(P)}\) of the initial state x 0, and player E’s measurement \(\overline{x}_{0}^{(E)}\) of the initial state x 0, as follows.
where x 0 is the true initial state and w P is player P’s measurement error, whose statistics are
Similarly, player E’s measurement
where x 0 is the true initial state and w E is player E’s measurement error, whose statistics are
Furthermore, the Gaussian random variables w P and w E are independent.
From player P’s point of view, \(\overline{x}_{0}^{(E)}\) is a random variable, but \(\overline{x}_{0}^{(P)}\) is not. Subtracting (5.21) from (5.22), player P concludes that as far as he is concerned, player E’s measurement upon which he will decide on his control v 0 is the random variable
where the random variable
in other words
Consider now the calculation of the expectations which feature in (5.20).
where the random variable
Similarly, the expectation
In addition, since
the expectation
Inserting (5.26), (5.28), and (5.30) into (5.20) yields the expression for player P’s expected cost in response to player E’s strategy g(⋅ ), as a function of his decision variable u 0,
Consider now the decision process of player E whose private information is \(\overline{x}_{0}^{(E)}\).
From player E’s perspective, the random variables at work are x 0 and \(\overline{x}_{0}^{(P)}\). Player E is confronted with a stochastic optimization problem and he calculates the expectation of the payoff function (5.12), conditioned on his private information \(\overline{x}_{0}^{(E)}\),
As before, it is important to realize that by using in the calculation of his expected cost in (5.32) player P’s strategy \(f(\overline{x}_{0}^{(P)})\), rather than player P’s decision variable u 0, player E has eliminated the possibility of an infinite regress in reciprocal reasoning. Thus, player E calculates
Player E calculates the expectations with respect to the random variable \(\overline{x}_{0}^{(P)}\), which feature in (5.33). To this end, player E models his measurement \(\overline{x}_{0}^{(E)}\) of the initial state x 0 using (5.22), and he models player P’s measurement \(\overline{x}_{0}^{(P)}\) of the initial state x 0 using (5.21).
From player E’s point of view, \(\overline{x}_{0}^{(P)}\) is a random variable, but \(\overline{x}_{0}^{(E)}\) is not. Subtracting (5.22) from (5.21), player E concludes that as far as he is concerned, player P’s measurement upon which he will decide on his control u 0 is the random variable
In other words
Consider now the calculation of the expectations which feature in (5.33).
Similarly, the expectation
In addition, since
the expectation
Inserting (5.36), (5.37), and (5.39) into (5.33) yields the expression for player E’s expected payoff in response to player P’s strategy f(⋅ ), as a function of his decision variable v 0,
The cost of player P is now given by (5.31) and the payoff of Player E is given by (5.40). Imperfect information leads to a nonzero-sum game formulation. Consequently, one id interested in a Nash equilibrium, a.k.a., Person By Person Satisfactory (PBPS) strategies.
Next, player P calculates his response to player E’s strategy \(g(\overline{x}_{0}^{(E)})\). Thus, given the information \(\overline{x}_{0}^{(P)}\), player P minimizes his expected cost (5.31); the minimization is performed in the decisionvariableu 0. The cost function is quadratic in the decision variable. Thus, the optimal decision variable \(u_{0}^{{\ast}}\) must satisfy the equation
In other words, the optimal response of player P to player E’s strategy g(⋅ ) is
Similarly, player E calculates his optimal response to player P’s strategy \(f(\overline{x}_{0}^{(P)})\). Thus, given the information \(\overline{x}_{0}^{(E)}\), player E maximizes his expected payoff (5.40); the maximization is performed in the decision variable v 0. The cost function is quadratic in the decision variable. Thus, the optimal decision variable \(v_{0}^{{\ast}}\) must satisfy the equation
In other words, the optimal response of player E to player P’s strategy f(⋅ ) is
Hence, the respective optimal strategies f ∗(⋅ ) and g ∗(⋅ ) of players P and E satisfy the set of two coupled equations (5.41) and (5.42),
The expectation
It is convenient to use the notation for the multivariate Gaussian distribution with covariance P ( > 0),
whereupon
Similarly, the expectation
Using the convolution notation in (5.41) and (5.42), one obtains
Thus, the functions
and
Inserting (5.44) into (5.43) and suppressing the dependence of the Gaussian p.d.f. on the covariance matrix yields
Similarly, inserting (5.43) into (5.44) yields
The convolution operation is associative. We shall require the following
Lemma 5.1.
The Gaussian kernel is self-similar, namely,
Proof.
We calculate
Hence
□
We also calculate
Inserting (5.47) and (5.48) into (5.45) yields a Fredholm integral equation of the second kind for the optimal strategy of player P,
Similarly, inserting (5.47) and (5.48) into (5.46) yields a Fredholm integral equation of the second kind for the optimal strategy of player E,
The Fredholm equations of the second kind (5.49) and (5.50) are of the convolution type and the kernel is a Gaussian function.
If the state’s measurement error covariances are “small,” namely, \(P_{0}^{(P)} < 1\) and \(P_{0}^{(E)} < 1\) and therefore the Gaussian distribution approaches a delta function, from (5.49) and (5.50) we conclude that the P and E strategies satisfy the equations
and
From (5.51) and (5.52) we therefore obtain players’ P and E optimal strategies, which are explicitly given by
that is, the optimal strategy of player P is
Similarly,
that is, the optimal strategy of player E is
where the Schur complement
Indeed, having calculated the functions f ∗(x) and g ∗(x), we obtained the optimal strategies of players P and E by setting \(x:= \overline{x}_{0}^{(P)}\) in f ∗(x) and \(x:= \overline{x}_{0}^{(E)}\) in g ∗(x). In the limiting case of Gaussian distributions with small covariance matrices, the players’ optimal strategies (5.53) and (5.54) are linear in the players’ respective measurements.
Equations (5.53) and (5.54) are identical to the respective (5.6) and (5.7) in Theorem 5.1—we have recovered the perfect information result of Theorem 5.1. This makes sense—the initial state’s measurements of both players are very accurate and thus the game is almost deterministic. Thus, one could have argued that when the covariances are “small,” namely, \(P_{0}^{(P)} << 1\) and P 0 (E) < < 1, that is, \(\overline{x}_{0}^{(P)} \approx \overline{x}_{0}^{(E)} \approx x_{0}\), one can re-use the deterministic state feedback strategies (5.6) and (5.7) of players P and E given by Theorem 5.1—simply set \(x_{0}:= \overline{x}_{0}^{(P)}\) in (5.6) and \(x_{0}:= \overline{x}_{0}^{(E)}\) in (5.7).
6 Linear Strategies
The Fredholm integral equations of the second kind, (5.49) and (5.50), are linear integral equations. Furthermore, the “forcing terms”/inputs on the R.H.S. of (5.49) and (5.50) are linear in \(\overline{x}_{0}^{(P)}\) and \(\overline{x}_{0}^{(E)}\), respectively. Consequently, the solution f ∗(⋅ ) of (5.49) is linear in \(\overline{x}_{0}^{(P)}\) and the solution g ∗(⋅ ) of (5.50) is linear in \(\overline{x}_{0}^{(E)}\)—think of linear integral operators as infinite dimensional matrices. Hence, postulate that the players’ optimal strategies are linear—in other words,
and
where the yet to be determined constant gains F u and F v are m u × n and m v × n matrices, respectively. Constant gain strategies (5.56) and (5.57) which satisfy the respective second kind Fredholm integral equations of the convolution type with a Gaussian kernel, (5.49) and (5.50), can be found. This is due to the fact that, according to (5.48), the convolution of the state vector x with a Gaussian function returns the state vector x. In the process of deriving the equations which yield the gains F u and F v , the necessary and sufficient conditions for the existence of a solution are obtained.
The optimal gains F u ∗ and F v ∗ are obtained as follows. Insert (5.56) into (5.49) and insert (5.57) into (5.50):
Therefore
Similarly
and
We have found constant gain strategies \(F_{u}^{{\ast}}\) and \(F_{v}^{{\ast}}\) which satisfy the respective second kind Fredholm integral equations of the convolution type with a Gaussian kernel, (5.49) and (5.50). This is due to the fact that, according to (5.48), the convolution of the state vector x with a Gaussian function returns the state vector x. Furthermore, (5.48) holds, irrespective of the covariance P. Hence, the constant gains are not dependent on the cumulative covariance P of the measurement errors and also apply in the limiting case of a deterministic scenario—in other words, the optimal constant gains F u and F v are exactly as in (5.6) and (5.7), and certainty equivalence holds. Having obtained the optimal strategies, one can now calculate the respective value functions of players P and E by evaluating the expectations in (5.31) and (5.40):
Consider (5.31), the expected cost \({\overline{J}}^{(P)}(u_{0},g(\cdot );\overline{x}_{0}^{(P)})\) of player P first. The expectations
and
Inserting (5.60)–(5.62) into (5.31) yields, with some abuse of notation, the value function of player P,
Also, in (5.63)
Inserting (5.59) and (5.64) into (5.63) yields the value function of player P. The value function \(V _{0}^{(P)}(\overline{x}_{0}^{(P)})\) of player P is quadratic in \(\overline{x}_{0}^{(P)}\). It is of the form
where M is an n × n real, symmetric matrix and c (P) is a constant. While the matrix M is complex in appearance, note that it is not dependent on the covariances \(P_{0}^{(P)}\) and \(P_{0}^{(E)}\) of the players’ state measurement errors. Hence, we conclude that the matrix
where the matrix P 1 is given by (5.11). The constant
where the gain \(F_{v}^{{\ast}}\) is given by (5.59).
Next, consider the expected cost \({\overline{J}}^{(E)}(f(\cdot ),v_{0};\overline{x}_{0}^{(E)})\) of player E, (5.40). The expectations
and
Inserting (5.66)–(5.68) into (5.40) yields the value function of player E
Also, in (5.69),
Inserting (5.58) and (5.70) into (5.69) yields the value function of player E. The value function \(V _{0}^{(E)}(\overline{x}_{0}^{(E)})\) of player E is quadratic in \(\overline{x}_{0}^{(E)}\). Similar to the value function of player P, it is of the form
The constant
where the gain F u ∗ is given by (5.58).
These results are summarized in
Theorem 5.3.
A necessary and sufficient condition for the existence of a solution to the game (5.1) and (5.2) where the players have private information, that is, a game where the initial state information of player P is specified in (5.15) and the initial state information of player E is specified in (5.16) is that condition (5.5) holds. Certainty equivalence holds and the optimal P strategy is given by (5.6) where x 0 is replaced by \(\overline{x}_{0}^{(P)}\) and the optimal E strategy is given by (5.7) where x 0 is replaced by \(\overline{x}_{0}^{(E)}\) . The value function of player P is
and the value function of player E
The matrix P 1 in (5.72) and (5.73) is specified by (5.9) and the constant terms in (5.72) and (5.73) ,c (P) and c (E) , are specified in (5.65) and (5.71) , respectively.
7 Conclusion
A static two-player linear-quadratic game where the players have private information on the game’s parameter, is addressed. The players have private information, however each player is able to formulate an expression for his expected payoff, without the need, a la Harsanyi, to provide a prior probability distribution function of the game’s parameter, and without recourse to the player Nature. Hence, the closed-form solution of the game is possible. It is shown that in this special case of a one-stage linear-quadratic game where the players have private information, the solution is similar in structure to the solution of the game with complete information, namely, the deterministic linear-quadratic game, and the solution of the linear-quadratic game with partial information, where the information about the game’s parameter is shared by the players. The principle of certainty equivalence holds. The analysis in this paper shows the way to possible extensions of the theory to multi-stage linear-quadratic dynamic games.
References
Başar T, Bernhard P (2008) H ∞—Optimal control and related minimax design problems: A dynamic game approach. Birkhäuser, Boston
Başar T, Olsder GJ (1995) Dynamic non-cooperative game theory. Academic Press, San Diego, CA
Engwerda J (2005) LQ Dynamic optimization and differential games. Wiley, Chichester
Pachter M, Pham K (2010) Discrete-time linear-quadratic dynamic games. J Optim Theory Appl 146(1):151–179
Pachter M, Pham K (2013) Static teams and stochastic games. In: Sorokin A, Pardalos PM (eds) Dynamics of information systems: Algorithmic approaches. Springer Proceedings in Mathematics & Statistics, vol 51. Springer, New York
Zhang F (2005) The Schur complement and its applications. Springer, New York
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Pachter, M. (2013). Static Linear-Quadratic Gaussian Games. In: Křivan, V., Zaccour, G. (eds) Advances in Dynamic Games. Annals of the International Society of Dynamic Games, vol 13. Birkhäuser, Cham. https://doi.org/10.1007/978-3-319-02690-9_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-02690-9_5
Published:
Publisher Name: Birkhäuser, Cham
Print ISBN: 978-3-319-02689-3
Online ISBN: 978-3-319-02690-9
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)