Keywords

1 Introduction

This paper is a first step in an attempt at bringing closer together the dynamic games paradigm and the theory of games, which historically have developed along separate lines. Dynamic game theorists have traditionally emphasized control theoretic aspects and the backward induction/dynamic programming solution method, whereas game theorists have focused on information economics, that is, the role of information in games.

Linear-Quadratic Dynamic Games (LQDG) with perfect information have received a great deal of attention (Başar and Bernhard 2008; Başar and Olsder 1995; Engwerda 2005). In these works, the concepts of state, and state feedback, are emphasized and the solution method entails backward induction, a.k.a., dynamic programming. In previous work (Pachter and Pham 2013) a static LQG team problem was addressed. In this paper a static LQDG, where each player has private information, is considered. Specifically, the simplest linear-quadratic game with incomplete/partial information is addressed: a one-stage, two-player, “zero-sum,” Linear-Quadratic Gaussian Game (LQGG) is solved.

In this paper a simple static linear-quadratic game where the players have private information, however each players is able to formulate an expression for his expected payoff, without the need to provide a prior probability distribution function of the game’s parameter and without recourse to the player Nature, is analyzed. Thus, in Sect. 5.2 the static linear-quadratic Gaussian game, where the players have private information, is introduced. The solution of the baseline game with perfect information is given in Sect. 5.3 and the solution of the game with imperfect information is given in Sect. 5.4. The scenario where the players have private information is analyzed in Sect. 5.5, and the complete solution of the game is given in Sect. 5.6. Concluding remarks are made in Sect. 5.7.

2 LQGG Problem Statement

The following linear-quadratic game, a static, two-player, “zero-sum” game, is considered. The players are P and E and their respective control variables are u and v. It is a one-stage game with linear “dynamics”

$$\displaystyle\begin{array}{rcl} x_{1} = Ax_{0} + Bu_{0} + Cv_{0}\,\ \ x_{0} \equiv x_{0}\,& &{}\end{array}$$
(5.1)

where the state \(x_{0},x_{1} \in {R}^{n}\). The P and E players’ controls are \(u \in {R}^{m_{u}}\) and \(v \in {R}^{m_{v}}\). The payoff function is quadratic:

$$\displaystyle\begin{array}{rcl} J = x_{1}^{T}Q_{ F}x_{1} + u_{0}^{T}R_{ u}u_{0} - v_{0}^{T}R_{ v}v_{0}& &{}\end{array}$$
(5.2)

where the Q F , R u , and R v weighing matrices are real, symmetric, and positive definite. Both players are cognizant of the A, B, C, Q F , R u , and R v data.

Player P strives to minimize the payoff/cost function (5.2) and player E strives to maximize the payoff (5.2).

The initial state information available to player P is

$$\displaystyle\begin{array}{rcl} x_{0} \sim \mathcal{N}(\overline{x}_{0}^{(P)},P_{ 0}^{(P)})\,& &{}\end{array}$$
(5.3)

where the vector \(\overline{x}_{0}^{(P)} \in {R}^{n}\) and the n × n covariance matrix P 0 (P) is real, symmetric, and positive definite. The initial state information available to player E is

$$\displaystyle\begin{array}{rcl} x_{0} \sim \mathcal{N}(\overline{x}_{0}^{(E)},P_{ 0}^{(E)})\,& &{}\end{array}$$
(5.4)

where the vector \(\overline{x}_{0}^{(E)} \in {R}^{n}\) and the n × n covariance matrix \(P_{0}^{(E)}\) is real, symmetric, and positive definite. The \(P_{0}^{(P)}\) and \(P_{0}^{(E)}\) data is public knowledge—only the \(\overline{x}_{0}^{(P)}\) and \(\overline{x}_{0}^{(E)}\) information is proprietary to the respective P and E players. This is tantamount to saying that players P and E took separate measurements of the initial state x 0, yet the accuracy of the instruments they used is known; however, the actual measurements \(\overline{x}_{0}^{(P)}\) and \(\overline{x}_{0}^{(E)}\) are the respective P and E players’ private information.

Since the pertinent random variables are Gaussian, we shall refer to the game (5.1)–(5.4) as a Linear-Quadratic Gaussian Game (LQGG).

3 Linear-Quadratic Game with Perfect Information

It is instructive to first analyze the perfect information version of the linear-quadratic game (5.1) and (5.2).

If the initial state x 0 is known to both players, we have a game with perfect information.

The closed-form solution of Linear-Quadratic Dynamic Games with perfect information, a.k.a., deterministic Linear-Quadratic Dynamic Games (LQDGs), is derived in Pachter and Pham (2010, Theorem 2.1). The Schur complement concept (Zhang 2005) was used in (Pachter and Pham 2010) to invert a blocked \((m_{u} + m_{v}) \times (m_{u} + m_{v})\) matrix and derive explicit formulae for the P and E players’ optimal strategies. The said matrix contains four blocks and its diagonal blocks are m u × m u and m u × m u matrices. One can improve on the results of Pachter and Pham (2010) by noting that a matrix with four blocks has two Schur complements, say S B and S C .

Concerning the linear-quadratic game (5.1) and (5.2), where the initial state/game parameter x 0 is known to both players and thus the game is a game with perfect information, the following holds.

Theorem 5.1.

A necessary and sufficient condition for the existence of a solution to the zero-sum game (5.1) and (5.2) with perfect information is

$$\displaystyle\begin{array}{rcl} R_{v} > {C}^{T}Q_{ F}C& &{}\end{array}$$
(5.5)

A Nash equilibrium/saddle point exists and the players’ optimal strategies are the linear state feedback control laws

$$\displaystyle\begin{array}{rcl} u_{0}^{{\ast}}(x_{ 0})& =& -S_{B}^{-1}(Q_{ F}){B}^{T}[I + Q_{ F}C{(R_{v} - {C}^{T}Q_{ F}C)}^{-1}{C}^{T}]Q_{ F}A \cdot x_{0}\, \\ v_{0}^{{\ast}}(x_{ 0})& =& {(R_{v} - {C}^{T}Q_{ F}C)}^{-1}{C}^{T}\{I - Q_{ F}BS_{B}^{-1}(Q_{ F}){B}^{T} \\ & & \qquad [I+ Q_{F}C{(R_{v} - {C}^{T}Q_{ F}C)}^{-1}{C}^{T}]\}Q_{ F}A \cdot x_{0} {}\end{array}$$
(5.6)

An alternative formula for the optimal strategy of player E is

$$\displaystyle\begin{array}{rcl}{ v}^{{\ast}}(x_{ 0}) = -S_{C}^{-1}(Q_{ F}){C}^{T}[I - Q_{ F}B{(R_{u} + {B}^{T}Q_{ F}B)}^{-1}{B}^{T}]Q_{ F}A\ x_{0}& &{}\end{array}$$
(5.7)

The value of the game

$$\displaystyle\begin{array}{rcl} V _{0}(x_{0}) = x_{0}^{T}P_{ 1}x_{0}\,& &{}\end{array}$$
(5.8)

where the matrix

$$\displaystyle\begin{array}{rcl} P_{1}& =& {A}^{T}\{Q_{ F}-Q_{F}[BS_{B}^{-1}(Q_{ F}){B}^{T}+BS_{ B}^{-1}(Q_{ F}){B}^{T}Q_{ F}C{(R_{v}-{C}^{T}Q_{ F}C)}^{-1}{C}^{T} \\ & +& C{(R_{v}-{C}^{T}Q_{ F}C)}^{-1}{C}^{T}Q_{ F}BS_{B}^{-1}(Q_{ F}){B}^{T} \\ & +& C{(R_{v}-{C}^{T}Q_{ F}C)}^{-1}{C}^{T}Q_{ F}BS_{B}^{-1}(Q_{ F}){B}^{T}Q_{ F}C{(R_{v}-{C}^{T}Q_{ F}C)}^{-1}{C}^{T} \\ & +& C{({C}^{T}Q_{ F}C-R_{v})}^{-1}{C}^{T}]Q_{ F}\}A {}\end{array}$$
(5.9)

In (5.6) and (5.9) ,

$$\displaystyle\begin{array}{rcl} S_{B}(Q_{F})& \equiv & {B}^{T}Q_{ F}B + R_{u} + {B}^{T}Q_{ F}C{(R_{v} - {C}^{T}Q_{ F}C)}^{-1}{C}^{T}Q_{ F}B{}\end{array}$$
(5.10)

is the first Schur complement of the blocked matrix and

$$\displaystyle\begin{array}{rcl} S_{C}(Q_{F}) \equiv -[R_{v} - {C}^{T}Q_{ F}C + {C}^{T}Q_{ F}B{(R_{u} + {B}^{T}Q_{ F}B)}^{-1}{B}^{T}Q_{ F}C]& & {}\\ \end{array}$$

is the second Schur complement of the blocked matrix.

Remark 5.1.

Using both Schur complements of the blocked matrix renders the respective P and E players’ strategies, (5.6) and (5.7), “symmetric.”

4 Linear-Quadratic Gaussian Game with Imperfect Information

If in (5.3) and (5.4) \(P_{0}^{(P)} = P_{0}^{(E)} = P_{0}\) and the P and E players’ information \(\overline{x}_{0}^{(P)} = \overline{x}_{0}^{(E)} = \overline{x}_{0}\) is public knowledge, we have on hand a linear-quadratic game with imperfect information; this is tantamount to saying that both players, together, took the measurement of the initial state and the outcome was

$$\displaystyle\begin{array}{rcl} x_{0} \sim \mathcal{N}(\overline{x}_{0},P_{0})& &{}\end{array}$$
(5.11)

This is a stochastic game.

The closed-form solution of Linear-Quadratic Dynamic Games with imperfect information proceeds as follows.

Using (5.1) and (5.2), we calculate the payoff function

$$\displaystyle\begin{array}{rcl} J(u_{0},v_{0};x_{0})& =& x_{0}^{T}{A}^{T}Q_{ F}Ax_{0} + u_{0}^{T}(R_{ u} + {B}^{T}Q_{ F}B)u_{0} - v_{0}^{T}(R_{ v} - {C}^{T}Q_{ F}C)v_{0} \\ & +& 2u_{0}^{T}{B}^{T}Q_{ F}Ax_{0} + 2v_{0}^{T}{C}^{T}Q_{ F}Ax_{0} + 2u_{0}^{T}{B}^{T}Q_{ F}Cv_{0} {}\end{array}$$
(5.12)

The random variable at work is the initial state x 0. The players calculate the expected payoff function

$$\displaystyle\begin{array}{rcl} \overline{J}(u_{0},v_{0};\overline{x}_{0})& \equiv & E_{x_{0}}\ (J(u_{0},v_{0};x_{0})\mid \overline{x}_{0}) \\ & =& \overline{x}_{0}^{T}{A}^{T}Q_{ F}A\overline{x}_{0} + \mathit{Trace}({A}^{T}Q_{ F}AP_{0}) + u_{0}^{T}(R_{ u} + {B}^{T}Q_{ F}B)u_{0} \\ & -& v_{0}^{T}(R_{ v} - {C}^{T}Q_{ F}C)v_{0} + 2u_{0}^{T}{B}^{T}Q_{ F}A\overline{x}_{0} + 2v_{0}^{T}{C}^{T}Q_{ F}A\overline{x}_{0} \\ & +& 2u_{0}^{T}{B}^{T}Q_{ F}Cv_{0} {}\end{array}$$
(5.13)

The expected payoff function \(\overline{J}(u_{0},v_{0};\overline{x}_{0})\) is convex in u 0 and concave in v 0. Differentiation in u 0 and v 0 yields a coupled linear system in the decision variables u 0 and v 0. Its solution is obtained using the Schur complement concept and it yields the optimal P and E strategies. The following holds.

Theorem 5.2.

A necessary and sufficient condition for the existence of a solution to the zero-sum game (5.1) and (5.2) with imperfect information, that is, a game where the initial state information (5.11) is available to both P and E, is that condition (5.5) holds. The respective optimal P and E strategies are given by (5.6) and (5.7) , where x 0 is replaced by \(\overline{x}_{0}\) . The value of the game is

$$\displaystyle\begin{array}{rcl} V _{0}(\overline{x}_{0}) = \overline{x}_{0}^{T}P_{ 1}\overline{x}_{0} + \mathit{Trace}({A}^{T}Q_{ F}AP_{0})\,& &{}\end{array}$$
(5.14)

where, as before, the real symmetric matrix P 1 is given by (5.9).

Similar to LQG optimal control, in the game with imperfect information the separation principle/certainty equivalence holds.

5 Linear-Quadratic Gaussian Game with Private Information

The initial state x 0 features in the payoff function (5.12). The players’ information on the initial state x 0 is now private information: Player P believes the initial state to be

$$\displaystyle\begin{array}{rcl} x_{0} \sim \mathcal{N}(\overline{x}_{0}^{(P)},P_{ 0}^{(P)})& &{}\end{array}$$
(5.15)

whereas player E believes the initial state to be

$$\displaystyle\begin{array}{rcl} x_{0} \sim \mathcal{N}(\overline{x}_{0}^{(E)},P_{ 0}^{(E)})& &{}\end{array}$$
(5.16)

This is tantamount to stipulating that players P and E took separate measurements of the initial state x 0. Assuming that the quality of the players’ instruments used to take the measurements is public knowledge—we refer to the measurement error covariances \(P_{0}^{(E)}\) and \(P_{0}^{(E)}\)—the private information of the players P and E are their respective measurements, \(\overline{x}_{0}^{(P)}\) and \(\overline{x}_{0}^{(E)}\). The measurement recorded by player E, \(\overline{x}_{0}^{(E)}\), is his private information and is not shared with player P. Hence, as far as player P is concerned, an E player with the private information \(\overline{x}_{0}^{(E)} = x\) is an E player of typex. Thus, the P player’s information on the game is incomplete. Similarly, the measurement recorded by player P, \(\overline{x}_{0}^{(P)}\) is his private information and is not shared with the E player. Therefore, as far as the E player is concerned, a player P with the private information \(\overline{x}_{0}^{(P)} = y\) is a P player of typey; also the E player’s information on the game is incomplete.

We are analyzing what appears to be a game with incomplete information. In the process of planning his strategy, the player’s opposition type is not known to him. However, although the information is incomplete, a Bayesian player can nevertheless assess, based on the private information available to him, the probability that the opposition he is facing is of a certain type. Consequently, the player can calculate the expectation of the payoff functional, conditioned on his private information.

The strategies available to player P are mappings \(f: {R}^{n} \rightarrow {R}^{m_{u}}\) from his information set into his actions set; thus, the action of player P is

$$\displaystyle\begin{array}{rcl} u_{0} = f(\overline{x}_{0}^{(P)})& &{}\end{array}$$
(5.17)

Similarly, the strategies available to the E player are mappings \(g: {R}^{n} \rightarrow {R}^{m_{v}}\) from his information set into his actions set; thus, the action of player E is

$$\displaystyle\begin{array}{rcl} v_{0} = g(\overline{x}_{0}^{(E)})& &{}\end{array}$$
(5.18)

From player P’s vantage point, the action v 0 of player E is a random variable because from player P’s vantage point, the measurement \(\overline{x}_{0}^{(E)}\) used by player E to form his control v 0, is a random variable. Similarly, from player E’s vantage point, the action u 0 of player P is a random variable.

Consider the decision process of player P whose private information is \(\overline{x}_{0}^{(P)}\).

From player P’s perspective, the random variables at work are x 0 and \(\overline{x}_{0}^{(E)}\). Player P is confronted with a stochastic optimization problem and he calculates the expectation of the payoff function (5.12), conditional on his private information \(\overline{x}_{0}^{(P)}\),

$$\displaystyle\begin{array}{rcl}{ \overline{J}}^{(P)}(u_{ 0},g(\cdot );\overline{x}_{0}^{(P)}) \equiv E_{ x_{0},\overline{x}_{0}^{(E)}}\ (J(u_{0},g(\overline{x}_{0}^{(E)});x_{ 0})\mid \overline{x}_{0}^{(P)})& &{}\end{array}$$
(5.19)

It is important to realize that by using in the calculation of his expected cost in (5.19) player’s E strategy \(g(\overline{x}_{0}^{(E)})\), rather than player E’s controlv 0, player P has eliminated the possibility of an infinite regress in reciprocal reasoning. Thus, player P calculates

$$\displaystyle\begin{array}{rcl}{ \overline{J}}^{(P)}(u_{ 0},g(\cdot );\overline{x}_{0}^{(P)})& =& {(\overline{x}_{ 0}^{(P)})}^{T}{A}^{T}Q_{ F}A\overline{x}_{0}^{(P)} + \mathit{Trace}({A}^{T}Q_{ F}AP_{0}^{(P)}) \\ & +& u_{0}^{T}(R_{ u} + {B}^{T}Q_{ F}B)u_{0} \\ & +& 2u_{0}^{T}{B}^{T}Q_{ F}A\overline{x}_{0}^{(P)} + 2E_{ x_{0},\overline{x}_{0}^{(E)}}\ ({g}^{T}(\overline{x}_{ 0}^{(E)}){C}^{T}Q_{ F}Ax_{0}\mid \overline{x}_{0}^{(P)}) \\ & -& E_{\overline{x}_{0}^{(E)}}\ ({g}^{T}(\overline{x}_{ 0}^{(E)})(R_{ v} - {C}^{T}Q_{ F}C)g(\overline{x}_{0}^{(E)})\mid \overline{x}_{ 0}^{(P)}) \\ & +& 2u_{0}^{T}{B}^{T}Q_{ F}CE_{\overline{x}_{0}^{(E)}}\ (g(\overline{x}_{0}^{(E)})\mid \overline{x}_{ 0}^{(P)}) {}\end{array}$$
(5.20)

Player P calculates the expectations with respect to the random variable \(\overline{x}_{0}^{(E)}\), which feature in (5.20). To this end, player P models his measurement \(\overline{x}_{0}^{(P)}\) of the initial state x 0, and player E’s measurement \(\overline{x}_{0}^{(E)}\) of the initial state x 0, as follows.

$$\displaystyle\begin{array}{rcl} \overline{x}_{0}^{(P)} = x_{ 0} + w_{P}\,& &{}\end{array}$$
(5.21)

where x 0 is the true initial state and w P is player P’s measurement error, whose statistics are

$$\displaystyle\begin{array}{rcl} w_{P} \sim \mathcal{N}(0,P_{0}^{(P)})& & {}\\ \end{array}$$

Similarly, player E’s measurement

$$\displaystyle\begin{array}{rcl} \overline{x}_{0}^{(E)} = x_{ 0} + w_{E}\,& &{}\end{array}$$
(5.22)

where x 0 is the true initial state and w E is player E’s measurement error, whose statistics are

$$\displaystyle\begin{array}{rcl} w_{E} \sim \mathcal{N}(0,P_{0}^{(E)})& & {}\\ \end{array}$$

Furthermore, the Gaussian random variables w P and w E are independent.

From player P’s point of view, \(\overline{x}_{0}^{(E)}\) is a random variable, but \(\overline{x}_{0}^{(P)}\) is not. Subtracting (5.21) from (5.22), player P concludes that as far as he is concerned, player E’s measurement upon which he will decide on his control v 0 is the random variable

$$\displaystyle\begin{array}{rcl} \overline{x}_{0}^{(E)} = \overline{x}_{ 0}^{(P)} +\tilde{ w}\,& &{}\end{array}$$
(5.23)

where the random variable

$$\displaystyle\begin{array}{rcl} \tilde{w} \equiv w_{E} - w_{P}\;& &{}\end{array}$$
(5.24)

in other words

$$\displaystyle\begin{array}{rcl} \overline{x}_{0}^{(E)} \sim \mathcal{N}(\overline{x}_{ 0}^{(P)},P_{ 0}^{(P)} + P_{ 0}^{(E)})& &{}\end{array}$$
(5.25)

Consider now the calculation of the expectations which feature in (5.20).

$$\displaystyle\begin{array}{rcl} E_{\overline{x}_{0}^{(E)}}\ (g(\overline{x}_{0}^{(E)})\mid \overline{x}_{ 0}^{(P)}) = E_{\tilde{ w}}\ (g(\overline{x}_{0}^{(P)} +\tilde{ w}))& &{}\end{array}$$
(5.26)

where the random variable

$$\displaystyle\begin{array}{rcl} \tilde{w} \sim \mathcal{N}(0,P_{0}^{(P)} + P_{ 0}^{(E)})& &{}\end{array}$$
(5.27)

Similarly, the expectation

$$\displaystyle\begin{array}{rcl} E_{\overline{x}_{0}^{(E)}}\ ({g}^{T}(\overline{x}_{ 0}^{(E)})(R_{ v} - {C}^{T}Q_{ F}C)g(\overline{x}_{0}^{(E)})\mid \overline{x}_{ 0}^{(P)})& =& E_{\tilde{ w}}\ ({g}^{T}(\overline{x}_{ 0}^{(P)} +\tilde{ w})(R_{ v} \\ & -& {C}^{T}Q_{ F}C)g(\overline{x}_{0}^{(P)} +\tilde{ w})){}\end{array}$$
(5.28)

In addition, since

$$\displaystyle\begin{array}{rcl} x_{0} = \overline{x}_{0}^{(P)} - w_{ P}\,& &{}\end{array}$$
(5.29)

the expectation

$$\displaystyle\begin{array}{rcl} & E_{x_{0},\overline{x}_{0}^{(E)}}\ ({g}^{T}(\overline{x}_{0}^{(E)}){C}^{T}Q_{F}Ax_{0}\mid \overline{x}_{0}^{(P)}) = E_{w_{E},w_{P}}\ ({g}^{T}(\overline{x}_{0}^{(P)}\! +\! w_{E}\! -\! w_{P}){C}^{T}Q_{F}A(\overline{x}_{0}^{(P)}\!\! -\! w_{P}))& \\ & \qquad \qquad \qquad \qquad = E_{\tilde{w}}\ ({g}^{T}(\overline{x}_{0}^{(P)} +\tilde{ w})){C}^{T}Q_{F}A\overline{x}_{0}^{(P)} & \\ & \qquad \qquad \qquad \qquad \qquad \qquad - E_{w_{E},w_{P}}\ ({g}^{T}(\overline{x}_{0}^{(P)} + w_{E} - w_{P}){C}^{T}Q_{F}Aw_{P}) &{}\end{array}$$
(5.30)

Inserting (5.26), (5.28), and (5.30) into (5.20) yields the expression for player P’s expected cost in response to player E’s strategy g(⋅ ), as a function of his decision variable u 0,

$$\displaystyle\begin{array}{rcl}{ \overline{J}}^{(P)}(u_{ 0},g(\cdot );\overline{x}_{0}^{(P)})& =& {(\overline{x}_{ 0}^{(P)})}^{T}{A}^{T}Q_{ F}A\overline{x}_{0}^{(P)} + \mathit{Trace}({A}^{T}Q_{ F}AP_{0}^{(P)}) \\ & +& u_{0}^{T}(R_{ u} + {B}^{T}Q_{ F}B)u_{0} \\ & +& 2u_{0}^{T}{B}^{T}Q_{ F}A\overline{x}_{0}^{(P)} + 2E_{\tilde{ w}}\ ({g}^{T}(\overline{x}_{ 0}^{(P)} +\tilde{ w})){C}^{T}Q_{ F}A\overline{x}_{0}^{(P)} \\ & -& 2E_{w_{E},w_{P}}\ ({g}^{T}(\overline{x}_{ 0}^{(P)} + w_{ E} - w_{P}){C}^{T}Q_{ F}Aw_{P}) \\ & -& E_{\tilde{w}}\ ({g}^{T}(\overline{x}_{ 0}^{(P)} +\tilde{ w})(R_{ v} - {C}^{T}Q_{ F}C)g(\overline{x}_{0}^{(P)} +\tilde{ w})) \\ & +& 2u_{0}^{T}{B}^{T}Q_{ F}CE_{\tilde{w}}\ (g(\overline{x}_{0}^{(P)} +\tilde{ w})) {}\end{array}$$
(5.31)

Consider now the decision process of player E whose private information is \(\overline{x}_{0}^{(E)}\).

From player E’s perspective, the random variables at work are x 0 and \(\overline{x}_{0}^{(P)}\). Player E is confronted with a stochastic optimization problem and he calculates the expectation of the payoff function (5.12), conditioned on his private information \(\overline{x}_{0}^{(E)}\),

$$\displaystyle\begin{array}{rcl}{ \overline{J}}^{(E)}(f(\cdot ),v_{ 0};\overline{x}_{0}^{(E)}) \equiv E_{ x_{0},\overline{x}_{0}^{(P)}}\ (J(f(\overline{x}_{0}^{(P)}),v_{ 0};x_{0})\mid \overline{x}_{0}^{(E)})& &{}\end{array}$$
(5.32)

As before, it is important to realize that by using in the calculation of his expected cost in (5.32) player P’s strategy \(f(\overline{x}_{0}^{(P)})\), rather than player P’s decision variable u 0, player E has eliminated the possibility of an infinite regress in reciprocal reasoning. Thus, player E calculates

$$\displaystyle\begin{array}{rcl}{ \overline{J}}^{(E)}(f(\cdot ),v_{ 0};\overline{x}_{0}^{(E)})& =& {(\overline{x}_{ 0}^{(E)})}^{T}{A}^{T}Q_{ F}A\overline{x}_{0}^{(E)} + \mathit{Trace}({A}^{T}Q_{ F}AP_{0}^{(E)}) \\ & -& v_{0}^{T}(R_{ v} - {C}^{T}Q_{ F}C)v_{0} \\ & +& 2v_{0}^{T}{C}^{T}Q_{ F}A\overline{x}_{0}^{(E)} + E_{\overline{x}_{ 0}^{(P)}}\ ({f}^{T}(\overline{x}_{ 0}^{(P)})(R_{ u} + {B}^{T}Q_{ F}B)f(\overline{x}_{0}^{(P)})\mid \overline{x}_{ 0}^{(E)}) \\ & +& 2E_{x_{0},\overline{x}_{0}^{(P)}}\ ({f}^{T}(\overline{x}_{ 0}^{(P)}){B}^{T}Q_{ F}Ax_{0}\mid \overline{x}_{0}^{(E)}) \\ & +& 2v_{0}^{T}{C}^{T}Q_{ F}BE_{\overline{x}_{0}^{(P)}}\ (f(\overline{x}_{0}^{(P)})\mid \overline{x}_{ 0}^{(E)}) {}\end{array}$$
(5.33)

Player E calculates the expectations with respect to the random variable \(\overline{x}_{0}^{(P)}\), which feature in (5.33). To this end, player E models his measurement \(\overline{x}_{0}^{(E)}\) of the initial state x 0 using (5.22), and he models player P’s measurement \(\overline{x}_{0}^{(P)}\) of the initial state x 0 using (5.21).

From player E’s point of view, \(\overline{x}_{0}^{(P)}\) is a random variable, but \(\overline{x}_{0}^{(E)}\) is not. Subtracting (5.22) from (5.21), player E concludes that as far as he is concerned, player P’s measurement upon which he will decide on his control u 0 is the random variable

$$\displaystyle\begin{array}{rcl} \overline{x}_{0}^{(P)} = \overline{x}_{ 0}^{(E)} -\tilde{ w}& &{}\end{array}$$
(5.34)

In other words

$$\displaystyle\begin{array}{rcl} \overline{x}_{0}^{(P)} \sim \mathcal{N}(\overline{x}_{ 0}^{(E)},P_{ 0}^{(P)} + P_{ 0}^{(E)})& &{}\end{array}$$
(5.35)

Consider now the calculation of the expectations which feature in (5.33).

$$\displaystyle\begin{array}{rcl} E_{\overline{x}_{0}^{(P)}}\ (f(\overline{x}_{0}^{(P)})\mid \overline{x}_{ 0}^{(E)}) = E_{\tilde{ w}}\ (f(\overline{x}_{0}^{(E)} -\tilde{ w}))& &{}\end{array}$$
(5.36)

Similarly, the expectation

$$\displaystyle\begin{array}{rcl} E_{\overline{x}_{0}^{(P)}}\ ({f}^{T}(\overline{x}_{ 0}^{(P)})(R_{ u}+{B}^{T}Q_{ F}B)f(\overline{x}_{0}^{(P)})\mid \overline{x}_{ 0}^{(E)})& =& E_{\tilde{ w}}\ ({f}^{T}(\overline{x}_{ 0}^{(E)}-\tilde{w})(R_{ u} \\ & +& {B}^{T}Q_{ F}B)f(\overline{x}_{0}^{(E)}-\tilde{w}))\qquad {}\end{array}$$
(5.37)

In addition, since

$$\displaystyle\begin{array}{rcl} x_{0} = \overline{x}_{0}^{(E)} - w_{ E}\,& &{}\end{array}$$
(5.38)

the expectation

$$\displaystyle\begin{array}{rcl} E_{x_{0},\overline{x}_{0}^{(P)}}\ ({f}^{T}(\overline{x}_{ 0}^{(P)}){B}^{T}Q_{ F}Ax_{0}\mid \overline{x}_{0}^{(E)})& =& E_{ w_{E},w_{P}}\ ({f}^{T}(\overline{x}_{ 0}^{(E)}+w_{ P}-w_{E}){B}^{T} \\ & & \qquad \qquad Q_{F}A(\overline{x}_{0}^{(E)}-w_{ E})) \\ & =& E_{\tilde{w}}\ ({f}^{T}(\overline{x}_{ 0}^{(E)}-\tilde{w})){B}^{T}Q_{ F}A\overline{x}_{0}^{(E)} \\ & -& E_{w_{E},w_{P}}\ ({f}^{T}(\overline{x}_{ 0}^{(E)}\!+\!w_{ P} - w_{E}){B}^{T}Q_{ F}Aw_{E}) \\ & & {}\end{array}$$
(5.39)

Inserting (5.36), (5.37), and (5.39) into (5.33) yields the expression for player E’s expected payoff in response to player P’s strategy f(⋅ ), as a function of his decision variable v 0,

$$\displaystyle\begin{array}{rcl}{ \overline{J}}^{(E)}(f(\cdot ),v_{ 0};\overline{x}_{0}^{(E)})& =& {(\overline{x}_{ 0}^{(E)})}^{T}{A}^{T}Q_{ F}A\overline{x}_{0}^{(E)} + \mathit{Trace}({A}^{T}Q_{ F}AP_{0}^{(E)}) \\ & -& v_{0}^{T}(R_{ v} - {C}^{T}Q_{ F}C)v_{0} + 2v_{0}^{T}{C}^{T}Q_{ F}A\overline{x}_{0}^{(E)} \\ & +& E_{\tilde{w}}\ ({f}^{T}(\overline{x}_{ 0}^{(E)} -\tilde{ w})(R_{ u} + {B}^{T}Q_{ F}B)f(\overline{x}_{0}^{(E)} -\tilde{ w})) \\ & +& 2E_{\tilde{w}}\ ({f}^{T}(\overline{x}_{ 0}^{(E)} -\tilde{ w})){B}^{T}Q_{ F}A\overline{x}_{0}^{(E)} \\ & -& 2E_{w_{E},w_{P}}\ ({f}^{T}(\overline{x}_{ 0}^{(E)} + w_{ P} - w_{E}){B}^{T}Q_{ F}Aw_{E}) \\ & +& 2v_{0}^{T}{C}^{T}Q_{ F}BE_{\tilde{w}}\ (f(\overline{x}_{0}^{(E)} -\tilde{ w})) {}\end{array}$$
(5.40)

The cost of player P is now given by (5.31) and the payoff of Player E is given by (5.40). Imperfect information leads to a nonzero-sum game formulation. Consequently, one id interested in a Nash equilibrium, a.k.a., Person By Person Satisfactory (PBPS) strategies.

Next, player P calculates his response to player E’s strategy \(g(\overline{x}_{0}^{(E)})\). Thus, given the information \(\overline{x}_{0}^{(P)}\), player P minimizes his expected cost (5.31); the minimization is performed in the decisionvariableu 0. The cost function is quadratic in the decision variable. Thus, the optimal decision variable \(u_{0}^{{\ast}}\) must satisfy the equation

$$\displaystyle\begin{array}{rcl} u_{0}^{{\ast}} = -{(R_{ u} + {B}^{T}Q_{ F}B)}^{-1}{B}^{T}Q_{ F}(A\overline{x}_{0}^{(P)} + CE_{\tilde{ w}}\ (g(\overline{x}_{0}^{(P)} +\tilde{ w})\ ))& & {}\\ \end{array}$$

In other words, the optimal response of player P to player E’s strategy g(⋅ ) is

$$\displaystyle\begin{array}{rcl}{ f}^{{\ast}}(\overline{x}_{ 0}^{(P)})=-{(R_{ u}+{B}^{T}Q_{ F}B)}^{-1}{B}^{T}Q_{ F}(A\overline{x}_{0}^{(P)}+CE_{\tilde{ w}}\ (g(\overline{x}_{0}^{(P)}+\tilde{w})\ ))\ \ \forall \ \overline{x}_{ 0}^{(P)} \in {R}^{n}& & {}\\ \end{array}$$

Similarly, player E calculates his optimal response to player P’s strategy \(f(\overline{x}_{0}^{(P)})\). Thus, given the information \(\overline{x}_{0}^{(E)}\), player E maximizes his expected payoff (5.40); the maximization is performed in the decision variable v 0. The cost function is quadratic in the decision variable. Thus, the optimal decision variable \(v_{0}^{{\ast}}\) must satisfy the equation

$$\displaystyle\begin{array}{rcl} v_{0}^{{\ast}} = {(R_{ v} - {C}^{T}Q_{ F}C)}^{-1}{C}^{T}Q_{ F}(A\overline{x}_{0}^{(E)} + BE_{\tilde{ w}}\ (f(\overline{x}_{0}^{(E)} -\tilde{ w})\ ))& & {}\\ \end{array}$$

In other words, the optimal response of player E to player P’s strategy f(⋅ ) is

$$\displaystyle\begin{array}{rcl}{ g}^{{\ast}}(\overline{x}_{ 0}^{(E)})={(R_{ v}-{C}^{T}Q_{ F}C)}^{-1}{C}^{T}Q_{ F}(A\overline{x}_{0}^{(E)}+BE_{\tilde{ w}}\ (f(\overline{x}_{0}^{(E)}-\tilde{w})\ ))\ \ \forall \ \overline{x}_{ 0}^{(E)} \in {R}^{n}& & {}\\ \end{array}$$

Hence, the respective optimal strategies f (⋅ ) and g (⋅ ) of players P and E satisfy the set of two coupled equations (5.41) and (5.42),

$$\displaystyle\begin{array}{rcl} & & {f}^{{\ast}}(\overline{x}_{ 0}^{(P)}) = -{(R_{ u} + {B}^{T}Q_{ F}B)}^{-1}{B}^{T}Q_{ F}(A\overline{x}_{0}^{(P)} + CE_{\tilde{ w}}\ ({g}^{{\ast}}(\overline{x}_{ 0}^{(P)} +\tilde{ w})\ )) \\ & & \qquad \forall \ \overline{x}_{0}^{(P)} \in {R}^{n} {}\end{array}$$
(5.41)
$$\displaystyle\begin{array}{rcl} & & {g}^{{\ast}}(\overline{x}_{ 0}^{(E)}) = {(R_{ v} - {C}^{T}Q_{ F}C)}^{-1}{C}^{T}Q_{ F}(A\overline{x}_{0}^{(E)} + BE_{\tilde{ w}}\ ({f}^{{\ast}}(\overline{x}_{ 0}^{(E)} -\tilde{ w})\ )) \\ & & \qquad \forall \ \overline{x}_{0}^{(E)} \in {R}^{n} {}\end{array}$$
(5.42)

The expectation

$$\displaystyle\begin{array}{rcl} E_{\tilde{w}}\ (f(\overline{x}_{0}^{(E)} -\tilde{ w})\ )& =& \frac{1} {{(2\pi )}^{\frac{n} {2} }\sqrt{\det (P_{0 }^{(P) } + P_{0 }^{(E) })}}\int {}\\ & & \qquad \qquad \cdots \int _{{R}^{n}}f(\overline{x}_{0}^{(E)} -\tilde{ w}){e}^{-\frac{1} {2} \tilde{{w}}^{T}{(P_{ 0}^{(P)}+P_{ 0}^{(E)})}^{-1}\tilde{w} }d\tilde{w} {}\\ \end{array}$$

It is convenient to use the notation for the multivariate Gaussian distribution with covariance P ( > 0),

$$\displaystyle\begin{array}{rcl} G(x;P) \equiv \frac{1} {{(2\pi )}^{\frac{n} {2} }\sqrt{\det (P)}}{e}^{-\frac{1} {2} {x}^{T}{P}^{-1}x }& & {}\\ \end{array}$$

whereupon

$$\displaystyle\begin{array}{rcl} E_{\tilde{w}}\ (f(\overline{x}_{0}^{(E)} -\tilde{ w})\ ) = [f {\ast} G(P_{ 0}^{(P)} + P_{ 0}^{(E)})](\overline{x}_{ 0}^{(E)})& & {}\\ \end{array}$$

Similarly, the expectation

$$\displaystyle\begin{array}{rcl} E_{\tilde{w}}\ (g(\overline{x}_{0}^{(P)} +\tilde{ w})\ ) = [g {\ast} G(P_{ 0}^{(P)} + P_{ 0}^{(E)})](\overline{x}_{ 0}^{(P)})& & {}\\ \end{array}$$

Using the convolution notation in (5.41) and (5.42), one obtains

$$\displaystyle\begin{array}{rcl}{ f}^{{\ast}}(\overline{x}_{ 0}^{(P)})& =& -{(R_{ u} + {B}^{T}Q_{ F}B)}^{-1}{B}^{T}Q_{ F}(A\overline{x}_{0}^{(P)} + C{g}^{{\ast}}{\ast} G(P_{ 0}^{(P)} {}\\ & & \quad + P_{0}^{(E)}))\ \ \forall \ \overline{x}_{ 0}^{(P)} \in {R}^{n} {}\\ {g}^{{\ast}}(\overline{x}_{ 0}^{(E)})& =& {(R_{ v} - {C}^{T}Q_{ F}C)}^{-1}{C}^{T}Q_{ F}(A\overline{x}_{0}^{(E)} + B{f}^{{\ast}}{\ast} G(P_{ 0}^{(P)} + P_{ 0}^{(E)})) {}\\ & & \quad \forall \ \overline{x}_{0}^{(E)} \in {R}^{n} {}\\ \end{array}$$

Thus, the functions

$$\displaystyle\begin{array}{rcl}{ f}^{{\ast}}(x) = -{(R_{ u} + {B}^{T}Q_{ F}B)}^{-1}{B}^{T}Q_{ F}(Ax + C[{g}^{{\ast}}{\ast} G(P_{ 0}^{(P)} + P_{ 0}^{(E)})](x)),\ \ \ \ \ \ \ & &{}\end{array}$$
(5.43)

and

$$\displaystyle\begin{array}{rcl}{ g}^{{\ast}}(x) = {(R_{ v} - {C}^{T}Q_{ F}C)}^{-1}{C}^{T}Q_{ F}(Ax + B[{f}^{{\ast}}{\ast} G(P_{ 0}^{(P)}& +& P_{ 0}^{(E)})](x)) \\ & & \forall \ x \in {R}^{n}{}\end{array}$$
(5.44)

Inserting (5.44) into (5.43) and suppressing the dependence of the Gaussian p.d.f. on the covariance matrix yields

$$\displaystyle\begin{array}{rcl}{ f}^{{\ast}}(x)& =& -{(R_{ u} + {B}^{T}Q_{ F}B)}^{-1}{B}^{T}Q_{ F}A\ x \\ & -& {(R_{u} + {B}^{T}Q_{ F}B)}^{-1}{B}^{T}Q_{ F}C{(R_{v} - {C}^{T}Q_{ F}C)}^{-1}{C}^{T}Q_{ F}A\ x {\ast} G \\ & -& {(R_{u} + {B}^{T}Q_{ F}B)}^{-1}{B}^{T}Q_{ F}C{(R_{v} - {C}^{T}Q_{ F}C)}^{-1}{C}^{T}Q_{ F}B\ {f}^{{\ast}}{\ast} G {\ast} G \\ & & \forall \ x \in {R}^{n} {}\end{array}$$
(5.45)

Similarly, inserting (5.43) into (5.44) yields

$$\displaystyle\begin{array}{rcl}{ g}^{{\ast}}(x)& =& {(R_{ v} - {C}^{T}Q_{ F}C)}^{-1}{C}^{T}Q_{ F}A\ x \\ & -& {(R_{v} - {C}^{T}Q_{ F}C)}^{-1}{C}^{T}Q_{ F}B{(R_{u} + {B}^{T}Q_{ F}B)}^{-1}{B}^{T}Q_{ F}A\ x {\ast} G \\ & -& {(R_{v} - {C}^{T}Q_{ F}C)}^{-1}{C}^{T}Q_{ F}B{(R_{u} + {B}^{T}Q_{ F}B)}^{-1}{B}^{T}Q_{ F}C\ {g}^{{\ast}}{\ast} G {\ast} G \\ & & \forall \ x \in {R}^{n} {}\end{array}$$
(5.46)

The convolution operation is associative. We shall require the following

Lemma 5.1.

The Gaussian kernel is self-similar, namely,

$$\displaystyle\begin{array}{rcl} G(P) {\ast} G(P) = G(2P)& &{}\end{array}$$
(5.47)

Proof.

$$\displaystyle\begin{array}{rcl} G(x;P) {\ast} G(x;P)& =& \int \ldots \int _{{R}^{n}} \frac{1} {{(2\pi )}^{\frac{n} {2} }\sqrt{\det (P)}}{e}^{-\frac{1} {2} {(x-y)}^{T}{P}^{-1}(x-y) } {}\\ & & \frac{1} {{(2\pi )}^{\frac{n} {2} }\sqrt{\det (P)}}{e}^{-\frac{1} {2} {y}^{T}{P}^{-1}y }dy {}\\ & =& \int \ldots \int _{{R}^{n}}{( \frac{1} {{(2\pi )}^{\frac{n} {2} }\sqrt{\det (P)}})}^{2}{e}^{-\frac{1} {2} [{(x-y)}^{T}{P}^{-1}(x-y)+{y}^{T}{P}^{-1}y] }dy {}\\ \end{array}$$

We calculate

$$\displaystyle\begin{array}{rcl}{ (x - y)}^{T}{P}^{-1}(x - y) + {y}^{T}{P}^{-1}y& =& 2{y}^{T}{P}^{-1}y - 2{x}^{T}{P}^{-1}y + {x}^{T}{P}^{-1}x {}\\ & =& {y}^{T}{(\frac{1} {2}P)}^{-1}y - 2{(\frac{1} {2}x)}^{T}{(\frac{1} {2}P)}^{-1}y {}\\ & +& {(\frac{1} {2}x)}^{T}{(\frac{1} {2}P)}^{-1}(\frac{1} {2}x) + \frac{1} {2}{x}^{T}{P}^{-1}x {}\\ & =& {(y -\frac{1} {2}x)}^{T}{(\frac{1} {2}P)}^{-1}(y -\frac{1} {2}x) + {x}^{T}{(2P)}^{-1}x {}\\ \end{array}$$

Hence

$$\displaystyle\begin{array}{rcl} & \ & \int \ldots \int _{{R}^{n}}{( \frac{1} {{(2\pi )}^{\frac{n} {2} }\sqrt{\det (P)}})}^{2}{e}^{-\frac{1} {2} [{(x-y)}^{T}{P}^{-1}(x-y)+{y}^{T}{P}^{-1}y] }dy_{1}\ldots dy_{n} {}\\ & =& \int \ldots \int _{{R}^{n}} \frac{1} {{(2\pi )}^{\frac{n} {2} }\sqrt{\det (\frac{1}{2} P)}}{e}^{-\frac{1} {2} {(y-\frac{1} {2} x)}^{T}{(\frac{1} {2} P)}^{-1}(y-\frac{1} {2} x)} {}\\ & & \qquad \qquad dy_{1}\ldots dy_{n} \cdot {(\frac{1} {2})}^{\frac{n} {2} } \frac{1} {{(2\pi )}^{\frac{n} {2} }\sqrt{\det (P)}}{e}^{-\frac{1} {2} {x}^{T}{(2P)}^{-1}x } {}\\ & =& 1 \cdot {(\frac{1} {2})}^{\frac{n} {2} } \frac{1} {{(2\pi )}^{\frac{n} {2} }\sqrt{\det (P)}}{e}^{-\frac{1} {2} {x}^{T}{(2P)}^{-1}x } {}\\ & =& \frac{1} {{(2\pi )}^{\frac{n} {2} }\sqrt{\det (2P)}}{e}^{-\frac{1} {2} {x}^{T}{(2P)}^{-1}x } {}\\ & =& G(2P) {}\\ \end{array}$$

 □ 

We also calculate

$$\displaystyle\begin{array}{rcl} x {\ast} G(P)& =& \int _{{R}^{n}}(x - y)G(y;P)dy \\ & =& x\qquad \forall P > 0 {}\end{array}$$
(5.48)

Inserting (5.47) and (5.48) into (5.45) yields a Fredholm integral equation of the second kind for the optimal strategy of player P,

$$\displaystyle\begin{array}{rcl}{ f}^{{\ast}}(\overline{x}_{ 0}^{(P)})& =& -{(R_{ u}+{B}^{T}Q_{ F}B)}^{-1}{B}^{T}Q_{ F}[I+C{(R_{v}-{C}^{T}Q_{ F}C)}^{-1}{C}^{T}Q_{ F}]A\ \overline{x}_{0}^{(P)} \\ & -& {(R_{u}+{B}^{T}Q_{ F}B)}^{-1}{B}^{T}Q_{ F}C{(R_{v}-{C}^{T}Q_{ F}C)}^{-1}{C}^{T}Q_{ F}B\ G(2P) {\ast} {f}^{{\ast}} \\ & &\forall \ \overline{x}_{0}^{(P)} \in {R}^{n} {}\end{array}$$
(5.49)

Similarly, inserting (5.47) and (5.48) into (5.46) yields a Fredholm integral equation of the second kind for the optimal strategy of player E,

$$\displaystyle\begin{array}{rcl}{ g}^{{\ast}}(\overline{x}_{ 0}^{(E)})& =& {(R_{ v} - {C}^{T}Q_{ F}C)}^{-1}{C}^{T}Q_{ F}[I - B{(R_{u} + {B}^{T}Q_{ F}B)}^{-1}{B}^{T}Q_{ F}]A\ \overline{x}_{0}^{(E)} \\ & -& {(R_{v} - {C}^{T}Q_{ F}C)}^{-1}{C}^{T}Q_{ F}B{(R_{u} + {B}^{T}Q_{ F}B)}^{-1}{B}^{T}Q_{ F}C\ G(2P) {\ast} {g}^{{\ast}} \\ & &\forall \ x_{0}^{(E)} \in {R}^{n} {}\end{array}$$
(5.50)

The Fredholm equations of the second kind (5.49) and (5.50) are of the convolution type and the kernel is a Gaussian function.

If the state’s measurement error covariances are “small,” namely, \(P_{0}^{(P)} < 1\) and \(P_{0}^{(E)} < 1\) and therefore the Gaussian distribution approaches a delta function, from (5.49) and (5.50) we conclude that the P and E strategies satisfy the equations

$$\displaystyle\begin{array}{rcl}{ f}^{{\ast}}(\overline{x}_{ 0}^{(P)})& =& -{(R_{ u}+{B}^{T}Q_{ F}B)}^{-1}{B}^{T}Q_{ F}[I+C{(R_{v}-{C}^{T}Q_{ F}C)}^{-1}{C}^{T}Q_{ F}]A\overline{x}_{0}^{(P)} \\ & -& {(R_{u}+{B}^{T}Q_{ F}B)}^{-1}{B}^{T}Q_{ F}C{(R_{v}-{C}^{T}Q_{ F}C)}^{-1}{C}^{T}Q_{ F}B{f}^{{\ast}}(\overline{x}_{ 0}^{(P)}) \\ & & \forall \ \overline{x}_{0}^{(P)} \in {R}^{n} {}\end{array}$$
(5.51)

and

$$\displaystyle\begin{array}{rcl}{ g}^{{\ast}}(\overline{x}_{ 0}^{(E)})& =& {(R_{ v}-{C}^{T}Q_{ F}C)}^{-1}{C}^{T}Q_{ F}[I-B{(R_{u}+{B}^{T}Q_{ F}B)}^{-1}{B}^{T}Q_{ F}]A\ \overline{x}_{0}^{(E)} \\ & -& {(R_{v}-{C}^{T}Q_{ F}C)}^{-1}{C}^{T}Q_{ F}B{(R_{u}+{B}^{T}Q_{ F}B)}^{-1}{B}^{T}Q_{ F}C\ {g}^{{\ast}}(\overline{x}_{ 0}^{(E)}) \\ & & \forall \ \overline{x}_{0}^{(E)} \in {R}^{n} {}\end{array}$$
(5.52)

From (5.51) and (5.52) we therefore obtain players’ P and E optimal strategies, which are explicitly given by

$$\displaystyle\begin{array}{rcl}{ f}^{{\ast}}(\overline{x}_{ 0}^{(P)})& =& -{[I+{(R_{ u}+{B}^{T}Q_{ F}B)}^{-1}{B}^{T}Q_{ F}C{(R_{v}-{C}^{T}Q_{ F}C)}^{-1}{C}^{T}Q_{ F}B]}^{-1}(R_{ u} {}\\ & -& {B}^{T}Q_{ F}B{)}^{-1}{B}^{T}Q_{ F}[I+C{(R_{v}-{C}^{T}Q_{ F}C)}^{-1}{C}^{T}Q_{ F}]A\ \overline{x}_{0}^{(P)} {}\\ & =& -{[R_{u}+{B}^{T}Q_{ F}B+{B}^{T}Q_{ F}C{(R_{v}-{C}^{T}Q_{ F}C)}^{-1}{C}^{T}Q_{ F}B]}^{-1}{B}^{T}[I {}\\ & +& Q_{F}C{(R_{v}-{C}^{T}Q_{ F}C)}^{-1}{C}^{T}Q_{ F}]A\ \overline{x}_{0}^{(P)}\forall \ \overline{x}_{ 0}^{(P)} \in {R}^{n} {}\\ \end{array}$$

that is, the optimal strategy of player P is

$$\displaystyle\begin{array}{rcl}{ f}^{{\ast}}(\overline{x}_{ 0}^{(P)})& =& -S_{ B}^{-1}(Q_{ F}){B}^{T}[I + Q_{ F}C{(R_{v} - {C}^{T}Q_{ F}C)}^{-1}{C}^{T}Q_{ F}]A\ \overline{x}_{0}^{(P)} \\ & & \forall \ \overline{x}_{0}^{(P)} \in {R}^{n} {}\end{array}$$
(5.53)

Similarly,

$$\displaystyle\begin{array}{rcl}{ g}^{{\ast}}(\overline{x}_{ 0}^{(E)})& =& {[I + {(R_{ v} - {C}^{T}Q_{ F}C)}^{-1}{C}^{T}Q_{ F}B{(R_{u} + {B}^{T}Q_{ F}B)}^{-1}{B}^{T}Q_{ F}C]}^{-1}(R_{ v} {}\\ & -& {C}^{T}Q_{ F}C{)}^{-1}{C}^{T}Q_{ F}[I - B{(R_{u} + {B}^{T}Q_{ F}B)}^{-1}{B}^{T}Q_{ F}]A\ \overline{x}_{0}^{(E)} {}\\ & =& {[R_{v} - {C}^{T}Q_{ F}C + {C}^{T}Q_{ F}B{(R_{u} + {B}^{T}Q_{ F}B)}^{-1}{B}^{T}Q_{ F}C]}^{-1}{C}^{T}[I {}\\ & -& Q_{F}B{(R_{u} + {B}^{T}Q_{ F}B)}^{-1}{B}^{T}]Q_{ F}A\ \overline{x}_{0}^{(E)}\ \ \forall \ \overline{x}_{ 0}^{(E)} \in {R}^{n} {}\\ \end{array}$$

that is, the optimal strategy of player E is

$$\displaystyle\begin{array}{rcl}{ g}^{{\ast}}(\overline{x}_{ 0}^{(E)})& =& -S_{ C}^{-1}(Q_{ F}){C}^{T}[I - Q_{ F}B{(R_{u} + {B}^{T}Q_{ F}B)}^{-1}{B}^{T}]Q_{ F}A\ \overline{x}_{0}^{(E)} \\ & & \forall \ \overline{x}_{0}^{(E)} \in {R}^{n} {}\end{array}$$
(5.54)

where the Schur complement

$$\displaystyle\begin{array}{rcl} S_{C}(Q_{F}) \equiv -[R_{v} - {C}^{T}Q_{ F}C + {C}^{T}Q_{ F}B{(R_{u} + {B}^{T}Q_{ F}B)}^{-1}{B}^{T}Q_{ F}C]& &{}\end{array}$$
(5.55)

Indeed, having calculated the functions f (x) and g (x), we obtained the optimal strategies of players P and E by setting \(x:= \overline{x}_{0}^{(P)}\) in f (x) and \(x:= \overline{x}_{0}^{(E)}\) in g (x). In the limiting case of Gaussian distributions with small covariance matrices, the players’ optimal strategies (5.53) and (5.54) are linear in the players’ respective measurements.

Equations (5.53) and (5.54) are identical to the respective (5.6) and (5.7) in Theorem 5.1—we have recovered the perfect information result of Theorem 5.1. This makes sense—the initial state’s measurements of both players are very accurate and thus the game is almost deterministic. Thus, one could have argued that when the covariances are “small,” namely, \(P_{0}^{(P)} << 1\) and P 0 (E) < < 1, that is, \(\overline{x}_{0}^{(P)} \approx \overline{x}_{0}^{(E)} \approx x_{0}\), one can re-use the deterministic state feedback strategies (5.6) and (5.7) of players P and E given by Theorem 5.1—simply set \(x_{0}:= \overline{x}_{0}^{(P)}\) in (5.6) and \(x_{0}:= \overline{x}_{0}^{(E)}\) in (5.7).

6 Linear Strategies

The Fredholm integral equations of the second kind, (5.49) and (5.50), are linear integral equations. Furthermore, the “forcing terms”/inputs on the R.H.S. of (5.49) and (5.50) are linear in \(\overline{x}_{0}^{(P)}\) and \(\overline{x}_{0}^{(E)}\), respectively. Consequently, the solution f (⋅ ) of (5.49) is linear in \(\overline{x}_{0}^{(P)}\) and the solution g (⋅ ) of (5.50) is linear in \(\overline{x}_{0}^{(E)}\)—think of linear integral operators as infinite dimensional matrices. Hence, postulate that the players’ optimal strategies are linear—in other words,

$$\displaystyle\begin{array}{rcl} f(\overline{x}_{0}^{P}) = F_{ u}\overline{x}_{0}^{P}& &{}\end{array}$$
(5.56)

and

$$\displaystyle\begin{array}{rcl} g(\overline{x}_{0}^{E}) = F_{ v}\overline{x}_{0}^{E}\,& &{}\end{array}$$
(5.57)

where the yet to be determined constant gains F u and F v are m u × n and m v × n matrices, respectively. Constant gain strategies (5.56) and (5.57) which satisfy the respective second kind Fredholm integral equations of the convolution type with a Gaussian kernel, (5.49) and (5.50), can be found. This is due to the fact that, according to (5.48), the convolution of the state vector x with a Gaussian function returns the state vector x. In the process of deriving the equations which yield the gains F u and F v , the necessary and sufficient conditions for the existence of a solution are obtained.

The optimal gains F u and F v are obtained as follows. Insert (5.56) into (5.49) and insert (5.57) into (5.50):

$$\displaystyle\begin{array}{rcl} F_{u}^{{\ast}}\ \overline{x}_{ 0}^{(P)}& =& -{(R_{ u}+{B}^{T}Q_{ F}B)}^{-1}{B}^{T}Q_{ F}[I+C{(R_{v}-{C}^{T}Q_{ F}C)}^{-1}{C}^{T}Q_{ F}]A\ \overline{x}_{0}^{(P)} {}\\ & -& {(R_{u}+{B}^{T}Q_{ F}B)}^{-1}{B}^{T}Q_{ F}C{(R_{v}-{C}^{T}Q_{ F}C)}^{-1}{C}^{T}Q_{ F}B\ F_{u}^{{\ast}}x {\ast} G(2P) {}\\ & =& -{(R_{u}+{B}^{T}Q_{ F}B)}^{-1}{B}^{T}Q_{ F}[I+C{(R_{v}-{C}^{T}Q_{ F}C)}^{-1}{C}^{T}Q_{ F}]A\ \overline{x}_{0}^{(P)} {}\\ & -& {(R_{u}+{B}^{T}Q_{ F}B)}^{-1}{B}^{T}Q_{ F}C{(R_{v}-{C}^{T}Q_{ F}C)}^{-1}{C}^{T}Q_{ F}B\ F_{u}^{{\ast}}\ \overline{x}_{ 0}^{(P)} {}\\ & & \forall \ \overline{x}_{0}^{(P)} \in {R}^{n} {}\\ \end{array}$$

Therefore

$$\displaystyle\begin{array}{rcl} F_{u}^{{\ast}}& =& -{[I + {(R_{ u} + {B}^{T}Q_{ F}B)}^{-1}{B}^{T}Q_{ F}C{(R_{v} - {C}^{T}Q_{ F}C)}^{-1}{C}^{T}Q_{ F}B]}^{-1}(R_{ u} \\ & +& {B}^{T}Q_{ F}B{)}^{-1}{B}^{T}Q_{ F}[I + C{(R_{v} - {C}^{T}Q_{ F}C)}^{-1}{C}^{T}Q_{ F}]A \\ & =& -S_{B}^{-1}(Q_{ F}){B}^{T}Q_{ F}[I + C{(R_{v} - {C}^{T}Q_{ F}C)}^{-1}{C}^{T}Q_{ F}]A {}\end{array}$$
(5.58)

Similarly

$$\displaystyle\begin{array}{rcl} F_{v}^{{\ast}}\ \overline{x}_{ 0}^{(E)}& =& {(R_{ v}-{C}^{T}Q_{ F}C)}^{-1}{C}^{T}Q_{ F}[I - B{(R_{u}+{B}^{T}Q_{ F}B)}^{-1}{B}^{T}Q_{ F}]A\ \overline{x}_{0}^{(E)} {}\\ & -& {(R_{v}-{C}^{T}Q_{ F}C)}^{-1}{C}^{T}Q_{ F}B{(R_{u}+{B}^{T}Q_{ F}B)}^{-1}{B}^{T}Q_{ F}C\ F_{v}^{{\ast}}x {\ast} G(2P) {}\\ & =& {(R_{v}-{C}^{T}Q_{ F}C)}^{-1}{C}^{T}Q_{ F}[I-B{(R_{u}+{B}^{T}Q_{ F}B)}^{-1}{B}^{T}Q_{ F}]A\ \overline{x}_{0}^{(E)} {}\\ & -& {(R_{v}-{C}^{T}Q_{ F}C)}^{-1}{C}^{T}Q_{ F}B{(R_{u}+{B}^{T}Q_{ F}B)}^{-1}{B}^{T}Q_{ F}C\ F_{v}^{{\ast}}\overline{x}_{ 0}^{(E)} {}\\ & & \forall \ x_{0}^{(E)} \in {R}^{n} {}\\ \end{array}$$

and

$$\displaystyle\begin{array}{rcl} F_{v}^{{\ast}}& =& {[I + {(R_{ v} - {C}^{T}Q_{ F}C)}^{-1}{C}^{T}Q_{ F}B{(R_{u} + {B}^{T}Q_{ F}B)}^{-1}{B}^{T}Q_{ F}C]}^{-1}(R_{ v} \\ & -& {C}^{T}Q_{ F}C{)}^{-1}{C}^{T}Q_{ F}[I - B{(R_{u} + {B}^{T}Q_{ F}B)}^{-1}{B}^{T}Q_{ F}]A \\ & =& -S_{C}^{-1}(Q_{ F}){C}^{T}Q_{ F}[I - B{(R_{u} + {B}^{T}Q_{ F}B)}^{-1}{B}^{T}Q_{ F}]A {}\end{array}$$
(5.59)

We have found constant gain strategies \(F_{u}^{{\ast}}\) and \(F_{v}^{{\ast}}\) which satisfy the respective second kind Fredholm integral equations of the convolution type with a Gaussian kernel, (5.49) and (5.50). This is due to the fact that, according to (5.48), the convolution of the state vector x with a Gaussian function returns the state vector x. Furthermore, (5.48) holds, irrespective of the covariance P. Hence, the constant gains are not dependent on the cumulative covariance P of the measurement errors and also apply in the limiting case of a deterministic scenario—in other words, the optimal constant gains F u and F v are exactly as in (5.6) and (5.7), and certainty equivalence holds. Having obtained the optimal strategies, one can now calculate the respective value functions of players P and E by evaluating the expectations in (5.31) and (5.40):

Consider (5.31), the expected cost \({\overline{J}}^{(P)}(u_{0},g(\cdot );\overline{x}_{0}^{(P)})\) of player P first. The expectations

$$\displaystyle\begin{array}{rcl} E_{\tilde{w}}\ (g(\overline{x}_{0}^{(P)} +\tilde{ w})) = F_{ v}^{{\ast}}\overline{x}_{ 0}^{(P)}\,& &{}\end{array}$$
(5.60)
$$\displaystyle\begin{array}{rcl} E_{w_{E},w_{P}}\ ({g}^{T}(\overline{x}_{ 0}^{(P)} + w_{ E} - w_{P}){C}^{T}Q_{ F}Aw_{P}) = -\mathit{Trace}(P_{0}^{(P)}{A}^{T}Q_{ F}CF_{v}^{{\ast}})\,& &{}\end{array}$$
(5.61)

and

$$\displaystyle\begin{array}{rcl} E_{\tilde{w}}\ & & ({g}^{T}(\overline{x}_{ 0}^{(P)} +\tilde{ w})(R_{ v} - {C}^{T}Q_{ F}C)g(\overline{x}_{0}^{(P)} +\tilde{ w})) \\ & =& {(\overline{x}_{0}^{(P)})}^{T}{(F_{ v}^{{\ast}})}^{T}(R_{ v} - {C}^{T}Q_{ F}C)F_{v}^{{\ast}}\overline{x}_{ 0}^{(P)} \\ & & +\mathit{Trace}({(F_{v}^{{\ast}})}^{T}(R_{ v} - {C}^{T}Q_{ F}C)F_{v}^{{\ast}}(P_{ 0}^{(P)} + P_{ 0}^{(E)})){}\end{array}$$
(5.62)

Inserting (5.60)–(5.62) into (5.31) yields, with some abuse of notation, the value function of player P,

$$\displaystyle\begin{array}{rcl} V _{0}^{(P)}& =& {(\overline{x}_{ 0}^{(P)})}^{T}{A}^{T}Q_{ F}A\overline{x}_{0}^{(P)} + \mathit{Trace}({A}^{T}Q_{ F}AP_{0}^{(P)}) + u_{ 0}^{T}(R_{ u} + {B}^{T}Q_{ F}B)u_{0} \\ & & +2u_{0}^{T}{B}^{T}Q_{ F}A\overline{x}_{0}^{(P)} + 2{(\overline{x}_{ 0}^{(P)})}^{T}{(F_{ v}^{{\ast}})}^{T}{C}^{T}Q_{ F}A\overline{x}_{0}^{(P)} \\ & & +2\mathit{Trace}(P_{0}^{(P)}{A}^{T}Q_{ F}CF_{v}^{{\ast}}) \\ & & -{(\overline{x}_{0}^{(P)})}^{T}{(F_{ v}^{{\ast}})}^{T}(R_{ v} - {C}^{T}Q_{ F}C)F_{v}^{{\ast}}\overline{x}_{ 0}^{(P)} \\ & & -\mathit{Trace}({(F_{v}^{{\ast}})}^{T}(R_{ v} - {C}^{T}Q_{ F}C)F_{v}^{{\ast}}(P_{ 0}^{(P)} + P_{ 0}^{(E)})) \\ & & +2u_{0}^{T}{B}^{T}Q_{ F}CF_{v}^{{\ast}}\ \overline{x}_{ 0}^{(P)} {}\end{array}$$
(5.63)

Also, in (5.63)

$$\displaystyle\begin{array}{rcl} u_{0}^{{\ast}}& =& -{(R_{ u} + {B}^{T}Q_{ F}B)}^{-1}{B}^{T}Q_{ F}(A\overline{x}_{0}^{(P)} + CE_{\tilde{ w}}\ (g(\overline{x}_{0}^{(P)} +\tilde{ w})\ )) \\ & =& -{(R_{u} + {B}^{T}Q_{ F}B)}^{-1}{B}^{T}Q_{ F}(A\overline{x}_{0}^{(P)} + C\ F_{ v}^{{\ast}}\overline{x}_{ 0}^{(P)}) \\ & =& -{(R_{u} + {B}^{T}Q_{ F}B)}^{-1}{B}^{T}Q_{ F}(A + C\ F_{v}^{{\ast}})\overline{x}_{ 0}^{(P)} {}\end{array}$$
(5.64)

Inserting (5.59) and (5.64) into (5.63) yields the value function of player P. The value function \(V _{0}^{(P)}(\overline{x}_{0}^{(P)})\) of player P is quadratic in \(\overline{x}_{0}^{(P)}\). It is of the form

$$\displaystyle\begin{array}{rcl} V _{0}^{(P)}(\overline{x}_{ 0}^{(P)}) = {(\overline{x}_{ 0}^{(P)})}^{T}M\overline{x}_{ 0}^{(P)} + {c}^{(P)}& & {}\\ \end{array}$$

where M is an n × n real, symmetric matrix and c (P) is a constant. While the matrix M is complex in appearance, note that it is not dependent on the covariances \(P_{0}^{(P)}\) and \(P_{0}^{(E)}\) of the players’ state measurement errors. Hence, we conclude that the matrix

$$\displaystyle\begin{array}{rcl} M = P_{1}\,& & {}\\ \end{array}$$

where the matrix P 1 is given by (5.11). The constant

$$\displaystyle\begin{array}{rcl}{ c}^{(P)}& =& \mathit{Trace}(\ {A}^{T}Q_{ F}AP_{0}^{(P)} + 2P_{ 0}^{(P)}{A}^{T}Q_{ F}CF_{v}^{{\ast}} \\ & &\qquad \qquad \qquad \qquad - {(F_{v}^{{\ast}})}^{T}(R_{ v} - {C}^{T}Q_{ F}C)F_{v}^{{\ast}}(P_{ 0}^{(P)} + P_{ 0}^{(E)})\ )\,{}\end{array}$$
(5.65)

where the gain \(F_{v}^{{\ast}}\) is given by (5.59).

Next, consider the expected cost \({\overline{J}}^{(E)}(f(\cdot ),v_{0};\overline{x}_{0}^{(E)})\) of player E, (5.40). The expectations

$$\displaystyle\begin{array}{rcl} & & E_{\tilde{w}}\ ({f}^{T}(\overline{x}_{ 0}^{(E)} -\tilde{ w})(R_{ u} + {B}^{T}Q_{ F}B)f(\overline{x}_{0}^{(E)} -\tilde{ w})) \\ & & \qquad = {(\overline{x}_{0}^{(E)} -\tilde{ w})}^{T}{(F_{ u}^{{\ast}})}^{T}(R_{ u} + {B}^{T}Q_{ F}B)F_{u}^{{\ast}}\overline{x}_{ 0}^{(E)} \\ & & \qquad + \mathit{Trace}({(F_{u}^{{\ast}})}^{T}(R_{ u} + {B}^{T}Q_{ F}B)F_{u}^{{\ast}}(P_{ 0}^{(P)} \\ & & \qquad + P_{0}^{(E)})) {}\end{array}$$
(5.66)
$$\displaystyle\begin{array}{rcl} & & E_{\tilde{w}}\ (f(\overline{x}_{0}^{(E)} -\tilde{ w})) = F_{ u}^{{\ast}}\overline{x}_{ 0}^{(E)}{}\end{array}$$
(5.67)

and

$$\displaystyle\begin{array}{rcl} E_{w_{E},w_{P}}\ ({f}^{T}(\overline{x}_{ 0}^{(E)} + w_{ P} - w_{E}){B}^{T}Q_{ F}Aw_{E}) = -\mathit{Trace}({(F_{u}^{{\ast}})}^{T}{B}^{T}Q_{ F}P_{0}^{(E)})\ \ \ \ \ \ \ \ & &{}\end{array}$$
(5.68)

Inserting (5.66)–(5.68) into (5.40) yields the value function of player E

$$\displaystyle\begin{array}{rcl} V _{0}^{(E)}& =& {(\overline{x}_{ 0}^{(E)})}^{T}{A}^{T}Q_{ F}A\overline{x}_{0}^{(E)} + \mathit{Trace}({A}^{T}Q_{ F}AP_{0}^{(E)}) - v_{ 0}^{T}(R_{ v} - {C}^{T}Q_{ F}C)v_{0} \\ & +& 2v_{0}^{T}{C}^{T}Q_{ F}A\overline{x}_{0}^{(E)} + {(\overline{x}_{ 0}^{(E)})}^{T}{(F_{ u}^{{\ast}})}^{T}(R_{ u} + {B}^{T}Q_{ F}B)F_{u}^{{\ast}}\overline{x}_{ 0}^{(E)} \\ & +& \mathit{Trace}({(F_{u}^{{\ast}})}^{T}(R_{ u} + {B}^{T}Q_{ F}B)F_{u}^{{\ast}}(P_{ 0}^{(P)} + P_{ 0}^{(E)})) \\ & +& 2{(\overline{x}_{0}^{(E)})}^{T}{(F_{ u}^{{\ast}})}^{T}{B}^{T}Q_{ F}A\overline{x}_{0}^{(E)} \\ & +& 2\mathit{Trace}({(F_{u}^{{\ast}})}^{T}{B}^{T}Q_{ F}P_{0}^{(E)}) \\ & +& 2v_{0}^{T}{C}^{T}Q_{ F}BF_{u}^{{\ast}}\overline{x}_{ 0}^{(E)} {}\end{array}$$
(5.69)

Also, in (5.69),

$$\displaystyle\begin{array}{rcl} v_{0}^{{\ast}}& =& {(R_{ v} - {C}^{T}Q_{ F}C)}^{-1}{C}^{T}Q_{ F}(A\overline{x}_{0}^{(E)} + BE_{\tilde{ w}}\ (f(\overline{x}_{0}^{(E)} -\tilde{ w})\ )) \\ & =& {(R_{v} - {C}^{T}Q_{ F}C)}^{-1}{C}^{T}Q_{ F}(A + BF_{u}^{{\ast}})\overline{x}_{ 0}^{(E)} {}\end{array}$$
(5.70)

Inserting (5.58) and (5.70) into (5.69) yields the value function of player E. The value function \(V _{0}^{(E)}(\overline{x}_{0}^{(E)})\) of player E is quadratic in \(\overline{x}_{0}^{(E)}\). Similar to the value function of player P, it is of the form

$$\displaystyle\begin{array}{rcl} V _{0}^{(E)}(\overline{x}_{ 0}^{(E)}) = {(\overline{x}_{ 0}^{(P)})}^{T}P_{ 1}\overline{x}_{0}^{(P)} + {c}^{(E)}& & {}\\ \end{array}$$

The constant

$$\displaystyle\begin{array}{rcl}{ c}^{(E)} = \mathit{Trace}(\ {A}^{T}Q_{ F}AP_{0}^{(E)} + {(F_{ u}^{{\ast}})}^{T}(R_{ u}& +& {B}^{T}Q_{ F}B)F_{u}^{{\ast}}(P_{ 0}^{(P)} + P_{ 0}^{(E)}) \\ & & \qquad \qquad + 2{(F_{u}^{{\ast}})}^{T}{B}^{T}Q_{ F}P_{0}^{(E)}\ ) {}\end{array}$$
(5.71)

where the gain F u is given by (5.58).

These results are summarized in

Theorem 5.3.

A necessary and sufficient condition for the existence of a solution to the game (5.1) and (5.2) where the players have private information, that is, a game where the initial state information of player P is specified in (5.15) and the initial state information of player E is specified in (5.16) is that condition (5.5) holds. Certainty equivalence holds and the optimal P strategy is given by (5.6) where x 0 is replaced by \(\overline{x}_{0}^{(P)}\) and the optimal E strategy is given by (5.7) where x 0 is replaced by \(\overline{x}_{0}^{(E)}\) . The value function of player P is

$$\displaystyle\begin{array}{rcl} V _{0}^{(P)}(\overline{x}_{ 0}^{(P)}) = {(\overline{x}_{ 0}^{(P)})}^{T}P_{ 1}\overline{x}_{0}^{(P)} + {c}^{(P)}& &{}\end{array}$$
(5.72)

and the value function of player E

$$\displaystyle\begin{array}{rcl} V _{0}^{(E)}(\overline{x}_{ 0}^{(E)}) = {(\overline{x}_{ 0}^{(P)})}^{T}P_{ 1}\overline{x}_{0}^{(P)} + {c}^{(E)}& &{}\end{array}$$
(5.73)

The matrix P 1 in (5.72) and (5.73) is specified by (5.9) and the constant terms in (5.72) and (5.73) ,c (P) and c (E) , are specified in (5.65) and (5.71) , respectively.

7 Conclusion

A static two-player linear-quadratic game where the players have private information on the game’s parameter, is addressed. The players have private information, however each player is able to formulate an expression for his expected payoff, without the need, a la Harsanyi, to provide a prior probability distribution function of the game’s parameter, and without recourse to the player Nature. Hence, the closed-form solution of the game is possible. It is shown that in this special case of a one-stage linear-quadratic game where the players have private information, the solution is similar in structure to the solution of the game with complete information, namely, the deterministic linear-quadratic game, and the solution of the linear-quadratic game with partial information, where the information about the game’s parameter is shared by the players. The principle of certainty equivalence holds. The analysis in this paper shows the way to possible extensions of the theory to multi-stage linear-quadratic dynamic games.