Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Introduction

In several areas of systems and control theory, such as linear quadractic optimal control, the bounded real-lemma, H-infinity control, or stochastic realization theory, quadratic matrix equations play a role. Such equations are of the form

$$\displaystyle{ XDX - BX + XA - C = 0, }$$
(19.1)

where A, B, C, and D are given matrices and X is the solution. The problem of finding X can often be solved in the following way: Introduce

$$\displaystyle{ \mathcal{H} = \left [\begin{array}{*{10}c} A&D\\ C & B \end{array} \right ] }$$
(19.2)

and consider the subspace \(\mathcal{M} =\mathrm{ Im\,}\left [\begin{array}{*{10}c} I\\ X \end{array} \right ]\). Then X is a solution of (19.1) if and only if \(\mathcal{M}\) is \(\mathcal{H}\)-invariant and in addition

$$\displaystyle{ \mathcal{M}\cap \mathrm{Im\,}\left [\begin{array}{*{10}c} 0\\ I \end{array} \right ] =\{ 0\}. }$$
(19.3)

Furthermore, if X is a solution to (19.1), then

$$\displaystyle{ \sigma (A + DX) =\sigma (\mathcal{H}\vert _{\mathcal{M}}). }$$
(19.4)

Thus solutions of the algebraic Riccati equation are in one-to-one correspondence with \(\mathcal{H}\)-invariant subspaces for which the extra condition (19.3) holds, and moreover, the spectrum of the so-called closed loop feedback matrix A + DX is given by (19.4).

In the control problems mentioned above, the equation usually has some symmetry. In fact, mostly D and C are Hermitian matrices, and \(B = -A^{{\ast}}\). In most cases also, one is looking for the unique solutions for which A + DX is stable in the sense that all its eigenvalues are in the open left half plane. It is easy to see that such a matrix has to be Hermitian as well. Such a solution is called the stabilizing solution of the algebraic Riccati equation.

Observe, in case D = D , C = C , and \(B = -A^{{\ast}}\), then the matrix \(\mathcal{H}\) is J-Hamiltonian, that is, with

$$\displaystyle{J = \left [\begin{array}{*{10}c} 0 &I\\ -I & 0 \end{array} \right ]}$$

we have

$$\displaystyle{J\mathcal{H} = -\mathcal{H}^{{\ast}}J.}$$

In other words, the matrix \(i\mathcal{H}\) is selfadjoint in the indefinite iH-inner product. Moreover, for Hermitian solutions X of the algebraic Riccati equation, the subspace \(\mathcal{M}\) satisfies \(J\mathcal{M} = \mathcal{M}^{\perp }\). A subspace with this property will be called J-Lagrangian .

Thus, when considering Hermitian solutions of symmetric algebraic Riccati equations, one is interested in \(\mathcal{H}\)-invariant J-Lagrangian subspaces with the extra condition (19.3). This connection between solutions of the Riccati equation and invariant Lagrangian subspaces goes back to [4, 20, 21].

The condition that the spectrum of A + DX lies in the open left half plane then implies (using a dimension argument) that \(\mathcal{H}\) does not have any spectrum on the imaginary axis, and [using (19.4)] that \(\mathcal{M}\) is the spectral subspace of \(\mathcal{H}\) corresponding to the open left half plane. For some applications, notably in H-infinity control, it is of interest to study solutions for which the spectral condition is weakened to σ(A + DX) lying in the closed left half plane. This motivates the study of \(\mathcal{H}\)-invariant J-Lagrangian subspaces for matrices that are J-Hamiltonian. In effect, since the results on canonical forms for selfadjoint matrices in indefinite inner products are readily available, it is easier to consider \(i\mathcal{H}\)-invariant subspaces which are iJ-Lagrangian.

An excellent discussion of the algebraic Riccati equation, based on an approach using indefinite inner product spaces, is given in the book [13]. Applications to problems in factorization of rational matrix functions, and connections to engineering problems like the theory of linear quadratic optimal control, H-infinity control, the bounded real lemma, and the positive real lemma may be found also in [1]. Most of these connections will be discussed briefly in the next section.

Algebraic Riccati equations may be solved in several ways. Classically, solutions were based on iterative techniques. With this in mind, the problem of finding invariant Lagrangian subspaces may be tackled by solving a corresponding algebraic Ricccati equation. However, the current way of solving algebraic Riccati equations works the other way around: the existence of invariant Lagrangian subspaces is used, and computer programs like Matlab use this to find the desired solution of the algebraic Riccati equation.

There is a rich literature concerning the infinite dimensional case. The reader is referred to [5] for a good starting point. Further developments can be found in, e.g., [19, 22, 23]. The viewpoint of using existence of invariant Lagrangian subspaces, and the theory of operators in spaces with an indefinite inner product, to study particular solutions of the algebraic Riccati equation can be found in [15], as well as in [3]. The focus in this essay will be on the finite dimensional case.

Motivation

In this section several ways in which the algebraic Riccati equation appears in problems in systems and control theory will be discussed.

Linear Quadratic Optimal Control

Consider a controllable linear system in continuous time, given by

$$\displaystyle\begin{array}{rcl} \dot{x}(t)& =& Ax(t) + Bu(t),\qquad t \geq 0, {}\\ x(0)& =& x_{0}. {}\\ \end{array}$$

Together with the system a cost function is given by

$$\displaystyle{J(u,x_{0}) =\int _{ 0}^{\infty }x(t)^{{\ast}}Rx(t) + u(t)^{{\ast}}Qu(t)\,dt.}$$

The goal is to minimize J(u, x 0) over all stabilizing input trajectories u(t), where x(t) is the corresponding trajectory of the system. The matrices Q and R satisfy the following conditions: Q ≥ 0, R > 0.

This minimization problem can be solved in the following way: find the stabilizing solution X of the algebraic Riccati equation

$$\displaystyle{XBR^{-1}B^{{\ast}}X - XA - A^{{\ast}}X - Q = 0}$$

and then set \(u(t) = -R^{-1}B^{{\ast}}x(t)\).

The analogue in discrete time is also being considered. The system then is given by

$$\displaystyle\begin{array}{rcl} x(t + 1)& =& Ax(t) + Bu(t),\qquad t = 0,1,2,\cdots \,, {}\\ x(0)& =& x_{0}, {}\\ \end{array}$$

and the cost function is given by

$$\displaystyle{J(u,x_{0}) =\sum _{ 0}^{\infty }x(t)^{{\ast}}Rx(t) + u(t)^{{\ast}}Qu(t).}$$

Again the goal is to minimize the cost function over all stabilizing input sequences u(t). Under the same conditions on the system and the cost function the solution is now as follows: find the stabilizing solution of the so-called discrete algebraic Riccati equation

$$\displaystyle{X = Q + A^{{\ast}}XA - A^{{\ast}}XB(R + B^{{\ast}}XB)^{-1}B^{{\ast}}XA,}$$

then the minimizing input sequence is given by \(u(t) = -(R + B^{{\ast}}XB)^{-1}B^{{\ast}}XA\).

How the discrete algebraic Riccati equation relates to an invariant subspace problem for a structured matrix in an indefinite inner product space will be discussed in the last section.

Dropping the condition that the input functions (for the continuous time case) or the input sequences (for the discrete time case) over which one minimizes the cost function are stabilizing, and just assuming that R is invertible, one arrives at the so-called linear quadratic problems with indefinite cost. It turns out that once again, certain solutions of the same algebraic Riccati equations play a role, but obviously, not the stabilizing ones. For details on this, see [25, 27].

Bounded Real Lemma

The bounded real lemma provides a characterization of contractiveness of a rational matrix valued function. As a first result, let \(W(\lambda ) = D + C(\lambda I_{n} - A)^{-1}B\) be a minimal realization of a rational p × m matrix function, and assume that D is a strict contraction. Then the following three statements are equivalent:

  1. 1.

    W(λ) has contractive values for λ on the imaginary axis,

  2. 2.

    there exists a Hermitian solution P of the algebraic Riccati equation

    $$\displaystyle{AP + PA^{{\ast}} + BB^{{\ast}} + (PC^{{\ast}} + BD^{{\ast}})(I - DD^{{\ast}})^{-1}(CP + DB^{{\ast}}) = 0,}$$
  3. 3.

    there exists a Hermitian solution Q of the algebraic Riccati equation

    $$\displaystyle{A^{{\ast}}Q + QA^{{\ast}}- C^{{\ast}}C - (QB - C^{{\ast}}D)(I - D^{{\ast}}D)^{-1}(B^{{\ast}}Q - D^{{\ast}}C) = 0.}$$

The bounded real lemma characterizes when a rational matrix valued function has contractive values in the closed right half plane. To be precise, with W(λ) as in the previous paragraph, assume that W(λ) is contractive for λ on the imaginary axis. Then W(λ) has contractive values for all λ in the closed right half plane if and only if A has all its eigenvalues in the open left half plane, which in turn is equivalent to the existence of a positive definite solution of

$$\displaystyle{AP + PA^{{\ast}} + BB^{{\ast}} + (PC^{{\ast}} + BD^{{\ast}})(I - DD^{{\ast}})^{-1}(CP + DB^{{\ast}}) = 0.}$$

H -Control

Consider the following problem: given is a system with two inputs (w and u) and two outputs (y and z):

$$\displaystyle\begin{array}{rcl} \dot{x}(t)& =& Ax(t) + B_{1}w(t) + B_{2}u(t), {}\\ z(t)& =& Cx(t) + Du(t), {}\\ y(t)& =& x(t). {}\\ \end{array}$$

The input u as usual is the one that can be controlled, w is interpreted as disturbances. Also, y is the measured output, z is the output to be controlled. This is a special case of an H -control problem, the so-called full information case. The objective is to make the influence of the disturbance w on the output to be controlled z small in an appropriate sense, to be made precise below.

Consider the state feedback u(t) = Kx(t), where K is a fixed matrix. Then the closed loop system becomes

$$\displaystyle\begin{array}{rcl} \dot{x}(t)& =& (A + B_{2}K)x(t) + B_{1}w(t), {}\\ z(t)& =& (C + DK)x(t). {}\\ \end{array}$$

Denote by G K (s) the transfer function from w to z, that is, \(G_{K}(s) = (C + DK)(sI - (A + B_{2}K))^{-1}B_{1}\). Then the objective is to find K such that the following two conditions hold:

  1. 1.

    for some pre-specified tolerance level γ

    $$\displaystyle{ \|G_{K}\|_{\infty }:=\max _{s\in i\mathbb{R}}\|G_{K}(s)\| <\gamma }$$
  2. 2.

    K is a stabilizing feedback, that is, A + B 2 K has all its eigenvalues in the open left half plane.

Under the assumptions that the pair (C, A) is observable, the pairs (A, B 1) and (A, B 2) are stabilizable, D T C = 0 and D T D = I, there exists a matrix K such that A + B 2 K is stable and \(\|G_{K}\|_{\infty } <\gamma\) if and only if there exists a positive definite matrix X for which the following two conditions are met: X satisfies the algebraic Riccati equation

$$\displaystyle{ X\left (\frac{1} {\gamma ^{2}} B_{1}B_{1}^{T} - B_{ 2}B_{2}^{T}\right )X + XA + A^{T}X + C^{T}C = 0, }$$

and \(A +\big (\frac{1} {\gamma ^{2}} B_{1}B_{1}^{T} - B_{2}B_{2}^{T}\big)X_{\infty }\) is stable. In that case one such state feedback is given by \(K = -B_{2}^{T}X_{\infty }\).

It may be observed that if γ then X , considered as a function of γ will go to the solution of the LQ-optimal control problem.

Stochastic Realization

Consider a vector valued zero-mean stationary stochastic process y(t), \(t \in \mathbb{Z}\). Recall that this means that \(\mathbb{E}(y(t)y(t - k)^{T})\) only depends on k. The vectors y(t) are in \(\mathbb{R}^{p}\). The p × p matrices \(R(k) = \mathbb{E}(y(t)y(t - k)^{T})\) are called the autocovariances of the process.

A state space representation for the process is a representation given by

$$\displaystyle\begin{array}{rcl} x(t + 1)& =& Ax(t) +\varepsilon _{1}(t),\qquad t \in \mathbb{Z} {}\\ y(t)& =& Cx(t) +\varepsilon _{2}(t), {}\\ \end{array}$$

where A is a stable matrix and \(\left (\begin{array}{*{10}c} \varepsilon _{1}(t) \\ \varepsilon _{2}(t) \end{array} \right )\) is a joint white noise process with covariance matrix

$$\displaystyle{\Sigma = \left (\begin{array}{*{10}c} \Sigma _{11} & \Sigma _{12} \\ \Sigma _{12}^{T}&\Sigma _{22} \end{array} \right ).}$$

Standing assumption is that \(\Sigma _{22}\) is invertible.

The (weak) stochastic realization problem is to construct the matrices A, C, and \(\Sigma \) from the autocovariances R(k) of the process. Obviously, it is of interest to have a minimal state space representation, which means that the number of state variables x(t) and the number of noise variables \(\left (\begin{array}{*{10}c} \varepsilon _{1}(t) \\ \varepsilon _{2}(t) \end{array} \right )\) are as small as possible.

A first step in the minimal realization is to construct matrices (A, C, M) such that \(R(k) = CA^{k-1}M\) such that the state space dimension is as small as possible. This can be done by a routine realization procedure. The second step is then that \(\Sigma \) may be produced from these matrices and the state covariance matrix \(\Pi = \mathbb{E}(x(t)x(t)^{T})\) is as follows:

$$\displaystyle{\Sigma = \left (\begin{array}{*{10}c} \Sigma _{11} & \Sigma _{12} \\ \Sigma _{12}^{T}&\Sigma _{22} \end{array} \right ) = \left (\begin{array}{*{10}c} \Pi - A\Pi A^{T} & M - A\Pi C^{T} \\ M^{T} - C\Pi A&R(0) - C\Pi C^{T} \end{array} \right ).}$$

The number of noise terms is minimized by making the rank of \(\Sigma \) as small as possible. Since \(\Sigma _{22}\) needs to be invertible, this can be achieved by taking \(\Pi \) such that \(\mathrm{rank\,}\Sigma = p\). This, in turn means one makes \(Z = \Sigma _{11} - \Sigma _{12}\Sigma _{22}^{-1}\Sigma _{12}^{T} = 0\). That is, \(\Pi \) is chosen so that it is positive definite (after all, \(\Pi \) is the state covariance matrix), \(R(0) - C\Pi C^{T}\) is invertible, and \(\Pi \) satisfies the algebraic Riccati equation

$$\displaystyle{\Pi = A\Pi A^{T} + (M - A\Pi C^{T})(R(0) - C\Pi C^{T})^{-1}(M^{T} - C\Pi A^{T}).}$$

For more details, see, e.g., [11, Chapter 6].

Kalman Filter

Given a zero-mean stationary stochastic process y(t), one is interested in the one-step ahead prediction, given by the conditional expectation of y(t) based on all earlier values of the process. To be precise

$$\displaystyle{\hat{y}(t) = \mathbb{E}(y(t)\mid y(s),s \leq t - 1).}$$

The Kalman filter solves this problem, starting from a realization of the process y:

$$\displaystyle\begin{array}{rcl} x(t + 1)& =& Ax(t) + F\varepsilon (t),\qquad t = 0,1,2\cdots {}\\ y(t)& =& Cx(t) + G\varepsilon (t), {}\\ \end{array}$$

where A is a stable matrix, G has full row rank, and the process \(\varepsilon\) is white noise with zero mean and unit covariance matrix. It is assumed that x(0) ∼ N(0, P(0)) and that it is independent of \(\varepsilon (t)\) for all t.

Defining \(\hat{x}(t) = \mathbb{E}(x(t)\mid y(s),s \leq t - 1)\), it is seen that

$$\displaystyle{\hat{y}(t) = C\hat{x}(t).}$$

Introduce ω(t) the observation error, \(\omega (t) = y(t) -\hat{ y}(t)\), and denote by P(t) the state error covariance matrix, \(P(t) = \mathbb{E}(x(t) -\hat{ x}(t))(x(t) -\hat{ x}(t))^{T}\).

Then the Kalman filter is given as follows:

$$\displaystyle\begin{array}{rcl} \hat{x}(t + 1)& =& A\hat{x}(t) + K(t)\omega (t), {}\\ \hat{y}(t)& =& C\hat{x}(t), {}\\ \hat{x}(0)& =& 0, {}\\ \end{array}$$

where

$$\displaystyle{K(t) = (FG^{T} + AP(t)C^{T})(GG^{T} + CP(t)C^{T})^{-1},}$$

and P(t) is given by the recursion

$$\displaystyle\begin{array}{rcl} P(t + 1)& =& AP(t)A^{T} + FF^{T} - (FG^{T} + AP(t)C^{T})(GG^{T} + CP(t)C^{T})^{-1} {}\\ & & \qquad (GF^{T} + CP(t)A^{T}) {}\\ \end{array}$$

started with the covariance matrix P(0) of x(0).

The recursion of P(t) can be shown to converge to a steady state under certain conditions on the coefficients. Under suitable conditions, the limit of P(t) is the largest solution of the algebraic Riccati equation

$$\displaystyle{P = APA^{T} + FF^{T} - (FG^{T} + APC^{T})(GG^{T} + CPC^{T})^{-1}(GF^{T} + CPA^{T}).}$$

Replacing P(t) by this solution P in the formula for K(t) leads to the so-called steady-state Kalman filter. For more details, see, e.g., [10, 11, 13].

Spectral Factorization

Let W(λ) be a rational m × m matrix function which has selfadjoint values on the imaginary axis, with the exception of possible poles. If \(W(\lambda ) = D + C(\lambda I_{n} - A)^{-1}B\) is a minimal realization, then there exists a unique invertible skew-Hermitian matrix H (i.e., \(H = -H^{{\ast}}\)) such that \(HA = -A^{{\ast}}H\), and HB = C . Note that iH is Hermitian, and that iA is iH-selfadjoint. Assuming that D is invertible, also \(A^{\times } = A - BD^{-1}C\) satisfies \(HA^{\times } = -(A^{\times })^{{\ast}}H\). The matrix A × is of importance because of the fact that

$$\displaystyle{W(\lambda )^{-1} = D^{-1} - D^{-1}C(\lambda I_{ n} - A^{\times })^{-1}BD^{-1}.}$$

Consider a special case, where W(λ) is positive definite for λ on the imaginary axis, again with the exception of possible poles. In that case it is of interest to construct the so-called spectral factors, that is, one is interested in finding a rational m × m matrix function L(λ) such that L has all its poles and zeros in the open left half plane and

$$\displaystyle{W(\lambda ) = L(-\bar{\lambda })^{{\ast}}L(\lambda ).}$$

An obvious necessary condition is that W itself does not have poles and zeros on the imaginary line. It turns out that this necessary condition is also sufficient. Usually, it is assumed that W is given in a different form. As a sample of the results available, consider the case where

$$\displaystyle{W(\lambda ) = D + C(\lambda I_{n} - A)^{-1} - B^{{\ast}}(\lambda I_{ n} + A^{{\ast}})^{-1}C^{{\ast}},}$$

with a stable matrix A, and a positive definite D. The corresponding matrix H is then given by

$$\displaystyle{H = \left (\begin{array}{*{10}c} 0 &I_{n} \\ -I_{n}& 0 \end{array} \right ).}$$

Assume in addition that W does not have zeros on the imaginary axis. Put \(A^{\times } = A - BD^{-1}C\). Then the Riccati equation

$$\displaystyle{PBD^{-1}B^{{\ast}}P - PA^{\times }- (A^{\times })^{{\ast}}P + C^{{\ast}}D^{-1}C = 0}$$

has a unique solution P for which \(\sigma (A^{\times }- BD^{-1}B^{{\ast}}P)\) is contained in the open left half plane, and a spectral factorization is given by \(W(\lambda ) = L(-\bar{\lambda })^{{\ast}}L(\lambda )\), where

$$\displaystyle{L(\lambda ) = D^{1/2} + D^{-1/2}(C + B^{{\ast}}P)(\lambda I_{ n} - A)^{-1}B.}$$

(See Theorem 13.2 in [1].)

Many related factorization problems also involve algebraic Riccati equations. To mention just a few: J-spectral factorization (Chapter 14 in [1]), inner–outer factorizations (see, e.g, Theorem 17.26 in [1]), and unitary completions of strictly contractive matrix functions (see Theorem 17.29 in [1]).

Bezout Equation

A classical problem in systems theory is the following: given is an m × p rational matrix function G(λ) which is analytic in the open right half plane, and for which the value G() = D exists. It is assumed that p > m, so that G has more columns than rows. The goal is to find a p × m rational matrix function X(λ), which is also analytic in the open right half plane, such that

$$\displaystyle{G(\lambda )X(\lambda ) = I_{m},\qquad \mathrm{Re\,}\lambda \geq 0.}$$

There is an extensive literature on this so-called Bezout equation and the related corona equation, see, e.g., [28] and the literature mentioned in [6]. Here, the solution obtained in [6] will be presented.

Assume that \(G(\lambda ) = D + C(\lambda I_{n} - A)^{-1}B\) where A has all its eigenvalues in the open left half plane. Clearly, a necessary condition for the existence of a solution X(λ) is that D has a right inverse. In particular, it is necessary that DD is invertible. Let P be the unique solution of the Lyapunov equation

$$\displaystyle{AP + PA^{{\ast}} = -BB^{{\ast}}.}$$

Put \(\Gamma = BD^{{\ast}} + PC^{{\ast}}\), and consider the algebraic Riccati equation

$$\displaystyle{A^{{\ast}}Q + QA + (C - \Gamma ^{{\ast}}Q)^{{\ast}}(DD^{{\ast}})^{-1}(C - \Gamma ^{{\ast}}Q) = 0.}$$

A solution Q of this equation is called the stabilizing solution if

$$\displaystyle{A_{0} = A - \Gamma (DD^{{\ast}})^{-1}(C - \Gamma ^{{\ast}}Q)}$$

has all its eigenvalues in the open left half plane.

In [6] the following result is proved: there is a rational p × m matrix function X which is analytic in the open right half plane and which satisfies the Bezout equation G(λ)X(λ) = I m if and only if there exists a stabilizing solution Q of the algebraic Riccati equation, and in addition, I n PQ is invertible. In that case, one solution is given by

$$\displaystyle{X(\lambda ) = \left (I_{p} - C_{1}(\lambda I_{n} - A_{0})^{-1}(I_{ n} - PQ)^{-1}B\right )D^{{\ast}}(DD^{{\ast}})^{-1},}$$

where \(C_{1} = D^{{\ast}}(DD^{{\ast}})^{-1}(C - \Gamma ^{{\ast}}Q) + B^{{\ast}}Q\).

Moreover, a complete description of all solutions is provided as well in [6]. The discrete time analogues were discussed in [7, 8].

Invariant Lagrangian Subspaces

Existence

For the reader’s convenience the canonical form for pairs of matrices (A, H), where H = H is invertible and HA = A H, is recalled here. As a starting point, consider the following two examples.

Example 1.

A = J n (λ) is the n × n Jordan block with real eigenvalue λ, and \(H =\varepsilon \Sigma _{n}\), where \(\varepsilon = \pm 1\) and \(\Sigma _{n}\) is the n × n sip matrix (i.e., the matrix with ones on the second main diagonal and zeros elsewhere).

Example 2.

\(A = J_{n}(\lambda ) \oplus J_{n}(\overline{\lambda })\) and \(H = \Sigma _{2n}\).

The result on the canonical form states that if A is H-selfadjoint, then there is an invertible matrix S such that the pair (S −1 AS, S HS) is a block diagonal sum of blocks of the types described in the two examples above (see [9] for this result and a description of its history).

The signs in the canonical form connected to Jordan blocks of A with real eigenvalues are defined as the sign characteristic of the pair (A, H). Using this notion, the following theorem describes the existence of A-invariant H-Lagrangian subspaces [24, Theorem 5.1].

Theorem 1.

Let A be H-selfadjoint. Then there exists an A-invariant H-Lagrangian subspace if and only if for each real eigenvalue λ of A the number of Jordan blocks of odd size with eigenvalue λ is even, and exactly half of those have a sign + 1 attached to them in the sign characteristic of the pair (A,H).

Stability

It is also of interest to study stability of A-invariant H-Lagrangian subspaces under small perturbations of the matrices A and H. To discuss this, a metric on the space of subspaces is needed. The gap between two subspaces \(\mathcal{M}\) and \(\mathcal{M}^{{\prime}}\) is defined by

$$\displaystyle{\mathrm{gap\,}(\mathcal{M},\mathcal{M}^{{\prime}}) =\| P_{ \mathcal{M}}- P_{\mathcal{M}^{{\prime}}}\|,}$$

where \(P_{\mathcal{M}}\) is the orthogonal projection on \(\mathcal{M}\), and likewise for \(P_{\mathcal{M}^{{\prime}}}\).

An A-invariant maximal H-Lagrangian subspace \(\mathcal{M}\) is called stable if for every \(\varepsilon > 0\) there is a δ > 0 such that for every pair (A , H ) with A being H -selfadjoint, and with

$$\displaystyle{\|A - A^{{\prime}}\| +\| H - H^{{\prime}}\| <\varepsilon }$$

there is an A -invariant H -Lagrangian subspace \(\mathcal{M}^{{\prime}}\) such that

$$\displaystyle{\mathrm{gap\,}(\mathcal{M},\mathcal{M}^{{\prime}}) <\delta.}$$

A slightly different concept, with the a-priori additional condition on the pair (A , H ) that there exists an A -invariant H -Lagrangian subspace, is called conditional stability. The following theorem can be found in, e.g., [24]. The notation \(\mathcal{R}(A,\lambda )\) denotes the spectral subspace of A corresponding to the eigenvalue λ.

Theorem 2.

  1. (i)

    Let A be H-selfadjoint. There exists a stable A-invariant H-Lagrangian subspace if and only if A has no real eigenvalues. In that case, an A-invariant H-Lagrangian subspace \(\mathcal{M}\) is stable if and only if for every eigenvalue λ of A with algebraic multiplicity greater than one, either \(\mathcal{R}(A,\lambda ) \subset \mathcal{M}\) or \(\mathcal{R}(A,\lambda ) \cap \mathcal{M} =\{ 0\}\).

  2. (ii)

    There exists a conditionally stable A-invariant H-Lagrangian subspace if and only if for every real eigenvalue λ 0 of A the partial multiplicities of A corresponding to λ 0 are all even and the signs in the sign characteristic of the pair (A,H) corresponding to these partial multiplicities are the same (but may differ from eigenvalue to eigenvalue). In that case, an A-invariant H-Lagrangian subspace \(\mathcal{M}\) is conditionally stable if and only if for every eigenvalue λ of A with algebraic multiplicity greater than one, either \(\mathcal{R}(A,\lambda ) \subset \mathcal{M}\) or \(\mathcal{R}(A,\lambda ) \cap \mathcal{M} =\{ 0\}\).

Invariant Maximal Semidefinite Subspaces

For a pair (A, H), where H = H is invertible, and A is H-selfadjoint, there always exist an A-invariant maximal H-nonnegative subspace \(\mathcal{M}_{+}\) and an A-invariant maximal H-nonpositive subspace \(\mathcal{M}_{-}\). Typically, there are many such subspaces. An invariant maximal nonnegative, respectively nonpositive, subspace \(\mathcal{M}\) is called stable if for every \(\varepsilon > 0\) there is a δ > 0 such that for every pair (A , H ) with A being H -selfadjoint and with

$$\displaystyle{\|A - A^{{\prime}}\| +\| H - H^{{\prime}}\| <\delta,}$$

there is an A -invariant maximal H -nonnegative, respectively nonpositive, subspace \(\mathcal{M}^{{\prime}}\) such that

$$\displaystyle{\mathrm{gap\,}(\mathcal{M},\mathcal{M}^{{\prime}}) <\varepsilon.}$$

To state the result on stability of invariant maximal semidefinite subspaces, first the sign condition is introduced. The pair (A, H) is said to satisfy the sign condition if for every real eigenvalue λ of A the signs in the sign characteristic of (A, H) corresponding to Jordan blocks of odd size with eigenvalue λ are all the same, and likewise, the signs corresponding to Jordan blocks of even size with eigenvalue λ are all the same.

In [24] the following theorem is proved.

Theorem 3.

Let A be H-selfadjoint. Then the following are equivalent:

  1. 1.

    there exists a unique A-invariant maximal H-nonnegative (resp. nonpositive) subspace \(\mathcal{M}\) such that \(\sigma (A\vert _{\mathcal{M}})\) is contained in the closed upper half plane,

  2. 2.

    there exists a unique A-invariant maximal H-nonnegative (resp. nonpositive) subspace \(\mathcal{M}\) such that \(\sigma (A\vert _{\mathcal{M}})\) is contained in the closed lower half plane,

  3. 3.

    there exists a stable A-invariant maximal H-nonnegative subspace,

  4. 4.

    the pair (A,H) satisfies the sign condition.

In that case, the unique A-invariant maximal H-nonnegative (resp. nonpositive) subspaces for which \(\sigma (A\vert _{\mathcal{M}})\) is contained in the closed upper half plane, are stable, and likewise the unique A-invariant maximal H-nonnegative (resp. nonpositive) subspaces for which \(\sigma (A\vert _{\mathcal{M}})\) is contained in the closed lower half plane are stable.

In addition, there is a complete description of all stable invariant maximal semidefinite subspaces.

The Algebraic Riccati Equation: A Special Case

A special case is the algebraic Riccati equation with a positive semidefinite coefficient in the quadratic term, that is:

$$\displaystyle{ XBR^{-1}B^{{\ast}}X - XA - A^{{\ast}}X - Q = 0, }$$
(19.5)

where R is positive definite. In particular, in linear quadratic optimal control this special case plays a role; in that case, also Q is positive semidefinite. Under certain additional conditions more can be said on the matrix

$$\displaystyle{\mathcal{H} = \left [\begin{array}{*{10}c} A&-BR^{-1}B^{{\ast}} \\ Q& -A^{{\ast}} \end{array} \right ]}$$

(note that this is in fact the negative of the matrix \(\mathcal{H}\) in the introduction; it is chosen here to use the notation and conventions of the literature in control theory). Observe that \(i\mathcal{H}\) is H-selfadjoint, where H = iJ. To state the results some notions have to be introduced.

The pair of matrices (A, B), where A is an n × n matrix and B is an n × m matrix, is said to be controllable if

$$\displaystyle{\mathrm{rank\,}\left [\begin{array}{*{10}c} B &AB &\cdots &A^{n-1}B \end{array} \right ] = n.}$$

The pair of matrices is said to be stabilizable if there exists an m × n matrix F such that A + BF has all its eigenvalues in the open left half plane. It can be shown that a controllable pair is stabilizable (this is known as the pole placement theorem in control theory).

The pair of matrices (C, A), where C is p × n and A is n × n, is called observable if \(\cap _{j=0}^{n-1}\mathrm{Ker\,}CA^{j} = 0\). The pair of matrices is called detectable if there exists an n × p matrix R such that ARC has all its eigenvalues in the open left half plane.

The following result is classical in linear quadratic optimal control.

Proposition 1.

Assume that R is positive definite, Q is positive semidefinite, (A,B) is stabilizable and (Q,A) is detectable. Then the matrix \(\mathcal{H}\) has no pure imaginary eigenvalues, and the Lagrangian invariant subspace corresponding to the eigenvalues in the open left half plane is a graph subspace in the sense that (19.3) is satisfied. Consequently, (19.5) has a stabilizing Hermitian solution.

Combining the above proposition with Theorem 2 it is seen that the stabilizing Hermitian solution is stable under small perturbations of A, B, R, and Q.

Dropping the condition that Q is positive semidefinite, but strengthening the condition on the pair (A, B) still allows to deduce a very interesting result, due to [14].

Theorem 4.

Assume that R is positive definite, and that (A,B) is controllable. Then any invariant Lagrangian subspace \(\mathcal{M}\) is a graph subspace of the form \(\mathcal{M} =\mathrm{ Im\,}\left [\begin{array}{*{10}c} I\\ X \end{array} \right ]\) for some Hermitian matrix X which is a solution of (19.5). In addition, the matrix \(i\mathcal{H}\) has only even partial multiplicities corresponding to its real eigenvalues, and the signs in the sign characteristic are all one.

Thus, under these conditions, there is a one-to-one relation between invariant Lagrangian subspaces and Hermitian solutions of (19.5). In addition, there is a one-to-one relation between invariant Lagrangian subspaces and \(\mathcal{H}\)-invariant subspaces \(\mathcal{N}\) such that \(\sigma (\mathcal{H}\vert \mathcal{N}) \subset \mathbb{C}_{r}\), where \(\mathbb{C}_{r}\) denotes the open right half plane. Indeed, it can be shown that any \(\mathcal{H}\)-invariant iJ-Lagrangian subspace \(\mathcal{M}\) is of the form

$$\displaystyle{\mathcal{M} = \mathcal{N}\dot{ +} \mathcal{N}_{0}\dot{ +} ((J\mathcal{N})^{\perp }\cap \mathcal{R}(\mathcal{H}, \mathbb{C}_{ l}).}$$

Here \(\mathbb{C}_{l}\) denotes the open left half plane, and \(\mathcal{R}(\mathcal{H}, \mathbb{C}_{l})\) the spectral subspace of \(\mathcal{H}\) corresponding to the open left half plane. Further, \(\mathcal{N}_{0}\) is the (unique) \(\mathcal{H}\)-invariant subspace spanned by the first halfs of Jordan chains corresponding to the pure-imaginary eigenvalues of \(\mathcal{H}\). See [13, 14, 26]. The description given here of the set of invariant Lagrangian subspaces is reminiscent of the description of Hermitian solutions to the algebraic Riccati equation given in [29].

Combining Theorems 4 and 2 we see that in case R is positive definite and (A, B) is controllable the solutions X for which \(A - BR^{-1}B^{{\ast}}X\) has all its eigenvalues in the closed left half plane are conditionally stable.

Inertia of Solutions

Returning to the case where Q is positive semidefinite, write Q = C C, and assume (without loss of generality) that R = I. Thus, consider the Riccati equation:

$$\displaystyle{ XBB^{{\ast}}X - XA - A^{{\ast}}X - C^{{\ast}}C = 0. }$$
(19.6)

Consider also a second indefinite inner product, namely the one given by

$$\displaystyle{J_{1} = \left [\begin{array}{*{10}c} 0&I\\ I & 0 \end{array} \right ].}$$

Note that \(i\mathcal{H}\) is not only iJ-selfadjoint, but also has the property that it is − J 1-dissipative. Indeed,

$$\displaystyle{\frac{1} {2}(J_{1}\mathcal{H}+\mathcal{H}^{{\ast}}J_{ 1}) = \left [\begin{array}{*{10}c} -C^{{\ast}}C & 0 \\ 0 &-BB^{{\ast}} \end{array} \right ].}$$

If X is an Hermitian solution of (19.6), then the subspace \(\mathcal{M} =\mathrm{ Im\,}\left [\begin{array}{*{10}c} I\\ X \end{array} \right ]\) has the following property:

$$\displaystyle{\left \langle J_{1}\left [\begin{array}{*{10}c} I\\ X \end{array} \right ]x,\left [\begin{array}{*{10}c} I\\ X \end{array} \right ]x\right \rangle =\langle Xx,x\rangle.}$$

Thus, for example, the Hermitian solution X is nonnegative definite if and only if the subspace \(\mathcal{M}\) is J 1-nonnegative.

To describe how the inertia of the solution X is related to the geometry of the subspace \(\mathcal{M}\), the following notations and definitions are needed. First, let \(\mathcal{V}\) be the maximal A-invariant subspace in Ker C. (Note, if the pair (C, A) is observable, then \(\mathcal{V} =\{ 0\}\).) Let \(\mathcal{R}_{l} = \mathcal{R}(A, \mathbb{C}_{l})\) be the spectral subspace of A corresponding to the open left half plane, and likewise \(\mathcal{R}_{r}\) be the spectral subspace of A corresponding to the open right half plane. Denote by \(\mathcal{V}_{r}\), respectively, \(\mathcal{V}_{l}\), the intersections \(\mathcal{V}\cap \mathcal{R}_{r}\) and \(\mathcal{V}\cap \mathcal{R}_{l}\). Introduce also the projection \(P: \mathbb{C}^{2n} \rightarrow \mathbb{C}^{n}\), given by \(P = \left [\begin{array}{*{10}c} I &0 \end{array} \right ]\).

As usual, denote by π(X), respectively ν(X), the number of positive, respectively, negative, eigenvalues of the Hermitian matrix X, and by δ(X) the dimension of Ker X.

With these notations the following result holds (see [16], compare also [30]).

Theorem 5.

Assume that (A,B) is controllable. Let X be a solution of (19.6), and let \(\mathcal{M} =\mathrm{ Im\,}\left [\begin{array}{*{10}c} I\\ X \end{array} \right ]\). Then

$$\displaystyle\begin{array}{rcl} \pi (X)& =& \mathrm{dim\,}(\mathcal{M}\cap \mathcal{R}(\mathcal{H}, \mathbb{C}_{l}) -\mathrm{ dim\,}(\mathcal{M}\cap P^{{\ast}}\mathcal{V}_{ l}), {}\\ \nu (X)& =& \mathrm{dim\,}(\mathcal{M}\cap \mathcal{R}(\mathcal{H}, \mathbb{C}_{r}) -\mathrm{ dim\,}(\mathcal{M}\cap P^{{\ast}}\mathcal{V}_{ r}), {}\\ \delta (X)& =& \mathrm{dim\,}(\mathcal{M}\cap P^{{\ast}}\mathcal{V}). {}\\ \end{array}$$

The Discrete Algebraic Riccati Equation

In optimal control theory for discrete time systems the following quadratic matrix equation plays a role:

$$\displaystyle{ X = A^{{\ast}}XA + Q - A^{{\ast}}XB(R + B^{{\ast}}XB)^{-1}B^{{\ast}}XA, }$$
(19.7)

where one is looking for a Hermitian solution X for which \(A - B(R + B^{{\ast}}XB)^{-1}B^{{\ast}}XA\) has all its eigenvalues in the open unit disc.

Under additional conditions there is a connection between invariant Lagrangian subspaces of a J-unitary matrix and solutions of the discrete algebraic Riccati equation. One of these conditions is the invertibility of the matrix A. To describe the results, introduce the matrix T by

$$\displaystyle{T = \left [\begin{array}{*{10}c} A + BR^{-1}B^{{\ast}}(A^{{\ast}})^{-1}Q&-BR^{-1}B^{{\ast}}(A^{{\ast}})^{-1} \\ -(A^{{\ast}})^{-1}Q & (A^{{\ast}})^{-1} \end{array} \right ],}$$

and the matrix valued function

$$\displaystyle{\Psi (z) = R+\left [\begin{array}{*{10}c} -B^{{\ast}}(A^{{\ast}})^{-1}Q&B^{{\ast}}(A^{{\ast}})^{-1} \end{array} \right ]\left (zI -\left [\begin{array}{*{10}c} A & 0\\ -(A^{{\ast} } )^{-1 } Q &(A^{{\ast} } )^{-1} \end{array} \right ]\right )^{-1}\left [\begin{array}{*{10}c} B \\ 0 \end{array} \right ].}$$

A direct computation shows that T is iJ-unitary, that is, T JT = J. Also, \(\Psi (z)\) has Hermitian values for z on the unit circle.

It can be shown that if X is a Hermitian solution of (19.7), then the graph subspace \(\mathrm{Im\,}\left [\begin{array}{*{10}c} I\\ X \end{array} \right ]\) of X is T-invariant and iJ-Lagrangian. Conversely, if the graph subspace of a matrix X is T-invariant and iJ-Lagrangian, then X is a Hermitian solution of (19.7).

Theorem 6.

Assume that A is invertible, (A,B) is controllable, and that there exists a number η on the unit circle such that \(\Psi (\eta )\) is positive definite. Then there exists a Hermitian solution of (19.7) if and only if there exists a T-invariant iJ-Lagrangian subspace. In turn, this is equivalent to the partial multiplicities of T corresponding to eigenvalues on the unit circle being even.

In that case, every T-invariant iJ-Lagrangian subspace is automatically a graph subspace of a Hermitian solution of (19.7), and conversely.

This theorem is one of the motivations of the study of Lagrangian invariant subspaces of matrices that are unitary in an indefinite inner product (see, e.g., [13]), or, for the case where all matrices are real, of matrices that are symplectic in a space with a skew-symmetric inner product. See, e.g., [17].

Key Literature

The following books give a far more exhaustive account of the theory: the book [13] is a good starting point; connections with robust control and H control may be found in [12, 18] and [10]; the book [2] is a collection of valuable review papers; connections with factorization of rational matrix functions can be found in [1].

Cross-References