Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Introduction

Indefinite Linear Algebra is the beginning of the title of the book by Gohberg et al. [14] which is probably the primary source for the theory of finite-dimensional indefinite inner product spaces and is an adaption and new edition of their earlier monograph [13]. The title concisely describes the two main features that come together in this topic: the theory of indefinite inner products and Linear Algebra in the sense of matrix theory with canonical forms as its powerful tool. Indeed, the additional restriction of a Kreĭn space to be finite dimensional sometimes allows stronger statements, because in many situations it is sufficient to investigate special representatives in canonical form from a given equivalence class. Therefore, many results in Indefinite Linear Algebra make use of the choice of a particular basis of the original vector space which is typically identified with \(\mathbb{F}^{n}\). Here and in the following \(\mathbb{F}\) stands either for the field \(\mathbb{C}\) of complex numbers or for the field \(\mathbb{R}\) of real numbers. Clearly, any real matrix can be interpreted as a complex matrix and in many circumstances it is advantageous to focus on the complex case only. However, there are several applications in which the matrices under consideration are real.

The aim of this chapter is to summarize research topics in the theory of finite indefinite inner product spaces from recent years and in particular to establish connections to a completely different area in mathematics: Numerical Analysis or, to be more precise, Numerical Linear Algebra.

After a brief review of some fundamental concepts from Indefinite Linear Algebra in section “Matrices with Symmetries with Respect to an Indefinite Inner Product”, the theory of H-polar decompositions is presented in section “Polar Decompositions and Normal Matrices” as an example for a concept investigated in the theory of finite-dimensional inner product spaces, where the knowledge of Numerical Analysis can be used to construct efficient algorithms for the actual computation of the desired decompositions. On the other hand, the Hamiltonian eigenvalue problem is investigated in section “Hamiltonian Matrices”, and it is highlighted that only the deeper understanding of the sign characteristic as an important invariant of matrices that have symmetry structures with respect to an indefinite inner product can help in explaining the effects and problems that occur when structure-preserving algorithms are considered in Numerical Analysis. Clearly, these two examples cover only a small part of the currently ongoing research that successfully combines the two areas Indefinite Linear Algebra and Numerical Linear Algebra.

Matrices with Symmetries with Respect to an Indefinite Inner Product

In the following, let \(H \in \mathbb{F}^{n\times n}\) be an invertible matrix satisfying H = H or \(H^{{\ast}} = -H\). Then H defines an indefinite inner product on \(\mathbb{F}^{n}\) via

$$\displaystyle{ [x,y]:= [x,y]_{H}:= (Hx,y) }$$
(18.1)

for all \(x,y \in \mathbb{F}^{n}\), where (⋅, ⋅ ) denotes the standard Euclidean inner product in \(\mathbb{F}^{n}\). Clearly, if \(\mathbb{F} = \mathbb{C}\) and H = H is Hermitian, then the pair \((\mathbb{C}^{n},[\cdot,\cdot ]_{H})\) is simply a finite-dimensional Kreĭn space. If \(H^{{\ast}} = -H\) is skew-Hermitian, then iH is Hermitian and, therefore, it is actually sufficient to consider the case H = H only, when \(\mathbb{F} = \mathbb{C}\). In the case \(\mathbb{F} = \mathbb{R}\), however, this “trick” is not possible and one has to treat the cases H = H and \(H^{{\ast}} = -H\) separately.

In all cases, the H-adjoint of a matrix \(A \in \mathbb{F}^{n\times n}\) is defined as the unique matrix denoted by A [∗] satisfying the identity

$$\displaystyle{[Ax,y] = [x,A^{[{\ast}]}y]}$$

for all \(x,y \in \mathbb{F}^{n}\). It is straightforward to check that \(A^{[{\ast}]} = H^{-1}A^{{\ast}}H\), i.e., A [∗] is similar to the adjoint A with respect to the standard Euclidean inner product. A matrix \(A \in \mathbb{F}^{n\times n}\) is called H-self-adjoint if A [∗] = A or, equivalently, if A H = HA. Other important matrices with symmetry structures include H-skew-adjoint and H-unitary matrices which together with their synonyms for the case \(\mathbb{F} = \mathbb{R}\) are compiled in the following table.

Canonical Forms

A change of basis in the space \(\mathbb{F}^{n}\) can be interpreted as a linear transformation xP −1 x, where \(P \in \mathbb{F}^{n\times n}\) is invertible. If \(A \in \mathbb{F}^{n\times n}\) is a matrix representing a linear transformation in a space equipped with an indefinite inner product induced by the invertible Hermitian matrix \(H \in \mathbb{F}^{n\times n}\), then P −1 AP is the matrix representing the linear transformation with respect to the new basis and similarly P HP represents the inner product with respect to the new basis. This simple observation motivates the following definition.

Definition 1.

Let \(H_{1},H_{2} \in \mathbb{F}^{n\times n}\) satisfy H 1 = σ H 1 and H 2 = σ H 2, where \(\sigma \in \{ +1,-1\}\) and let \(A_{1},A_{2} \in \mathbb{F}^{n\times n}\). Then the pairs (A 1, H 1) and (A 2, H 2) are called unitarily similar if there exists an invertible matrix \(P \in \mathbb{F}^{n\times n}\) such that

$$\displaystyle{ A_{2} = P^{-1}A_{ 1}P\quad \mbox{ and}\quad H_{2} = P^{{\ast}}H_{ 1}P. }$$
(18.2)

The term unitary similarity was chosen in [14], because the transformation matrix P in (18.2) can be considered as an (H 2, H 1)-unitary matrix, i.e., as a matrix satisfying

$$\displaystyle{[Px,Py]_{H_{1}} = [x,y]_{H_{2}}}$$

for all \(x,y \in \mathbb{F}^{n}\). It is straightforward to check that if A has one of the symmetry structures listed in Table 18.1 with respect to [⋅, ⋅ ] H , then P −1 AP has the same symmetry structure with respect to \([\cdot,\cdot ]_{P^{{\ast}}HP}\). For all of those matrix classes (or, more precisely, for pairs (A, H)) canonical forms under unitary similarity are available. As an example, the canonical form for the case of H-self-adjoint matrices is presented here; see [14] and also [25], where a connection to the canonical form of Hermitian pencils is made. Let

Table 18.1 Matrices with symmetry structures with respect to [⋅, ⋅ ] H

denote the m × m upper triangular Jordan block associated with the eigenvalue \(\lambda \in \mathbb{C}\) and the m × m standard involutory permutation (in short called SIP matrix) which has the entry 1 in the \((i,m + 1 - i)\)-positions and zeros elsewhere, respectively.

Theorem 1.

Let \(A,H \in \mathbb{C}^{n\times n}\), where H is Hermitian and invertible and A is H-self-adjoint. Then there exists an invertible matrix \(P \in \mathbb{C}^{n\times n}\) such that

$$\displaystyle\begin{array}{rcl} P^{-1}AP& =& \left (\bigoplus _{ i=1}^{k}\mathcal{J}_{ n_{i}}(\lambda _{i})\right ) \oplus \left (\bigoplus _{j=1}^{\ell}\left [\begin{array}{cc} \mathcal{J}_{n_{k+j}}(\lambda _{k+j})& 0 \\ 0 &\mathcal{J}_{n_{k+j}}(\overline{\lambda }_{k+j}) \end{array} \right ]\right ) {}\\ P^{{\ast}}HP& =& \left (\bigoplus _{ i=1}^{k}\varepsilon _{ i}S_{n_{i}}\right ) \oplus \left (\bigoplus _{j=1}^{\ell}S_{ 2n_{k+j}}\right ), {}\\ \end{array}$$

where \(\lambda _{1},\ldots,\lambda _{k}\) are the real eigenvalues of A, and \(\lambda _{k+1},\ldots,\lambda _{k+\ell}\) are the nonreal eigenvalues of A with positive imaginary part. Moreover, the list \(\varepsilon = (\varepsilon _{1},\ldots,\varepsilon _{k})\) is an ordered set of signs ± 1. The list \(\varepsilon\) is uniquely determined by (A,H) up to permutations of signs corresponding to equal Jordan blocks.

The list \(\varepsilon\) in Theorem 1 is called the sign characteristic of the pair (A, H). Another way of interpreting the sign characteristic is the following: if a fixed eigenvalue λ occurs a multiple number of times among the values \(\lambda _{1},\ldots,\lambda _{\ell}\) of Theorem 1, then the numbers n i corresponding to the indices i for which λ i = λ are called the partial multiplicities of λ. Thus, each partial multiplicity n j of a real eigenvalue λ can be thought of as coming along with an attached sign + 1 or − 1. In this way, it makes sense to speak of the sign characteristic of a real eigenvalue λ by extracting from the sign characteristic \(\varepsilon\) only the signs attached to partial multiplicities associated with λ.

Of particular interest in Numerical Linear Algebra is the development of structure-preserving algorithms , i.e., algorithms using unitary similarity transformations that leave the given indefinite inner product (or more precisely the corresponding Hermitian matrix H) invariant. Thus, these transformations have to satisfy P HP = H which corresponds exactly to the definition of H-unitary matrices. Therefore, H-unitary transformations are an important special case of unitary similarity transformations in indefinite inner product spaces .

Normal Matrices

In the case of the Euclidean inner product, the class of normal matrices has been intensively studied, because it is a class of matrices that generalize self-adjoint, skew-adjoint, and unitary matrices, but still share many important properties with them, like, for example, unitary diagonalizability. Therefore, Gohberg et al. [13] posed the problem of classifying normal matrices in finite-dimensional indefinite inner product spaces. If \(H \in \mathbb{C}^{n\times n}\) is Hermitian and invertible, then a matrix \(N \in \mathbb{C}^{n\times n}\) is called H-normal if N commutes with its adjoint, i.e., if N [∗] N = NN [∗]. In contrast to the cases of H-self-adjoint, H-skew-adjoint, and H-unitary matrices, where a complete classification is available, it turned out that the problem of classifying H-normal matrices is wild, i.e., it contains the problem of classification of a commuting pair of matrices under simultaneous similarity [10]. So far, the problem has only been solved for some special cases, namely the case of inner products with one negative square in [10]–this result was later generalized to Pontryagin spaces with one negative square in [26]–and for the case of two negative squares in [18].

Although some successful attempts have been made to restrict the class of H-normal matrices to smaller classes that allow a complete classification in [11, 12, 29], the interest in H-normal matrices decreased for quite some time due to the lack of applications and probably also due to the following fact established in Proposition 8.1.2 of [14]:

Theorem 2.

Let \(X \in \mathbb{C}^{n\times n}\) be an arbitrary matrix. Then there exists an invertible Hermitian matrix \(H \in \mathbb{C}^{n\times n}\) such that X is H-normal.

From this point of view, H-normal matrices seem to be fairly general and not very special. Nevertheless, it was discovered later that H-normal matrices do play an important role in another topic from the theory of finite-dimensional indefinite inner products: polar decompositions.

H-Polar Decompositions

Polar decompositions in indefinite inner product spaces have gained a lot of attention in recent years. Recall that if \(X \in \mathbb{C}^{n\times n}\) is a matrix, then a factorization X = UA into a unitary matrix \(U \in \mathbb{C}^{n\times n}\) and a positive semidefinite Hermitian matrix \(A \in \mathbb{C}^{n\times n}\) is called a polar decomposition of X and this decomposition is unique if and only if X is nonsingular [19]. If the space \(\mathbb{C}^{n}\) is equipped with an indefinite inner product induced by the invertible Hermitian matrix \(H \in \mathbb{C}^{n\times n}\), then analogously H-polar decompositions can be defined.

Definition 2 (H-polar Decomposition).

Let \(H \in \mathbb{C}^{n\times n}\) be invertible and Hermitian and let \(X \in \mathbb{C}^{n\times n}\). Then a factorization X = UA is called an H-polar decomposition if \(U \in \mathbb{C}^{n\times n}\) is \(H\)-unitary and \(A \in \mathbb{C}^{n\times n}\) is H-self-adjoint.

Following [7], this definition does not impose additional conditions on the H-self-adjoint factor in contrast to the case of the Euclidean inner product, where semi-definiteness is required. One way to generalize semi-definiteness to indefinite inner product spaces is to require that the H-self-adjoint factor has its spectrum in the open right half-plane and this has been included in the definition of H-polar decompositions in [28], where the factorization was called generalized polar decomposition. Bolshakov et al. [7], however, suggested other possible generalizations (like, for example, semi-definiteness of HA) and kept the original definition of H-polar decompositions more general.

Applications for H-polar decompositions include linear optics, where an H-polar decomposition in the four-dimensional Minkowski space is computed to check if a given matrix satisfies the Stokes criterion [6], and H-Procrustes problems that occur in a branch of mathematics known in psychology as factor analysis or multidimensional scaling, where an H-polar decomposition in an n-dimensional space with non-Euclidean geometry has to be computed to compare mathematical objects that represent a test person’s opinion on the similarities and dissimilarities of a finite number of given objects [21].

A simple calculation reveals that the problem of finding H-polar decompositions is closely related to the problem of finding H-self-adjoint square roots of certain H-self-adjoint matrices. Indeed, if X = UA is an H-polar decomposition of the matrix \(X \in \mathbb{C}^{n\times n}\), then

$$\displaystyle{X^{[{\ast}]}X = A^{[{\ast}]}U^{[{\ast}]}UA = AU^{-1}UA = A^{2},}$$

i.e., the square of the H-self-adjoint factor equals X [∗] X. Clearly, X = UA and A must also have identical kernels in order for an H-polar decomposition to exist, and it turns out that these two conditions are also sufficient; see Theorem 4.1 in [7] and see also Lemma 4.1 in [5].

Theorem 3.

Let \(H,X \in \mathbb{C}^{n\times n}\), where H is Hermitian and invertible. Then X admits an H-polar decomposition if and only if there exists an H-self-adjoint matrix \(A \in \mathbb{C}^{n\times n}\) such that X [∗] X = A 2 and \(\mathop{\mathrm{ker}}\nolimits X =\mathop{ \mathrm{ker}}\nolimits A\).

In contrast to the Euclidean inner product, H-polar decompositions need not always exist as the following example shows.

Example 1.

Consider the matrices

$$\displaystyle{X = \left [\begin{array}{cc} 0 &1\\ - 1 &1 \end{array} \right ]\quad \mbox{ and}\quad H = \left [\begin{array}{cc} 0&1\\ 1 &0 \end{array} \right ].}$$

Then

$$\displaystyle{X^{[{\ast}]} = \left [\begin{array}{cc} 1 &1 \\ - 1&0 \end{array} \right ]\quad \mbox{ and}\quad X^{[{\ast}]}X = \left [\begin{array}{cc} - 1& 2 \\ 0 & - 1 \end{array} \right ].}$$

If A was an H-self-adjoint square root of X [∗] X, then necessarily σ(A) ⊂ {−i, i}. Since the spectrum of H-self-adjoint matrices is symmetric with respect to the real line, it follows that \(\sigma (A) =\{ -i,i\}\). But this means that A and thus also X [∗] X would be diagonalizable which is not the case. Thus, X does not admit an H-polar decomposition.

In Theorem 4.4 of [7] necessary and sufficient conditions in terms of the canonical form of the pair (X [∗] X, H) were given for the existence of an H-polar decomposition of the matrix X. These conditions only referred to the nonpositive eigenvalues of X [∗] X so that the following result is obtained as an immediate consequence using the uniqueness of the principal square root of a matrix having no nonpositive eigenvalues, i.e., the square root whose eigenvalues lie in the open left half-plane ([16], Section 1.7):

Theorem 4.

Let \(X \in \mathbb{C}^{n\times n}\) such that X [∗] X does not have nonpositive real eigenvalues. Then there exists a unique generalized polar decomposition, i.e., an H-polar decomposition X = UA, where the spectrum of A is contained in the open right half-plane.

However, although Theorem 4.4 of [7] also completely classifies the existence of H-polar decomposition in the case when X [∗] X does have nonpositive eigenvalues, the conditions are rather difficult to check and, therefore, there was need for other criteria for the existence of H-polar decompositions.

Polar Decompositions and Normal Matrices

Since H-self-adjoint and H-unitary matrices trivially admit H-polar decompositions, it is only natural to ask if matrices from the more general class of H-normal matrices introduced in section “Normal Matrices” do so as well. First attempts into this direction were made in [7], where it was shown in Theorems 5.1 and 5.2 that every nonsingular H-normal matrix and every H-normal matrix in the case that the inner product induced by H has at most one negative square allow H-polar decompositions. The complete answer to this problem was given in Corollary 5 of [30].

Theorem 5.

Let \(N \in \mathbb{C}^{n\times n}\) be an H-normal matrix. Then N admits an H-polar decomposition.

As a consequence of this result, an alternative criterion for the existence of H-polar decompositions can be obtained. It is straightforward to check that if the matrix \(X \in \mathbb{C}^{n\times n}\) has an H-polar decomposition X = UA then \(XX^{[{\ast}]} = UAA^{[{\ast}]}U^{[{\ast}]} = UA^{2}U^{-1}\). Together with the relation X [∗] X = A 2 from Theorem 3, this implies that the matrices XX [∗] and X [∗] X have the same canonical forms as H-self-adjoint matrices. Kintzel [22] conjectured that this condition was also sufficient which is indeed the case, because if \(\tilde{U}XX^{[{\ast}]}\tilde{U}^{-1} = X^{[{\ast}]}X\) for some H-unitary matrix \(\tilde{U} \in \mathbb{C}^{n\times n}\), then UX is H-normal and therefore it does allow an H-polar decomposition \(\tilde{U}X = UA\). But then \(X =\tilde{ U}^{-1}UA\) is an H-polar decomposition for X. The results established above were summarized in Corollary 6 in [30]:

Theorem 6.

Let \(X \in \mathbb{C}^{n\times n}\). Then X admits an H-polar decomposition if and only if the two pairs (X [∗] X,H) and (XX [∗],H) have the same canonical form.

From this point of view, the class of H-normal matrices has turned out to be useful at the end: it served as an important step in the development of necessary and sufficient conditions for the existence of H-polar decompositions.

Numerical Computation of H-Polar Decompositions

So far, only the theoretical aspects of the theory of H-polar decomposition have been summarized and the question arises what can be said from a computational point of view, in particular, since there is need for the numerical computation of H-polar decompositions in applications [21]. An important step into this direction was given in [17], where an important connection between H-polar decompositions and the matrix sign function was discovered.

Recall that the sign function for a complex number z lying off the imaginary axis is defined by

$$\displaystyle{\mathop{\mathrm{sign}}\nolimits (z) = \left \{\begin{array}{rl} 1,&\quad \mbox{ if Re}(z) > 0,\\ - 1, &\quad \mbox{ if Re} (z) < 0. \end{array} \right.}$$

The matrix sign functions extends this definition to square matrices with no eigenvalues on the imaginary axis; see [16]. If \(X \in \mathbb{C}^{n\times n}\) is a matrix with Jordan canonical form \(X = PJP^{-1} = J_{1} \oplus J_{2}\), where the spectrum of \(J_{1} \in \mathbb{C}^{p\times p}\) is contained in the open right half and the spectrum of \(J_{2} \in \mathbb{C}^{(n-p)\times (n-p)}\) is contained in the open left half-plane, then

$$\displaystyle{\mathop{\mathrm{sign}}\nolimits (X):= P\left [\begin{array}{cc} I_{p}& 0 \\ 0 & - I_{n-p} \end{array} \right ]P^{-1}.}$$

Equivalently, the formula \(\mathop{\mathrm{sign}}\nolimits (X) = X(X^{2})^{-1/2}\) can be used as a definition, generalizing the corresponding formula \(\mathop{\mathrm{sign}}\nolimits (z) = z/(z^{2})^{1/2}\) for complex numbers. The matrix sign function is an important tool in model reduction and in the solution of Lyapunov equations and algebraic Riccati equations; see [20]. Therefore, this matrix function has been studied intensively in the literature and many algorithms for its numerical computation have been suggested; see the survey in [16]. A connection to generalized polar decompositions was established in Corollary 4.4 of [17]:

Theorem 7.

Let \(X \in \mathbb{C}^{n\times n}\) have a generalized polar decomposition X = UA, i.e., the spectrum of A is contained in the open right half-plane. Then

$$\displaystyle{\mathop{\mathrm{sign}}\nolimits \left (\left [\begin{array}{cc} 0 &X\\ X^{[{\ast}] } & 0 \end{array} \right ]\right ) = \left [\begin{array}{cc} 0 &U\\ U^{-1 } & 0 \end{array} \right ].}$$

The key impact of this observation is that it can be used to translate results and iterations for the matrix sign function into corresponding results and iterations for H-polar decomposition as shown in Theorem 4.6 of [17]:

Theorem 8.

Let \(Z \in \mathbb{C}^{n\times n}\) have the H-polar decomposition Z = UA, where σ(A) is contained in the open right half-plane. Let g be any matrix function of the form g(X) = Xh(X 2 ) for some matrix function h such that g(M [∗] ) = g(M) [∗] for all \(M \in \mathbb{C}^{n\times n}\) and such that the iteration \(X_{k+1} = g(X_{k})\) converges to \(\mathop{\mathrm{sign}}\nolimits (X_{0})\) with order of convergence m whenever \(\,\mathop{\mathrm{sign}}\nolimits (X_{0})\) is defined. Then the iteration

$$\displaystyle{Y _{k+1} = Y _{k}h(Y _{k}^{[{\ast}]}Y _{ k}),\quad Y _{0} = Z}$$

converges to U with order of convergence m.

The required form of the iteration function g is not restrictive. In fact, all iteration functions in the Padé family have the required form; see [17], Section 5.4. A particular example is the [0∕1] Padé iteration of the form

$$\displaystyle{X_{k+1} = 2X_{k}(I + X_{k}^{2})^{-1},}$$

which is known to converge quadratically to \(\mathop{\mathrm{sign}}\nolimits (X_{0})\), if the start matrix X 0 has no eigenvalues on the imaginary axis. Consequently, the iteration

$$\displaystyle{ Y _{k+1} = 2Y _{k}(I + Y _{k}^{[{\ast}]}Y _{ k})^{-1},\quad Y _{ 0} = Z }$$
(18.3)

converges quadratically to the H-unitary polar factor U of Z if Z satisfies the hypothesis of Theorem 8.

Example 2.

Consider the matrices

$$\displaystyle{Z = \left [\begin{array}{cccc} 0& 0 & 0 & 1\\ 0 & - 3 & - 1 & 0 \\ 0& 0 & - 3& - 2\\ 1 & 2 & 0 & 1 \end{array} \right ]\quad \mbox{ and}\quad H = \left [\begin{array}{cccc} 0&0&0&1\\ 0 &0 &1 &0 \\ 0&1&0&0\\ 1 &0 &0 &0 \end{array} \right ].}$$

Then X admits an H-polar decomposition X = UA, where

$$\displaystyle{U = \left [\begin{array}{cccc} 0& 0 & 0 &1\\ 0 & - 1 & 0 &0 \\ 0& 0 & - 1&0\\ 1 & 0 & 0 &0 \end{array} \right ]\quad \mbox{ and}\quad A = \left [\begin{array}{cccc} 1&2&0&1\\ 0 &3 &1 &0 \\ 0&0&3&2\\ 0 &0 &0 &1 \end{array} \right ].}$$

Thus, the spectrum of A is contained in the open right half-plane so that Z satisfies the hypothesis of Theorem 8. Then starting the iteration (18.3) with the matrix Y 0 = Z results in iterates Y k with the following absolute error \(e_{k}:=\| Y _{k} - U\|_{2}\):

k

1

2

3

4

5

6

e k

0.6413

0.2358

0.0281

2.2301e-4

7.0541e-9

2.1687e-16

This illustrates the quadratic convergence to the H-unitary polar factor U as predicted by Theorem 8. Clearly, once U has been computed, the H-self-adjoint polar factor can be obtained via \(A = U^{-1}X\).

Although the numerical computation of the generalized polar decomposition is easily achieved by the use of appropriate matrix functions, it remains an open problem to construct algorithms that numerically compute H-polar decompositions when the spectrum of the H-self-adjoint factor is not contained in the open right half-plane. In particular, this includes H-polar decompositions of matrices for which X [∗] X has nonpositive eigenvalues.

Hamiltonian Matrices

A special case frequently appearing in applications is the case that the matrix defining the inner product has the special form

$$\displaystyle{H = J:= \left [\begin{array}{cc} 0 &I_{n} \\ - I_{n}& 0 \end{array} \right ].}$$

In this case, the structured matrices from the last column of Table 18.1 are simply called skew-Hamiltonian, Hamiltonian, and symplectic matrices, respectively. In Numerical Linear Algebra, these terms are also commonly used in the complex case and we will follow this habit in this survey. Consequently, a matrix \(\mathcal{H}\in \mathbb{C}^{2n\times 2n}\) is called Hamiltonian if \(\mathcal{H}^{{\ast}}J + J\mathcal{H} = 0\), i.e., if it is skew-adjoint with respect to the indefinite inner product induced by J. As a direct consequence of Theorem 1 a canonical form for Hamiltonian matrices can be obtained by computing the canonical form of the (iJ)-self-adjoint matrix \(i\mathcal{H}\). This shows that now the purely imaginary eigenvalues of Hamiltonian matrices are equipped with a sign characteristic as an additional invariant under unitary similarity.

The Hamiltonian eigenvalue problem , i.e., the problem of finding eigenvalues, eigenvectors, and invariant subspaces for a given Hamiltonian matrix, has been intensively studied in the literature due to a large number of applications. Two of them, namely the solution of Algebraic Riccati Equations and the stabilization of gyroscopic systems, will be presented in the next two subsections. Further applications include stability radius computation for control systems, H -norm computation, and passivity preserving model reduction; see the survey papers [4, 9].

Algebraic Riccati Equations and the Hamiltonian Schur Form

If \(\mathcal{H}\in \mathbb{R}^{2n\times 2n}\) is a Hamiltonian matrix, then it has the block form

$$\displaystyle{ \mathcal{H} = \left [\begin{array}{cc} A& G\\ Q & - A^{T} \end{array} \right ], }$$
(18.4)

where \(A,G,Q \in \mathbb{R}^{n\times n}\) and where G and Q are symmetric. The corresponding algebraic Riccati equation (ARE) for the unknown matrix \(X \in \mathbb{R}^{n\times n}\) takes the form

$$\displaystyle{ Q + XA + A^{T}X - XGX = 0. }$$
(18.5)

The following theorem establishes a close connection between solutions of the ARE and invariant subspaces of the corresponding Hamiltonian matrix; see Theorems 13.1 and 13.2 in [36].

Theorem 9.

Let \(\mathcal{V}\subset \mathbb{C}^{2n}\) be an n-dimensional invariant subspace of the Hamiltonian matrix \(\mathcal{H}\) in (18.4), and let \(X_{1},X_{2} \in \mathbb{C}^{n\times n}\) such that

$$\displaystyle{\mathcal{V} = \mathrm{Im}\left [\begin{array}{c} X_{1} \\ X_{2} \end{array} \right ].}$$

If X 1 is invertible, then \(X:= X_{2}X_{1}^{-1}\) is a solution of the corresponding algebraic Riccati equation (18.5) and the eigenvalues of A + RX are exactly the eigenvalues of \(\mathcal{H}\) associated with \(\mathcal{V}\).

Conversely, if \(X \in \mathcal{C}^{n\times n}\) is a solution of the algebraic Riccati equation (18.5), then there exist matrices \(X_{1},X_{2} \in \mathbb{C}^{n\times n}\) with X 1 being invertible such that \(X = X_{2}X_{1}^{-1}\) and such that the columns of

$$\displaystyle{\left [\begin{array}{c} X_{1} \\ X_{2} \end{array} \right ]}$$

form a basis of an n-dimensional invariant subspace of the corresponding Hamiltonian matrix (18.4) .

The solution of the ARE is related to the construction of optimal feedback controllers for linear time-invariant control systems. However, it was pointed out in [3, 31] that for the construction of optimal feedback controllers the approach via solutions of the ARE can be avoided and the consideration of invariant subspaces of Hamiltonian matrices is already sufficient. Of particular interest are n-dimensional invariant subspaces with eigenvalues in the open left half-plane \(\mathbb{C}^{-}\), because they lead to stabilizing feedback solutions of linear time-invariant control systems; see [24] and [36].

For the solution of the Hamiltonian eigenvalue problem, the preservation of the Hamiltonian structure is an important factor in the development of efficient and accurate algorithms, because of two main reasons that basically concern all problems dealing with matrices that carry an additional structure. First, exploiting the structure may yield in higher efficiency of the algorithm. For example, computing all eigenvalues of a symmetric matrix using the symmetric QR algorithm requires approximately 10% of the floating point operations needed for computing all eigenvalues of a general matrix using the unsymmetric QR algorithm; see [9, 15]. Second, matrices that are structured with respect to an indefinite inner product typically show a symmetry in the spectrum. For example, the spectrum of a real Hamiltonian matrix \(\mathcal{H}\) is symmetric with respect to both the real and the imaginary axes: if \(\lambda _{0} \in \mathbb{C}\) is an eigenvalue then so are \(-\lambda _{0},\overline{\lambda }_{0},-\overline{\lambda }_{0}\). (This follows easily from Theorem 1 applied to \((i\mathcal{H},iJ)\) and the fact that the spectrum of real matrices is symmetric with respect to the real line.) If general similarity transformations are applied to \(\mathcal{H}\) then this eigenvalue symmetry will typically be lost in finite precision arithmetic due to roundoff errors.

Therefore, [33] suggested to use symplectic unitary similarity transformations for Hamiltonian matrices. The property of being symplectic ensures that the similarity transformation preserves the Hamiltonian structure; while the property of being unitary is important for stability of numerical algorithms. Now a matrix \(Q \in \mathbb{C}^{2n\times 2n}\) is both symplectic and unitary if and only if it satisfies Q Q = I and \(JQ = (QQ^{{\ast}})JQ = Q(Q^{{\ast}}JQ) = QJ\) which reduces to the block form

$$\displaystyle{Q = \left [\begin{array}{cc} Q_{1} & - Q_{2} \\ Q_{2} & Q_{1} \end{array} \right ],}$$

where \(Q_{1}^{{\ast}}Q_{1} + Q_{2}^{{\ast}}Q_{2} = I\) and \(Q_{1}^{{\ast}}Q_{2} + Q_{2}^{{\ast}}Q_{1} = 0\). Paige and Van Loan [33] suggested to use symplectic unitary similarity to compute the following variant of the Schur form for a Hamiltonian matrix:

Definition 3.

A Hamiltonian matrix \(\mathcal{H}\in \mathbb{C}^{2n\times 2n}\) is said to be in Hamiltonian Schur form, if

$$\displaystyle{ \mathcal{H} = \left [\begin{array}{cc} T & R\\ 0 & - T^{{\ast}} \end{array} \right ], }$$
(18.6)

where \(T \in \mathbb{C}^{n\times n}\) is upper triangular.

A sufficient condition on the eigenvalues of \(\mathcal{H}\) for this form to exist is given in Theorem 3.1 in [33]:

Theorem 10.

Let \(\mathcal{H}\in \mathbb{C}^{2n\times 2n}\) be Hamiltonian. If \(\mathcal{H}\) does not have eigenvalues on the imaginary axis, then there exists a unitary symplectic matrix \(Q \in \mathbb{C}^{2n\times 2n}\) such that

$$\displaystyle{Q^{{\ast}}\mathcal{H}Q = \left [\begin{array}{cc} T & R \\ 0 & - T^{{\ast}} \end{array} \right ]}$$

is in Hamiltonian Schur form. In particular, Q can be chosen so that the eigenvalues of T are in the left half-plane.

It was observed, however, that the Hamiltonian Schur form does not always exist if the Hamiltonian matrix does have eigenvalues on the imaginary axis. It follows immediately from the block form (18.6) that the algebraic multiplicity of each purely imaginary eigenvalue of \(\mathcal{H}\) must be even, because every eigenvalue that appears on the diagonal of T will also appear on the diagonal of − T . This condition is thus necessary but not sufficient. At this point, it is the understanding of the sign characteristic that is needed for a complete answer to the problem of the existence of the Hamiltonian Schur form. The following result links the problem of the existence of the Hamiltonian Schur form to the existence of a particular J-neutral invariant subspace. (Recall that a subspace \(\mathcal{V}\subset \mathbb{C}^{2n}\) is called J-neutral if x Jy = 0 for all \(x,y \in \mathcal{V}\).)

Theorem 11.

Let \(\mathcal{H}\in \mathbb{C}^{2n\times 2n}\) be a Hamiltonian matrix. Then the following statements are equivalent.

  1. (1)

    There exists a symplectic matrix \(S \in \mathbb{C}^{2n\times 2n}\) such that \(S^{-1}\mathcal{H}S\) is in Hamiltonian Schur form.

  2. (2)

    There exists a unitary symplectic matrix \(Q \in \mathbb{C}^{2n\times 2n}\) such that \(Q^{{\ast}}\mathcal{H}Q\) is in Hamiltonian Schur form.

  3. (3)

    There exists an n-dimensional subspace of \(\mathbb{C}^{2n}\) that is J-neutral and \(\mathcal{H}\) -invariant.

  4. (4)

    For any purely imaginary eigenvalue λ of \(\mathcal{H}\), the number of odd partial multiplicities corresponding to λ with sign + 1 is equal to the number of partial multiplicities corresponding to λ with sign − 1.

The implication (1) ⇒ (2) follows immediately from a QR-like decomposition of symplectic matrices proved in [8]; see also Lemma 3 in [27]. (2) ⇒ (3) is trivial as the first n columns of Q span an \(\mathcal{H}\)-invariant subspace which is also J-neutral, because of Q JQ = J. Then (3) ⇔ (4) was proved in Theorem 5.1 in [34] in the terms of self-adjoint matrices, while (4) ⇒ (1) was proved in Theorem 23 in [27].

Example 3.

Consider the matrices

$$\displaystyle{\mathcal{H} = \left [\begin{array}{cccc} i &1& 1 &0\\ 0 & i & 0 &0 \\ 0&0& i &0\\ 0 &0 & - 1 & i \end{array} \right ]\quad \mbox{ and}\quad P = \frac{1} {\sqrt{2}}\left [\begin{array}{cccc} 2i& 0 & i & 2i\\ 0 & 1 & - i & -i \\ 0 & 0 & 1 & 0\\ 0 & - i & 1 & 1 \end{array} \right ].}$$

Then \(\mathcal{H}\) is a Hamiltonian matrix in Hamiltonian Schur form and P is the transformation matrix that brings the pair \((i\mathcal{H},iJ)\) into the canonical form of Theorem 1:

$$\displaystyle{P^{-1}(iH)P = \left [\begin{array}{cccc} - 1& 1 & 0 & 0\\ 0 & - 1 & 1 & 0 \\ 0 & 0 & - 1& 0\\ 0 & 0 & 0 & -1 \end{array} \right ],\quad P^{{\ast}}(iJ)P = \left [\begin{array}{cccc} 0&0&1& 0\\ 0 &1 &0 & 0 \\ 1&0&0& 0\\ 0 &0 &0 & -1 \end{array} \right ],}$$

Thus, \(\mathcal{H}\) has the eigenvalue i with partial multiplicities 3 and 1. The partial multiplicity 3 has the sign 1 and the partial multiplicity 1 has the sign − 1 and thus condition (4) of Theorem 18.6 is satisfied. This example shows in particular that condition (4) only refers to the number of odd partial multiplicities with a particular sign, but not to their actual sizes.

Although the problem of existence of the Hamiltonian Schur form is completely sorted out, it remains a challenge to design satisfactory numerical methods for Hamiltonian matrices having some eigenvalues on the imaginary axis. So far, most structure-preserving algorithms for Hamiltonian matrices are designed for real Hamiltonian matrices without eigenvalues on the imaginary axis; see the survey [4].

Stability of Gyroscopic Systems

A gyroscopic system is a second-order differential equation of the form

$$\displaystyle{ M\ddot{x}(t) + G\dot{x}(t) + Kx(t) = 0, }$$
(18.7)

where \(M,G,K \in \mathbb{R}^{n\times n}\), M = M, \(G^{{\ast}} = -G\), and K = K, see [23, 35]. Typically, M is positive definite and by otherwise considering the equivalent system

$$\displaystyle{\ddot{y} + L^{-1}GL^{-{\ast}}\dot{y} + L^{-1}KL^{-{\ast}}y = 0,}$$

where L is the Cholesky factor of M, i.e., M = LL , and y = L x, one can assume without loss of generality that M = I. In that case, stability of the system can be investigated by computing the eigenvalues of the quadratic matrix polynomial

$$\displaystyle{L(\lambda ) =\lambda ^{2}I +\lambda G + K}$$

or, equivalently, by computing the eigenvalues of the Hamiltonian matrix

$$\displaystyle{ \mathcal{H} = \left [\begin{array}{cc} -\frac{1} {2}G&K + \frac{1} {4}G^{2} \\ I & -\frac{1} {2}G \end{array} \right ], }$$
(18.8)

see [32]. The gyroscopic system is said to be stable if all solutions of (18.7) are bounded for all nonnegative t. Since the eigenvalues of \(\mathcal{H}\) are symmetric with respect to the imaginary axis, it follows that a necessary condition for (18.7) to be stable is that all eigenvalues of L or \(\mathcal{H}\), respectively, lie exactly on the imaginary axis. If in addition all eigenvalues are semisimple (i.e., the algebraic multiplicity is equal to the geometric multiplicity), then this condition is also sufficient; see [35].

A stronger concept is the notion of strong stability; see [23]. The gyroscopic system (18.7) is called strongly stable, if it is stable and in addition all neighboring systems are stable, i.e., all gyroscopic systems of the form \(\tilde{M}\ddot{x}(t) +\tilde{ G}\dot{x}(t) +\tilde{ K}x(t) = 0\), where the coefficient matrices \(\tilde{M},\tilde{G},\tilde{K}\) are sufficiently close to the coefficient matrices M, G, K of the original system. Again, in the case M = I one can assume without loss of generality that also \(\tilde{M} = I\) and hence it is sufficient to consider Hamiltonian matrices that are sufficiently close to the one in (18.8). For conveniently stating the following result, which is a special case of Theorem 3.2 in [32], the following terminology is needed.

Definition 4.

Let \(\mathcal{H}\in \mathbb{C}^{2n\times 2n}\) be a Hamiltonian matrix and let λ be a purely imaginary eigenvalue of \(\mathcal{H}\).

  1. (1)

    λ is called an eigenvalue of definite type if all partial multiplicities corresponding to λ have size 1 (i.e., λ is semisimple) and if they all have the same same sign.

  2. (2)

    λ is called an eigenvalue of mixed type if it is not an eigenvalue of definite type, i.e., either λ has at least one partial multiplicity exceeding one or else there exist two partial multiplicities corresponding to λ such that one has positive sign and the other has negative sign.

Theorem 12.

Let \(\mathcal{H}\in \mathbb{C}^{2n\times 2n}\) be a Hamiltonian matrix and let λ be a purely imaginary eigenvalue of \(\mathcal{H}\) with algebraic multiplicity p.

  1. (1)

    If λ is of definite type, then there exists an \(\varepsilon > 0\) such that for all Hamiltonian matrices \(\mathcal{E}\in \mathbb{C}^{2n\times 2n}\) with \(\|\mathcal{E}\| <\varepsilon\) the matrix \(\mathcal{H} + \mathcal{E}\) has exactly p eigenvalues \(\lambda _{1},\ldots,\lambda _{p}\) in a small neighborhood of λ which are all semisimple and on the imaginary axis.

  2. (2)

    If λ is of mixed type, then for any \(\varepsilon > 0\) there exists a Hamiltonian matrix \(\mathcal{E}\in \mathbb{C}^{2n\times 2n}\) with \(\|\mathcal{E}\| =\varepsilon\) such that \(\mathcal{H} + \mathcal{E}\) has eigenvalues with nonzero real part.

A direct consequence of this theorem is the following characterization of strong stability (compare also Theorem 3.2 in [2]).

Corollary 1.

The system (18.7) with M = I is strongly stable if and only if all eigenvalues of the corresponding Hamiltonian matrix \(\mathcal{H}\) of (18.8) are purely imaginary and of definite type.

As an illustration of this characterizations, consider the following example, see Example 3.5 in [32]:

Example 4.

Consider the Hamiltonian matrices

$$\displaystyle{\mathcal{H}_{1} = \left [\begin{array}{cccc} 0 &1& 0 &0\\ - 1 &0 & 0 &0 \\ 0 &0& 0 &1\\ 0 &0 & - 1 &0 \end{array} \right ],\quad \mathcal{H}_{2} = \left [\begin{array}{cccc} 0 & 0 &1&0\\ 0 & 0 &0 &1 \\ - 1& 0 &0&0\\ 0 & - 1 &0 &0 \end{array} \right ]}$$

which both have two semisimple purely imaginary eigenvalues ± i with algebraic multiplicity 2. One can easily check that the eigenvalues of \(\mathcal{H}_{1}\) are of mixed type while the eigenvalues of \(\mathcal{H}_{2}\) are of definite type.

Figure 18.1 displays the effect of random Hamiltonian perturbations. In a numerical experiment 1,000 random Hamiltonian matrices \(\mathcal{E}\) (with entries normally distributed with mean 0 and standard deviation 1) were computed in Matlab (Matlab ®; is a registered trademark of The MathWorks Inc.) and then normalized to spectral norm 1∕4. Then the eigenvalues of \(\mathcal{H}_{1} + \mathcal{E}\) and \(\mathcal{H}_{2} + \mathcal{E}\) were computed and plotted into the left and right subplot, respectively. The left picture shows that the eigenvalues of the perturbed Hamiltonian matrices form two clouds around the eigenvalues ± i due to the fact that the eigenvalues were of mixed type. The right picture, however, shows that the eigenvalues of all Hamiltonian perturbations of \(\mathcal{H}_{2}\) stay on the imaginary axis.

Fig. 18.1
figure 1

Random Hamiltonian perturbations of \(\mathcal{H}_{1}\) and \(\mathcal{H}_{2}\)

Figure 18.2 displays the same situation when general random perturbations of spectral norm 1∕4 are considered. In both cases the eigenvalues of the perturbed matrices appear in two clouds centered around the original eigenvalues ± i.

Fig. 18.2
figure 2

Random Hamiltonian perturbations of \(\mathcal{H}_{1}\) and \(\mathcal{H}_{2}\)

This example highlights the importance of the theory of indefinite inner products. If numerical algorithms do not exploit the special structure of Hamiltonian matrices, then the strong stability of the gyroscopic system described by the Hamiltonian matrix \(\mathcal{H}_{2}\) will be lost, because the sign characteristic of purely imaginary eigenvalues is ignored. Only structure-preserving algorithms are able to detect further important properties of structured matrices and underlying systems like strong stability.

Besides the application to strongly stable gyroscopic systems, the effect of the sign characteristic of purely imaginary eigenvalues of Hamiltonian matrices has important applications in the theory of passivation of control systems; see [1].