Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Footnote 1 Footnote 2

1 Introduction

Nonlinear eigenvalue problems of the form

$$\displaystyle{ P(\lambda )x = 0\,,\quad x \in \mathbb{C}^{n},\quad x\neq 0\,, }$$

where P(λ) is an m × n matrix-valued function of a scalar variable λ, are playing an increasingly important role in classical and contemporary applications. The simplest, but still most important among these problems are the polynomial eigenproblems, where P(λ) is an m × n matrix polynomial

$$\displaystyle{ P(\lambda ) =\sum _{ i=0}^{k}\lambda ^{i}A_{ i}\,,\quad A_{i} \in \mathbb{C}^{m\times n}\,. }$$
(12.1)

Such problems arise directly from applications, from finite element discretizations of continuous models, or as approximations to more general nonlinear eigenproblems, as detailed in the survey articles [69, 79]. The trend towards extreme designs, such as high speed trains, optoelectronic devices, micro-electromechanical systems, and “superjumbo” jets such as the Airbus 380, presents a challenge for the computation of the resonant frequencies of these structures as these extreme designs often lead to eigenproblems with poor conditioning.

However, the physics that underlies problems arising from applications can lead to algebraic structure in their mathematical formulation. Numerical methods that preserve this structure keep key qualitative features such as eigenvalue symmetries from being obscured by finite precision error.

A recurring theme running through much of the work of Volker Mehrmann has been the preservation of structure – in the pursuit of condensed forms, and in the development of numerical algorithms. To quote from the 2004 paper titled Nonlinear Eigenvalue Problems: A Challenge for Modern Eigenvalue Methods by Mehrmann and Voss [69]:

The task of numerical linear algebra then is to design numerical methods that are accurate and efficient for the given problem. The methods should exploit to a maximal extent the sparsity and structure of the coefficient matrices. Furthermore, they should be as accurate as the approximation of the underlying operator problem permits, and they should include error and condition estimates.

One of the most common strategies for solving a polynomial eigenproblem is via a linearization, which replaces the given matrix polynomial P(λ) by a matrix pencil \(L(\lambda ) =\lambda X + Y\) with the same eigenvalues as P. The eigenproblem for L(λ) is then solved with general pencil algorithms like the QZ algorithm, or with methods designed to work effectively on the specific types of pencils produced by the linearization process. If the matrix polynomial has some structure, then the linearization should also have that structure, and the algorithm employed on the linearization should preserve that structure.

The most commonly used linearizations in numerical practice have been the Frobenius companion forms. Although these pencils have many desirable properties, including the extreme ease with which they can be constructed, they have one significant drawback. They do not preserve any of the most important and commonly occurring matrix polynomial structures – Hermitian, alternating, or palindromic. Thus in order to implement the structure preservation principle on the linearization strategy, it is necessary to have more linearizations available, in particular ones that preserve the structure of the original polynomial. It is also useful to have a large palette of easily constructible linearizations, even in the absence of any structure to be preserved. For example, it may be possible to improve numerical accuracy by selecting an “optimal” linearization, but only if there are many linearizations available to choose from.

This essay will illustrate the influence that the structure preservation principle has had on the development of linearizations of matrix polynomials, on the impact our improved understanding of linearizations in general has had on numerical practice, and Mehrmann’s key contributions to that effort.

2 Basic Concepts

We use \(\mathbb{N}\) to denote the set of nonnegative integers, \(\mathbb{F}\) for an arbitrary field, \(\mathbb{F}[\lambda ]\) for the ring of polynomials in one variable with coefficients from the field \(\mathbb{F}\), and \(\mathbb{F}(\lambda )\) for the field of rational functions over \(\mathbb{F}\).

A matrix polynomial of grade k has the form

$$\displaystyle{ P(\lambda ) =\sum _{ i=0}^{k}\lambda ^{i}A_{ i},\quad \mbox{ with}\quad A_{i} \in \mathbb{F}^{m\times n}. }$$
(12.2)

Here we allow any of the coefficient matrices, including A k , to be the zero matrix. The degree of a nonzero matrix polynomial retains its usual meaning as the largest integer j such that the coefficient of λ j in P(λ) is nonzero. The grade of a nonzero matrix polynomial is a choice of integer k at least as large as its degree [22, 59, 61]. It signals that the polynomial is to be viewed as an element of a particular vector space – the \(\mathbb{F}\)-vector space of all matrix polynomials of degree less than or equal to k. Choosing a grade, in effect, specifies the finite-dimensional vector space of discourse.

If m = n and detP(λ) is not the identically zero polynomial, then P(λ) is said to be regular; equivalently, P(λ) is regular if it is invertible when viewed as a matrix with entries in the field of rational functions \(\mathbb{F}(\lambda )\). Otherwise, P(λ) is said to be singular (note that this includes all rectangular matrix polynomials with mn). The rank of P(λ) is its rank when viewed as a matrix with entries in the field \(\mathbb{F}(\lambda )\), or equivalently, the size of the largest nonzero minor of P(λ). For simplicity, in many cases we may suppress the dependence on λ when referring to a matrix polynomial.

An m × m polynomial E(λ) is said to be unimodular if detE(λ) is a nonzero constant, i.e., E(λ) has an inverse that is also a matrix polynomial [32]. The canonical form of a matrix polynomial P(λ) under a transformation E(λ)P(λ)F(λ) by unimodular matrix polynomials E(λ) and F(λ) is referred to as the Smith form of P(λ). This form was first developed for integer matrices by H.J.S. Smith [76] in the context of solving linear systems of Diophantine equations [51]. It was then extended by Frobenius in [30] to matrix polynomials.

Theorem 1 (Smith form (Frobenius, 1878)[30])

Let P(λ) be an m × n matrix polynomial over an arbitrary field \(\,\mathbb{F}\) . Then there exists \(r \in \mathbb{N}\) , and unimodular matrix polynomials E(λ) and F(λ) over \(\,\mathbb{F}\) of size m × m and n × n, respectively, such that

$$\displaystyle{ E(\lambda )P(\lambda )F(\lambda ) =\mathrm{ diag}(d_{1}(\lambda ),\ldots,d_{\min \{m,n\}}(\lambda )) =: D(\lambda ), }$$
(12.3)

where each d i (λ) is in \(\mathbb{F}[\lambda ]\) , d 1 (λ),…,d r (λ) are monic, \(\,d_{r+1}(\lambda ),\ldots,d_{\min \{m,n\}}(\lambda )\) are identically zero, and d 1 (λ),…,d r (λ) form a divisibility chain , that is, d j (λ) is a divisor of d j+1 (λ) for \(j = 1,\ldots,r - 1\) . Moreover, the m × n diagonal matrix polynomial D(λ) is unique, and the number r is equal to the rank of P.

The nonzero diagonal elements d j (λ), \(j = 1,\ldots,r\) in the Smith form D(λ) are called the invariant factors or invariant polynomials of P(λ).

The uniqueness of D(λ) in Theorem 1 implies that the Smith form is insensitive to field extensions. In other words, the Smith forms of P(λ) over \(\mathbb{F}\) and over any extension field \(\tilde{\mathbb{F}} \supseteq \mathbb{F}\) are identical. Consequently, the following notions of the partial multiplicity sequences, eigenvalues, and elementary divisors of P(λ) are well-defined.

Definition 1 (Partial Multiplicity Sequences and Jordan Characteristic)

Let P(λ) be an m × n matrix polynomial of rank r over a field \(\mathbb{F}\). For any λ 0 in the algebraic closure \(\overline{\mathbb{F}}\), the invariant polynomials d i (λ) of P, for 1 ≤ i ≤ r, can each be uniquely factored as

$$\displaystyle{ d_{i}(\lambda ) = (\lambda -\lambda _{0})^{\alpha _{i}}\,p_{i}(\lambda )\quad \mbox{ with}\quad \alpha _{i} \geq 0\,,\;p_{i}(\lambda _{0})\neq 0\,. }$$
(12.4)

The sequence of exponents \((\alpha _{1},\alpha _{2},\mathop{\ldots },\alpha _{r})\) for any given \(\lambda _{0} \in \overline{\mathbb{F}}\) satisfies the condition \(0 \leq \alpha _{1} \leq \alpha _{2} \leq \cdots \leq \alpha _{r}\) by the divisibility chain property of the Smith form, and is called the partial multiplicity sequence of P at \(\lambda _{0} \in \overline{\mathbb{F}}\), denoted \(\mathcal{J} (P,\lambda _{0})\). The collection of all the partial multiplicity sequences of P is called the Jordan characteristic of P.

Note that we allow any, even all, of the exponents α i in a partial multiplicity sequence \(\mathcal{J} (P,\lambda _{0})\) to be zero. Indeed, this occurs for all but a finite number of \(\lambda _{0} \in \overline{\mathbb{F}}\). These exceptional λ 0 with at least one nonzero entry in \(\mathcal{J} (P,\lambda _{0})\) are of course just the eigenvalues of P(λ).

Definition 2 (Eigenvalues and Elementary Divisors)

A scalar \(\lambda _{0} \in \overline{\mathbb{F}}\) is a (finite) eigenvalue of a matrix polynomial P whenever its partial multiplicity sequence \((\alpha _{1},\alpha _{2},\mathop{\ldots },\alpha _{r})\) is not the zero sequence. The elementary divisors for an eigenvalue λ 0 of P are the collection of factors \((\lambda -\lambda _{0})^{\alpha _{i}}\) with α i ≠ 0, including repetitions. The algebraic multiplicity of an eigenvalue λ 0 is the sum \(\alpha _{1} +\alpha _{2} + \cdots +\alpha _{r}\) of the terms in its partial multiplicity sequence, while the geometric multiplicity is the number of nonzero terms in this sequence. An eigenvalue λ 0 is said to be simple if its algebraic multiplicity is one; λ 0 is semisimple if its algebraic and geometric multiplicities are equal, equivalently, if all of the nonzero terms in its partial multiplicity sequence are equal to one.

It is worth noting that defining the eigenvalues of a matrix polynomial via the Smith form subsumes the more restrictive notion of the eigenvalues as the roots of detP(λ), which is completely inadequate for singular matrix polynomials. We also stress the importance of viewing the partial multiplicities of a fixed λ 0 as a sequence. In a number of situations, especially for matrix polynomials with structure [5860], it is essential to consider certain subsequences of partial multiplicities, which can be subtly constrained by the matrix polynomial structure. Indeed, even the zeroes in the partial multiplicity sequences of structured matrix polynomials can sometimes have nontrivial significance [5860].

Matrix polynomials may also have infinite eigenvalues, with a corresponding notion of elementary divisors at . In order to define the elementary divisors at we need one more preliminary concept, that of the reversal of a matrix polynomial.

Definition 3 (j-reversal)

Let P(λ) be a nonzero matrix polynomial of degree d ≥ 0. For j ≥ d, the j-reversal of P is the matrix polynomial rev j P given by

$$\displaystyle{ (\mathrm{rev}_{j}P)(\lambda )\,:=\,\lambda ^{j}P(1/\lambda ). }$$
(12.5)

In the special case when j = d, the j-reversal of P is called the reversal of P and is sometimes denoted by just revP.

Definition 4 (Elementary divisors at )

Let P(λ) be a nonzero matrix polynomial of grade k and rank r. We say that λ 0 =  is an eigenvalue of P if and only if 0 is an eigenvalue of rev k P, and the partial multiplicity sequence of P at λ 0 =  is defined to be the same as that of the eigenvalue 0 for rev k P, that is \(\mathcal{J} (P,\infty ):= \mathcal{J} (\mathrm{rev}_{k}P,0)\). If this partial multiplicity sequence is \((\alpha _{1},\alpha _{2},\mathop{\ldots },\alpha _{r})\), then for each α i ≠ 0 we say there is an elementary divisor of degree α i for the eigenvalue λ 0 =  of P.

If \(P(\lambda ) =\sum _{ i=0}^{g}\lambda ^{i}A_{i}\) has grade k and rank r, then P has an eigenvalue at if and only if the rank of the leading coefficient matrix A k is strictly less than r. For a regular polynomial P this just means that A k is singular. Observe that if k > deg P, then A k  = 0 and P necessarily has r elementary divisors at .

Definition 5 (Spectral Structure of a Matrix Polynomial)

The collection of all the eigenvalues of a matrix polynomial P(λ), both finite and infinite, is the spectrum of P. The collection of all the elementary divisors of P, both finite and infinite, including repetitions, constitutes the spectral structure of P.

The two most frequently used equivalence relations that preserve spectral structure between matrix polynomials are unimodular equivalence and strict equivalence. They can be used only between matrix polynomials of the same size.

Definition 6

A pair of m × n matrix polynomials P and Q over a fixed but arbitrary field \(\mathbb{F}\) are said to be

  1. (a)

    Unimodularly equivalent, denoted P ∼ Q, if there exist unimodular matrix polynomials E(λ) and F(λ) over \(\mathbb{F}\) such that E(λ)P(λ)F(λ) = Q(λ),

  2. (b)

    Strictly equivalent, denoted PQ, if there exist invertible (constant) matrices E and F over \(\mathbb{F}\) such that E ⋅ P(λ) ⋅ F = Q(λ).

Of these two relations, unimodular equivalence is the more flexible, as it allows the degrees of the two matrix polynomials to differ, while keeping the list of finite elementary divisors invariant. On the other hand, strict equivalence preserves both finite and infinite elementary divisors, but because the degrees of strictly equivalent matrix polynomials have to be identical, this relation can be a bit restrictive.

Recently the relations of extended unimodular equivalence and spectral equivalence have been introduced [22] to facilitate the comparison of matrix polynomials that are of different sizes, including rectangular, and of different grades. The underlying goal is to investigate the extent to which it is possible for such diverse matrix polynomials to share the same spectral structure and and the same singular structure. These extended equivalences now open up the possibility of choosing linearizations that can take on any size that “works.” This is in accord with the notion of “trimmed linearizations” studied by Byers, Mehrmann and Xu in [16]. Another important consequence is that one can now easily generalize the notion of (strong) linearization to (strong) quadratification, and indeed to (strong) -ification [22]!

3 Linearizations

For square matrix polynomials, the notion of linearization plays a central role for both theory and computation.

Definition 7 (Linearization)

An nk ×nk pencil \(L(\lambda ) =\lambda X + Y\) is said to be a linearization for an n × n matrix polynomial P(λ) of grade k if there exist unimodular nk ×nk matrix polynomials E(λ), F(λ) such that

$$\displaystyle{E(\lambda )L(\lambda )F(\lambda ) = \left [\begin{array}{c|c} P(\lambda ) & 0 \\ \hline 0&I_{(k-1)n} \end{array} \right ]_{\mathit{nk}\times \mathit{nk}}\!.}$$

If in addition, \(\mathrm{rev}_{1}L(\lambda ):=\lambda X + Y\) is a linearization of rev k P(λ), then L is said to be a strong linearization of P.

The key property of any linearization L of P is that L has the same finite elementary divisors as P, while a strong linearization has the same finite and infinite elementary divisors as P. Since there are well-known algorithms for solving the linear eigenproblem this immediately suggests working on a matrix pencil L that is a strong linearization for P.

The linearizations most used in practice are the first and second Frobenius companion forms \(C_{1}(\lambda ) =\lambda X_{1} + Y _{1}\), and \(C_{2}(\lambda ) =\lambda X_{2} + Y _{2}\), where

$$\displaystyle{ X_{1} = X_{2} =\mathrm{ diag}(A_{k},I_{(k-1)n}), }$$
(12.6a)
$$\displaystyle\begin{array}{rcl} \qquad Y _{1} = \left [\begin{array}{cccc} A_{k-1} & A_{k-2} & \cdots &A_{0} \\ - I_{n}& 0 & \cdots & 0\\ \\ \\ \vdots & \ddots & \ddots & \vdots \\ 0 & \cdots & - I_{n}& 0 \end{array} \right ]\!,\quad \mbox{ and}\quad Y _{2} = \left [\begin{array}{cccc} A_{k-1} & - I_{n}&\cdots & 0\\ \\ \\ A_{k-2} & 0 & \ddots & \vdots\\ \\ \\ \vdots & \vdots & \ddots & - I_{n} \\ A_{0} & 0 &\cdots & 0 \end{array} \right ]\!.& &{}\end{array}$$
(12.6b)

They have several attractive properties:

  • there is a uniform template for constructing them directly from the data in P, using no matrix operations on the coefficients of P,

  • eigenvectors of P are easily recoverable from eigenvectors of the companion forms,

  • they are always strong linearizations for P, no matter whether P is regular or singular.

However, they have one significant drawback – they usually do not reflect the structure that may be present in the original polynomial P.

3.1 Ansatz Spaces

During an extended visit by the first two authors to Berlin in 2003, Mehrmann proposed searching for alternatives to the companion linearizations C 1(λ) and C 2(λ) – alternatives that would share the structure of the parent polynomial P(λ).

In joint work with Mehrmann and Mehl, two large vector spaces of pencils that generalize the first and second Frobenius companion forms were introduced in [55]. Christened \(\mathbb{L}_{1}(P)\) and \(\mathbb{L}_{2}(P)\), where P is a regular matrix polynomial, these spaces were conceived as the collection of all pencils satisfying a certain ansatz, which we now briefly describe. With \(\varLambda:= \left [\begin{array}{ccccc} \lambda ^{k-1},&\lambda ^{k-2},&\mathop{\ldots },&\lambda,&1 \end{array} \right ]^{T}\), where k is the grade of P, define \(\mathbb{L}_{1}(P)\) as the set of all kn ×kn pencils L(λ) satisfying the right ansatz

$$\displaystyle{ L(\lambda ) \cdot (\varLambda \otimes I_{n}) = v \otimes P(\lambda ),\quad \mbox{ for some}\:v \in \mathbb{F}^{k}, }$$
(12.7)

and \(\mathbb{L}_{2}(P)\) as the set of all kn ×kn pencils L(λ) satisfying the left ansatz

$$\displaystyle{ (\varLambda ^{T} \otimes I_{ n}) \cdot L(\lambda ) = w^{T} \otimes P(\lambda ),\quad \mbox{ for some}\:w \in \mathbb{F}^{k}. }$$
(12.8)

A direct calculation shows that \(C_{1}(\lambda ) \in \mathbb{L}_{1}(P)\) with right ansatz vector v = e 1, and \(C_{2}(\lambda ) \in \mathbb{L}_{2}(P)\) with left ansatz vector w = e 1. The pencils in these ansatz spaces were shown to have a number of nice properties:

  • like C 1(λ) and C 2(λ), they are all easily constructible from the coefficients of P,

  • eigenvectors of P are easily recoverable; pencils in \(\mathbb{L}_{1}(P)\) reveal right eigenvectors of P, while those in \(\mathbb{L}_{2}(P)\) reveal left eigenvectors,

  • for regular P, almost all pencils in these spaces are strong linearizations for P.

Furthermore, each of these spaces is of dimension \(k(k - 1)n^{2} + k\). Thus each represents a relatively large subspace of the full pencil space (which has dimension 2k 2 n 2), and hence is a large source of potential linearizations for P. In fact, these spaces are so large, that for any choice of ansatz vector there are many degrees of freedom available for choosing a potential linearization in \(\mathbb{L}_{1}(P)\) or \(\mathbb{L}_{2}(P)\).

The aim of identifying smaller, but interesting subspaces of these ansatz spaces brings the double ansatz subspace \(\mathbb{D}\mathbb{L}(P):= \mathbb{L}_{1}(P) \cap \mathbb{L}_{2}(P)\) into focus. One sees right away that linearizations in \(\mathbb{D}\mathbb{L}(P)\) enjoy a two-sided eigenvector recovery property. But a \(\mathbb{D}\mathbb{L}(P)\)-pencil also has an unexpected feature: its right and left ansatz vectors are identical, with this common vector uniquely determining the pencil. An isomorphism between \(\mathbb{D}\mathbb{L}(P)\) and \(\mathbb{F}^{k}\) now follows, which in turn induces a natural basis for \(\mathbb{D}\mathbb{L}(P)\). Described in [38], a pencil λ X i + Y i in this basis has special structure. Every X i and Y i is block diagonal, with the diagonal blocks being block-Hankel. In a surprising twist, a completely different construction of Lancaster [48] dating back to the 1960s is proved to also generate this natural basis for \(\mathbb{D}\mathbb{L}(P)\).

The unique vector \(v \in \mathbb{F}^{k}\) associated with \(L(\lambda ) \in \mathbb{D}\mathbb{L}(P)\) gives us a way to test when L(λ) is a linearization for P, and show that almost all pencils in \(\mathbb{D}\mathbb{L}(P)\) are linearizations for P.

Theorem 2 (Eigenvalue Exclusion Theorem [55])

Let P(λ) be a regular matrix polynomial of grade k and let \(L(\lambda ) \in \mathbb{D}\mathbb{L}(P)\) with ansatz vector \(v = [v_{1},v_{2},\ldots,v_{k}]^{T} \in \mathbb{F}^{k}\) . Then L(λ) is a linearization for P(λ) if and only if no root of the grade k − 1 scalar polynomial

$$\displaystyle{ q(\lambda ) = v_{1}\lambda ^{k-1} + v_{ 2}\lambda ^{k-2} + \cdots + v_{ k-1}\lambda + v_{k} }$$
(12.9)

is an eigenvalue of P(λ). We include ∞ as one of the possible roots of q(λ), or as one of the possible eigenvalues of P(λ).

The systematizing of the construction of linearizations [55] has spurred exciting new research in this area. The ansatz spaces \(\mathbb{L}_{1}(P)\) and \(\mathbb{L}_{2}(P)\) were recently revisited from a new vantage point [80]. By regarding block matrices as a device to record the matrix coefficients of a bivariate matrix polynomial, and by using the concepts of the Bézoutian function and associated Bézout matrix, shorter proofs of the key results in [55] were obtained, while simultaneously generalizing them from regular matrix polynomials expressed in the standard monomial basis to regular polynomials expressed in any degree-graded basis.

What can be said about the pencils in \(\mathbb{L}_{1}(P)\) and \(\mathbb{L}_{2}(P)\) when the n × n matrix polynomial P is singular? As was shown recently in [18], almost all of them are still linearizations for P that now allow easy recovery of the left and right minimal indices and minimal bases of P.

Are there linearizations for P that are not in \(\mathbb{L}_{1}(P)\) or \(\mathbb{L}_{2}(P)\)? Yes! Consider the cubic matrix polynomial \(P(\lambda ) =\lambda ^{3}A_{3} +\lambda ^{2}A_{2} +\lambda A_{1} + A_{0}\). In [6], the pencil

$$\displaystyle{L(\lambda ) =\lambda \left [\begin{array}{ccc} 0&A_{3} & 0 \\ I &A_{2} & 0 \\ 0& 0 &I \end{array} \right ]+\left [\begin{array}{ccc} - I & 0 & 0 \\ 0 & A_{1} & A_{0} \\ 0 & - I & 0 \end{array} \right ]}$$

was shown to be a linearization for P; but L(λ) is neither in \(\mathbb{L}_{1}(P)\) nor \(\mathbb{L}_{2}(P)\), as observed in [55]. We turn next to the discussion of these pencils.

3.2 Fiedler Pencils

Another source of linearizations for matrix polynomials was inspired by a 2003 paper of Fiedler [29], in which he showed that the usual companion matrix C of a scalar polynomial \(p(\lambda ) =\varSigma _{ i=1}^{k}a_{i}\lambda ^{i}\) of degree k can be factored into a product of n sparse matrices M i which differ only slightly from the n × n identity matrix:

$$\displaystyle{ C = \left [\begin{array}{ccccc} - a_{k-1} & - a_{k-2} & \ldots & - a_{1} & - a_{0} \\ 1 & 0 & \ldots & 0 & 0\\ \\ \\ 0 & 1 & \ddots & & \vdots\\ \\ \\ \vdots & & \ddots & \ddots & 0\\ 0 & \ldots &0 & 1 & 0 \end{array} \right ]\, = M_{k-1}M_{k-2}\cdots M_{0}\,, }$$

where

$$\displaystyle{ M_{j}:= \left [\begin{array}{cccc} I_{k-j-1} & & & \\ & - a_{j}&1& \\ & 1 &0& \\ & & &\;I_{ j-1} \end{array} \right ]\quad \mbox{ for }\;j = 1,\mathop{\ldots },k-1\,,\quad \mbox{ and }\;M_{0}:= \left [\begin{array}{ccccc} I_{k-1} & \\ & - a_{0} \end{array} \right ]. }$$

Fiedler observed that any permutation of the factors M i produces a matrix that is similar to C, and hence also a companion matrix for p(λ). Furthermore, certain permutations produce companion matrices that are of low bandwidth, i.e., pentadiagonal.

The first step in extending Fiedler’s results to matrix polynomials was taken by Antoniou and Vologiannidis in [6]. The Fiedler factors are now block matrices:

$$\displaystyle{M_{j}:= \left [\begin{array}{cccc} I_{n(k-j-1)} & & & \\ & - A_{j}&I_{n}& \\ & I_{n} & 0 & \\ & & &\;I_{n(j-1)} \end{array} \right ]\quad \mbox{ for }\;j = 1,\mathop{\ldots },k-1,\quad M_{0}:= \left [\begin{array}{ccccc} I_{k-1} & \\ & - A_{0} \end{array} \right ],}$$

and one extra block matrix, \(M_{k}:=\mathrm{ diag}[A_{k},I_{n(k-1)}]\), which is needed because matrix polynomials cannot, without loss of generality, be assumed to be monic. For any permutation \(\sigma = (j_{0},j_{1},\mathop{\ldots },j_{k-1})\) of the indices \((0,1,2,\mathop{\ldots },k - 1)\), one can now define the associated Fiedler pencil

$$\displaystyle{ F_{\sigma }(\lambda )\,:=\;\lambda M_{k}\, -\, M_{j_{0}}M_{j_{1}}\cdots M_{j_{k-1}}\,. }$$
(12.10)

Each member of this family of Fiedler pencils was shown in [6] to be a strong linearization when P is a regular matrix polynomial over \(\mathbb{C}\), by demonstrating strict equivalence to the Frobenius companion pencil. The regularity assumption is essential for this proof strategy to work, so to prove that the Fiedler pencils remain strong linearizations when P is singular requires different techniques. This was done in [19], with the restriction on the field lifted. It was also shown that the left and right minimal indices of a singular P are recoverable from any of its Fiedler pencils. Additionally, eigenvectors can be recovered without added computational cost.

Antoniou and Vologiannidis also introduced in [6] a kind of “generalized” Fiedler pencil; exploiting the fact that every M j for \(j = 1,\mathop{\ldots },k - 1\) is invertible, we can “shift” some of the M j factors to the λ-term. For example, \(F_{\sigma }(\lambda )\,:=\;\lambda M_{k}\, -\, M_{j_{0}}M_{j_{1}}\cdots M_{j_{k-1}}\) is strictly equivalent to

$$\displaystyle{\tilde{F}_{\sigma }(\lambda ) =\lambda M_{j_{1}}^{-1}M_{ j_{0}}^{-1}M_{ k}^{}M_{j_{k-1}}^{-1}\, -\, M_{ j_{2}}\cdots M_{j_{k-2}}\,,}$$

so \(\tilde{F}_{\sigma }(\lambda )\) is also a strong linearization. These generalized Fiedler pencils can have additional nice properties, as illustrated by the following example for a general square polynomial P(λ) of degree k = 5.

$$\displaystyle\begin{array}{rcl} S(\lambda )& =& \lambda M_{5}^{}M_{3}^{-1}M_{ 1}^{-1} - M_{ 4}^{}M_{2}^{}M_{0}^{} {}\\ & =& \left [\begin{array}{ccccc} \lambda A_{5} + A_{4} & - I_{n}& & & \\ - I_{n} & 0 & \lambda I_{n} & & \\ & \lambda I_{n} &\lambda A_{3} + A_{2} & - I_{n}& \\ & & - I_{n} & 0 & \lambda I_{n} \\ & & & \lambda I_{n} &\lambda A_{1} + A_{0} \end{array} \right ].{}\\ \end{array}$$

This pencil S(λ) is not only a strong linearization for P(λ), it is also block-tridiagonal. The low bandwidth property of certain Fiedler (and generalized Fiedler) pencils thus opens up the possibility of developing fast algorithms to compute the eigenstructure of high degree matrix polynomials. The eigenvector and minimal basis recovery properties of these generalized Fiedler pencils have been studied in [13].

In more recent work [83], Vologiannidis and Antoniou have extended Fiedler pencils even further, showing that repetitions of the Fiedler factors M i can sometimes be allowed in the construction of F σ (λ) in (12.10), and template-like strong linearizations for P will still be produced. These pencils are sometimes referred to as Fiedler pencils with repetition, and have been shown to be yet another source of structured linearizations [14, 83].

Fiedler pencils have also been shown [21] to be adaptable to rectangular matrix polynomials P. In this case, however, the product representation in (12.10) is no longer tractable, and other techniques for constructing these pencils are required. Each Fiedler pencil now has its own characteristic size as well as block pattern, but each rectangular Fiedler pencil is still a strong linearization for P. This concretely illustrates a distinctive feature of rectangular matrix polynomials, as contrasted with regular (square) matrix polynomials; a rectangular m × n matrix polynomial with mn always has strong linearizations of many different sizes, while a regular matrix polynomial has strong linearizations of only one possible size. This phenomenon is explored in more detail in [22]. For more on the impact of Fiedler’s work on our understanding of linearizations, see [53].

4 Matrix Polynomial Structures

There are several kinds of algebraic structure commonly encountered in matrix polynomials arising in the analysis and numerical solution of systems of ordinary, partial, and delay differential equations. To concisely define these structures, we define the ⋆-adjoint of matrix polynomials, where the symbol ⋆ is used to denote transpose T in the real case \(\mathbb{F} = \mathbb{R}\), and either transpose T or conjugate transpose ∗ in the complex case \(\mathbb{F} = \mathbb{C}\). Note that the structures under consideration apply only to square matrix polynomials.

Definition 8 (Adjoint of Matrix Polynomials)

Let \(P(\lambda ) =\sum _{ i=0}^{k}\lambda ^{i}A_{i}\) where \(A_{i} \in \mathbb{F}^{n\times n}\) with \(\mathbb{F} = \mathbb{R}\) or \(\mathbb{C}\) be a matrix polynomial of grade k. Then

$$\displaystyle{ P^{\star }(\lambda ):=\sum _{ i=0}^{k}\lambda ^{i}A_{ i}^{\star } }$$
(12.11)

defines the ⋆-adjoint P (λ).

The three most important matrix polynomial structures in applications are

$$\displaystyle\begin{array}{rcl} \mbox{ Hermitian/symmetric:}P^{\star }(\lambda ) = P(\lambda )\,,& &{}\end{array}$$
(12.12)
$$\displaystyle\begin{array}{rcl} \mbox{ $\star $-alternating:}P^{\star }(-\lambda ) = \pm P(\lambda )\,,& &{}\end{array}$$
(12.13)
$$\displaystyle\begin{array}{rcl} \mbox{ and $\star $-palindromic:}\mathrm{rev}P^{\star }(\lambda ) = \pm P(\lambda )\,.& &{}\end{array}$$
(12.14)

Also of interest are skew-symmetric matrix polynomials, defined by \(P^{T}(\lambda ) = -P(\lambda )\), and the following alternative types of alternating and palindromic structure. Letting \(\mathcal{R}\in \mathbb{R}^{n\times n}\) denote an arbitrary involution (i.e., \(\mathcal{R}^{2} = I\)), then P(λ) is said to be \(\mathcal{R}C\mathcal{R}\)-palindromic if \(\mathcal{R}\mathop{\mathrm{rev}}\overline{P}(\lambda )\,\mathcal{R} = \pm P(\lambda )\), and \(\mathcal{R}C\mathcal{R}\)-alternating if \(\mathcal{R}\overline{P}(-\lambda )\mathcal{R} = \pm P(\lambda )\). Note that the C in \(\mathcal{R}C\mathcal{R}\) refers to the conjugation operation in the definitions; the name ⋆-alternating was suggested by Mehrmann and Watkins in [71], because the matrix coefficients of such polynomials strictly alternate between symmetric and skew-symmetric (or Hermitian and skew-Hermitian) matrices.

Matrix polynomials (especially quadratic polynomials) with Hermitian structure are well known from the classical problem of vibration analysis, and have been extensively studied for many years [31, 32, 48, 79]. The analysis of rail noise caused by high speed trains also leads to a quadratic eigenproblem (QEP), but one with a complex T-palindromic matrix polynomial. Real and complex T-palindromic QEPs also arise in the numerical simulation of the behavior of periodic surface acoustic wave (SAW) filters [43, 85]. Quadratic eigenproblems with T-alternating polynomials arise in the study of corner singularities in anisotropic elastic materials [7, 8, 70]. Gyroscopic systems [25, 48, 49] also lead to quadratic T-alternating matrix polynomials. Higher degree ∗-alternating and ∗-palindromic polynomial eigenproblems arise in the linear-quadratic optimal control problem; the continuous-time case leads to ∗-alternating polynomials, while the discrete-time problem produces ∗-palindromic ones [15]. The stability analysis of delay-differential equations leads to an \(\mathcal{R}C\mathcal{R}\)-palindromic QEP [28], while a variant of \(\mathcal{R}C\mathcal{R}\)-alternating structure (without conjugation) arises in linear response theory from quantum chemistry [66]. Further details on these and other applications can be found in [52, 69, 79], Chaps. 2 and 3 of this Festschrift, and the references therein.

An important feature of the structured matrix polynomials described above are the special symmetry properties of their spectra, some of which are described in the following result. The proof of this composite theorem may be found in [52] or [56], together with [28].

Theorem 3 (Eigenvalue Pairings of Structured Matrix Polynomials)

Let \(P(\lambda ) =\sum _{ i=0}^{k}\lambda ^{i}A_{i}\) , A k ≠ 0 be a regular matrix polynomial that has one of the palindromic or alternating structures described above. Then the spectrum of P(λ) has the pairing depicted in Table  12.1. Moreover, the algebraic, geometric, and partial multiplicities of the two eigenvalues in each such pair are equal. Note that λ = 0 is included here as a possible eigenvalue, with the reciprocal partner 1∕λ or \(1/\overline{\lambda }\) to be interpreted as the eigenvalue ∞.

Table 12.1 Spectral symmetries

The eigenvalue pairings seen in this theorem are sometimes referred to as symplectic spectrum and Hamiltonian spectrum, because they parallel the eigenvalue structure of symplectic and Hamiltonian matrices. Indeed, this is one of several ways in which palindromic and alternating matrix polynomials may be viewed as generalizations of symplectic and Hamiltonian matrices, respectively. For more on this connection see [52].

Although Theorem 3 says quite a lot about the spectral structure of palindromic and alternating matrix polynomials, there are several issues that are not addressed by this result. For example, do these spectral symmetries still hold in the singular case? And what happens when the spectral pairings degenerate, e.g., at λ 0 = ±1 for T-palindromic polynomials, and at λ 0 = 0 or for T-alternating polynomials? Are there any additional constraints on the spectra at these degenerate points?

In joint work with Mehrmann [58, 59], these questions were resolved by characterizing the Smith forms for these structure classes using a novel technique based on the properties of compound matrices. This work showed that the eigenvalue pairings found in Theorem 3 do indeed extend to singular polynomials in these classes. Degenerate eigenvalues, however, have some nontrivial fine structure in their admissible Jordan characteristics. The details are somewhat technical, but the main message can be simply stated. For each of these structure classes, the constraints on the admissible spectral structures of odd grade polynomials in a class differ from the constraints on the even grade polynomials in that class. It is interesting to note, though, that this dichotomy between odd and even grade appears only in the fine structure of the partial multiplicities at the degenerate eigenvalues.

Next, the same compound matrix techniques were brought to bear on skew-symmetric matrix polynomials [60]. A characterization of their Smith forms revealed even multiplicity for all elementary divisors, with no odd/even grade dichotomy in the admissible spectral structures.

4.1 Möbius Transformations

A useful investigative tool developed by Mehrmann and his co-authors in the last few years is the extension of linear fractional rational transformations (i.e., Möbius transformations) to transformations that act on matrix polynomials [61]. One of the main motivations for this work is understanding the relationships between different classes of structured matrix polynomials. Clearly such a study can be greatly aided by fashioning transformations that allow results about one structured class to be translated into results about another structured class. This inquiry has its origin in particular examples such as the classical Cayley transformation for converting one matrix structure (e.g., skew-Hermitian or Hamiltonian) into another (unitary or symplectic, respectively). This Cayley transformation was extended from matrices to matrix pencils in [50], and in a 1996 paper by Mehrmann [67]. It was then generalized to matrix polynomials in 2006 by Mehrmann and co-authors [56], where it was shown how palindromic and alternating structures are related via a Cayley transformation of matrix polynomials. The definition of general Möbius transformations in [61] completes this development, providing an important and flexible tool for working with matrix polynomials.

Definition 9 (Möbius Transformation)

Let V be the vector space of all m × n matrix polynomials of grade k over the field \(\,\mathbb{F}\), and let \(A \in GL(2, \mathbb{F})\). Then the Möbius transformation on V induced by A is the map M A : V → V defined by

$$\displaystyle{\mathbf{M}_{A}\left (\sum _{i=0}^{k}B_{ i}\lambda ^{i}\right )(\mu ) =\sum _{ i=0}^{k}B_{ i}\,(a\mu +b)^{i}(c\mu +d)^{k-i},\quad \mbox{ where}\quad A = \left [\begin{array}{cc} \,a&b\, \\ \,c &d\, \end{array} \right ].}$$

It is worth emphasizing that a Möbius transformation acts on graded polynomials, returning polynomials of the same grade (although the degree may increase, decrease, or stay the same, depending on the polynomial). In fact, M A is a linear operator on V. Observe that the Möbius transformations induced by the matrices

$$\displaystyle{ A_{+1}\; =\; \left [\begin{array}{rc} 1&1\\ - 1 &1 \end{array} \right ]\quad \mbox{ and}\quad A_{-1}\; =\; \left [\begin{array}{cr} 1& - 1\\ 1 & 1 \end{array} \right ] }$$

are exactly the Cayley transformations \(\mathcal{C}_{+1}(P)\) and \(\mathcal{C}_{-1}(P)\), respectively, introduced in [56]. Also note that the reversal operation described in Definition 3 is the Möbius transformation M R corresponding to the matrix

$$\displaystyle{R = \left [\begin{array}{cc} 0&1\\ 1 &0 \end{array} \right ].}$$

Some of the significant properties of general Möbius transformations proved in [61] include the following:

  1. 1.

    Möbius transformations affect the eigenvalues of P and their partial multiplicity sequences in a simple and uniform way. In particular, if \(\mathsf{m}_{A}^{}(\lambda ) = \frac{\,\,a\lambda +b\,} {\,\,c\lambda +d\,}\) denotes the scalar Möbius function on \(\mathbb{F} \cup \{\infty \}\) corresponding to the matrix \(A ={\bigl [ \begin{matrix}\scriptstyle a&\scriptstyle b \\ \scriptstyle c&\scriptstyle d\end{matrix}\bigr ]} \in \mathit{GL}(2, \mathbb{F})\), then we have that

    $$\displaystyle{ \mathcal{J}{\bigl (\mathbf{M}_{A}^{}(P),\mu _{0}^{}\bigr )} \equiv \mathcal{J}{\bigl (P,\mathsf{m}_{A}^{}(\mu _{0}^{})\bigr )} }$$
    (12.15)

    for any \(\mu _{0} \in \mathbb{F} \cup \{\infty \}\).

  2. 2.

    Eigenvectors are preserved by Möbius transformations, but Jordan chains are not. By (12.15), though, the lengths of Jordan chains are preserved.

  3. 3.

    Möbius transformations preserve minimal indices, and transform minimal bases in a simple and uniform way.

  4. 4.

    Möbius transformations preserve the property of being a strong linearization; that is, if L(λ) is a strong linearization for P(λ), then M A (L) is a strong linearization for M A (P). More generally, Möbius transformations preserve the spectral equivalence relation.

  5. 5.

    Möbius transformations preserve sparsity patterns; for example, if P is upper triangular, then M A (P) is also upper triangular.

For the study of structured matrix polynomials, perhaps the most significant property of all is that Möbius transformations provide a rich source of bijections between classes of structured polynomials, that allow us to conveniently transfer intuition and results about one class to another. Important examples include correspondences between

T-palindromic and T-alternating polynomials,

as well as between the three classes of

Hermitian, ∗-palindromic, and ∗-alternating matrix polynomials.

These last correspondences provide an opportunity to transfer over to ∗-palindromic and ∗-alternating polynomials much of the existing wealth of knowledge about Hermitian matrix polynomials, including results about such special subclasses as hyperbolic polynomials, definite polynomials [40], and other types of Hermitian matrix polynomials with all-real spectrum [4].

Finally, it is worth noting that the idea of linear fractional transformations acting on matrix polynomials has been extended even further to more general rational transformations in [73].

5 Structured Linearizations

I’m pickin’ up good vibrations – The Beach Boys

When a matrix polynomial P(λ) has structure, the linearization strategy for solving the associated polynomial eigenproblem has two parts: first find a suitable structured linearization L(λ) for P(λ), and then compute the eigenvalues of L(λ) using a structure-preserving algorithm. Although our focus is on the first part of this strategy, it is important to note that there has also been much work on the development of structure-preserving algorithms for matrix pencils in the last decade. Examples of some of this work can be found in the papers [28, 45, 47, 57, 65, 68, 70, 71, 75], as well as in the Chaps. 2 and 3 of this Festschrift.

We now turn to developments of the last decade concerning structure-preserving linearizations, focusing mainly on the “big three” types of structure – Hermitian, palindromic and alternating.

5.1 In Ansatz Spaces

The pencil spaces \(\mathbb{L}_{1}(P)\) and \(\mathbb{L}_{2}(P)\) introduced by Mehrmann and co-authors [55] were shown in a follow-up paper [56] to provide a rich arena in which to look for linearizations with additional properties like structure preservation or improved numerics, thus realizing the original purpose for their development. Subspaces of pencils that inherit the ⋆-palindromic or ⋆-alternating structure of P were identified, a constructive method to generate these structured pencils described, and necessary and sufficient conditions for them to be strong linearizations established.

There is a close connection between the structure of a pencil in \(\mathbb{L}_{1}(P)\) and the structure of its ansatz vectors. Loosely put, if P is palindromic, then a palindromic pencil in \(\mathbb{L}_{1}(P)\) will have a palindromic ansatz vector, while if P is alternating, then an alternating pencil in \(\mathbb{L}_{1}(P)\) will have an alternating ansatz vector. When P is structured, there is also a very close connection between the double ansatz space \(\mathbb{D}\mathbb{L}(P)\) and pencils in \(\mathbb{L}_{1}(P)\) that reflect the structure of P. More precisely, let R be the reverse identity matrix, and Σ a diagonal matrix of alternating signs,

(12.16)

and let \(L(\lambda ) \in \mathbb{L}_{1}(P)\) with ansatz vector v. If P is a palindromic matrix polynomial, e.g., if revP T(λ) = P(λ), then

$$\displaystyle{ \mathrm{rev}L^{T}(\lambda )=L(\lambda )\,\Longleftrightarrow\,{\Bigl (Rv=v,\;\:\mbox{ and}\;\:(R \otimes I)L(\lambda ) \in \mathbb{D}\mathbb{L}(P)\;\mbox{ with ansatz vector}\;v\Bigr )}. }$$

So to find a palindromic pencil in \(\mathbb{L}_{1}(P)\), begin with a palindromic ansatz vector. Now there is a unique pencil in \(\mathbb{D}\mathbb{L}(P)\) corresponding to that vector. This pencil can be explicitly constructed using the natural basis for \(\mathbb{D}\mathbb{L}(P)\) mentioned in Sect. 12.3.1, and described in detail in [56]. Then reversing the order of the block rows of that \(\mathbb{D}\mathbb{L}(P)\)-pencil turns it into a palindromic pencil in \(\mathbb{L}_{1}(P)\). Will this pencil be a linearization for P ? The Eigenvalue Exclusion Theorem stated in Sect. 12.3.1 and proved in [55], determines whether the answer is yea or nay. If the answer is yea, and P is regular, this linearization is automatically also a strong linearization [55].

On the other hand, if P is alternating, say \(P^{T}(-\lambda ) = P(\lambda )\), then for \(L(\lambda ) \in \mathbb{L}_{1}(P)\) we have

$$\displaystyle{ L^{T}(-\lambda )=L(\lambda )\,\Longleftrightarrow\,{\Bigl (\varSigma v=v,\;\:\mbox{ and}\;\:(\varSigma \otimes I)L(\lambda ) \in \mathbb{D}\mathbb{L}(P)\;\mbox{ with ansatz vector}\;v\Bigr )}, }$$

which, as in the palindromic case detailed before, can be used mutatis mutandis to construct an alternating linearization for P. Similar results were proved for the other flavors of palindromicity and alternation, and concrete examples given in [56].

An unexpected property of \(\mathbb{D}\mathbb{L}(P)\) itself was proved in [38]. Consider the block transpose of a block matrix, defined as follows.

Definition 10

The block transpose of a block k × matrix A with m × n blocks A ij is the block × k matrix \(A^{\mathcal{B}}\) with m × n blocks \((A^{\mathcal{B}})_{\mathit{ij}} = A_{\mathit{ji}}\).

Now consider the subspace \(\mathbb{B}(P)\) of all block symmetric (with respect to n × n blocks) pencils in \(\mathbb{L}_{1}(P)\), that is,

$$\displaystyle{\mathbb{B}(P):=\{\lambda X + Y \in \mathbb{L}_{1}(P)\::\: X^{\mathcal{B}} = X,\:Y ^{\mathcal{B}} = Y \}.}$$

Then for any P, the subspaces \(\mathbb{B}(P)\) and \(\mathbb{D}\mathbb{L}(P)\) are identical! Thus pencils in \(\mathbb{D}\mathbb{L}(P)\) always have block symmetric coefficients, even when there is no structure in the matrix coefficients of P. What happens when P is structured? As shown in [38], when P is symmetric, the collection of all symmetric pencils in \(\mathbb{L}_{1}(P)\) is exactly \(\mathbb{D}\mathbb{L}(P)\), while for Hermitian P the Hermitian pencils in \(\mathbb{L}_{1}(P)\) form a proper (but nontrivial) subspace \(\mathbb{H}(P) \subset \mathbb{D}\mathbb{L}(P)\).

Among Hermitian matrix polynomials, perhaps the most important are those with all-real spectrum [4]. This includes the definite polynomials, a class of Hermitian polynomials introduced in [40] as a common generalization for hyperbolic polynomials and definite pencils. In this setting, the natural structured linearization question is whether every definite Hermitian polynomial has a linearization that is a definite pencil. This is answered affirmatively in [40]; indeed, it is shown that a Hermitian matrix polynomial P is definite if and only if it has a definite linearization in \(\mathbb{H}(P)\), the set of Hermitian pencils in \(\mathbb{L}_{1}(P)\). Thus we see that \(\mathbb{L}_{1}(P)\) is rich enough to provide a structured-preserving (strong) linearization for any definite Hermitian polynomial. It is also worth noting that the results in [40] had a significant impact on the later characterization results of [4].

The double ansatz space has also appeared as the star player in other structured settings. The stability analysis of time-delay systems leads to a palindromic polynomial eigenproblem [28] with an involutory twist – the n × n complex matrix polynomial P(λ) in this problem satisfies

$$\displaystyle{ \mathcal{R}\cdot \mathrm{ rev}\overline{P}(\lambda ) \cdot \mathcal{R} = P(\lambda )\,, }$$
(12.17)

where \(\mathcal{R}\) is a real involution (i.e., \(\mathcal{R}^{2} = I_{n}\)), thus making P an \(\mathcal{R}C\mathcal{R}\)-palindromic matrix polynomial in the sense described in Sect. 12.4. In order to find structured linearizations in this context, the first issue is to specify an appropriate class of structured pencils to search in; in other words, a suitable involution on the space of nk ×nk pencils must be chosen. In [28] it is shown that the block anti-diagonal matrix \(\widehat{\mathcal{R}}:= R_{k} \otimes \mathcal{R}\), where R k is the k × k backwards identity matrix as in (12.16), gives a compatible choice of involution. With this choice of involution, it now follows that the right ansatz vector \(v \in \mathbb{C}^{k}\) of any \(\widehat{\mathcal{R}}C\widehat{\mathcal{R}}\)-palindromic pencil in \(\mathbb{L}_{1}(P)\) must satisfy \(Rv = \overline{v}\). For any such vector v, there are many \(\widehat{\mathcal{R}}C\widehat{\mathcal{R}}\)-palindromic pencils in \(\mathbb{L}_{1}(P)\) with this right ansatz vector, exactly one of which will also be in \(\mathbb{D}\mathbb{L}(P)\). These results, along with a constructive procedure to build these structured \(\mathbb{D}\mathbb{L}(P)\)-pencils, were presented in [28], where they were also extended to the other variants of \(\mathcal{R}C\mathcal{R}\)-structure mentioned in Sect. 12.4 by using the linearization theory and techniques developed in [55, 56].

The techniques developed in [56] had an impact on eigenvalue computations occurring in the vibration analysis of rail tracks under excitation from high speed trains [42, 46]; see also the Chap. 3 of this Festschrift. This eigenvalue problem has the form

$$\displaystyle{ \left (\lambda A(\omega ) + B(\omega ) + \frac{1} {\lambda } A(\omega )^{T}\right )x = 0, }$$
(12.18)

where A, B are large, sparse, parameter-dependent, complex square matrices, with B complex symmetric, and A highly singular. Clearly, for any fixed value of ω, multiplying (12.18) by λ leads to a T-palindromic eigenvalue problem. Solving this problem directly with the QZ-algorithm without respecting its structure resulted in erroneous eigenvalues. However, the use of a T-palindromic linearization from [56] allowed structured deflation of the zero and infinite eigenvalues. The computed frequencies were now accurate to within the range of the discretization error. Thus we see that the computation of “good vibrations” is aided by the use of “good linearizations.”

5.1.1 Problematic Eigenvalues

For regular matrix polynomials P, the pencils in \(\mathbb{D}\mathbb{L}(P)\) have repeatedly shown themselves to be prolific sources of structured linearizations.Footnote 3 However, pencils in \(\mathbb{D}\mathbb{L}(P)\) have one significant drawback. Because of the eigenvalue exclusion property described in Theorem 2, for any \(L(\lambda ) \in \mathbb{D}\mathbb{L}(P)\) there is always at least one “problematic eigenvalue” that may prevent L from being a linearization for P; these problematic eigenvalues are just the roots of the scalar polynomial q(λ) in (12.9), associated with the ansatz vector v for L.

In many situations, this obstruction to \(L(\lambda ) \in \mathbb{D}\mathbb{L}(P)\) being a linearization can be easily side-stepped simply by shifting consideration to a different pencil in \(\mathbb{D}\mathbb{L}(P)\), since almost every pencil in \(\mathbb{D}\mathbb{L}(P)\) is a linearization. However, in a structured setting, where the goal is to find a structured linearization, this problematic eigenvalue obstruction sometimes cannot be avoided, no matter what pencil in \(\mathbb{D}\mathbb{L}(P)\) is used.

Consider, for example, the case of a T-palindromic matrix polynomial P of any even grade k ≥ 2. As described in Sect. 12.5.1, any T-palindromic pencil in \(\mathbb{L}_{1}(P)\) is strictly equivalent to a pencil in \(\mathbb{D}\mathbb{L}(P)\) possessing a palindromic ansatz vector v, i.e., a \(v \in \mathbb{F}^{k}\) such that Rv = v. But the scalar polynomial q(λ) in (12.9) corresponding to any such v is necessarily palindromic of odd grade, and thus must always have − 1 as a root. Consequently, any T-palindromic polynomial P of even grade that has the eigenvalue \(\lambda _{0} = -1\) will never have any structured linearization in \(\mathbb{L}_{1}(P)\)!

This phenomenon of having an unavoidable problematic eigenvalue obstruction to the existence of any structured linearizations in \(\mathbb{L}_{1}(P)\) occurs for other structures in addition to T-palindromic structure (see [56]). However, it is significant to note that this is only known to occur for structured polynomials of even grade.

5.2 Among Fiedler Pencils

Modified versions of the generalized Fiedler pencils and Fiedler pencils with repetition described in Sect. 12.3.2 have shown themselves to be particularly valuable sources for not just structure-preserving linearizations, but for structured companion forms. Here by the term “companion form” we mean a template for producing a pencil associated to each matrix polynomial P of some fixed size and grade that

  • is constructed directly from the matrix coefficients of P, without any matrix operations on these coefficients, and

  • produces a strong linearization for every polynomial P of the given size and grade (both regular and singular if the polynomials are square).

Every Fiedler and generalized Fiedler pencil is a companion form in this sense; by contrast, none of the pencils in \(\mathbb{D}\mathbb{L}(P)\) is ever a companion form because of Theorem 2.

A companion form is said to be structured with respect to a class \(\mathcal{C}\) of matrix polynomials, if for every \(P \in \mathcal{C}\), the associated companion pencil is also in \(\mathcal{C}\). Thus we might have Hermitian companion forms, palindromic companion forms, und so weiter. Structured companion forms derived from generalized Fiedler pencils have appeared in a number of papers [6, 14, 20, 5860, 83], for a variety of structure classes, including Hermitian, T-palindromic, and T-alternating matrix polynomials.

Here are some simple examples from those papers. Suppose

$$\displaystyle{P(\lambda )\, =\,\lambda ^{5}A_{ 5} +\lambda ^{4}A_{ 4} + \cdots +\lambda A_{1} + A_{0}}$$

is a general n × n polynomial of grade 5. Then in [58] it is shown that the block-tridiagonal pencil template

$$\displaystyle{ \mathcal{S}_{P}(\lambda )\; =\; \left [\begin{array}{ccccc} \lambda A_{1} + A_{0} & \lambda I & & & \\ \lambda I &0& I & & \\ &I &\lambda A_{ 3} + A_{2} & \lambda I & \\ & & \lambda I & 0 & I \\ & & &I &\lambda A_{5} + A_{4} \end{array} \right ]_{5n\times 5n} }$$
(12.19)

is a companion form for the set of all n × n matrix polynomials of grade 5. Note that \(\mathcal{S}_{P}(\lambda )\) in (12.19) is a simplified version of an example that first appeared in [6]. It is clear how \(\mathcal{S}_{P}(\lambda )\) can be extended to a companion form for any other odd grade. Also noteworthy is that \(\mathcal{S}_{P}(\lambda )\) is not just a companion form, it is also both a symmetric and a Hermitian companion form; i.e., if P is symmetric (Hermitian), then \(\mathcal{S}_{P}\) will also be symmetric (Hermitian). Many more symmetric and Hermitian companion forms can be constructed by the methods developed in [83].

Pre-multiplying \(\mathcal{S}_{P}\) by a certain diagonal ± 1 matrix (a strict equivalence) now immediately produces a T-even companion form

$$\displaystyle{\mathcal{E}_{P}(\lambda )\; =\; \left [\begin{array}{ccccc} \lambda A_{1} + A_{0} & \lambda I & & & \\ -\lambda I & 0 & - I & & \\ & - I & -\lambda A_{3} - A_{2} & -\lambda I & \\ & & \lambda I & 0 & I\\ & & & I &\lambda A_{ 5} + A_{4} \end{array} \right ]_{5n\times 5n},}$$

as shown in [58]. Pre-multiplication of \(\mathcal{S}_{P}\) by R k I n (another strict equivalence) reverses the order of the block rows, giving

$$\displaystyle{\mathcal{P}_{P}(\lambda )\; =\; \left [\begin{array}{ccccc} & & &I &\lambda A_{5} + A_{4} \\ & & \lambda I &0& I\\ &I &\lambda A_{ 3} + A_{2} & \lambda I & \\ \lambda I & 0 & I & & \\ \lambda A_{1} + A_{0} & \lambda I & & & \end{array} \right ]_{5n\times 5n},}$$

which is a T-palindromic companion form [59]. Many more palindromic companion forms are constructed in [14] and [20], all for odd grade polynomials. Indeed, all the known structured companion forms arising from Fiedler pencils are for odd grade matrix polynomials.

The lack of any Fiedler-based structured companion forms for even grade polynomials is curious; is this just an oddityFootnote 4 Footnote 5 of Fiedler pencils, or is it a sign of some intrinsic limitation on all pencils?

5.3 Existence: Leave It to Smith

“The first thing to do,” said Psmith, “is to ascertain that such a place as Clapham Common really exists. One has heard of it, of course, but has its existence ever been proved? I think not.”

P.G. Wodehouse, Psmith in the City [84]

Several phenomena now contribute to the suspicion that structured even grade polynomials may be intrinsically “harder” to linearize (at least by a structured companion form) than structured matrix polynomials of odd grade. Among these are the plenitude of Fiedler-based structured companion forms for odd grade as contrasted with the absence of any known for even grade; another is the presence of “problematic eigenvalues” that block the existence of any structured linearization in the ansatz spaces for certain even grade structured matrix polynomials. The resolution of this issue was finally achieved by the detailed investigation of the Smith forms of various types of structured matrix polynomials in the Smith form trilogy [5860], described at the end of Sect. 12.4.

A structured companion form for even grade would be able to simultaneously provide a structured linearization for every structured polynomial of that even grade. But the Smith form results of [58] and [59] show that the admissible Jordan characteristics of even and odd grade polynomials in the palindromic (or the alternating) structure class are not the same. Consequently, for each structure class there are always structured polynomials of each even grade whose elementary divisor structure is incompatible with that of every pencil in that structure class. This elementary divisor incompatibility thus precludes the existence of any structured companion form for any even grade, for either palindromic or alternating matrix polynomials.

The existence or non-existence of Hermitian or symmetric companion forms for even grades cannot be settled by a similar argument; for these structures there are no comparable elementary divisor incompatibilities between even and odd grade. Nonetheless, the impossibility of such structured companion forms for even grades has recently been shown in [22]; the argument given there is based on minimal index incompatibilities between even and odd grade structured polynomials that are singular.

The impossibility of any even grade structured companion form, for any of these three most important structure classes, suggests that a reduction to a spectrally equivalent quadratic matrix polynomial might be a more natural alternative to linearization for even grade structured polynomials. This is one motivation to investigate the possibility of structure-preserving quadratifications, as part of a wider investigation of the properties of quadratic matrix polynomials [23, 54], quadratifications more generally, and the development of algorithms that work directly on a quadratic polynomial, without any intervening linearization. Some initial work in this direction can be found in [44] for palindromic structure. From a characterization of the possible elementary divisor and singular structures of quadratic palindromic polynomials [24], it has been recently shown that every even grade palindromic polynomial has a palindromic (strong) quadratification. Similar results are also now known to hold for even grade alternating polynomials [24], and for even grade Hermitian matrix polynomials [63].

6 Impact on Numerical Practice

In order to analyze the numerical properties of algorithms for the polynomial eigenproblem, both left and right eigenvectors of a matrix polynomial P must be considered. In this context, then, the polynomial eigenproblem is more properly formulated as

$$\displaystyle{ P(\lambda )x\, =\, 0,\quad y^{{\ast}}P(\lambda )\, =\, 0\,, }$$
(12.20)

where x ≠ 0 is a right eigenvector, and y ≠ 0 is a left eigenvector for P(λ). For this analysis, it is also usually assumed that P is regular, which we do throughout this section. The associated generalized eigenvalue problem

$$\displaystyle{ L(\lambda )z\, =\, 0,\quad w^{{\ast}}L(\lambda )\, =\, 0 }$$
(12.21)

for a linearization L of P can now be solved using standard techniques and readily available software. In particular, if the size of L is not very large, dense transformation-based methods can be used to solve (12.21), such as the QZ algorithm [72], or a structure-preserving algorithm when L is a structured linearization [28, 47, 57, 65, 75, 79]. Krylov methods can be used for large sparse problems [9, 64, 68, 70, 79]. Among the infinitely many linearizations L of P, we are interested in those which preserve the structure, if any, and whose right and left eigenvectors permit easy recovery of the corresponding eigenvectors of P. So all the linearizations described in Sects. 12.3 and 12.5 are obvious candidates.

The introduction of these new structured and unstructured linearizations in the last decade has led to not only the development of structure-preserving algorithms, but also the development of techniques to analyze the influence of the linearization process on the accuracy and stability of the computed solution, so as to guide us in our choice of linearization. To indicate the key idea we assume that P is expressed in the monomial basis as in (12.1). Let x and y denote right and left eigenvectors of P, and let z and w denote right and left eigenvectors of L, all corresponding to a simple, nonzero, finite eigenvalue λ. Eigenvalue condition numbers are given, in the 2-norm, by the following expressions [78, Thm. 5]:

$$\displaystyle{\kappa _{P}^{}(\lambda ) = \frac{{\bigl (\sum _{i=0}^{k}\vert \lambda \vert ^{i}\|A_{i}\|_{2}\bigr )}\|y\|_{2}\|x\|_{2}} {\vert \lambda \vert \vert y^{{\ast}}P'(\lambda )x\vert },\quad \kappa _{L}^{}(\lambda ) = \frac{{\bigl (\vert \lambda \vert \|X\|_{2} + \|Y \|_{2}\bigr )}\|w\|_{2}\|z\|_{2}} {\vert \lambda \vert \vert w^{{\ast}}L'(\lambda )z\vert }.}$$

These condition numbers measure the sensitivity of the eigenvalue λ of P and L, respectively, to small perturbations of P and L measured in a normwise relative fashion. Different linearizations of the same matrix polynomial can have widely varying eigenvalue condition numbers. Unless the block structure of the linearization is respected (and it is not by standard algorithms), the conditioning of the larger linear problem can be worse than that of the original matrix polynomial, since the class of admissible perturbations is larger. For example, eigenvalues that are well-conditioned for P(λ) may be ill-conditioned for L(λ) [39, 41, 78]. Ideally, when solving (12.20) via (12.21) we would like to have \(\kappa _{P}^{}(\lambda ) \approx \kappa _{L}^{}(\lambda )\). Most linearizations in Sects. 12.3 and 12.5 satisfy one-sided factorizations of the form

$$\displaystyle{ L(\lambda )F(\lambda ) = G(\lambda )P(\lambda ),\quad E(\lambda )L(\lambda ) = P(\lambda )H(\lambda ), }$$
(12.22)

where \(G(\lambda ),H^{T}(\lambda ),F(\lambda )\) and E(λ)T are kn × n matrix functions. Assume that F(λ) is of full rank in a neighborhood of a finite eigenvalue λ of P and L, and that y: = G(λ) w ≠ 0. Then it follows from (12.22) that z = F(λ)x is a right eigenvector of L, y is a left eigenvector of P, and w L′(λ)z = y P′(λ)x (see [34, Lemma 3.2]) so that

$$\displaystyle{ \frac{\kappa _{L}^{}(\lambda )} {\kappa _{P}^{}(\lambda )} = \frac{\vert \lambda \vert \|X\|_{2} + \|Y \|_{2}} {\sum _{j=0}^{k}\vert \lambda \vert ^{j}\|A_{j}\|_{2}} \cdot \frac{\|w\|_{2}\|z\|_{2}} {\|y\|_{2}\|x\|_{2}}. }$$
(12.23)

This expression can now be used to investigate the size of the ratio \(\kappa _{L}^{}(\lambda )/\kappa _{P}^{}(\lambda )\) as L varies, for fixed P, where the L-dependent terms are X, Y, w, and z. This is done for example in [39] for pencils \(L \in \mathbb{D}\mathbb{L}(P)\), where minimization of the ratio over L is considered.

Backward errors characterize the stability of a numerical method for solving a problem by measuring how far the problem has to be perturbed for an approximate solution to be an exact solution of the perturbed problem. Let (x, λ) be an approximate right eigenpair for P(λ) obtained from an approximate right eigenpair (z, λ) for \(L(\lambda ) =\lambda X + Y\). The relative backward errors for (x, λ) and (z, λ) are given in the 2-norm by

$$\displaystyle\begin{array}{rcl} \eta _{P}^{}(x,\lambda ) = \frac{\|P(\lambda )x\|_{2}} {\left (\sum _{i=0}^{k}\vert \lambda ^{i}\vert \|A_{i}\|_{2}\right )\|x\|_{2}},\quad \eta _{L}^{}(z,\lambda ) = \frac{\|L(\lambda )z\|_{2}} {\left (\vert \lambda \vert \|X\|_{2} + \|Y \|_{2}\right )\|z\|_{2}}.& &{}\end{array}$$
(12.24)

There are analogous formulae for approximate left eigenpairs.

We would like the linearization L that we use to lead, after recovering an approximate eigenpair of P from one of L, to a backward error for P of the same order of magnitude as that for L. To relate backward errors for L and P we need to assume that the pencil L satisfies a left-sided factorization as in the right hand-side of (12.22), with E(λ) of full rank, and that x is recovered from z via x = H(λ)z. Then E(λ)L(λ)z = P(λ)x so that

$$\displaystyle{ \frac{\eta _{P}^{}(x,\lambda )} {\eta _{L}^{}(z,\lambda )} \leq \frac{\vert \lambda \vert \|X\|_{2} + \|Y \|_{2}} {\sum _{i=0}^{k}\vert \lambda ^{i}\vert \|A_{i}\|_{2}} \cdot \frac{\|E(\lambda )\|_{2}\|z\|_{2}} {\|x\|_{2}}. }$$
(12.25)

This bound, which largely separates the dependence on L, P, and λ (in the first term) from the dependence on E and z (in the second term), can then be analyzed for a given linearization. This was done for Frobenius companion linearizations and \(\mathbb{D}\mathbb{L}(P)\) linearizations in [37].

For Frobenius companion linearizations, a straightforward analysis of the ratio (12.23) and the upper bound (12.25) shows that if \(\|A_{i}\|_{2} \approx 1\) for \(i = 0,\ldots,k\), then \(\kappa _{L}^{}(x) \approx \kappa _{P}^{}(\lambda )\) and the upper bound in (12.25) will be of order 1; this suggests that scaling the polynomial eigenproblem to try to achieve this condition before computing the eigenpairs via a Frobenius companion linearization could be numerically advantageous. Fan, Lin, and Van Dooren [27] considered the following scaling strategy for quadratics, which converts \(P(\lambda ) =\lambda ^{2}A_{2} +\lambda A_{1} + A_{0}\) to \(\widetilde{P}(\mu ) =\mu ^{2}\widetilde{A}_{2} +\mu \widetilde{ A}_{1} +\widetilde{ A}_{0}\), where

$$\displaystyle{\lambda =\gamma \mu,\quad P(\lambda )\delta =\mu ^{2}(\gamma ^{2}\delta A_{ 2}) +\mu (\gamma \delta A_{1}) +\delta A_{0} \equiv \widetilde{ P}(\mu ),}$$

and is dependent on two nonzero scalar parameters γ and δ. They showed that when A 0 and A 2 are nonzero, then taking \(\gamma = \sqrt{\|A_{0 } \|_{2 } /\|A_{2 } \|_{2}}\) and \(\delta = 2/(\|A_{0}\|_{2} + \|A_{1}\|_{2}\gamma )\) solves the problem of minimizing the maximum distance of the coefficient matrix norms from 1:

$$\displaystyle{\min _{\gamma,\delta }\max \{\|\widetilde{A}_{0}\|_{2} - 1,\|\widetilde{A}_{1}\|_{2} - 1,\|\widetilde{A}_{2}\|_{2} - 1\}.}$$

It is shown in [37] that with this choice of parameters and for not too heavily damped quadratics, that is, \(\|A_{1}\|_{2}^{2}\stackrel{<}{\sim }\|A_{0}\|_{2}\|A_{2}\|_{2}\), then \(\kappa _{P}^{} \approx \kappa _{L}^{}\) for all eigenvalues and \(\eta _{P}^{} \approx \eta _{L}^{}\) for both left and right eigenpairs. Hence, with this scaling the linearization process does not affect the eigenvalue condition numbers, and if the generalized eigenvalue problem (12.21) is solved by a backward stable algorithm such as the QZ algorithm, then the computed eigenpairs for P will have small backward errors. These ideas have been implemented in an algorithm for the complete solution of quadratic eigenvalue problems by Hammarling, Munro, and Tisseur [36]. The case of heavily damped quadratics has been addressed by Zeng and Su [86].

It is now well established that for structured polynomial eigenproblems, it is important to use algorithms that preserve the structure of the problem when computing its eigenvalues, so that the eigenvalue pairings are preserved. This has lead to the development of a number of structure-preserving algorithms for structured linearizations of structured eigenproblems [45, 47, 65, 68, 70, 75, 79], as well as the derivation of structured backward errors and structured condition numbers corresponding to structured perturbations [13, 11, 12].

7 Related Recent Developments

The linearization strategy for the polynomial eigenproblem continues to be actively developed for more types of matrix polynomials; this strategy is even beginning to be extended to other types of nonlinear eigenproblems.

In recent research on matrix polynomials, for example, a new theme has started to attract increasing attention – finding simple, template-like ways to construct linearizations when the polynomial

$$\displaystyle{ P(\lambda ) =\sum _{ i=0}^{k}A_{ i}\phi _{i}(\lambda ) }$$
(12.26)

is expressed in some non-standard basis {ϕ i (λ)}. Particularly important for numerical computation are the classical examples of such bases, e.g., those associated with the names Chebyshev, Newton, Hermite, Lagrange, and Bernstein. It is tempting to simply convert P(λ) in (12.26) to the standard basis, and then leverage the existing body of knowledge about linearizations. However, it is important to avoid reformulating P into the standard basis, since a change of basis has the potential to introduce numerical errors not present in the original problem. Instead we should look for templates that construct linearizations for P(λ) directly from the coefficients A i in (12.26), without any matrix additions, multiplications, or inverses. This could be viewed as another kind of structure preservation, i.e., a preservation of the polynomial basis.

Although there are precedents for doing this for scalar polynomials in [10], and even earlier in [33], the first serious effort in this direction for matrix polynomials was [5] and the earlier [17], where concrete templates for producing strong linearizations were provided, one for each of several classical polynomial bases, including Chebyshev, Newton, Lagrange, and Bernstein bases. This work has been used in [26], as part of a Chebyshev interpolation method for solving non-polynomial nonlinear eigenproblems. Additional examples for the Hermite and Lagrange bases have been developed and used in [81, 82]. More systematic methods for constructing large families of template-like linearizations for matrix polynomials expressed in non-standard bases can be found in the very recent papers [62, 74, 80].

The linearization strategy has been so effective for polynomial eigenproblems that researchers have started to consider ways to extend this strategy to other nonlinear eigenproblems, especially to rational eigenproblems P(λ)x = 0, where the scalar ϕ i (λ) functions in P(λ) as in (12.26) are now rational functions of λ rather than just polynomials. Significant advances in this direction have been made in [77], and more recently in [35].

8 Concluding Remarks

Wer wirklich Neues erdenken will, muss hin und wieder ein wenig spinnen.

Quote on Room MA 466, TU Berlin.

We hope this review has shown how the discovery of new families of linearizations in the last decade has propelled research on polynomial eigenproblems forward, with significant advances made in the development of theory and algorithms for structured problems. Volker has contributed much to this effort, as a researcher and, equally importantly, as a stimulating mentor. There is still more waiting to be discovered, and more fun to be had in uncovering it. As Volker taught us to say to one another, “Es gibt viel zu tun, fangt schon mal an!”