1 Introduction

Consider the autonomous dynamical system

$$ x^{\prime} = Ax, $$
(1.1)

where, \(x^{\prime }\) denotes the time derivative of x, \(A\in \mathbb {F}^{n\times n}\) and \(\mathbb {F} = \mathbb {R}\) or \(\mathbb {F} = \mathbb {C}.\) One important property of this linear dynamical system is the notion of stability. Formally, the system (1.1) is asymptotically stable if and only if all of the eigenvalues of A lie in the open left-half of the complex plane, i.e., \({\Lambda }(A)\subset \mathbb {C}^{-} \), where Λ(A) denotes the spectrum of A, \(\mathbb {C}^{-} := \left \{z \in \mathbb {C} : \text{Re} (z) < 0\right \}\) and Re(⋅) stands for the real part of its argument [8]. In this case, the matrix A is said to be Hurwitz stable or shortly a stable matrix. In practice, certain perturbations may cause the stable system to become unstable. In other words, some or all of the eigenvalues of A may be moved into the right-half plane by applying perturbations. In such cases, the robust stability of the system becomes much more important. The system (1.1) is said to berobustly stable if the system, as well as all of its perturbations from a certain class of perturbation, are stable. To detect such cases and to measure the robustness of the system subject to perturbations, a quantity of interest is the stability radius or in other words the distance to instability. The unstructured and structured stability radii have been introduced in [13,14,15, 26]. The structured stability radius of a matrix triple \((A,B,C)\in \mathbb {F}^{n\times n} \times \mathbb {F}^{n\times m} \times \mathbb {F}^{p\times n} \) is defined by

$$ r_{\mathbb{F}}(A;B,C): = \inf\left\{ \Vert {\varDelta} \Vert_{2} : {\varDelta}\in\mathbb{F}^{m\times p} \text{and} A + B{\varDelta} C \text{is unstable} \right\} $$
(1.2)

where Δ is a perturbation (disturbance) matrix, and \(B\in \mathbb {F}^{n\times m} \) and \(C\in \mathbb {F}^{p\times n}\) are restriction matrices that determine the structure of the perturbation. For example, the matrices B and C may reflect the possibility that only certain elements of A are subject to perturbations [13]. When B = I and C = I, we abbreviate rF(A;I,I) by \(r_{\mathbb {F}}(A)\) and call it the unstructured stability radius of A.

In many applications (see, e.g., [25]), it is required to consider only a certain class of perturbations. For example, when A is a real matrix, then allowing only real perturbations is more plausible. In this paper, we turn our attention to the real structured perturbations only. For real (A,B,C), \(r_{\mathbb {R}}(A;B,C)\) and \(r_{\mathbb {R}}(A)\) are called the real structured and unstructured stability radii, respectively.

Let us denote the singular values of \(M\in \mathbb {F}^{p\times m},\) ordered non-increasingly, by σi(M), \(i = 1,2, \ldots , \min \limits \left \{p,m \right \},\) and denote the real and imaginary parts of M by Re(M) and Im(M), respectively.

A formula for \(r_{\mathbb {R}}(A; B,C)\) given by Qiu et al. [22] in the form of an optimization problem in two variables is as follows:

$$ r_{\mathbb{R}}(A;B,C)^{-1} = \sup\limits_{\omega\in\mathbb{R}} \inf\limits_{\gamma\in (0,1]}\sigma_{2}\left( \begin{bmatrix} \text{Re}(C(\mathrm{i}\omega I-A)^{-1}B ) & -\gamma \text{Im}(C(\mathrm{i}\omega I-A)^{-1}B) \\ \frac{1}{\gamma} \text{Im}(C(\mathrm{i}\omega I-A)^{-1}B) & \text{Re}(C(\mathrm{i}\omega I-A)^{-1}B) \end{bmatrix} \right). $$
(1.3)

Setting

$$ H(s):= C(sI-A)^{-1}B $$
(1.4)

and

$$ \mu(M):= \inf\limits_{\gamma\in (0,1]}\sigma_{2}\left( \begin{bmatrix} \text{Re}(M) & -\gamma \text{Im}(M) \\ \frac{1}{\gamma} \text{Im}(M) & \text{Re}(M) \end{bmatrix}\right) , $$
(1.5)

the problem in (1.3) takes the form of

$$ r_{\mathbb{R}}(A;B,C) = \left\{ \sup\limits_{\omega\in\mathbb{R}}\mu \left( H(\mathrm{i}\omega)\right) \right\}^{-1}. $$
(1.6)

Computationally more plausible characterization for the unstructured real stability radius \(r_{\mathbb {R}}(A)\) given by Qiu et al. is as follows:

$$ r_{\mathbb{R}}(A) := \min\limits_{\omega\in\mathbb{R}} \max\limits_{\gamma\in (0,1]}\sigma_{2n-1}\left( \begin{bmatrix} A & -\omega \gamma I \\ \frac{\omega}{\gamma} I & A \end{bmatrix}\right). $$
(1.7)

It was proved in [22] that the function to be minimized in (1.5) is unimodal on (0,1]. Therefore, any local minimizer is actually a global minimizer, and the computation of μ(M) for a given M is the easier part of the problem. In fact, any search algorithm can be used to solve the inner minimization problem (1.5) with the guarantee of global convergence. But the outer maximization problem in (1.6) is still challenging, especially when n is large.

1.1 Literature review

Several algorithms have been proposed so far to estimate the structured and unstructured real stability radii given by the characterizations above. Sreedhar et al. proposed a globally convergent method in [24]. Their method is based on the well-known correspondence between the singular values of the transfer function and the imaginary eigenvalues of certain Hamiltonian matrices introduced in [4, 5, 7]. Although numerical experiments on their work suggest that the rate of the convergence of their algorithm is quadratic, rigorous proof does not appear in the paper. Repeated solution of Hamiltonian eigenvalue problems of size four times the order of the system, makes this method inconvenient for large-scale problems. Freitag and Spence [9] focus on the unstructured case and they propose an implicit determinant method based on Newton’s method. Their method also benefits from the link between the singular values of the transfer function and the imaginary eigenvalues of the Hamiltonian matrices. To find the critical point corresponding to the desired singular value, they impose the implicit determinant method. The implicit determinant method is devised for finding zeros of the determinant of a parameter-dependent matrix. They show that to find the zeros of the determinant is equivalent to solving a linear system. Therefore, instead of solving singular value problems and Hamiltonian eigenvalue problems, they solve linear systems at each Newton step. This makes their method more efficient and convenient for large-scale systems. Another method to handle the large-scale problems is proposed by Guglielmi and Manetta [12].

Their method consists of the inner and outer iterations and does not rely on the characterization (1.3). The inner iteration approximates the 𝜖 − pseudospectral abscissa from the above for a fixed 𝜖, whereas the outer iteration varies 𝜖 employing the Newton iterations. Their method converges locally and provides only upper bounds for the stability radius. More recent works for the structured real stability radius of sparse systems are [10, 18]. But the distance measure involved in these works is the Frobenius norm. Katewa and Pasqualetti [18] formulate the stability radius problem as an equality-constrained optimization problem and using the Lagrangian method, they characterize the optimality conditions by revealing the relation between an optimal perturbation and the eigenvectors of an optimally perturbed system. Consequently, they develop a gradient descent algorithm that converges locally. The method in [10] is based on the relationship between the spectral value set abscissa and the real stability radius. The other related works are [11, 20, 23]. Lu and Vandereycken introduce a criss-cross type algorithm to compute the real pseudospectral abscissa in [20]. Their method is also based on the well-known relation between the imaginary eigenvalues of a Hamiltonian matrix and the singular values of the transfer function. They also propose a subspace framework to solve the large-scale problems. However, their subspace approach is different from the subspace approach of the current work. In [11], the authors consider differential equations of rank-1 and rank-2 matrices. Discretization of the differential equations by means of explicit Euler method yield fast and efficient algorithms to compute the real pseudospectral abscissa and real stability radius. The method introduced by Rostami [23] has a similar structure as [12]. The main difference is that to estimate to \(r_{\mathbb {R}}(A)\) an eigenvalue problem is solved.

In this work, we are concerned with the computation of the structured real stability radius for large systems. We assume that the number of columns of the restriction matrix B and the number of rows of the restriction matrix C are relatively small, i.e., nm,p, which is usually the case in practice. We propose a subspace framework to estimate \(r_{\mathbb {R}}(A;B,C).\) For this, we adapt the techniques of our recent works [1, 2], which are devised to compute and to minimize the \({{\mathscr{H}}}_{\infty }\) norm of large-scale systems, respectively. The proposed method converges fast with respect to the subspace dimension. This fast convergence is verified by rigorous analysis and observed by means of several numerical examples.

The rest of this work is organized as follows. The next section describes a subspace framework for the structured real stability radius. The method is based on an interpolatory model order reduction technique. In Section 3 we give a rigorous rate of convergence analysis of the subspace framework described in the next section. Section 4 is devoted to the numerical experiments that illustrate the fast convergence of the algorithm presented in this work.

2 Subspace framework for \(r_{\mathbb {R}}(A; B, C)\)

In this section we present a subspace framework for the estimation of the structured real stability radius \(r_{\mathbb {R}}(A; B, C)\) for a large-scale stable matrix \(A\in \mathbb {R}^{n \times n}\) and the given restriction matrices \(B \in \mathbb {R}^{n \times m}, C\in \mathbb {R}^{p\times n}.\) The matrix valued function H(s) = C(sIA)− 1B in (1.4) corresponds to the transfer function of the standard linear time-invariant (LTI) system

$$ \begin{array}{@{}rcl@{}} x^{\prime} &&= Ax + Bu \\ y &&= Cx \end{array} $$
(2.1)

with \(A\in \mathbb {R}^{n\times n}, B\in \mathbb {R}^{n\times m}\) and \(C\in \mathbb {R}^{p\times n},\) whose \({\mathscr{H}}_{\infty }\)-norm is defined by

$$ \left\| H \right\|_{\mathcal{H}_{\infty}} := \sup\limits_{s \in \mathbb{C}^{+}} \left\| H(s) \right\|_{2} = \sup\limits_{s \in \partial\mathbb{C}^{+}} \left\| H(s) \right\|_{2} = \sup\limits_{\omega \in \mathbb{R}} \sigma_{1}(H(\mathrm{i} \omega)) $$

when A is asymptotically stable.

Inspired by [16], in our recent works [1, 2], we proposed subspace frameworks by employing two-sided projections which aim at approximating and minimizing the \({\mathscr{H}}_{\infty }\) norm of large-scale control systems. In the context of LTI systems, the idea of [1, 2] is described as follows. We consider two subspaces \({\mathcal {V}}\) and \({\mathcal {W}},\) called the left and right subspaces, of equal dimension, as well as (biorthogonal) matrices V and W, whose columns form orthonormal bases for \({\mathcal {V}}\) and \({\mathcal {W}}.\) The state x of the system (2.1) is restricted to \({\mathcal {V}}\), i.e., the state x is approximated by \(V \tilde x,\) where \(\tilde x\) is the reduced state. Then, by means of the Petrov-Galerkin condition with respect to \(\mathcal W,\) it is imposed that the residual of the differential part of the restricted system is orthogonal to \({\mathcal {W}}\). This amounts to obtaining the reduced order system

$$ \tilde x^{\prime} = \widetilde{A} \tilde x + \widetilde{B} u, \quad \widetilde y = \widetilde{C} \tilde x, $$

with

$$ \widetilde{A} := W^{*} {A} V, \widetilde{B} := W^{*} {B}, \widetilde{C} = {C} V. $$

Then a reduced problem is solved, i.e., the \({\mathcal H}_{\infty }\) norm of the reduced transfer function

$$ H^{\mathcal W, \mathcal V}(s) := \widetilde{C}(s I - \widetilde{A})^{-1} \widetilde{B} $$
(2.2)

is computed (or minimized) by employing the globally convergent methods such as [4, 5], and the subspaces \({\mathcal {V}}\) and \({\mathcal {W}}\) are expanded into larger subspaces in a way that certain Hermite interpolation properties are satisfied between the reduced and full transfer functions. This in turn gives rise to a superlinear convergence. For more details, we refer to [1, 2].

The main idea of this section is inspired by the works spelled out above. One significant difference is that here the second largest singular value function is involved in a maximin problem, whereas in [1] we consider the maximization of the largest singular value function, and in [2] the largest singular value function is involved in a minimax problem. Another and more challenging difference is that here the real and imaginary parts of the transfer function H appear as block components of the block matrix

$$ \begin{array}{@{}rcl@{}} T(\omega, \gamma): = \begin{bmatrix} \text{Re}(H(\mathrm{i}\omega)) & -\gamma \text{Im}(H(\mathrm{i}\omega))\\ \frac{1}{\gamma} \text{Im}(H(\mathrm{i}\omega)) & \text{Re}(H(\mathrm{i}\omega)) \end{bmatrix}. \end{array} $$

Here we resort to one-sided projection in the sense that the left and the right subspaces are the same. Furthermore, we form the subspace in a way so that the algorithm we discuss here converges quadratically. Since the large-scale nature of the problem in the characterization (1.3) is hidden in the middle factor of C(sIA)− 1B, when m,p are relatively small, we focus on the reduction of the middle factor D(s) := (sIA) of H(s) to a much smaller dimension using one-sided projection. To this end, we determine a subspace \(\mathcal {V}\subset \mathbb {C}^{n}\) of small dimension and a matrix V whose columns form an orthonormal basis for \(\mathcal {V}.\) Then the projected reduced problem is defined in terms of the matrices

$$ A^{V} := V^{*}AV, \quad B^{V} := V^{*}B, \quad C^{V}: = CV. $$

More precisely, the reduced problem is defined by

$$ r^{\mathcal{V}}(A;B,C) := \left\{ \sup\limits_{\omega\in\mathbb{R}}\mu\left( H^{\mathcal{V}}(\mathrm{i}\omega)\right) \right\}^{-1}, $$
(2.3)

where,

$$ H^{\mathcal{V}}(s): =C^{V}(sI-A^{V})^{-1}B^{V}. $$
(2.4)

In our subspace framework, the main objective is to get Hermite interpolation properties between the original problem defined in terms of H(s) and the reduced problem defined in terms of \(H^{\mathcal {V}}(s)\), which in turn give rise to a quadratic convergence. Formally, for a given \(\omega \in \mathbb {R},\) we aim to form the subspace \(\mathcal {V}\) in a way so that, we have

$$ \begin{array}{@{}rcl@{}} \mu(H(\mathrm{i}\omega)) &&= \mu\left( H^{\mathcal{V}}(\mathrm{i}\omega) \right), \end{array} $$
(2.5)
$$ \begin{array}{@{}rcl@{}} \mu^{\prime}\left( H(\mathrm{i}\omega)\right) &&= \mu^{\prime}\left( H^{\mathcal{V}}(\mathrm{i}\omega) \right) \quad \text{and} \end{array} $$
(2.6)
$$ \begin{array}{@{}rcl@{}} \mu^{\prime\prime}\left( H(\mathrm{i}\omega)\right)&&= \mu^{\prime\prime}\left( H^{\mathcal{V}}(\mathrm{i}\omega) \right) \end{array} $$
(2.7)

The following theorem, a particular instance of [3, Theorem 1], is helpful for this purpose.

Theorem 2.1

Let H(s) and \(H^{\mathcal {V}}(s) \) be defined as in (1.4) and (2.4), respectively.

For given \(\widehat {s} \in {\mathbb C}\), \( \widehat {b} \in {\mathbb C}^{m}\) and a positive integer N, if

$$ \left[ (\widehat{s}I - A)^{-1} \right]^{\ell} B \widehat{b} \in {\mathcal{V}} \quad \quad \text{for} \ell = 1, \dots, N, $$
(2.8)

and V has orthonormal columns, that is, VV = I, then, denoting the -th derivatives of H(s) and \(H^{\mathcal {V}}(s)\) at the point \(\widehat {s}\) with \(H^{(\ell )}(\widehat {s})\) and \(\left [H^{\mathcal {V}}\right ]^{(\ell )}(\widehat {s})\), we have

$$ H^{(\ell)} (\widehat{s}) \widehat{b} = \big[ H^{\mathcal{V} } \big]^{(\ell)} (\widehat{s}) \widehat{b} \quad\quad \text{for} \ell = 0, \dots, N-1 $$
(2.9)

provided that both \(\widehat {s} I - A\) and \(\widehat {s} I - A^{V}\) are invertible.

Theorem 2.1 yields us a direction that indicates how to construct the subspace \(\mathcal {V}\) so that the Hermite interpolation properties (2.5)–(2.7) are satisfied between the original and the reduced problems.

If mp, for a given \(\widetilde {\omega }\in \mathbb {R},\) setting

$$ \mathcal{V}:= \text{Col} \left( \begin{bmatrix} (\mathrm{i} \widetilde{\omega} I -A)^{-1}B & (\mathrm{i}\widetilde{\omega} I - A)^{-2}B & (\mathrm{i}\widetilde{\omega} I - A)^{-3}B \end{bmatrix} \right), $$

where Col(M) denotes the column space of the matrix M, we conclude from Theorem 2.1 that

$$ H(\mathrm{i}\widetilde{\omega}) = H^{\mathcal{V}}(\mathrm{i}\widetilde{\omega}), \quad H^{\prime}(\mathrm{i}{\widetilde{\omega}}) =\left[H^{\mathcal{V}}\right]^{\prime}(\mathrm{i}{\widetilde{\omega}}) \quad \text{and} \quad H^{\prime\prime}(\mathrm{i}{\widetilde{\omega}}) =\left[H^{\mathcal{V}}\right]^{\prime\prime}(\mathrm{i}{\widetilde{\omega}}), $$
(2.10)

which in turn imply

$$ \begin{array}{@{}rcl@{}} T(\widetilde{\omega}, \gamma) &=& T^{\mathcal{V}}(\widetilde{\omega},\gamma), \end{array} $$
(2.11)
$$ \begin{array}{@{}rcl@{}} \frac {\partial T}{\partial \omega} \left( \widetilde{\omega}, \gamma \right) &=& \frac {\partial T^{\mathcal{V}}}{\partial \omega} \left( \widetilde{\omega}, \gamma \right), \end{array} $$
(2.12)
$$ \begin{array}{@{}rcl@{}} \frac {\partial^{2} T}{\partial \omega^{2}} \left( \widetilde{\omega}, \gamma \right) &=& \frac{ \partial^{2} T^{\mathcal{V}} }{\partial \omega^{2}} \left( \widetilde{\omega}, \gamma \right), \end{array} $$
(2.13)

for all γ ∈ (0,1], where

$$ T^{\mathcal{V}}(\omega, \gamma): = \begin{bmatrix} \text{Re}(H^{\mathcal{V}}(\mathrm{i} \omega) & -\gamma \text{Im}(H^{\mathcal{V}}(\mathrm{i}\omega)) \\ \frac{1}{\gamma} \text{Im}(H^{\mathcal{V}}(\mathrm{i}\omega)) & \text{Re}(H^{\mathcal{V}}(\mathrm{i}\omega)) \end{bmatrix}.$$

If m > p, to keep the dimension of the subspace smaller, one can form the subspace as

$$ \mathcal{V}:= \text{Col} \left( \begin{bmatrix} \left( C(\mathrm{i} \widetilde{\omega} I -A)^{-1} )\right)^{*} & \left( C(\mathrm{i} \widetilde{\omega} I -A)^{-2} )\right)^{*} & \left( C(\mathrm{i} \widetilde{\omega} I -A)^{-3} )\right)^{*} \end{bmatrix} \right). $$

Then, substituting H with its conjugate transpose H, the interpolations in (2.10) will be satisfied.

The equalities (2.11)–(2.13) give rise to the Hermite interpolation properties between the reduced and original problems. We will discuss this in detail in Theorem 2.4.

Remark 2.2

It is shown in [22] that for a given ω, the minimum of σ(ω,⋅) is not attained in (0,1], that is,

$$ \mu \left( H(\mathrm{i}\omega) \right) = \inf\limits_{\gamma \in (0,1]} \sigma_{2}(\omega,\gamma) = \lim\limits_{\gamma\rightarrow 0} \sigma_{2}(\omega,\gamma),$$

if and only if rank(Im(H(iω))) = 1. This holds, for example, when m = 1 or p = 1. In [22], it is also shown that in this case,

$$ \mu \left( H(\mathrm{i}\omega) \right) = \max\left\{\sigma_{1}\left( {U_{2}^{T}} \text{Re}(H (\mathrm{i}\omega) ) \right), \sigma_{1} \left( \text{Re} (H(\mathrm{i}\omega) ) V_{2} \right) \right\}, $$

where

$$ \begin{bmatrix} U_{1} & U_{2} \end{bmatrix} \begin{bmatrix} \sigma_{1}(\text{Im}(H(\mathrm{i}\omega)) ) & 0 \\ 0 & 0 \end{bmatrix} \begin{bmatrix} V_{1} & V_{2} \end{bmatrix}^{T} $$

is a real singular value decomposition of Im(H(iω)).

The subsequent arguments of this work, in particular, the Lipschitz continuity and boundedness of the derivatives of the singular value functions, do not apply when γ → 0. To cope with this difficulty, throughout the rest of this work, we identify the domain (0,1] of the inner minimization problem by Ω := [Γ,1], for some small number Γ (in practice we set Γ = 10− 8).

The subspace framework for the computation of \(r_{\mathbb {R}}(A;B,C)\) is summarized in Algorithm 1, where and throughout the rest of this work, we use the short-hand notations

$$ \sigma_{i}(\omega,\gamma) := \sigma_{i}(T(\omega,\gamma)) \quad \text{and} \quad \sigma_{i}^{{\mathcal{V}}}(\omega,\gamma) := \sigma_{i}\left( T^{{\mathcal{V}}}(\omega,\gamma) \right) \quad \quad $$

for \(i = 1,\ldots , \min \limits \left \{m,p \right \}.\) We also use the notations

$$\mu(\omega) := \mu(H(\mathrm{i} \omega)) \quad \text{and} \quad \mu^{{\mathcal{V}}}(\omega) := \mu\left( H^{{\mathcal{V}}}(\mathrm{i}\omega) \right).$$

Moreover, orth(M) stands for a matrix whose columns form an orthonormal basis for the column space of the matrix M. Finally, for a given ω, we define \(\gamma (\omega ),\gamma ^{\mathcal {V}}(\omega )\in (0,1],\) (or more precisely in [Γ,1]) such that

$$ \mu(\omega) = \sigma_{2}(\omega,\gamma(\omega)) \quad \quad \text{and} \quad \quad \mu^{\mathcal{V}}(\omega) = \sigma_{2}(\omega,\gamma^{\mathcal{V}}(\omega)). $$
(2.14)
figure a

At each iteration of Algorithm 1, a reduced problem is solved in line 4 and a global maximizer \(\widehat {\omega }\) is found. Then the subspace is expanded so that certain Hermite interpolation properties are satisfied at optimal \(\widehat {\omega }.\)

Remark 2.3

It is shown in [1] that for a given \(\widetilde {\omega } \in \mathbb {R},\) the inclusions

$$ \text{Col}\left( (\mathrm{i} \widetilde{\omega} I - A)^{-1}B \right) \subseteq \mathcal V, \text{Col}\left( (C (\mathrm{i} \widetilde{\omega} I - A)^{-1})^{*} \right) \subseteq \mathcal W$$

yield the Hermite interpolation properties

$$\sigma_{1}(H(\mathrm{i}\widetilde{\omega})) = \sigma_{1}(H^{\mathcal W, \mathcal V}(\mathrm{i}\widetilde{\omega})) \text{and} \sigma_{1}^{\prime}(H(\mathrm{i}\widetilde{\omega})) = \sigma_{1}^{\prime}(H^{\mathcal W, \mathcal V}(\mathrm{i}\widetilde{\omega})), $$

which give rise to a superlinear convergence. For the current problem, one could use a similar two-sided projections framework to get a superlinear convergence. However, to get a quadratic convergence in a two-sided projections framework, one should form the subspaces in a slightly different way. Indeed, one should interpolate the second derivatives of the objective function, as well, and it requires to add more vectors to the subspaces. More precisely, in the two-sided projections framework, assuming that m = p, (for the general case, we refer to [1]) if one chose the subspaces as

$$ \begin{array}{@{}rcl@{}} \mathcal{V} &=& \text{Col} \left( \left[ (\mathrm{i} \widetilde{\omega} I - A)^{-1}B (\mathrm{i} \widetilde{\omega} I - A)^{-2}B\right] \right),\\ \mathcal W &=& \text{Col}\left( \left[ (C (\mathrm{i} \widetilde{\omega} I - A)^{-1})^{*} (C (\mathrm{i} \widetilde{\omega} I - A)^{-2})^{*}\right] \right) \end{array} $$
(2.15)

at a given \(\widetilde \omega ,\) and formed the reduced transfer function appropriately (as in (2.2)), the conditions (2.10) and hence (2.11)–(2.13) would be satisfied at \(\widetilde \omega \). As a result, one would interpolate the second derivatives of μ(⋅) and get a quadratic convergence.

The main drawback of the two-sided projections framework is its more expensive nature. Looking at (2.15), we can see that at each iteration of the algorithm based on the two-sided projections, one needs to solve four large-scale linear systems rather than three. In fact, to solve the large-scale linear systems, both frameworks require one LU decomposition, but the two-sided framework requires one additional back and forward substitutions. Moreover, at every subspace iteration, the two-sided framework needs to orthogonalize two projection matrices, while for the one-sided variant there is only one projection matrix to be orthogonalized. Finally, the one-sided framework seems to be more reliable and numerically more stable in practice. On the other hand, in one-sided framework, the subspace is expanded to a high dimension more rapidly as it contains extra vectors. Especially, when \(\min \limits \left \{m,p\right \}\) is relatively large, the subspace dimension enlarges faster, therefore one needs to solve larger dimensional reduced problems. Hence, when \(\min \limits \left \{m,p\right \}\) is relatively large, the two-sided variant might be more efficient. This observation is verified by numerical experiments in Section 4.

Assuming that {ωk} converges to a maximizer ω of μ(ω), in the next section we show that under some mild assumptions the rate of convergence of Algorithm 1 is quadratic. The next result concerns the Hermite interpolation properties between the original and the reduced problems.

Lemma 2.4

The following assertions hold regarding Algorithm 1 for each j = 1,…,k:

(i):

For all \(\gamma \in \left (0,1\right ],\)

$$\sigma_{i}(\omega_{j},\gamma ) = \sigma_{i}^{{\mathcal{V}}_{k}}(\omega_{j},\gamma), \quad \quad \text{for} \quad \quad i = 1, \ldots, \min\left\{m,p\right\}.$$
(ii):

It holds that

$$\mu(\omega_{j}) = \mu^{\mathcal{V}_{k}}(\omega_{j}).$$

Furthermore, we have \(\gamma (\omega _{j}) = \gamma ^{{\mathcal {V}}_{k}} (\omega _{j}).\)

(iii):

If σ2(ωj,γ(ωj)) is simple, then

$$\mu^{\prime}(\omega_{j}) =\left[\mu^{\mathcal{V}_{k}}\right]^{\prime}(\omega_{j}).$$
(iv):

If σ2(ωj,γ(ωj)) is simple, then

$$\mu^{\prime\prime}(\omega_{j}) =\left[\mu^{\mathcal{V}_{k}}\right]^{\prime\prime}(\omega_{j}).$$

Proof

  1. (i)

    From the lines 2, 5, 6 and Theorem 2.1 we have \(H(\omega _{j}) = H^{\mathcal {V}_{k}} (\omega _{j})\) and hence

    $$ T({\omega_{j}}, \gamma) = T^{\mathcal{V}_{k}} ({\omega_{j}},\gamma), $$
    (2.16)

    for all γ ∈ (0,1], from which the result follows immediately.

  2. (ii)

    Observe that, for all γ ∈ (0,1],

    $$\sigma_{2}(\omega_{j}, \gamma(\omega_{j}) )\leq \sigma(\omega_{j},\gamma) \quad \quad \text{and} \quad \quad \sigma_{2}^{{\mathcal{V}}_{k}}(\omega_{j},\gamma^{\mathcal{V}_{k}} (\omega_{j}) )\leq \sigma^{{\mathcal{V}}_{k}}(\omega_{j},\gamma).$$

    Combining these with part (i) we get

    $$ \begin{array}{@{}rcl@{}} \mu(\omega_{j}) &=& \sigma_{2}(\omega_{j},\gamma (\omega_{j}) ) = \sigma_{2}^{{\mathcal{V}}_{k}}(\omega_{j},\gamma(\omega_{j})) \\ &\geq& \mu^{{\mathcal{V}}_{k}}(\omega_{j}) = \sigma_{2}^{{\mathcal{V}}_{k}}(\omega_{j},\gamma^{{\mathcal{V}}_{k}} (\omega_{j}) ) \\ &=& \sigma_{2}(\omega_{j},\gamma^{{\mathcal{V}}_{k}} (\omega_{j}) ) \geq \sigma_{2}(\omega_{j},\gamma (\omega_{j}) ) \\ &=& \mu(\omega_{j}) \end{array} $$

    Hence, the inequalities above can be replaced by equalities. So, we have \(\mu (\omega _{j}) = \mu ^{{\mathcal V }_{k}}(\omega _{j}).\) Furthermore, since the function to be minimized in (1.5) is unimodal, we have \(\gamma (\omega _{j}) = \gamma ^{{\mathcal {V}}_{k}} (\omega _{j})\) for each j ∈{1,⋯ ,k}.

  3. (iii)

    Suppose that σ2(ωj,γ(ωj)) is simple, for a fixed j ∈{1,⋯ ,k}. Then, μ(ω) and \(\mu ^{{\mathcal {V}}_{k} }(\omega )\) are differentiable at ω = ωj. Since \(T(\omega _{j},\gamma (\omega _{j})) = T^{\mathcal {V}_{k}} (\omega _{j},\gamma (\omega _{j}) ) \) due to (2.16), the left and the right singular vectors corresponding to \(\sigma _{2}(\omega _{j},\gamma (\omega _{j})) = \sigma _{2}^{{\mathcal {V}}_{k}}(\omega _{j},\gamma (\omega _{j})) \) are the same. Let us denote them by u2 and v2, respectively, and assume, without loss of generality, that they are unit vectors. From the lines 2, 5, 6 together with Theorem 2.1 we have

    $$ \frac{\partial T}{\partial \omega} \left( {\omega_{j}}, \gamma \right) = \frac{\partial T^{\mathcal{V}_{k}}}{\partial \omega} \left( {\omega_{j}},\gamma \right), $$
    (2.17)

    Now, exploiting the analytical formulas for the derivatives of singular value functions [6, 19] and employing (2.12) we obtain

    $$ \begin{array}{@{}rcl@{}} \mu^{\prime} (\omega_{j}) &=& \frac{ \partial \sigma_{2}}{\partial \omega} \left( \omega_{j}, \gamma(\omega_{j}) \right) = \text{Re} \left( {u_{2}^{T}} \frac{\partial T}{\partial \omega} \left( \omega_{j}, \gamma(\omega_{j}) \right) v_{2} \right) \\ &=& \text{Re} \left( {u_{2}^{T}} \frac{\partial T^{\mathcal{V}_{k}} }{\partial \omega} \left( \omega_{j}, \gamma(\omega_{j}) \right) v_{2} \right) = \frac{\partial \sigma^{ {\mathcal{V}}_{k}} }{\partial \omega} \left( \omega_{j}, \gamma(\omega_{j}) \right) \\ &=& \frac{\partial \sigma^{{\mathcal{V}}_{k}} }{\partial \omega} \left( \omega_{j}, \gamma^{\mathcal{V}_{k}}(\omega_{j}) \right) = \left[\mu^{{\mathcal{V}}_{k}}\right]' (\omega). \end{array} $$
  4. (iv)

    It follows from the similar arguments as in part (iii), only here we use the analytical formulas for the second derivatives of singular value functions and exploit

    $$\frac{\partial^{2} T}{\partial \omega^{2}} \left( {\omega_{j}}, \gamma \right) = \frac{\partial^{2} T^{\mathcal{V}_{k}}}{\partial \omega^{2}} \left( {\omega_{j}},\gamma \right), \quad \forall \gamma \in(0,1]. $$

Remark 2.5

Although (2.3) is a maximin problem when we expand the subspace we only utilize the optimizer of the outer iteration, namely optimal ω, since H depends only on ω. Therefore, the problem turns into the maximization of μ(ω) and the Hermite interpolation properties between the original problem and the reduced one at only optimal ω for the reduced problem suffices to get a quadratic convergence as we shall see in the next section.

3 Rate of convergence analysis

This section is devoted to the rate of convergence analysis of the subspace framework discussed in the previous section. The main theorem states that the rate of convergence of Algorithm 1 is quadratic, assuming that it converges at least locally. In particular, we consider the three consecutive iterates ωk− 1,ωk,ωk+ 1 of Algorithm 1 that are sufficiently close to a local or global maximizer ω of μ(⋅).

Before proceeding, we discuss some particular cases regarding the location of optimizers of the inner and outer optimization problems which were analyzed in detail in [22].

Case 1::

rank(Im(H(iω))) ≤ 1 and ω = 0. It turns out that, ω = 0 if and only if Im(H(iω)) = 0. Therefore, in this case the inner minimization over γ disappears. More generally, if rank(Im(M)) ≤ 1, then the minimization over γ can be eliminated and μ(⋅) can be calculated by means of an explicit formula [22]. In particular, when ω = 0, we have

$$ \begin{array}{@{}rcl@{}} r_{\mathbb{R}}(A;B,C) &=& \sigma_{2} \left( \begin{bmatrix} \text{Re}(H(\mathrm{i}\omega_{*}) ) & 0 \\ 0 & \text{Re}(H(\mathrm{i}\omega_{*})) \end{bmatrix}\right) \\ &=& \sigma_{1} \left( \begin{bmatrix} \text{Re}(H(\mathrm{i}\omega_{*})) & 0 \\ 0 & \text{Re}(H(\mathrm{i}\omega_{*})) \end{bmatrix}\right) \\ &=& \sigma_{1}(CA^{-1}B). \end{array} $$
Case 2::

γ = 1 and ω≠ 0, where and throughout γ denotes the global minimizer of σ2(ω,⋅) over (0,1], that is, μ(ω) = σ2(ω,γ). In this case,

$$\mu \left( H(\mathrm{i}\omega_{*})) \right) = \sigma_{2}(T(\omega_{*},1)) = \sigma_{1}(T(\omega_{*},1)) = \sigma_{1}(H(\mathrm{i}\omega_{*})).$$

Hence, the computation of \(r_{\mathbb {R}}(A;B,C)\) becomes equivalent to the computation of \({\mathscr{H}}_{\infty }\) norm, which is already addressed in [1].

We remark that in practice, our method efficiently works for the non-smooth cases above (see Section 4). However, the rate of convergence analysis discussed in this section does not cover these cases. To this end, throughout this section, we assume that cases 1 and 2 mentioned above are excluded. That is, we assume that, ω≠ 0, γ≠ 1. More generally, in the rate of convergence analysis, we address the smooth setting only. In other words, we assume in the rest of this section that the following smoothness assumption holds.

Assumption 3.1 (Smoothness)

The singular value σ2(ω,γ) of T(ω,γ) is simple.

The rate of convergence analysis for the non-smooth optimization of the largest eigenvalue function of a Hermitian matrix that depends on one parameter is addressed in [17]. As a future work, the idea of [17] can be adapted to the non-smooth cases of the problem of this work.

Results of this section are uniform over all subspaces \(\mathcal {V}_{k}\) and orthonormal bases Vk as long as they satisfy the following non-degeneracy assumption.

Assumption 3.2 (Non-degeneracy)

For a given β > 0, we have

$$ \sigma_{\min} \left( D^{\mathcal{V}_{k}}(\mathrm{i}\omega_{*}) \right)\geq\beta $$
(3.1)

where \(D^{\mathcal {V}_{k}}(s) := (sI-A^{V_{k}})\) and \(\sigma _{\min \limits }(\cdot )\) denotes the smallest singular value of its matrix argument.

Assumption 3.2 combined with the Lipschitz continuity of \(\sigma _{\min \limits } (\cdot )\) implies the boundedness of the smallest singular value in (3.1) away from zero in a neighborhood of ω. Formally, there exists a neighborhood \(\mathcal N (\omega _{*})\) of ω such that

$$ \sigma_{\min} \left( D^{\mathcal{V}_{k}}(\mathrm{i}\omega) \right) \geq \beta/2,\quad \forall \omega \in \mathcal N (\omega_{*}), $$
(3.2)

see the beginning of Lemma A.1 in [2].

The next result concerns the uniform Lipschitz continuity of the singular value functions \(\omega \mapsto \sigma ^{\mathcal {V}_{k}} (\omega ,\cdot ).\) For the proofs of the assertions (i) and (ii) we refer to [2, Lemma A.1]. The proof of the assertion (iii) follows from part (ii); see [21, Lemma 8, (ii)]. Here we make use of the following notations

$$\mathcal{I}(\omega_{*}, \delta) := (\omega_{*} - \delta, \omega_{*} + \delta),\quad \mathcal{I}(\gamma_{*}, \delta) := (\gamma_{*} - \delta, \gamma_{*} + \delta)$$

and

$$\overline{\mathcal{I}} (\omega_{*}, \delta):= [\omega_{*} - \delta, \omega_{*} + \delta] \quad \overline{\mathcal{I}}(\gamma_{*}, \delta):= [\gamma_{*} - \delta, \gamma_{*} + \delta]$$

for given \(\omega \in \mathbb {R}, \gamma \in {\Omega }\) and δ > 0.

By a constant, here and in the rest of this section, we mean a scalar that may depend only on the parameters of the original problem which are independent of the subspace \(\mathcal {V}_{k}\) and orthonormal basis Vk for the subspace.

Lemma 3.3 (Uniform Lipschitz Continuity)

Suppose that Assumption 3.2 holds.

There exist constants δω,ζ that satisfy the following:

  • For all γΩ,

    $$ \left\Vert{T^{\mathcal{V}_{k}}(\widetilde{\omega}, \gamma ) - T^{\mathcal{V}_{k}}(\widehat{\omega}, \gamma )}\right\Vert_{2} \leq \zeta |\widetilde{\omega} - \widehat{\omega}| \quad \forall \widetilde{\omega}, \widehat{\omega} \in\overline{\mathcal{I}} (\omega_{*}, \delta_{\omega}). $$
  • For all γΩ, and i = 1,2,3,

    $$ \left| \sigma_{i}^{\mathcal{V}_{k}}(\widetilde{\omega}, \gamma ) - \sigma_{i}^{\mathcal{V}_{k}}(\widehat{\omega}, \gamma ) \right| \leq \zeta |\widetilde{\omega} - \widehat{\omega}| \quad \forall\widetilde{\omega}, \widehat{\omega} \in \overline{\mathcal{I}} (\omega_{*}, \delta_{\omega}). $$
  • $$ \left |\mu^{\mathcal{V}_{k}}(\widetilde{\omega}) - \mu^{\mathcal{V}_{k}}(\widehat{\omega}) \right | \leq {\zeta} |\widetilde{\omega} - \widehat{\omega}| \quad \forall \widetilde{\omega}, \widehat{\omega} \in \overline{\mathcal{I}} (\omega_{*}, \delta_{\omega}). $$

Since our main result relies on the smooth setting, in the next result, we state and prove that there exists an interval in which μ(⋅) and \(\mu ^{\mathcal {V}_{k}}(\cdot )\) are both smooth under some mild assumptions.

Lemma 3.4

Suppose that Assumptions 3.1 and 3.2 hold.

(i):

There exist constants \(\widetilde {\delta }_{\omega }>0, \widetilde {\delta }_{\gamma }>0\) such that both σ2(ω,γ) and \(\sigma _{2}^{{\mathcal V_{k}}}(\omega ,\gamma )\) are simple, hence real analytic, for all \(\omega \in \mathcal I (\omega _{*}, \widetilde {\delta }_{\omega })\) and \(\gamma \in \mathcal I (\gamma _{*}, \widetilde {\delta }_{\gamma }).\)

(ii):

There exists a constant δω > 0 such that both μ(ω) and \(\mu ^{\mathcal {V}_{k}} (\omega )\) are real analytic for all \(\omega \in \mathcal I (\omega _{*}, {\delta }_{\omega }).\)

Proof

(i):

Since A is asymptotically stable, H is analytic on the imaginary axis and so (ω,γ)↦T(ω,γ) is continuous on \(\mathbb {R}\times {\varOmega }.\) Consequently, σ1(⋅,⋅),σ2(⋅,⋅),σ3(⋅,⋅) are continuous functions. The continuity of σ1(⋅,⋅),σ2(⋅,⋅),σ3(⋅,⋅) and the assumption that σ2(ω,γ) is simple imply that σ2(ω,γ) remains simple in a neighborhood \(\mathcal N(\omega _{*}, \gamma _{*})\) of (ω,γ). More precisely,

$$ \begin{array}{@{}rcl@{}} \sigma_{1}(\omega, \gamma) - \sigma_{2}(\omega, \gamma) \geq \widetilde{\epsilon} \quad \text{and} \quad \sigma_{2}(\omega, \gamma) - \sigma_{3}(\omega, \gamma) \geq \widetilde{\epsilon} \end{array} $$
(3.3)

for all \((\omega ,\gamma ) \in \mathcal N(\omega _{*},\gamma _{*}),\) for some \(\widetilde {\epsilon } >0.\) This shows that σ2(ω,γ) is simple in \(\mathcal N (\omega _{*}, \gamma _{*})\).

Now, by exploiting the interpolation properties

$$\sigma_{i} (\omega_{k},\gamma) = \sigma^{\mathcal{V}_{k}} (\omega_{k},\gamma), \quad i=1,2,3, \quad \gamma\in {\varOmega},$$

as well as the uniform Lipschitz continuity of \(\sigma _{i}^{\mathcal {V}_{k}} (\cdot ,\cdot ),\) for i = 1,2,3, and assuming without loss of generality that ωk is close enough to ω we deduce that there exists a region \(\mathcal I(\omega _{*}, \widetilde {\delta }_{\omega } ) \times \mathcal I(\gamma _{*}, \widetilde {\delta }_{\gamma } ) \subseteq \mathcal N (\omega _{*}, \gamma _{*})\) in which \(\sigma _{2}^{\mathcal {V}_{k}} (\cdot ,\cdot )\) is simple. That is,

$$ \sigma_{1}^{\mathcal{V}_{k}}(\omega, \gamma) - \sigma_{2}^{\mathcal{V}_{k}}(\omega, \gamma) \geq {\epsilon}, \sigma_{2}^{\mathcal{V}_{k}}(\omega, \gamma) - \sigma_{3}^{\mathcal{V}_{k}}(\omega, \gamma) \geq {\epsilon} $$
(3.4)

for all \((\omega ,\gamma ) \in \mathcal I(\omega _{*}, \widetilde {\delta }_{\omega } ) \times \mathcal I(\gamma _{*}, \widetilde {\delta }_{\gamma } ) \) and for some constants \({\epsilon } \in (0,\widetilde {\epsilon }), \widetilde {\delta }_{\omega }>0, \widetilde {\delta }_{\gamma }>0.\) Therefore, σ2(⋅,⋅) and \(\sigma _{2}^{\mathcal {V}_{k}} (\cdot , \cdot )\) are simple singular value functions of T(ω,γ) and \(T^{\mathcal {V}_{k}}(\omega ,\gamma )\) in \(\mathcal I(\omega _{*}, \widetilde {\delta }_{\omega } ) \times \mathcal I(\gamma _{*}, \widetilde {\delta }_{\gamma } ).\) It follows that both σ2(⋅,⋅) and \(\sigma _{2}^{\mathcal {V}_{k}} (\cdot , \cdot )\) are real analytic in \(\mathcal I(\omega _{*}, \widetilde {\delta }_{\omega } ) \times \mathcal I(\gamma _{*}, \widetilde {\delta }_{\gamma } ). \)

(ii):

From the implicit function theorem, there exist constants δω > 0,δγ > 0 such that for all \(\omega \in \mathcal I(\omega _{*}, {\delta }_{\omega },)\) we have \(\gamma (\omega ) \in \mathcal I(\gamma _{*}, {\delta }_{\gamma } ).\) Now, assume without loss of generality that \(\delta _{\omega }\in (0, \widetilde {\delta }_{\omega })\) and \(\delta _{\gamma } \in (0, \widetilde {\delta }_{\gamma }),\) where \(\widetilde {\delta }_{\omega }, \widetilde {\delta }_{\gamma } \) are as in (3.4). Then for \( \omega \in \mathcal I(\omega _{*}, {\delta }_{\omega } ) \subset \mathcal I(\omega _{*}, \widetilde {\delta }_{\omega } ) \) we have \(\gamma (\omega ) \in \mathcal I(\gamma _{*}, {\delta }_{\gamma } ) \subset \mathcal I(\gamma _{*}, \widetilde {\delta }_{\gamma } ).\) Hence, (3.4) implies

$$ \sigma_{1}\left( \omega, \gamma(\omega) \right) - \sigma_{2} \left( \omega, \gamma(\omega) \right) \geq {\epsilon}\quad \text{and} \quad \sigma_{2} \left( \omega, \gamma(\omega) \right) - \sigma_{3} \left( \omega, \gamma(\omega) \right) \geq {\epsilon} $$
(3.5)

for all \(\omega \in \mathcal I \left (\omega _{*}, {\delta }_{\omega } \right ).\) Thus, \(\sigma _{2} \left (\omega , \gamma (\omega ) \right )\) is simple and so μ(ω) is real analytic in \( \mathcal I \left (\omega _{*}, {\delta }_{\omega } \right ).\) Considering ωk close enough to ω and following the similar arguments as above we conclude that \(\mu ^{\mathcal {V}_{k}}(\omega )\) is also real analytic in \( \mathcal I \left (\omega _{*}, {\delta }_{\omega } \right ).\)

The next result concerns the uniform boundedness of the third derivatives of μ(⋅) and \(\mu ^{\mathcal {V}_{k}}(\cdot )\) in a neighborhood of ω. This uniform boundedness of the third derivatives gives rise to the uniform Lipschitz continuity of the second derivatives of these functions, which becomes important in our main result. For a proof, we refer to Lemma A.2 (ii) and Proposition 3.5 in [2]

Lemma 3.5

Suppose that Assumptions 3.1 and 3.2 hold.

(i):

Both μ(⋅) and \(\mu ^{\mathcal {V}_{k}}(\cdot )\) are at least three times continuously differentiable in \(\mathcal I (\omega _{*},\delta _{\omega }),\) where δω is as in Lemma 3.4.

(ii):

For each \(\hat {\delta }_{\omega }\in (0, \delta _{\omega })\) there exists a constant ξ > 0 such that for all \(\omega \in \mathcal I (\omega _{*}, \hat {\delta }_{\omega })\) we have

$$\left|\mu^{\prime\prime\prime}(\omega)\right|\leq \xi \quad \text{and} \quad \left|\left[\mu^{\mathcal{V}_{k}}\right]^{\prime\prime\prime}(\omega)\right|\leq \xi.$$

Now, we are ready to present our main result. Here, we remind that ωk− 1,ωk,ωk+ 1 are assumed to be sufficiently close to a local maximizer ω of μ(⋅).

Theorem 3.6 (Local Quadratic Convergence)

Suppose that Assumptions 3.1 and 3.2 hold and \(\mu ^{\prime \prime }(\omega _{*}) \neq 0.\) Then, for Algorithm 1, there exists a constant c > 0 such that

$$\frac{ | \omega_{k+1} - \omega_{\ast} |}{| \omega_{k} - \omega_{\ast} |^{2}} \leq c.$$

Proof

Assume without loss of generality that ωk,ωk− 1 lie in \(\mathcal I (\omega _{*}, \delta _{\omega } )\) and are close enough to ω so that \(\mathcal I (\omega _{k}, h_{k} )\subset \mathcal I (\omega _{*}, \delta _{\omega } ),\) where hk := |ωkωk− 1| and δω is as in Lemma 3.4 and Lemma 3.5.

Assume further that \(\mu ^{\prime \prime }(\omega _{k}) \neq 0,\)

that is,

$$ | \mu^{\prime\prime}(\omega_{k}) | \geq l $$
(3.6)

for some constant l > 0. Existence of such l > 0 follows from the assumption \(\mu ^{\prime \prime }(\omega _{*}) \neq 0 \) and the continuity of \(\mu ^{\prime \prime }(\cdot )\) in \(\mathcal I (\omega _{*}, \delta _{\omega } ).\) Due to analyticity of μ(⋅) in \(\mathcal I (\omega _{*}, \delta _{\omega } )\) (Lemma 3.4, part (ii))

we have

$$ 0 = \mu^{\prime}(\omega_{*}) = \mu^{\prime}(\omega_{k}) + {{\int}_{0}^{1}} \mu^{\prime\prime} \left( \omega_{*} + t (\omega_{*} - \omega_{k}) \right) (\omega_{*} - \omega_{k})\mathrm{d}t. $$
(3.7)

Dividing both sides of (3.7) by \(\mu ^{\prime \prime }(\omega _{k})\) we obtain

$$ 0 = \frac{\mu^{\prime}(\omega_{k})}{\mu^{\prime\prime}(\omega_{k})} + \left( \omega_{\ast} - \omega_{k} \right) + \frac{1}{\mu^{\prime\prime}(\omega_{k})} {{\int}_{0}^{1}} \left( \mu^{\prime\prime} \left( \omega_{k} + t (\omega_{\ast} - \omega_{k} )\right) - \mu^{\prime\prime}(\omega_{k}) \right) \left( \omega_{\ast} - \omega_{k} \right) \mathrm{d}t. $$
(3.8)

Application of Taylor’s theorem with integral remainder to \([\mu ^{\mathcal {V}_{k}}]^{\prime } (\cdot )\) at ωk, and optimality of ωk+ 1 with respect to \(\mu ^{\mathcal {V}_{k}}(\cdot )\) give rise to

$$ 0 = [\mu^{\mathcal{V}_{k}}]^{\prime}(\omega_{k+1} ) = [\mu^{\mathcal{V}_{k}}]^{\prime}(\omega_{k}) + {{\int}_{0}^{1}} [\mu^{\mathcal{V}_{k}}]^{\prime\prime} \left( \omega_{k} + t (\omega_{k+1} - \omega_{k}) \right) (\omega_{k+1} - \omega_{k})\mathrm{d}t. $$
(3.9)

which implies

$$ \begin{array}{@{}rcl@{}} 0 &=& \frac{ [\mu^{\mathcal{V}_{k}}]^{\prime}(\omega_{k})}{[\mu^{\mathcal{V}_{k}}]^{\prime\prime}(\omega_{k})} + \left( \omega_{k+1} - \omega_{k} \right)\\ &+& \frac{1}{[\mu^{\mathcal{V}_{k}}]^{\prime\prime}(\omega_{k})} {{\int}_{0}^{1}} \left( [\mu^{\mathcal{V}_{k}}]^{\prime\prime} \left( \omega_{k} + t (\omega_{k+1} - \omega_{k} )\right) - [\mu^{\mathcal{V}_{k}}]^{\prime\prime}(\omega_{k}) \right) \left( \omega_{k+1} - \omega_{k} \right) \mathrm{d}t. \end{array} $$
(3.10)

Combining (3.8) and (3.10) and exploiting the Hermite interpolation properties

$$\mu^{\prime}(\omega_{k}) = [\mu^{\mathcal{V}_{k}}]^{\prime}(\omega_{k}) \quad \text{and} \quad \mu^{\prime\prime}(\omega_{k}) = [\mu^{\mathcal{V}_{k}}]^{\prime\prime}(\omega_{k})$$

we deduce

$$ \begin{array}{@{}rcl@{}} &&0 = \omega_{*} - \omega_{k+1} + \frac{1}{\mu^{\prime\prime}(\omega_{k})} {{\int}_{0}^{1}} \left( \mu^{\prime\prime} \left( \omega_{k} + t (\omega_{\ast} - \omega_{k} )\right) - \mu^{\prime\prime}(\omega_{k}) \right) \left( \omega_{\ast} - \omega_{k} \right) \mathrm{d}t \\ &&\quad\quad\quad\quad\quad\quad - \frac{1}{\mu^{\prime\prime}(\omega_{k})} {{\int}_{0}^{1}} \left( [\mu^{\mathcal{V}_{k}}]^{\prime\prime} \left( \omega_{k} + t (\omega_{k+1} - \omega_{k} )\right) - [\mu^{\mathcal{V}_{k}}]^{\prime\prime}(\omega_{k}) \right) \left( \omega_{k+1} - \omega_{k} \right) \mathrm{d}t \end{array} $$
(3.11)

implying

$$ \begin{array}{@{}rcl@{}} &&\left | \omega_{*} - \omega_{k+1} \right | \leq \left | \frac{1}{\mu^{\prime\prime}(\omega_{k})} \right | \left | {{\int}_{0}^{1}} \left( \mu^{\prime\prime} \left( \omega_{k} + t (\omega_{\ast} - \omega_{k} )\right) - \mu^{\prime\prime}(\omega_{k}) \right)\left( \omega_{\ast} - \omega_{k} \right) \mathrm{d}t \right | \\ &&\quad\quad\quad\quad\quad\quad+ \left | \frac{1}{\mu^{\prime\prime}(\omega_{k})} \right | \left | {{\int}_{0}^{1}} \left( [\mu^{\mathcal{V}_{k}}]^{\prime\prime} \left( \omega_{k} + t (\omega_{k+1} - \omega_{k} )\right) - [\mu^{\mathcal{V}_{k}}]^{\prime\prime}(\omega_{k}) \right)\left( \omega_{k+1} -\omega_{k} \right) \mathrm{d}t \right | \end{array} $$

In the last equation, we exploit the Lipschitz continuity of both \(\mu ^{\prime \prime }(\cdot )\) and \(\left [ \mu ^{\mathcal {V}_{k}} \right ]^{\prime \prime }(\cdot )\) in \(\mathcal I (\omega _{*}, \delta _{\omega })\) (which follows from Lemma 3.5) and employ (3.6) to obtain

$$ \left | \omega_{*} - \omega_{k+1} \right | \leq \xi/l \left( |\omega_{*} - \omega_{k} |^{2} + |\omega_{k+1} - \omega_{k} |^{2} \right) $$
(3.12)

where ξ is the Lipschitz constant for \(\mu ^{\prime \prime }\) and \([\mu ^{\mathcal {V}}_{k}]^{\prime \prime }.\)

The result follows by substituting the inequality

$$|\omega_{k+1} - \omega_{k} |^{2} \leq 2 |\omega_{k+1} - \omega_{*} |^{2} + 2|\omega_{k} - \omega_{*} |^{2}$$

in (3.12). □

4 Numerical experiments

In this section, we present numerical results obtained by our MATLAB implementation of Algorithm 1. First, we introduce some important implementation details and the test setup. Then, we report the numerical results on several examples. We exhibit the results in detail in Tables 1 and 3. All examples are taken from Model Order Reduction Wiki (MOR Wiki) websiteFootnote 1, EigToolFootnote 2 and Compleib Footnote 3. In all examples the system matrices (A,B,C) are real and A is asymptotically stable. Our numerical experiments have been performed using MATLAB (R2018b) on an iMac with an Intel Core i5 1.4 GHz processor and 4 GB of memory

Table 1 Numerical results of the experiment on several test examples; the results of Algorithm 1 are compared with the results of the method in [24]

4.1 Implementation details and test setup

At each iteration of Algorithm 1, a reduced problem is solved, namely, \(r^{\mathcal {V}}(A;B,C)\) is calculated. Therefore, at each iteration, we solve a small-scale minimax problem. Since the cost function of the inner minimization problem is unimodal [22], we utilize “golden section search algorithm” to solve the inner minimization problem. In fact, we use MATLAB function fminbnd to solve the inner minimization problem. The outer maximization problem is solved by means of a MATLAB implementation of the algorithm in [24], which is an extension of Boyd and Balakrishan algorithm [4]. The approach of [24] is a level-set method and based on the relation between the imaginary eigenvalues of a Hamiltonian matrix and singular values of the transfer function. Therefore, at each iteration, the method requires the solution of a Hamiltonian eigenvalue problem of size four times the order of the given system. But since the reduced problems are of small order, the method works efficiently.

Algorithm 1 terminates in practice when the relative distance between \(r^{\mathcal {V}_{k}}(A;B,C)\) and \(r^{\mathcal {V}_{k-1}}(A;B,C)\) is less than a prescribed tolerance for some k > 1, or if the number of iterations exceeds a specified integer. Formally, the algorithm terminates, if

$$ k > k_{\max} \quad \text{or} \quad \left| r^{\mathcal{V}_{k}}(A;B,C) - r^{\mathcal{V}_{k-1}}(A;B,C) \right| < \varepsilon \cdot \frac{1}{2} \left| r^{\mathcal{V}_{k}}(A;B,C) + r^{\mathcal{V}_{k-1}}(A;B,C) \right|. $$

In our numerical experiments, we set ε = 10− 5 and \(k_{\max \limits } = 25\).

Algorithm 1 converges locally. In our previous works [1, 2], we initialize the algorithm with more than one interpolation point to reduce the possibility of stagnating at a local optimizer. Here, we apply the same strategy.

We choose k0 ≥ 2 number of initial interpolation points that are equidistantly distributed in the interval \([0,\omega _{\max \limits }],\) where \(\omega _{\max \limits }\) is a problem-dependent parameter. In our experiments, k0 = 10 is usually sufficient to capture the global maximum. However, there are more complicated examples which need a larger number of initial interpolation points. For instance, for the Demmel examples, the algorithm should be started with at least 40 initial interpolation points in the interval [0,1000] to find the global maximum in a few iterations. It is possible that the algorithm converges to the correct global maximizer with fewer initial points but then it might happen that more iterations are needed, since less global information is known.

4.2 Test results

In Tables 1 and 3 we report the outcome of our numerical experiments. The number of additional iterations after the construction of the initial reduced function needed to return the \(r_{\mathbb {R}}(A;B,C)\) is denoted by niter in both tables. Furthermore, the dimension of A, the number of the columns of B and the number of the rows of C are denoted by n,m,p, respectively.

In Table 1, we exhibit the results of the experiments on several test examples. We compare the results of our subspace method (based on one-sided framework) with the ones generated by [24]. For all examples, with or without subspace acceleration, we retrieve the same \(r_{\mathbb {R}}(A;B,C)\) values up to the prescribed tolerance. The proposed subspace framework outperforms the approach in [24] for all large-scale examples listed in Table 1. The ratios between the runtime required by the method in [24] and that required by Algorithm 1 are listed in the last column of the table. We observe quadratic convergence in all examples. More precisely, for all examples listed in Table 1, at most four additional iterations after the initial interpolation are required to estimate \(r_{\mathbb {R}}(A;B,C)\) up to a given tolerance. The errors in the iterates as a function of the number of iterations for the RCL circuit example is reported in Table 2.

Table 2 The errors of the iterates of Algorithm 1, the errors |μkμ| and |rkr| are listed, for the example RCL circuit

Recall that, the rate of convergence analysis discussed in Section 3 relies on the smooth setting, and when ω = 0, the singular value function σ2(⋅,⋅) is not simple and hence not smooth. However, from the middle part of Table 1 we see that the quadratic convergence is still observed for this particular non-smooth case.

The last four examples in Table 1 are of small scale. The ratios of the runtimes required by the method in [24] over the runtimes required by Algorithm 1 for those small-scale problems suggest that our subspace method is not efficient on the small-scale examples. This is because, for small-scale examples, Algorithm 1 carries out a few calls (at least two) of the method of [24]. Therefore, it needs more time to solve the original small-scale problem than a single run of [24].

In Table 3, we compare the results of the subspace method based on one-sided framework with that of a two-sided variant. Both frameworks return the same correct value of r(A;B,C). The ratios of the runtimes required by the one-sided framework over the runtimes required by two-the sided framework are listed in the last column of the table. The results suggest that when \(\min \limits \left \{m,p\right \}\) is relatively large (say larger than 10), the two-sided framework is more efficient, otherwise the one-sided framework works better in terms of the complexity.

Table 3 Comparison of the numerical results obtained by one-sided framework(OSF) and two-sided framework(TSF)

4.3 Downside of the method

The main limitation of the algorithm discussed in this work is that it converges only locally. If the algorithm starts far away from the global maximizer of μ(⋅) or if it starts with not a sufficient number of initial interpolation points, then it is possible to miss a global maximizer of μ(⋅). This fact is illustrated in Fig. 1. Algorithm 1 is run on the Demmel1K example with only two random initial points. As a result, the algorithm converges to a local maximum which is attained at zero. To get a better illustration, we plot the figure in a symmetric interval and the global maximum is not depicted in the figure as it is too much bigger than the local maximum attained at zero. However, it is still easily seen that zero is not a global maximizer as the function has higher values at different points. One possible solution to this problem is to start the algorithm with more initial interpolation points. For this particular example, if the algorithm starts with 40 initial points, then it retrieves the global maximum.

Fig. 1
figure 1

Intermediate reduced functions obtained by Algorithm 1 for the Demmel_1k example. The blue line indicates the original function, while the red dashed lines represent the reduced functions. The locations of the maximizers and maximum of the reduced functions are indicated by the red crosses and circles, respectively

5 Conclusion

We have proposed a subspace framework to approximate the structured real stability radius for large-scale systems. The method for the computation of the structured real stability radius was based on an interpolatory model order reduction technique. In particular, we reduced the large-scale problems to small ones, and by repeatedly solving small-scale problems, we obtain an estimate of the large-scale stability radius. Our subspace method yields a Hermite interpolation property between the full and reduced problems which ensures quadratic convergence. The rigorous analysis of this fast convergence was established as well as the efficiency of the proposed methods was demonstrated on several numerical experiments. The subspace idea addressed in this work can be applicable for the computation of the structured real stability radius of a large-scale time-delay system. Moreover, a different variant of the subspace method (see, for example, [16, 21]) can be applied to approximate the unstructured real stability radius for large-scale systems. We aim to focus on these problems in future works.