1 Introduction

Consider the linear time-invariant (LTI) multi-input, multi output system described by the state-space equations

$$\begin{aligned} \varSigma := \left\{ \begin{array}{lllll} \dot{x}(t) &{} = &{} Ax(t) + Bu(t)\\ y(t) &{} = &{} Cx(t),\\ \end{array}\right. \end{aligned}$$
(1)

where \(x(t) \in \mathbb {R}^n\) denotes the state vector , \(u(t) \in \mathbb {R}^s\) and \(y(t)\in \mathbb {R}^s\) are the input and output signals, respectively. The matrix \(A \in \mathbb {R}^{n\times n}\) is assumed to be large, sparse and stable (all the eigenvalues are in the open left plan); \(B\in \mathbb {R}^{n\times s}\) and \(C^{T} \in \mathbb {R}^{n\times s}\). The transfer function for the system in (1) is obtained by applying the Laplace transform, which yields

$$\begin{aligned} \left\{ \begin{array}{lllll} sX(\omega ) &{} = &{} AX(\omega ) + BU(\omega )\\ Y(\omega ) &{} = &{} CX(\omega ),\\ \end{array}\right. \end{aligned}$$
(2)

where \(X(\omega )\), \(Y(\omega )\) and \(U(\omega )\) are the Laplace transform of x(t), y(t) and u(t), respectively. If we eliminate \(X(\omega )\) in the previous two equations we obtain \(Y (\omega ) = H(\omega )U(\omega )\), where \(H(\omega )\) is called the transfer function of the system (1) defined as

$$\begin{aligned} H(\omega ) := C(\omega I_n - A)^{-1}B. \end{aligned}$$
(3)

The goal of model reduction is to produce a much smaller order system \(\varSigma _m\) with a state-space form

$$\begin{aligned} \varSigma _m: \left\{ \begin{array}{lllll} \dot{x}_m(t) &{} = &{} A_mx_m(t)+ B_mu(t) \\ y_m(t) &{} = &{} C_mx_m(t), \\ \end{array}\right. \end{aligned}$$
(4)

and its transfer function

$$\begin{aligned} H_m(\omega ) := C_m(\omega I_m - A_m)^{-1}B_m \in \mathbb {R}^{s\times s}, \end{aligned}$$
(5)

where \(A_m \in \mathbb {R}^{m\times m}\), \(B_m\in \mathbb {R}^{m\times s}\), \(C_m^{T} \in \mathbb {R}^{m\times s}\), (with \(m \ll n\)), such that the reduced system \(\varSigma _m\) will have an output \(y_m(t)\) as close as possible to the one of the original system to any given input u(t), which means that for some chosen norm, \(\Vert y-y_m \Vert \) is small.

Various model reduction methods for MIMO systems have been explored these last years, such as balanced truncation [17, 19, 25], moment matching and interpolation methods [2, 10] and optimal Hankel norm approximation [13]. Popular model reduction techniques for large-scale linear dynamical systems are based on interpolation methods [10, 12]. These methods use Krylov, block Krylov subspaces or rational Krylov subspaces, to produce reduced models whose transfer function interpolates the original system transfer function at selected interpolation points. Using tangential interpolation, a more powerful method was first introduced in [20]. For this method, one has to interpolate the transfer function at some points and in the directions \((d_i)\). The tangential interpolation points \(s_i\), \(i=1,\ldots ,m\) and directions \(d_i\), \(i=1,\ldots ,m\) can be obtained by using the rational subspaces generated by \((s_1I-A)^{-1}Bd_1\),..., and \( (s_m I-A)^{-1}B d_m\). In [26], the authors proposed the iterative tangential interpolation algorithm (ITIA) where the interpolation points and directions are selected in a specific way. In [9] the authors proposed an adaptive way for choosing the parameters \((s_i, d_i)\), \(i = 1,\ldots ,m\) by optimizing the error.

In the present paper, we use an adaptive way, using the global Arnoldi method, to obtain these parameters and the we derive new tangential Krylov based methods.

The paper is organized as follows: In Sect. 2, we introduce the tangential interpolation method. In Sect. 3, we recall some definitions and properties that will be used later. Section 4 presents the Adaptive Global Tangential Arnoldi method which is obtained by using an adaptive strategy for the selection of the shifts and the tangential directions, that are used in the construction of rational tangential Krylov subspaces. The last section is devoted to some numerical tests and comparisons with some well known model order reduction methods.

2 Tangential Interpolation

2.1 Moments and Interpolation

We first give the following definition.

Definition 1

Given the LTI dynamical system \(\varSigma \) defined by (1). Then its associated transfer function \(H(\omega ) =C(\omega I-A)^{-1}B \) can be decomposed through a Laurent series expansion around a given \(\sigma \in \mathbb {R}\) (shift point), as follows

$$\begin{aligned} H(\omega )= \sum _{i=0}^\infty \eta _i^{(\sigma )} \frac{(\omega -\sigma )^i}{i!}, \end{aligned}$$
(6)

where \(\eta _i^{(\sigma )} \in \mathbb {R}^{s \times s}\) is called the i-th moments at \(\sigma \) associated to the system and defined as follows

$$\begin{aligned} \eta _i^{(\sigma )} = C(\sigma I_n-A)^{-(i+1)}B =(-1)^i\frac{d^i}{ds^i}H(\omega )|_{s=\sigma }. \end{aligned}$$
(7)

In the case where \( \sigma = \infty \), the \(\eta _i^{(\sigma )}\)’s are called Markov parameters and are given by

$$\begin{aligned} \eta _i = CA^iB. \end{aligned}$$

Given \(H(\omega )\) as in (3), the interpolation problem is to find a reduced model \(\varSigma _m\) such that \(H_m(\omega )\) interpolates \(H(\omega )\) as well as a certain number of its derivatives at \(\sigma \) in the complex plane.

2.2 The Tangential-Interpolation Projection Problem

The aim of this paper is to produce a low order transfer function, \(H_m(\sigma )\), that approximates the large order transfer function \(H(\sigma )\); we want \(H_m(\sigma ) \approx H(\sigma )\) for a set of values of \(\sigma \), with \(m \ll n\) in a sense that will be made precise later. Various model reduction methods for MIMO dynamical systems have been explored these last years. Some of them are based on Krylov subspace interpolation methods and are defined as follows: Select a set of points \(\{\sigma _i\}^m_{i=1} \subset \mathbb {C}\) and then seek a reduced order transfer function, \(H_m(\omega )\), such that \(H_m(\sigma _i) = H(\sigma _i)\) for \(i = 1,\ldots ,m\); see [3, 4, 7, 8] for more details.

The tangential interpolation is a more powerful method in which the interpolation conditions above are acting in specified directions. Hence the problem can be stated as follows

Assume that the following parameters are given:

  • Left interpolation points \(\{\mu _i\}^m_{i=1} \subset \mathbb {C}\) and left tangent directions

    \(\{l_i\}^m_{i=1} \subset \mathbb {C}^{s}\).

  • Right interpolation points \(\{\sigma _i\}^m_{i=1} \subset \mathbb {C}\) and right tangent directions

    \(\{r_i\}^m_{i=1} \subset \mathbb {C}^{s}\).

The aim of the tangential interpolation is to produce a low-order dimensional LTI dynamical system (4) such that the associated transfer function, \(H_m\) in (5) is a tangential interpolant to H, i.e.,

$$\begin{aligned} l_i^TH_m(\mu _i) = l_i^TH(\mu _i) \quad {\mathrm{and}}\,\, H_m(\sigma _i)r_i = H(\sigma _i)r_i, \quad \mathrm{for}\, \ i=1,\ldots ,m. \end{aligned}$$
(8)

The interpolation points and tangent directions are selected to realize the model reduction goals described later. We want to interpolate H without ever computing explicitly the quantities in (8), since these parameters are numerically ill-conditioned, as provided in [10] for single-input/single-output dynamical systems. This can be achieved by using Petrov-Galerkin projections by carefully choosing the projection subspaces.

The model reduction interpolation projectors were first introduced by Skelton et al. in [6, 16]. Later, Grimme [14] modified this approach into a numerically framework by using the rational Krylov subspace method of Ruhe [27]. For MIMO dynamical systems, a rational tangential interpolation method has been recently developed by Gallivan et al. [11]; see also [1].

3 The Global Method

3.1 Definitions

We begin by recalling some notations that will be used later. We define the inner product \(\left<Y,Z\right>_F = tr(Y^TZ)\), where \( tr(Y^TZ)\) denotes the trace of the matrix \(Y^TZ\) such that \(Y,Z \in \mathbb {R}^{n\times p}\). The associated norm is the Frobenius norm denoted by \(\Vert .\Vert _F\).

The matrix product \(A \otimes B = [a_{i,j}B]\) denotes the well known Kronecker product of the matrices A and B which verifies the following properties:

  1. 1.

    \((A \otimes B)(C \otimes D) = (AC \otimes BD)\).

  2. 2.

    \((A\otimes B)^T = A^T \otimes B^T \).

  3. 3.

    \((A\otimes B)^{-1} = A^{-1} \otimes B^{-1} \), if A and B are invertible.

We also use the matrix product \(\diamond \) defined in [5] as follows.

Definition 2

Let \(A = [A_1,\ldots ,A_m]\) and \(B = [B_1,\ldots , B_l]\) be matrices of dimension \(n \times mp\) and \(n\times lp\), respectively, where \(A_i\) and \(B_j\) \((i = 1,\ldots ,m\) \(j = 1,\ldots ,l)\) are \(\mathbb {R}^{n\times p}\). Then the \(\mathbb {R}^{m\times l}\) matrix \(A^T \diamond B\) is defined by:

$$\begin{aligned} A^T \diamond B= \left( \begin{array}{ccccc} \left<A_1,B_1\right>_F &{}\quad \left<A_1,B_2\right>_F &{}\quad \dots &{}\quad \left<A_1,B_l\right>_F \\ \left<A_2,B_1\right>_F &{}\quad \left<A_2,B_2\right>_F &{}\quad \dots &{}\quad \left<A_2,B_l\right>_F \\ \vdots &{}\quad \vdots &{}\quad \ddots &{}\quad \vdots \\ \left<A_m,B_1\right>_F &{}\quad \left<A_m,B_2\right>_F &{}\quad \dots &{}\quad \left<A_m,B_l\right>_F \end{array}\right) \end{aligned}$$

Remark

  1. 1.

    If \(p = 1\), then \(A^T \diamond B = A^T B\).

  2. 2.

    The matrices \(A = [A_1,\ldots ,A_m]\) and \(B = [B_1,\ldots ,B_m]\) are F-biorthonormal if and only if \(A^T \diamond B = I_m\), i.e.,

    $$\begin{aligned} tr(A_i^TB_j) = \delta _{i,j} = \left\{ \begin{array}{ccccc} 0 &{}\quad \mathrm{if} &{} i\ne j \\ 1 &{}\quad \mathrm{if} &{} i= j \end{array}\right. \; i,j=1,\ldots ,m. \end{aligned}$$
    (9)
  3. 3.

    If \(X \in \mathbb {R}^{n\times p}\), then \(X^T \diamond X =\Vert X \Vert _F^2\).

The following proposition gives some properties satisfied by the above product.

Proposition 1

[5] Let \(A,\ B,\ C \in \mathbb {R}^{n\times ps}\), \(D \in \mathbb {R}^{n\times n}\), \(L \in \mathbb {R}^{p\times p}\) and \(\in \mathbb {R}\). Then we have,

  1. 1.

    \((A + B)^T \diamond C = A^T \diamond C + B^T \diamond C\).

  2. 2.

    \(A^T \diamond (B + C) = A^T \diamond B + A^T \diamond C\).

  3. 3.

    \((\alpha A)^T \diamond C = \alpha (A^T \diamond C)\).

  4. 4.

    \((A^T \diamond B)^T = B^T \diamond A\).

  5. 5.

    \((DA)^T \diamond B = A^T \diamond (D^T B)\).

  6. 6.

    \(A^T \diamond (B(L \otimes I_s)) = (A^T \diamond B)L.\)

  7. 7.

    \(\Vert A^T \diamond B \Vert _F \le \Vert A \Vert _F \Vert B \Vert _F\).

4 The Adaptive Global Tangential Arnoldi Algorithm

In this section, we introduce the global tangential Arnoldi algorithm that allows us to compute an F-orthonormal basis of some specific matrix subspace and we derive some algebraic relations related to this algorithm.

4.1 The Global Tangential Arnoldi Method

The global tangential Arnoldi procedure allows us to construct an F-orthonormal basis of the matrix subspace

$$\begin{aligned} \mathbb {V}_m =Span\{(\sigma _1I_n-A)^{-1}BR_1,\ldots ,(\sigma _mI_n-A)^{-1}BR_m\}, \end{aligned}$$
(10)

where \(\sigma = \{\sigma _i\}_{i=1}^m\) is a set of interpolation points and \(R=\{R_i\}_{i=1}^m\) is a set of tangential matrix directions, where \(R_i\in \mathbb {R}^{s \times s}\), \(A\in \mathbb {R}^{n\times n}\) and \(B\in \mathbb {R}^{n\times s}\).

The global tangential Arnoldi algorithm is summarized as follows:

figure a

In Algorithm 1, we assume that the interpolation points \(\sigma = \{\sigma _i\}_{i=1}^{m+1}\) and tangential directions \(\{R_i\}_{i=1}^{m+1}\) are given. At each iteration j, we use a new interpolation point \(\sigma _{j+1}\) and a new tangential direction \(R_{j+1}\), \(j=1,\ldots ,m\) and we initialize the subsequent tangential subspace by setting \(\widetilde{V}_{j+1}= (\sigma _{j+1}I_n -A)^{-1}BR_{j+1}\) if \(\sigma _{j+1}\) is finite and \(\widetilde{V}_{j+1} = ABR_{j+1}\) if \(\sigma _{j+1} = \infty \).

We notice that if at some step j, \(h_{j+1,j} =0\), then \((\sigma _{j+1}I_n -A)^{-1}BR_{j+1}\) is written as a linear combination of the computed blocks \(V_i,\ \ i=1,\ldots ,j\),

$$\begin{aligned} (\sigma _{j+1}I_n -A)^{-1}BR_{j+1}=\sum _{i=1}^jh_{i,j}V_i, \end{aligned}$$

and in this case, we drop the interpolation point \(\sigma _{j+1}\) and tangent direction \(R_{j+1}\), replace them by the new ones \(\sigma _{j+2}\) and \(R_{j+2}\) respectively, and go to the following step.

Algorithm 1 constructs also an upper Hessenberg matrix \(\widetilde{H}_m\) whose elements are the \(h_{i,j}\)’s. The upper Hessenberg matrix \(H_m\) is the \(m\times m\) matrix obtained from \({\widetilde{H}}_m\) by deleting its last row:

$$\begin{aligned} \widetilde{H}_m = \left( \begin{array}{ccccc} H_m \\ h_{m+1,m}e_m^T \end{array}\right) . \end{aligned}$$

Let \(\widetilde{K}_m\) be the matrix defined as follows

$$\begin{aligned} \widetilde{K}_m = \left( \begin{array}{ccccc} H_mD_m \\ \sigma _{m+1}h_{m+1,m}e_m^T \end{array}\right) , \end{aligned}$$

where \(D_m\) is the diagonal matrix \(D_m=diag(\sigma _1,\ldots ,\sigma _{m})\) containing the interpolation points. The following theorem derives some useful results that will be used later.

Theorem 1

Let \(\mathscr {V}_{m+1}\) be the F-orthonormal matrix of \(\mathbb {R}^{n\times (m+1)s}\) constructed by Algorithm 1. Then, for \(A\in \mathbb {R}^{n\times n}\) and \(B\in \mathbb {R}^{n\times s}\), we have the following relations

$$\begin{aligned} A\mathscr {V}_{m+1}(\widetilde{H}_m \otimes I_s) = \mathscr {V}_{m+1}(\widetilde{K}_m \otimes I_s)-B\widetilde{\mathscr {R}}_{m+1}, \end{aligned}$$
(11)

and

$$\begin{aligned} T_m= \mathscr {V}_m^T\diamond A\mathscr {V}_m=\left[ H_mD_m-\mathscr {V}_m^T\diamond B\widetilde{\mathscr {R}}_{m+1}-h_{m+1,m}(\mathscr {V}_m^T\diamond AV_{m+1})e^T_m\right] H^{-1}_m, \end{aligned}$$
(12)

where

$$\begin{aligned} \widetilde{\mathscr {R}}_{m+1}=\mathscr {R}_{m+1}\left( \begin{array}{ll} I_m \\ e_m^T \end{array}\right) \otimes I_s, \end{aligned}$$
(13)

and \(\mathscr {R}_{m+1}=[R_1,\ldots ,R_{m+1}]\).

Proof

From Algorithm 1, we have

$$\begin{aligned} h_{j+1,j}V_{j+1}=(\sigma _{j+1}I_n-A)^{-1}BR_{j+1}-\displaystyle {\sum _{i=1}^j h_{i,j}V_{i}} \qquad j=1,\ldots ,m. \end{aligned}$$
(14)

Multiplying (14) on the left by \((\sigma _{j+1}I_n-A)\) and re-arranging terms, we get

$$\begin{aligned} A\, \displaystyle {\sum _{i=1}^{j+1} h_{i,j}V_{i}}=\sigma _{j+1}\displaystyle {\sum _{i=1}^{j+1} h_{i,j}V_{i}}-BR_{j+1} \qquad j=1,\ldots ,m, \end{aligned}$$
(15)

which gives the following relation

$$\begin{aligned} A\mathscr {V}_{m+1}(\widetilde{H}_m\otimes I_s)=\mathscr {V}_m(H_mD_m \otimes I_s)+\sigma _{m+1}h_{m+1,m}V_{m+1}(e_m^T\otimes I_s)-B\widetilde{\mathscr {R}}_{m+1}. \end{aligned}$$

Hence

$$\begin{aligned} A\mathscr {V}_{m+1}(\widetilde{H}_m\otimes I_s)= \mathscr {V}_{m+1}(\widetilde{K}_m \otimes I_s)-B\widetilde{\mathscr {R}}_{m+1}. \end{aligned}$$

For the proof of the expression (12), we have

$$\begin{aligned} \left. \begin{array}{llll} A\mathscr {V}_{m+1}(\widetilde{H}_m\otimes I_s)=&{}\mathscr {V}_m(H_mD_m \otimes I_s)+\sigma _{m+1}h_{m+1,m}V_{m+1}(e_m^T\otimes I_s)\\ &{}-\, BR_{m+1}(e_m^T\otimes I_s)-B\mathscr {R}_{m}(I_m\otimes I_s), \end{array}\right. \end{aligned}$$

which gives

$$\begin{aligned} A\mathscr {V}_{m}(H_m\otimes I_s)&= \mathscr {V}_m(H_mD_m \otimes I_s)+\sigma _{m+1}h_{m+1,m}V_{m+1}(e_m^T\otimes I_s)-B\mathscr {R}_{m}(I_m\otimes I_s)\\&\quad -\, BR_{m+1}(e_m^T\otimes I_s)-h_{m+1,m}AV_{m+1}(e_m^T\otimes I_s). \end{aligned}$$

This implies that

$$\begin{aligned} \left. \begin{array}{llll} A\mathscr {V}_{m}=&{} \left[ \mathscr {V}_m(H_mD_m \otimes I_s)+\sigma _{m+1}h_{m+1,m}V_{m+1}(e_m^T\otimes I_s)-B\mathscr {R}_{m}(I_m\otimes I_s)\right. \\ &{}\left. -\,h_{m+1,m}AV_{m+1}(e_m^T\otimes I_s)\right] (H_m\otimes I_s)^{-1}. \end{array}\right. \end{aligned}$$

Multiplying on the left by \(\mathscr {V}_{m}^T\) with the diamond product, we get

$$\begin{aligned} \left. \begin{array}{llll} \mathscr {V}_{m}^T\diamond A\mathscr {V}_{m}&{}=&{} \left[ \mathscr {V}_{m}^T\diamond \mathscr {V}_m(H_mD_m \otimes I_s)+h_{m+1,m}\mathscr {V}_{m}^T\diamond V_{m+1}(e_m^TD_m\otimes I_s)\right. \\ &{} &{} \left. -\, \mathscr {V}_{m}^T\diamond B\widetilde{R}_{m+1}-h_{m+1,m}\mathscr {V}_{m}^T\diamond AV_{m+1}e_m^T)\right] H_m^{-1}. \end{array}\right. \end{aligned}$$

Finally, we obtain

$$\begin{aligned} \left. \begin{array}{llll} \mathscr {V}_{m}^T\diamond A\mathscr {V}_{m}= & {} \left[ H_mD_m- \mathscr {V}_{m}^T\diamond B\widetilde{R}_{m+1}-h_{m+1,m}\mathscr {V}_{m}^T\diamond AV_{m+1}e_m^T \right] H_m^{-1}. \end{array}\right. \end{aligned}$$

\(\square \)

The following result gives a recursion for computing \(T_{m+1}=\mathscr {V}_{m+1}^T\diamond A\mathscr {V}_{m+1}\) without requiring additional matrix-matrix products with A.

Proposition 2

Let \(T_{m+1}=\mathscr {V}_{m+1}^T\diamond A\mathscr {V}_{m+1}=\left[ t_{:,1},\ldots ,t_{:,m+1} \right] \) and \(\widetilde{H}_{m}=\left[ h_{:,1},\ldots ,h_{:,m} \right] \), where \(t_{:,i},\ h_{:,i}\in \mathbb {R}^{m+1}\) are the i-th column of the \((m+1)\times (m+1)\) matrix \(T_{m+1}\) and the \((m+1)\times m\) upper Hessenberg matrix \(\widetilde{H}_{m}\) obtained from Algorithm 1, respectively. Then for \(j=1,\ldots ,m\)

$$\begin{aligned} t_{:,j+1}= \displaystyle {\frac{1}{h_{j+1,j}}}\left[ \sigma _{j+1}\left( e_1^{(m+1)},\ldots ,e_{j+1}^{(m+1)} \right) h_{1:j+1,j}-t_{:,1:j}h_{1:j,j}-\mathscr {V}_{m+1}^T\diamond (BR_{j+1}) \right] . \end{aligned}$$

Proof

Using (15) and re-arranging terms, we get

$$\begin{aligned} h_{j+1,j}AV_{j+1}=\sigma _{j+1}\displaystyle {\sum _{i=1}^{j+1} h_{i,j}V_{i}}-A\, \displaystyle {\sum _{i=1}^{j} h_{i,j}V_{i}}-BR_{j+1} \qquad j=1,\ldots ,m, \end{aligned}$$

which gives the following relation,

$$\begin{aligned} AV_{j+1}= & {} \displaystyle {\frac{1}{h_{j+1,j}}}\left[ \sigma _{j+1}[V_1,\ldots ,V_{j+1}](h_{1:j+1,j}\otimes I_s)\right. \\&\quad -\left. A[V_1,\ldots ,V_{j}](h_{1:j,j}\otimes I_s)-BR_{j+1}\right] , \end{aligned}$$

Multiplying on the left by \(\mathscr {V}_{m+1}^T\) and using the properties of the \(\diamond \)-product, we obtain

$$\begin{aligned} \mathscr {V}_{m+1}^T\diamond AV_{j+1}= \displaystyle {\frac{1}{h_{j+1,j}}}\left[ \sigma _{j+1}\left( \mathscr {V}_{m+1}^T\diamond \mathscr {V}_{j+1} \right) h_{1:j+1,j}-t_{:,1:j}h_{1:j,j}-\mathscr {V}_{m+1}^T\diamond (BR_{j+1}) \right] , \end{aligned}$$

hence,

$$\begin{aligned} t_{:,j+1}= \displaystyle {\frac{1}{h_{j+1,j}}}\left[ \sigma _{j+1}\left( e_1^{(m+1)},\ldots ,e_{j+1}^{(m+1)} \right) h_{1:j+1,j}-t_{:,1:j}h_{1:j,j}-\mathscr {V}_{m+1}^T\diamond (BR_{j+1}) \right] . \end{aligned}$$

\(\square \)

4.2 The Adaptive Global Arnoldi Method

The performance of the tangential approach depends on the choice of the poles and the tangent directions. In [26], the iterative tangential interpolation Algorithm (ITIA) has been proposed, in the context of the \(\mathscr {H}_2\)-optimal model order reduction, where a specific way is used to choose the interpolation points and the tangential directions. Starting from an initial set of interpolation points and directions, a reduced-order system is determined, and a new set of interpolation points is chosen. This set contains the mirror images of the eigenvalues of \(A_m\), i.e \({\sigma _i}=-\,\lambda _i(A_m)\), \(i = 1,\ldots ,m\). The tangential directions are defined as

$$\begin{aligned} l_i=C_md_i, \quad r_i=B_m^Tg_i, \end{aligned}$$
(16)

where \(d_i\) and \(g_i, i = 1,\ldots ,m\), are right and left eigenvectors respectively, of the reduced model \(\varSigma _m\), i.e.,

$$\begin{aligned} A_md_i=\lambda _id_i, \quad g_i^TA_m=\lambda _ig^T_i. \end{aligned}$$

The main disadvantage of this method is that it requires the construction of many Krylov subspaces, which will not be used in the final model (only the last subspace is used). In contrast to ITIA method, the authors in [9] proposed an adaptive method for choosing the interpolation points and the tangential directions. The main objective of this method is the fact that at each iteration, new interpolation points and tangent directions are computed by minimizing the norm of a certain approximation of the error. Here, we will use a similar approach for our proposed global tangential-based method.

Next, let us see how to approximate the transfer function \(H(\omega )=C(\omega I_n-A)^{-1}B\) when using global-type methods.

Setting \(H(\omega ) = CX \), where \(X \in \mathbb {R}^{n\times s}\) is the solution of the following linear system

$$\begin{aligned} (\omega I_n-A)X=B. \end{aligned}$$
(17)

An approximate solution \(X_m \in Span\{V_1,\ldots ,V_m\}\) of the solution of (17) (assuming that \(\omega I_n-A\) is nonsingular), can be determined by imposing the Petrov-Galerkin condition

$$\begin{aligned} R_B(\omega )\ \bot _F\ Span\{V_1,\ldots ,V_m\} \end{aligned}$$

where the residual \(R_B(\omega ) \) is given by

$$\begin{aligned} R_B(\omega ) = B-(\omega I_n -A)X_m. \end{aligned}$$

Therefore,

$$\begin{aligned} X_m = \mathscr {V}_m\left[ (\omega I_m -T_m)^{-1}(\mathscr {V}_m^T \diamond B)\otimes I_s\right] . \end{aligned}$$

Using the properties of the Kronecker product, we get

$$\begin{aligned} X_m=\mathscr {V}_m(\omega I_{ms}-(T_m \otimes I_s))^{-1}((\mathscr {V}_m^T \diamond B)\otimes I_s), \end{aligned}$$

and then the approximated transfer function \(H_m(\omega ) \) is defined by

$$\begin{aligned} H_m(\omega ) = C\, X_m, \end{aligned}$$

which gives

$$\begin{aligned} \left. \begin{array}{llll} H_m(\omega ) &{} = &{} C\mathscr {V}_m(\omega I_{ms}-(T_m \otimes I_s))^{-1}((\mathscr {V}_m^T \diamond B)\otimes I_s)\\ &{} = &{} C_m(\omega I_{ms}-A_m)^{-1}B_m, \end{array}\right. \end{aligned}$$
(18)

where

$$\begin{aligned} \left. \begin{array}{llll} A_m = ((\mathscr {V}_m^T \diamond A\mathscr {V}_m)\otimes I_s)\in \mathbb {R}^{ms\times ms},&{} T_m=\mathscr {V}_m^T \diamond A\mathscr {V}_m \in \mathbb {R}^{m\times m},\\ B_m = ((\mathscr {V}_m^T \diamond B)\otimes I_s)\in \mathbb {R}^{ms\times s} ,&{} C_m = C\mathscr {V}_m \in \mathbb {R}^{s \times ms }. \end{array}\right. \end{aligned}$$

The residual can be written as

$$\begin{aligned} R_B(\omega )=B-(\omega I_n -A)\mathscr {V}_m(\omega I_{ms}-A_m)^{-1}B_m. \end{aligned}$$
(19)

In the following, we describe an adaptive approach for choosing interpolation points and tangent directions.

4.3 An Adaptive Strategy for Choosing Interpolation Points and Tangential Directions

Using the expression of the transfer function (18), where \(\mathscr {V}_m =[V_1,\ldots ,V_m]\) has F-orthonormal block columns, we want to extend the subspace \(Span\{V_1,\ldots ,V_m\}\) by

$$\begin{aligned} V_{m+1} = (\sigma _{m+1}I_n-A)^{-1}BR_{m+1}, \end{aligned}$$

with the tangent direction \(R_{m+1}\in \mathbb {R}^{s\times s}\), \(\Vert R_{m+1}\Vert _2=1\) and with the new interpolation point \(\sigma _{m+1}\in S_m\). Here \(S_m \subset \mathbb {R}^{+}\) is a set defined similarly as in [7, 8], i.e., the convex hull of \(\{-\lambda _1,\ldots ,-\lambda _m\}\) where \(\{\lambda _i\}_{i=1}^m\) are the eigenvalues of \(A_m\).

The interpolation point \(\sigma _{m+1}\) and tangent direction \(R_{m+1}\) are computed as follows

$$\begin{aligned} (R_{m+1},\sigma _{m+1}) = arg \max _{\tiny {\left. \begin{array}{ccc} \omega \in S_m \\ R\in \mathbb {R}^{s\times s},\Vert R\Vert _2=1 \end{array}\right. }} \Vert R_B(\omega ) R\Vert {_2}, \end{aligned}$$
(20)

where the \(R_B(\omega )\) is residual given by (19). The idea of this choice of poles and directions comes from the result of the following proposition.

Proposition 3

Let \(\mathscr {V}_m\in \mathbb {R}^{n\times ms}\) be the F-orthogonal matrix generated by the global tangential Arnoldi algorithm(GTAA), \(A\in \mathbb {R}^{n\times n}\), \(B\in \mathbb {R}^{n\times s}\) and \(C\in \mathbb {R}^{s\times n}\). The following relation holds

$$\begin{aligned} \Vert H(\omega )-H_m(\omega )\Vert _2 \le \Vert C(\omega I_n-A)^{-1}\Vert _2 \Vert R_B(\omega )\Vert _2. \end{aligned}$$
(21)

Proof

Using the expression of transfer functions \(H(\omega )\) and \(H_m(\omega )\), we have

$$\begin{aligned} \left. \begin{array}{llllll} \Vert H(\omega )-H_m(\omega )\Vert _2 &{} = &{} \Vert C(\omega I_n-A)^{-1}B-C_m(\omega I_{ms}-A_m)^{-1}B_m\Vert _2, \\ &{} = &{} \Vert C(\omega I_n-A)^{-1}(B-(\omega I_n-A)\mathscr {V}_m(\omega I_{ms}-A_m)^{-1}B_m)\Vert _2, \\ &{} = &{} \Vert C(\omega I_n-A)^{-1}R_B(\omega )\Vert _2, \\ &{} \le &{} \Vert C(\omega I_n-A)^{-1}\Vert _2 \Vert R_B(\omega )\Vert _2. \end{array}\right. \end{aligned}$$

\(\square \)

The parameter \(\sigma _{m+1}\) and tangential direction block \(R_{m+1}\) are computed as follows. We first compute \(\sigma _{m+1}\) by maximizing the norm of the residual on the convex hull \(S_m\),

$$\begin{aligned} \left. \begin{array}{lll} \sigma _{m+1} &{} = arg \displaystyle {\max _{\omega \in S_m}}\Vert R_B(\omega )\Vert {_2}, \\ \end{array}\right. \end{aligned}$$
(22)

For small to medium problems, this is done by computing the norm of \(R_B(\omega )\) for each \(\omega \) in \(S_m\). The tangent direction \(R_{m+1}\) is computed by solving the problem

$$\begin{aligned} R_{m+1} = arg \displaystyle {\max _{R\in \mathbb {R}^{s\times s},\Vert R\Vert =1}}\Vert R_B(\sigma _{m+1})R\Vert _2. \end{aligned}$$
(23)

The tagential matrix direction \(R_{m+1}=[r_{m+1}^{(1)},\ldots ,r_{m+1}^{(s)}]\), can be determined such that \(r_{m+1}^{(i)}\) are the right singular vectors corresponding to the s largest singular values of \(R_B(\sigma _{m+1})\).

Notice that as the adaptive tangential interpolation approximation becomes exact when

$$\begin{aligned} \displaystyle {\max _{\omega \in S_m }}\Vert R_B(\omega )\Vert _2= 0, \end{aligned}$$

it is reasonable to stop the process when \(\Vert R_B(\sigma _{m+1})\Vert _2\) is small enough.

Proposition 4

Let \(A_m = (\mathscr {V}_m^T \diamond A\mathscr {V}_m)\otimes I_s\) , then the residual \( R_B(\omega ) \) is expressed as

$$\begin{aligned} R_B(\omega ) =(I-\mathscr {P}_m)(B) + (A\mathscr {V}_m-\mathscr {V}_mA_m)Y_m(\omega ), \end{aligned}$$
(24)

where \(Y_m(\omega ) = (\omega I_{ms}-A_m)^{-1}B_m\), \(\mathscr {V}_m=\left[ V_1,\ldots ,V_m\right] \) and \(\mathscr {P}_m(B)=\mathscr {V}_m\left[ (\mathscr {V}_m^T \diamond B)\otimes I_s\right] \)

Proof

We have

$$\begin{aligned} \left. \begin{array}{llllll} R_B(\omega ) &{}= &{} B+A\mathscr {V}_mY_m(\omega )-\omega \mathscr {V}_mY_m(\omega ), \\ &{}= &{} B+A\mathscr {V}_mY_m(\omega )-\mathscr {V}_m(\omega I_{ms}-A_m)(\omega I_{ms}-A_m)^{-1}B_m, \\ &{} &{} -\,\mathscr {V}_mA_m(\omega I_{ms}-A_m)^{-1}B_m,\\ &{}= &{} B+A\mathscr {V}_mY_m(\omega )-\mathscr {V}_mB_m -\mathscr {V}_mA_mY_m(\omega ),\\ &{}= &{} B-\mathscr {P}_m(B)+(A\mathscr {V}_m -\mathscr {V}_mA_m)Y_m(\omega ),\\ &{}= &{} (I-\mathscr {P}_m)(B)+(A\mathscr {V}_m -\mathscr {V}_mA_m)Y_m(\omega ).\\ \end{array}\right. \end{aligned}$$

\(\square \)

Next we give a result obtained by using the global tangential Arnoldi algorithm (GTAA).

Proposition 5

Let \(\mathscr {R}_m = [R_1,\ldots ,R_m]\) and let \(\mathscr {V}_m\) and \(H_m\) be the F-orthonormal and the upper Hessenberg matrices obtained at step m of the global tangential Arnoldi algorithm. Then

$$\begin{aligned} A\mathscr {V}_m-\mathscr {V}_mA_m = -(I-\mathscr {P}_m)(B\mathscr {R}_m) \times (H_m^{-1}\otimes I_s), \end{aligned}$$
(25)

and

$$\begin{aligned} R_B(\omega ) =(I-\mathscr {P}_m)(B)- (I-\mathscr {P}_m)(B\mathscr {R}_m) \times (H_m^{-1}\otimes I_s)Y_m(\omega ), \end{aligned}$$
(26)

where \(\mathscr {P}_m(B\mathscr {R}_m)=\mathscr {V}_m\left[ (\mathscr {V}_m^T \diamond B\mathscr {R}_m)\otimes I_s\right] \)

Proof

Let \( K_m = [(\sigma _1I-A)^{-1}BR_1,\ldots ,(\sigma _mI-A)^{-1}BR_m].\) Then from the global tangential Arnoldi algorithm, we can show that \(K_m= \mathscr {V}_m(H_m\otimes I_s)\). Setting \(D_m =diag(\sigma _1,\ldots ,\sigma _m)\), and using the fact that \(A(\sigma _iI-A)^{-1}BR_i = -BR_i + \sigma _i( \sigma _iI-A)^{-1}BR_i\), it follows that

$$\begin{aligned} A\mathscr {V}_m = AK_m(H_m\otimes I_s)^{-1} = \left[ -B\mathscr {R}_m + K_m(D_m\otimes I_s)\right] (H_m\otimes I_s)^{-1}. \end{aligned}$$
(27)

Therefore replacing \(A_m= (\mathscr {V}_m^T \diamond A\mathscr {V}_m)\otimes I_s\) and \(K_m= \mathscr {V}_m(H_m\otimes I_s)\) in (27) , we get

$$\begin{aligned} A\mathscr {V}_m-\mathscr {V}_mA_m= & {} \left[ -B\mathscr {R}_m +\mathscr {V}_m(H_m\otimes I_s)(D_m\otimes I_s)\right] (H_m\otimes I_s)^{-1} \\&-\, \mathscr {V}_m\left[ (\mathscr {V}_m^T \diamond A\mathscr {V}_m)\otimes I_s\right] ,\\= & {} -B\mathscr {R}_m(H_m\otimes I_s)^{-1} +\mathscr {V}_m(H_m\otimes I_s)(D_m\otimes I_s)(H_m\otimes I_s)^{-1}. \\&-\, \mathscr {V}_m\left[ (\mathscr {V}_m^T \diamond (-B\mathscr {R}_m + \mathscr {V}_m(H_m\otimes I_s)(D_m\otimes I_s))(H_m\otimes I_s)^{-1})\otimes I_s\right] \end{aligned}$$

It follows that

$$\begin{aligned} A\mathscr {V}_m-\mathscr {V}_mA_m= & {} -B\mathscr {R}_m(H_m\otimes I_s)^{-1} +\mathscr {V}_m\left[ (\mathscr {V}_m^T \diamond (B\mathscr {R}_m) (H_m\otimes I_s)^{-1})\otimes I_s\right] \\&+ \,\mathscr {V}_m(H_m\otimes I_s)(D_m\otimes I_s)(H_m\otimes I_s)^{-1}\\&-\, \mathscr {V}_m\left[ (\mathscr {V}_m^T \diamond (\mathscr {V}_m(H_m\otimes I_s)(D_m\otimes I_s)(H_m\otimes I_s)^{-1}))\otimes I_s\right] ,\\= & {} -B\mathscr {R}_m(H_m\otimes I_s)^{-1} + \mathscr {V}_m(H_m\otimes I_s)(D_m\otimes I_s)(H_m\otimes I_s)^{-1} \\&-\,\mathscr {V}_m\left[ (\mathscr {V}_m^T \diamond (\mathscr {V}_m(H_m\otimes I_s) (D_m\otimes I_s)(H_m^{-1} \otimes I_s)))\otimes I_s\right] \\&+\,\mathscr {V}_m \left[ ((\mathscr {V}_m^T \diamond (B\mathscr {R}_m (H_m^{-1} \otimes I_s)) \otimes I_s \right] . \end{aligned}$$

Using the relation

$$\begin{aligned} \mathscr {P}_m(B\mathscr {R}_m (H_m^{-1}\otimes I_s))= & {} \left[ (\mathscr {V}_m^T \diamond (B\mathscr {R}_m(H_m^{-1} \otimes I_s)) \otimes I_s\right] \\= & {} \left[ ((\mathscr {V}_m^T \diamond B\mathscr {R}_m) \times H_m^{-1}) \otimes I_s \right] \\= & {} \left[ (\mathscr {V}_m^T \diamond B\mathscr {R}_m) \otimes I_s \right] \times (H_m^{-1} \otimes I_s)\\= & {} \mathscr {P}_m(B\mathscr {R}_m) \times (H_m\otimes I_s)^{-1}, \end{aligned}$$

we obtain

$$\begin{aligned} A\mathscr {V}_m-\mathscr {V}_mA_m= & {} -(I-\mathscr {P}_m)(B\mathscr {R}_m) (H_m\otimes I_s)^{-1}+ \mathscr {V}_m(H_m\otimes I_s)(D_m\otimes I_s)(H_m\otimes I_s)^{-1}\\&-\, \mathscr {V}_m\left[ (\mathscr {V}_m^T \diamond (\mathscr {V}_m(H_mD_mH_m^{-1}\otimes I_s))\otimes I_s\right] ,\\= & {} -(I-\mathscr {P}_m)(B\mathscr {R}_m) (H_m\otimes I_s)^{-1} + \mathscr {V}_m(H_m\otimes I_s)(D_m\otimes I_s)(H_m\otimes I_s)^{-1}\\&-\, \mathscr {V}_m\left[ ((\mathscr {V}_m^T \diamond \mathscr {V}_m)H_mD_mH_m^{-1})\otimes I_s\right] ,\\= & {} -(I-\mathscr {P}_m)(B\mathscr {R}_m) \times (H_m^{-1}\otimes I_s). \end{aligned}$$

\(\square \)

The expression of \(R_B(\omega ) \) given in (26) allows us to reduce the computational cost while seeking for the next pole and direction. Solving \( \displaystyle {\max _{\omega \in S_m}} \Vert R_B(\omega )\Vert \) for a large sample of values for \(\omega \) requires the computation of the tall matrix \(R_B(\omega )\) at each value of \(\omega \). Proposition 5 shows that computational cost can decrease, as \(\omega \) varies. In fact, consider the skinny QR decomposition of \((I-\mathscr {P}_m)(B)= Q_1L_1\) and \((I-\mathscr {P}_m)(B\mathscr {R}_m)= Q_2 L_2\), it follows that

$$\begin{aligned} \Vert R_B(\omega ) \Vert _2 \le \ \Vert L_1 \Vert _2 + \Vert L_2 (H_m^{-1}\otimes I_s)Y_m(\omega ) \Vert _2. \end{aligned}$$

We notice that, if we consider the case where the tangential subspace includes B as a first block, then \((I-\mathscr {P}_m)(B)=0\) and hence

$$\begin{aligned} R_B(\omega ) =- (I-\mathscr {P}_m)(B\mathscr {R}_m)(H_m^{-1}\otimes I_s)Y_m(\omega ), \end{aligned}$$

which shows that

$$\begin{aligned} \Vert R_B(\omega ) \Vert _2 = \Vert L_2 (H_m^{-1}\otimes I_s)Y_m(\omega ) \Vert _2. \end{aligned}$$

This means that, for each value of \(\omega \), one has to only compute norms of matrices of size \(ms \times ms\).

In the following, we present the adaptive global tangential Arnoldi algorithm (AGTAA). This algorithm allows us to compute a low dimentional dynamical system by computing the matrices \(A_{m_{max}} = ((\mathscr {V}_{m_{max}}^T \diamond A\mathscr {V}_{m_{max}})\otimes I_s)\), \(B_{m_{max}} = ((\mathscr {V}_{m_{max}}^T \diamond B)\otimes I_s)\) and \(C_{m_{max}} = C\mathscr {V}_{m_{max}}\) for a fixed value of \(m_{max}\) where \(\mathscr {V}_{m_{max}}\) is the F-orthonormal matrix obtained by applying Algorithm 1. The algorithm is summarized as follows

figure b

At each iteration m, the total number of arithmetic operations is dominated by the computation of the matrix \(\widetilde{V}_{m+1}\) at Step 4 of Algorithm 2 by solving the multiple shifted linear system \( (\sigma _{m+1}I_n-A)V=BR_{m+1}\). For medium or large structured problems, this is done by using the LU factorization of \((\sigma _{m+1}I_n-A)\). For large problems, one can also use a Krylov solver such as the block or standard GMRES with a suitable preconditionner. We notice that the well known Iterative Rational Krylov Algorithm (IRKA) algorithm needs, at each iteration, two LU factorizations. In all our numerical tests, we solved those shifted linear systems using the LU factorization. We also need to compute, for each iteration m, the block direction \(R_{m+1}\) by solving the problem (23) which is done by computing the leading s right singular vectors of the \(n \times s\) matrix \(R_B(\sigma _{m+1})\).

5 Numerical Experiments

In this section, we give some experimental results to show the effectiveness of the proposed approach. All the experiments were performed on a computer of Intel Core i5 at 3.00GHz and 8Go of RAM. The algorithms were coded in Matlab 8.0. The following functions from Lyapack [21] were used:

  • lp\(\_\)lgfrq: Generates a set of logarithmically distributed frequency sampling points.

  • lp\(\_\) para: Used for computing the initial first two shifts.

In our experiments, we used some matrices from LYAPACK and from the Oberwolfach collection.Footnote 1 These matrix tests are reported in Table 1. We used different values of s.

Table 1 Matrix tests

Example 1

The model of the first experiment is a model of stage 1R of the International Space Station (ISS). It has 270 states, three inputs and three outputs, for more details on this system, see [16]. Figure 1 shows the singular values versus frequencies of the transfer function and its approximation. In Fig. 2, we plotted the 2-norm of the errors \(\Vert H(j\omega )-H_m(j\omega )\Vert _2\) versus the frequencies for \(m=15\). As can be shown from these plots, the AGTAA algorithm gives good results with a small value of m.

Fig. 1
figure 1

The ISS model: Singular values versus frequencies

Fig. 2
figure 2

The ISS model: error-norms versus frequencies

Example 2

In this example, we applied the AGTAA method on 3 models: FDM, RAIL3113 and RAIL821. The corresponding matrix A of the FDM model, is obtained from the centered finite difference discretization of the operator,

Fig. 3
figure 3

The FDM model: Singular values versus frequencies

Fig. 4
figure 4

The FDM model: error-norms versus frequencies

Fig. 5
figure 5

The RAILl821 model: error-norms versus frequencies

Fig. 6
figure 6

The RAIL3113 model: error-norms versus frequencies

$$\begin{aligned} L_A(u) = \varDelta u-f(x,y)\frac{\partial u}{\partial x}-g(x,y)\frac{\partial u}{\partial y}-h(x,y)u, \end{aligned}$$

on the unit square \([0, 1]\times [0, 1]\) with homogeneous Dirichlet boundary conditions with

$$\begin{aligned} \left\{ \begin{array}{lll} f(x,y) &{} = &{} log(x + 2y+1) \\ g(x,y) &{} = &{} e^{x + y} \\ h(x,y) &{} = &{} x + y. \\ \end{array}\right. \end{aligned}$$

The matrices B and C were random matrices with entries uniformly distributed in [0, 1]. The number of inner grid points in each direction was \(n_0 = 100\) and the dimension of A is \(n = n_0^2 = 10{,}000\). The plots in Fig. 3, represent the sigma-plot (the singular values of the transfer function) of the original system (dashed-dashed line) and the one of the reduced order system (solid line). In Fig. 4, we plotted the error-norm \(\Vert H(j\omega ) - H_m(j\omega )\Vert _2\) versus the frequencies \(\omega \in [10^{-6},\; 10^6]\). For this experiment, the value of m was \(m=20\).

Fig. 7
figure 7

The \(\hbox {MNA}_1\) model: AGTAA (solid line) and the IRKA (dashed-dotted line)

Fig. 8
figure 8

The CDplayer model: AGTAA (solid line) and the IRKA (dashed-dotted line)

Fig. 9
figure 9

The \(\hbox {MNA}_3\) model: AGTAA (solid line) and the IRKA (dashed-dotted line)

Fig. 10
figure 10

The Flow model: AGTAA (solid line) and the IRKA (dashed-dotted line)

In Figs. 5 and 6, we considered the two models: RAIL821 and RAIL3113. These models describe the steel rail cooling problem (order \( n = 821\) and \(n = 3113\), respectively) and are from the Oberwolfach collection.Footnote 2 The plots in these figures represent the error-norm \(\Vert H(j\omega ) - H_m(j\omega )\Vert _2\) versus the frequencies.

Example 3

In this example we compared the AGTAA algorithm with the Iterative Rational Krylov Algorithm (IRKA)[1]. We used five models: CDplayer, MNA\(_1\), MNA\(_3\) and Flow. The original model of the CDplayer describes the dynamics between a lens actuator and the radial arm position in a portable CD player. The model is relatively hard to reduce. For more details on this system, see [15]. The MNA\(_1\), MNA\(_3\) models are Modified Nodal Analysis models, they have 578 and 4863 states, with \(s=9\) and \(s=22\), respectively and were obtained from NICONET [18], while the matrix Flow is from the Oberwolfach collection.

The Figs. 7, 8, 9 and 10 illustrate the exact error-norm \(\Vert H(j\omega ) - H_m(j\omega )\Vert _2\) versus the frequencies for AGTAA (solid line) and IRKA (dashed-dotted line) with \(m=20\). In Fig. 8, the AGTAA returns good results for small frequencies with a little advantage for IRKA for large frequencies. It is clear that in all the presented plots, the AGTAA algorithm returns generally good results as compared to the IRKA algorithm. We also compared the execution times of the two algorithms and we reported the obtained timings in Table 2.

Table 2 The execution times for AGTAA and IRKA

The results of Table 2 show that the cost of IRKA method is much higher than the cost of the adaptive global tangential Arnoldi method.

6 Conclusion

In this paper, we proposed a new global tangential Arnoldi method to get reduced order dynamical systems that approximate the initial large scale multiple inputs and multiple outputs (MIMO) ones. The method generates sequences of matrix tangential rational Krylov subspaces by selecting, in an adaptive way, the interpolation shifts (poles) and tangential directions by maximizing the residual norms. We gave some new algebraic properties and present some numerical experiments on some benchmark examples.