Balanced truncation-rational Krylov methods for model reduction in large scale dynamical systems

Abidi, O.; Jbilou, K.

doi:10.1007/s40314-016-0359-z

Balanced truncation-rational Krylov methods for model reduction in large scale dynamical systems

Published: 18 June 2016

Volume 37, pages 525–540, (2018)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Computational and Applied Mathematics Aims and scope Submit manuscript

Balanced truncation-rational Krylov methods for model reduction in large scale dynamical systems

Download PDF

O. Abidi¹ &
K. Jbilou¹

315 Accesses
5 Citations
Explore all metrics

Abstract

In this paper, we consider the balanced truncation method for model reductions in large-scale linear and time-independent dynamical systems with multi-inputs and multi-outputs. The method is based on the solutions of two large coupled Lyapunov matrix equations when the system is stable or on the computation of stabilizing positive and semi-definite solutions of some continuous-time algebraic Riccati equations when the dynamical system is not stable. Using the rational block Arnoldi, we show how to compute approximate solutions to these large Lyapunov or algebraic Riccati equations. The obtained approximate solutions are given in a factored form and used to build the reduced order model. We give some theoretical results and present numerical examples with some benchmark models.

A model reduction method in large scale dynamical systems using an extended-rational block Arnoldi method

Article 22 March 2021

Balanced truncation model order reduction in limited time intervals for large systems

Article Open access 05 June 2018

A tangential method for the balanced truncation in model reduction

Article 11 April 2019

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Consider the following linear time invariant (LTI) dynamical system:

$$\begin{aligned} \mathbf {(LTI)} \left\{ \begin{array}{lll} x^{\prime }(t) = A x(t) +Bu(t) ; \, x(t_0)=x_0\\ y(t) = Cx(t), \end{array} \right. \end{aligned}$$

(1)

where $x(t) \in \mathbb {R}^n$, $u(t) \in \mathbb {R}^r$, $ y(t) \in \mathbb {R}^s$, $A \in \mathbb {R}^{n \times n}$, $B \in \mathbb {R}^{n \times r}$ and $C \in \mathbb {R}^{s \times n}$ with $r,s \ll n$. The vector x is called the state vector and it belongs to the state space. The vector u is the input (or the control) vector and y(t) is the output (to be measured). If $s=r=1$, then the LTI dynamical system (1) is called single-input single-output (SISO) and is called multiple-input multiple-output (MIMO) otherwise. The control problem consists in acting on the input vector u (t) so that the output vector y(t) has a desirable time trajectory and modifying the input u(t) according to the output y(t) which is observed or to the state x(t) is called feedback. The LTI dynamical system (1) can also be denoted as

$$\begin{aligned} \mathbf{(LTI)} \equiv \left[ \begin{array}{c|c} A &{} B\\ \hline C &{} 0 \end{array} \right] . \end{aligned}$$

(2)

In many applications, such as circuit simulation, or time-dependent PDE control problems, the dimension n of the dynamical system (2) is quite large, while the number of inputs and outputs is small $r,s \ll n$. In these large-scale settings, the system dimension makes the computation infeasible due to memory, time limitations and ill-conditioning. The goal is to produce a low-dimensional system that has similar response characteristics as the original system with lower storage requirements and evaluation times.

The reduced order dynamical system can be stated as follows:

$$\begin{aligned} \mathbf{(LTI)} _m \left\{ \begin{array}{lll} x_m^{\prime }(t) = A_m x_m(t) +B_mu(t) \\ y_m(t) = C_mx_m(t), \end{array} \right. \end{aligned}$$

(3)

where $x_m(t) \in \mathbb {R}^m$, $ y_m(t) \in \mathbb {R}^s$, $A_m \in \mathbb {R}^{m \times m}$, $B \in \mathbb {R}^{m \times r}$ and $C_m \in \mathbb {R}^{s \times m}$ with $m \ll n$. The system (3) is also represented as

$$\begin{aligned} \mathbf{(LTI)} _m \equiv \left[ \begin{array}{c|c} A_m &{} B_m\\ \hline C_m &{} 0 \end{array} \right] . \end{aligned}$$

(4)

The reduced order dynamical system (3) should be constructed such that

the output $y_m(t)$ of the reduced system approaches the output y(t) of the original system;
Some properties of the original system such as passivity and stability (if possible) are preserved, and
the computation methods are robust and efficient.

One of the most known reduction model techniques is the Balanced Model Reduction first introduced by Mullis and Roberts (1976) and later in the systems and control literature by Moore (1981). When applied to stable systems, Lyapunov balanced reduction preserves stability and provides a bound for the approximation error. For small-to-medium scale problems, Lyapunov balancing can be implemented efficiently. However, for large-scale settings, exact balancing is expensive to implement because it requires dense matrix factorizations and results in a computational complexity of $O(n^3)$ and a storage requirement of $O(n^2)$, see Antoulas (2005), Benner et al. (2008) and Gugercin and Antoulas (2004). For large problems, direct methods could not be applied and then Krylov-based (El et al. 2002; Jaimoukha and Kasenally 1994; Jbilou 2006, 2010) or ADI-based methods (Benner et al. 2008; Penzl 2012) are required to compute these Gramians that are given in factored forms which allows to save memory. Besides the Lyapunov balancing method, other types of balancing exist such as stochastic balancing, bounded real balancing, positive real balancing, LQG balancing and frequency weighted balancing requiring the solution of continuous time algebraic Riccati equations; see Fortuna et al. (1992) and Mullis and Roberts (1976).

2 Lyapunov-balanced truncation

In the sequel we assume that the initial system is stable which means that A is a stable matrix (all its eigenvalues are in left open part of the complex plane).

2.1 The transfer function

The state space representation is usually referred as an internal representation of a dynamical system because it involves the state variables x which are internal variables of the system. The input/output representation, also called external representation, is obtained by eliminating the state vector, between the state equation and the output equation with zero initial conditions.

To get the frequency domain description we apply the Laplace transform

$$\begin{aligned} \mathscr {L}(f(t) = \int _0^{\infty } f(t) e^{-st} \mathrm{d}t \end{aligned}$$

to the state equation (1), and we get

$$\begin{aligned} \left\{ \begin{array}{lll} sX(s) = AX(s)+BU(s)\\ Y(s)= CX(s), \end{array} \right. \end{aligned}$$

where $X(s)=\mathscr {L}(x(t))$ and $U(s)=\mathscr {L}(u(t))$. Therefore,

$$\begin{aligned} X(s) = (sI-A)^{-1}BU(s), \end{aligned}$$

and by substituting X(s) in the output equation of (1), we get

$$\begin{aligned} Y(s)=F(s)\,U(s), \end{aligned}$$

(5)

with

$$\begin{aligned} F(s)= C(sI-A)^{-1}B. \end{aligned}$$

(6)

The rational function F(s) is called the transfer function related to the dynamical system (1). The elements of this function are real rational functions. The transfer function F(.) is stable if its poles lie in the open left-half plane $\mathbb {C}^-$.

We recall that two LTI systems $ \left[ \begin{array}{c|c} A &{} B\\ \hline C &{} 0 \end{array} \right] $ and $\left[ \begin{array}{c|c} {\widetilde{A}} &{} {\widetilde{B}}\\ \hline {\widetilde{C}} &{} 0 \end{array} \right] $ are called equivalent if they have the same transfer function. It is easy to verify that for any nonsingular $n \times n$ matrix T, the LTI system

$$\begin{aligned} \left[ \begin{array}{c|c} T^{-1}AT &{} T^{-1} B\\ \hline C T &{} 0 \end{array} \right] \end{aligned}$$

is equivalent to the LTI system $\left[ \begin{array}{c|c} A &{} B\\ \hline C &{} 0 \end{array} \right] .$ Therefore, if the main concern is the output under some specific inputs, we have many choices of the state-space description. The choice of the matrix T is very important and the states are connected by the relation $x(t) = T {\widetilde{x}}(t)$.

2.2 Controllability and observability Gramians

We assume that the LTI dynamical system is stable.

Definition 1

The controllability Gramian associated with the LTI system (1) is defined as

$$\begin{aligned} P = \int _{0}^{\infty } e^{tA} BB^\mathrm{T} e^{tA^\mathrm{T}}\mathrm{d}t, \end{aligned}$$

(7)

and the observability Gramian is defined by

$$\begin{aligned} Q = \int _{0}^{\infty } e^{tA^\mathrm{T}} C^TC e^{tA}\mathrm{d}t. \end{aligned}$$

(8)

By using the Parseval relation, we obtain the following expressions of the Gramians:

$$\begin{aligned} P= & {} \int _{-\infty }^{+\infty } (j\omega I-A)^{-1}BB^\mathrm{T} (j\omega I-A^\mathrm{T})^{-1}d \omega ,\end{aligned}$$

(9)

$$\begin{aligned} Q= & {} \int _{-\infty }^{+\infty } (j\omega I-A^\mathrm{T})^{-1}C^TC (j\omega I-A)^{-1}d \omega . \end{aligned}$$

(10)

The two Gramians are the unique solutions of the following coupled Lyapunov matrix equations:

$$\begin{aligned} AP+PA^\mathrm{T}+BB^\mathrm{T} = 0, \end{aligned}$$

(11)

and

$$\begin{aligned} A^\mathrm{T} Q+QA+C^TC = 0. \end{aligned}$$

(12)

We will see later that the product PQ plays an important role in model reduction.

Consider the new equivalent LTI dynamical system

$$\begin{aligned} {\widetilde{\mathbf{(LTI)}} } \, \equiv \, \left[ \begin{array}{c|c} T^{-1}AT &{} T^{-1} B\\ \hline C T &{} 0, \end{array} \right] \end{aligned}$$

where T is a nonsingular matrix. Then the associated controllability and observability Gramians $\widetilde{P}$, and $\widetilde{Q}$ are expressed as

$$\begin{aligned} \widetilde{P}= & {} \int _{0}^{\infty } e^{t {\widetilde{A}}} {\widetilde{B}}{\widetilde{B}} ^\mathrm{T} e^{t{\widetilde{A}}^T}\mathrm{d}t,\\ \widetilde{Q}= & {} \int _{0}^{\infty } e^{t {\widetilde{A}}^T} {\widetilde{C}}^T{\widetilde{C}} e^{t{\widetilde{A}}}\mathrm{d}t, \end{aligned}$$

where ${\widetilde{A}} = T^{-1}AT $, ${\widetilde{B}}=T^{-1} B$ and ${\widetilde{C}}= C T$. Hence, we obtain

$$\begin{aligned} \widetilde{P} =T^{-1} P T^\mathrm{-T},\quad \mathrm{and}\quad \widetilde{Q} =T^\mathrm{T} Q T. \end{aligned}$$

(13)

These last relations show that the Gramians of two equivalent LTI systems are not similar. However, the similarity is preserved for the product of the controllability and observability Gramians and we have

$$\begin{aligned} {\widetilde{P}} {\widetilde{Q}}=T^{-1} P QT. \end{aligned}$$

2.3 The $\mathscr {H}_{\infty }$-norm

In this subsection, we recall the well-known $\mathscr {H}_{\infty }$-norm of a transfer function.

Definition 2

The $\mathscr {H}_{\infty }$ norm of the transfer function F(.) is defined as

$$\begin{aligned} \Vert F(.) \Vert _{{\mathscr {H}}_{\infty }} = \sup _{\omega \in \mathbb {R}} \sigma _\mathrm{max} (F(j\omega )), \end{aligned}$$

(14)

where $\sigma _\mathrm{max}$ denotes the largest singular value.

To approximate the $\mathscr {H}_{\infty }$-norm, we choose a set of frequencies $\Omega _N=\{\omega _1,\omega _2,\ldots ,\omega _N\}$ and search for

$$\begin{aligned} \sup _{1 \le k \le N} \sigma _\mathrm{max} (F(j\omega _k)) \approx \Vert F(.) \Vert _{{\mathscr {H}}_{\infty }}. \end{aligned}$$

2.4 Lyapunov balanced truncation

A well-known model reduction scheme is called Lyapunov-Balanced Truncation and was first introduced by Mullis and Roberts (1976) and later in the systems and control by Moore and Glover; see Glover (1984) and Moore (1981). We assume here that the LTI system is stable, controllable and observable (in this case we call it also stable and minimal). Then the controllability and observability Gramians are unique positive definite. The concept of balanced truncation is to transform the original LTI system to an equivalent one in which the states that are difficult to reach are also difficult to observe. This reduces to finding a nonsingular matrix T such that the new Gramians ${\widetilde{P}}$ and ${\widetilde{Q}}$ given by (13) are such that

$$\begin{aligned} {\widetilde{P}} = {\widetilde{Q}}=\mathrm{diag}(\sigma _1,\ldots ,\sigma _n), \end{aligned}$$

where $\sigma _i$ is the i-th Hankel singular value of the LTI system, i.e.

$$\begin{aligned} \sigma _i = \sqrt{\lambda _i(PQ)}. \end{aligned}$$

Let us see how to obtain the matrix T. Consider the Cholesky decompositions of the Gramians P and Q:

$$\begin{aligned} P=L_c{L_c}^T,\quad Q=L_o{L_o}^T, \end{aligned}$$

(15)

and consider also the singular value decomposition of ${L_c}^\mathrm{T} L_o$ as

$$\begin{aligned} {L_c}^\mathrm{T} L_o = Z \Sigma Y^\mathrm{T}, \end{aligned}$$

(16)

where Z and Y are unitary $n \times n$ matrices and $\Sigma $ is a diagonal matrix containing the singular values. Let T be the matrix defined by

$$\begin{aligned} T = L_cZ\Sigma ^{1/2}; \end{aligned}$$

(17)

then it can be easily verified that

$$\begin{aligned} {\widetilde{P}} = {\widetilde{Q}} = \Sigma , \end{aligned}$$

where $\Sigma $ is also the diagonal matrix whose elements are the Hankel singular values $\sqrt{\lambda _i(PQ)}$ since PQ is similar to ${\widetilde{P}} {\widetilde{Q}}$. There are other possible ways for the construction of the matrix T. It was remarked by Glover (1984) that the balanced transformation is not unique but unique up to a nonsingular transformation.

As the concept of balancing has the property that the states which are difficult to reach are simultaneously difficult to observe, a reduced model is obtained by truncating the states which have this property, i.e., those which correspond to small Hankel singular values $\sigma _i$. We have the following theorem:

Theorem 1

Antoulas (2005) Assume that the LTI dynamical system (1) is stable and minimal and having the following balanced realization;

$$\begin{aligned} {\widetilde{\mathbf{(LTI)}} } \, \equiv \, \left[ \begin{array}{cc|c} A_{11} &{} A_{12} &{} B_1\\ A_{21} &{} A_{22} &{} B_2\\ \hline C_1 &{} C_2 &{} 0 \end{array} \right] , \end{aligned}$$

with $P=Q=\mathrm{diag}(\Sigma _m,{\widetilde{\Sigma }}_m)$, $\Sigma _m= \mathrm{diag}(\sigma _1,\ldots ,\sigma _m)$ and ${\widetilde{\Sigma }}_m = \mathrm{diag}(\sigma _{m+1},\ldots ,\sigma _n)$.

Then, the reduced order model represented by

$$\begin{aligned} {\widetilde{\mathbf{(LTI)}} }_m \, \equiv \, \left[ \begin{array}{c|c} A_{11} &{} B_1 \\ \hline C_1 &{} 0 \end{array} \right] . \end{aligned}$$

is stable and we have

$$\begin{aligned} \Vert F(.)-F_m(.) \Vert _{\mathscr {H}_{\infty }} \le 2(\sigma _{m+1}+\cdots + \sigma _n). \end{aligned}$$

The preceding theorem shows that if the neglected singular values $\sigma _{m+1},\ldots ,\sigma _n$ are small, then the reduced order LTI system is close to the original one.

Let us see now see how to construct the low-order model $\mathbf{(LTI)} _m$. We set

$$\begin{aligned} \mathscr {W}_m =L_o Y_m \Sigma _m^{-1/2}\,\mathrm{and }\, \mathscr {V}_m = L_c Z_m \Sigma _m^{-1/2}, \end{aligned}$$

(18)

where $\Sigma _m=\mathrm{diag}(\sigma _1,\ldots ,\sigma _m)$ and $Z_m$ and $Y_m$ correspond to the leading m columns of the matrices Z and Y given by the singular value decomposition (16).

The matrices of the reduced LTI system

$$\begin{aligned} {\widetilde{\mathbf{(LTI)}} }_m \, \equiv \, \left[ \begin{array}{c|c} A_m &{} B_m \\ \hline C_m &{} 0 \end{array} \right] , \end{aligned}$$

are given by

$$\begin{aligned} A_m = \mathscr {W}_m^\mathrm{T} A \mathscr {V}_m,\, B_m=\mathscr {W}_m^\mathrm{T} B\, \mathrm{and} \, C_m = C\mathscr {V}_m. \end{aligned}$$

(19)

Notice that $\mathscr {V}_m \mathscr {W}_m^\mathrm{T}$ is an oblique projector, ${\widetilde{P}} \mathscr {W}_m=\mathscr {V}_m \Sigma _m$ and ${\widetilde{Q}} \mathscr {V}_m =\mathscr {W}_m \Sigma _m$.

The use of Cholesky factors in the Gramians P and Q is not applicable for large-scale problems. Instead, and as we will see later, one can compute low rank approximations of P and Q in factored forms and use them to construct an approximate Lyapunov-balanced truncation model.

Let $\widetilde{A}$, $\widetilde{B}$ and $\widetilde{C}$ be the following matrices:

$$\begin{aligned} {\widetilde{A}}= \left( \begin{array}{l@{\quad }l} A &{} 0\\ 0 &{} A_m \end{array} \right) ,\,\, {\widetilde{B}}=\left( \begin{array}{l} B\\ B_m \end{array} \right) ,\,\, {\widetilde{C}}= \left( \begin{array}{l@{\quad }l} C&C_m \end{array} \right) .\,\, \end{aligned}$$

(20)

Then, the Gramians corresponding to the error dynamical system

$$\begin{aligned} {\widetilde{\mathbf{(LTI)}}} \equiv \left( \begin{array}{c|c} {\widetilde{A}} &{} {\widetilde{B}}\\ \hline {\widetilde{C}} &{} 0 \end{array} \right) \end{aligned}$$

are the solutions of the following Lyapunov matrix equations:

$$\begin{aligned} {\widetilde{A}} {\widetilde{P}}+ {\widetilde{P}} {\widetilde{A}}^T+ {\widetilde{B}}{\widetilde{B}}^T= 0, \end{aligned}$$

and

$$\begin{aligned} \widetilde{A}^{T} {\widetilde{Q}}+ {\widetilde{Q}} {\widetilde{A}}+ {\widetilde{C}}^T{\widetilde{C}}=0. \end{aligned}$$

Therefore, the Hankel norm of the error can be expressed as

$$\begin{aligned} \Vert F(s)-F_m(s) \Vert _{\mathscr {H}} = \sqrt{\lambda _\mathrm{max} ({\widetilde{P}} {\widetilde{Q}})}. \end{aligned}$$

We notice that other model reduction techniques such as the Cross–Gramain method (Antoulas 2005) require the solution of large Sylvester matrix equations to construct the reduced order model. Next, we apply the rational block Arnoldi algorithm for solving large Lyapunov (or in general large Sylvester) matrix equations that are used in the construction of reduced order models using the balanced truncation techniques.

3 The rational block Arnoldi method for solving large Sylvester matrix equations

Consider the following Sylvester matrix equation:

$$\begin{aligned} AX+XD +EF^\mathrm{T} =0, \end{aligned}$$

(21)

where $ A \in \text { IR}^{n \times n} $ and $ D \in \text { IR}^{p \times p} $ are large and sparse stable matrices. We assume that $ E \in \text { IR}^{n \times r} $, and $ F \in \text { IR}^{p \times r} $ are of full rank r, with $r \ll n,p$.

The Bartels–Stewart algorithm (Bartels and Stewart 1972) is the standard and widely used direct method for the solution of Sylvester equations of small to moderate size. Therefore, this direct method is unsuitable when either one of the matrices A or D is of medium size or large and sparse. For medium and large coefficient matrices, iterative schemes have to be used. Krylov-type subspace methods such as those based on the Arnoldi process (El et al. 2002; Jbilou 2010, 2006; Simoncini 2007) are attractive if the matrices are sparse and if no information about the spectra of A and D is available. The Smith method (Penzl 2012) and the alternating directional implicit (in short ADI) iterations could also be applied if a spectral information about A and D is given. Note that ADI iterations allow faster convergence if sub-optimal shifts to A and D can be effectively computed and linear systems with shifted coefficient matrices are solved effectively at low cost. Here, we will use a method based on the rational Krylov subspace; see (Druskin and Simoncini 2011).

Let us first recall the following rational block Krylov subspaces:

$$\begin{aligned} {\mathbf {K}}_m(A,E) =\mathrm{Range} ( \{ E,(A-s_2 I)^{-1} E,\ldots ,\prod _{i=2}^{m}(A-s_{i} I)^{-1}E \}), \end{aligned}$$

(22)

and

$$\begin{aligned} {\mathbf {K}}_m(D^\mathrm{T} ,F) =\mathrm{Range} ( \{ F,(D^\mathrm{T}- \tilde{s}_2 I)^{-1} F,\ldots ,\prod _{i=2}^{m}(D^\mathrm{T}-{\tilde{s}_i} I)^{-1}F \}), \end{aligned}$$

(23)

where the shift-parameters $s_i$ and $\tilde{s}_i$, $i=2,\ldots ,m$ are generated during the construction of the process or selected a posteriori. In our numerical tests, we used two strategies: the first one is a priori selection from Lyapack (Mehrmann and Penzl 1998) and the second strategy consists in selecting, at each iteration, a new shift $s_{m+1}$ which is used to compute a new basis vector. For the second case, we used an adaptive selection from Abidi et al. (2016).

The rational block Arnoldi algorithm for the pair (A, V) where $V \in \text { IR}^{n \times r}$ is summarized as follows:

After m steps, the rational block Arnoldi algorithm generates a block matrix $\mathscr {V}_m= [V_1,\ldots , V_m] \in \text { IR}^{n\times mr}$ whose columns form an orthonormal basis of the rational block Krylov subspace ${\mathbf {K}}_m(A,V)$ and an upper $(m+1)r\times mr$ block Hessenberg matrix $\overline{\mathscr {H}}_{A,m}$ whose blocks $H_{i,j}^A$ are defined by Algorithm 1. The $mr\times mr$ upper block Hessenberg matrix $\mathscr {H}_{A,m}$ is obtained from $\overline{\mathscr {H}}_{A,m}$ by deleting its last r-rows.

When applied to the pairs (A, E) and $(D^\mathrm{T},F)$, the rational block Arnoldi algorithm constructs a system of matrices $\{V_1,\ldots , V_m\}$ and $\{W_1,\ldots , W_m\}$ forming two orthonormal bases of the rational block Krylov subspaces $\mathbf {K}_m(A,E)$ and $\mathbf {K}_m(D^\mathrm{T} ,F)$, respectively. Let

$$\begin{aligned} \mathscr {T}_{A,m}=\mathscr {V}_m^\mathrm{T} A \mathscr {V}_m, \quad \mathscr {T}_{D,m}=\mathscr {W}_m^\mathrm{T} D^\mathrm{T} \mathscr {W}_m, \end{aligned}$$

(24)

where $\mathscr {V}_m=[V_1,\ldots , V_m ]$ and $\mathscr {W}_m=[W_1,\ldots , W_m ]$. The matrices $\mathscr {T}_{A,m}$ and $\mathscr {T}_{D,m} $ could be obtained directly from $\mathscr {H}_{A,m}$ and $\mathscr {H}_{D,m}$, respectively; see Abidi et al. (2016) as follows:

$$\begin{aligned} \mathscr {T}_{A,m}= (I_{mp} + \mathscr {H}_{A,m} S_m - \mathscr {V}_m^*AV_{m+1}H^A_{m+1,m}E_m^*)\mathscr {H}_{A,m}^{-1}, \end{aligned}$$

and $S_m=\mathrm{diag}(s_2 I_r, \ldots , s_{m+1} I_r)$, and $E_m^\mathrm{T}=[0_r,\ldots ,0_r,I_r] =(e_m^\mathrm{T} \otimes I_r)$.

From Algorithm 1, we can deduce the following relations:

$$\begin{aligned} A\mathscr {V}_m = \mathscr {V}_m \mathscr {T}_{A,m} - \Phi _{A,m}H_{m+1,m}^AE_m^\mathrm{T}\mathscr {H}_{A,m}^{-1}+ V_{m+1}H^A_{m+1,m}E_m^\mathrm{T} S_m \mathscr {H}_{A,m}^{-1}, \end{aligned}$$

(25)

where

$$\begin{aligned} \Phi _{A,m}=(I_n-\mathscr {V}_m\mathscr {V}_m^{*})AV_{m+1}. \end{aligned}$$

(26)

We also have

$$\begin{aligned} D^\mathrm{T} \mathscr {W}_m = \mathscr {W}_m \mathscr {T}_{D,m} - \Phi _{D,m}H_{m+1,m}^D E_m^\mathrm{T} \mathscr {H}_{D,m}^{-1}+ W_{m+1}H^D_{m+1,m}E_m^\mathrm{T} \tilde{S}_m \mathscr {H}_{D,m}^{-1}, \end{aligned}$$

(27)

where

$$\begin{aligned} \Phi _{D,m}=(I_n-\mathscr {W}_m\mathscr {W}_m^{^T})D^\mathrm{T} W_{m+1}, \end{aligned}$$

(28)

and $\tilde{S}_m= \mathrm{diag}(\tilde{s}_2 I_r, \ldots , \tilde{s}_{m+1}I_r)$.

We have $E_m^\mathrm{T} S_m=(e_m^\mathrm{T} \otimes I_r)(D_m \otimes I_r)$ where $D_m=\mathrm{diag}(s_2 , \ldots , s_{m+1} )$, and then $E_m^\mathrm{T} S_m=(e_m^\mathrm{T} D_m \otimes I_r)=s_{m+1}(e_m^\mathrm{T} \otimes I_r)=s_{m+1}E_m^\mathrm{T}$ , and $E_m^\mathrm{T} \tilde{S}_m=\tilde{s}_{m+1}E_m^\mathrm{T} $.

Using the relations given in (25) and (27), we get

$$\begin{aligned} A\mathscr {V}_m = \mathscr {V}_m \mathscr {T}_{A,m} + (s_{m+1}V_{m+1}- \Phi _{A,m}) H_{m+1,m}^AE_m^\mathrm{T} \mathscr {H}_{A,m}^{-1}, \end{aligned}$$

(29)

and

$$\begin{aligned} D^\mathrm{T} \mathscr {W}_m = \mathscr {W}_m \mathscr {T}_{D,m} + (\tilde{s}_{m+1}W_{m+1}- \Phi _{D,m}) H_{m+1,m}^D E_m^\mathrm{T} \mathscr {H}_{D,m}^{-1}. \end{aligned}$$

(30)

When applying Krylov-based methods for solving Sylvester matrix equation (21), one seeks for low rank approximate solution of the form

$$\begin{aligned} \mathscr {X}_m=\mathscr {V}_m \mathscr {Y}_m\mathscr {W}_m^\mathrm{T}, \end{aligned}$$

(31)

such that the following Galerkin condition is satisfied:

$$\begin{aligned} \mathscr {V}_m^\mathrm{T} \mathscr {R}(\mathscr {X}_m)\mathscr {W}_m=0, \end{aligned}$$

(32)

where $\mathscr {R}(\mathscr {X}_m)$ is the residual corresponding to the approximation $\mathscr {X}_m$ and given by

$$\begin{aligned} \mathscr {R}(\mathscr {X}_m)= A\mathscr {X}_m+\mathscr {X}_mD +EF^\mathrm{T}. \end{aligned}$$

(33)

Therefore, replacing $\mathscr {X}_m$ by $\mathscr {V}_m \mathscr {Y}_m\mathscr {W}_m^\mathrm{T}$ in (32), we obtain

$$\begin{aligned} \mathscr {V}_m^\mathrm{T} A \mathscr {V}_m \mathscr {Y}_m \mathscr {W}_m^\mathrm{T} \mathscr {W}_m + \mathscr {V}_m^\mathrm{T} \mathscr {V}_m \mathscr {Y}_m\mathscr {W}_m^\mathrm{T} D \mathscr {W}_m +\mathscr {V}_m^\mathrm{T} EF^\mathrm{T} \mathscr {W}_m=0. \end{aligned}$$

(34)

We get the following low-dimensional Sylvester matrix equation:

$$\begin{aligned} \mathscr {T}_{A,m} \mathscr {Y}_m + \mathscr {Y}_m\mathscr {T}_{D,m}^\mathrm{T} +(\mathscr {V}_m^\mathrm{T} E )(\mathscr {W}_m^\mathrm{T} F)^T=0. \end{aligned}$$

(35)

Now, as $[V_1,R ]=QR(E)$ and $[W_1,S ]=QR(F)$ (the QR factorisation of the matrices $V_1$ and $W_1$, respectively), we have

$$\begin{aligned} \mathscr {V}_m^\mathrm{T} E=\mathscr {V}_m^\mathrm{T} V_1 R= E_1 R \quad \text {et} \quad \mathscr {W}_m^\mathrm{T} F=\mathscr {W}_m^\mathrm{T} W_1 S= E_1 S, \end{aligned}$$

(36)

with $E_1=e_1 \otimes I_r$. The Sylvester matrix equation (35) will be solved by a direct method such as the Bartels–Stewart algorithm (Bartels and Stewart 1972). In the next theorem, we give a computational expression for the norm of the residual without computing the approximate solution which is given only at the end of the process and in a factored form.

Theorem 2

Let $\mathscr {V}_m=[V_1,\ldots , V_m ]$ and $\mathscr {W}_m=[W_1,\ldots , W_m ]$ be the matrices whose columns form bases of the rational Krylov subspaces given by (22) and (23), respectively. Let $\mathscr {X}_m=\mathscr {V}_m \mathscr {Y}_m\mathscr {W}_m^\mathrm{T}$ be the approximate solution of the Sylvester matrix equation (21); then the residual norm is given as follows:

$$\begin{aligned} ||\mathscr {R}(\mathscr {X}_m) ||_2= ||S_1JS_2^\mathrm{T} ||_2, \quad J=\begin{bmatrix} 0&\quad I_r \\ I_r&\quad o \end{bmatrix}, \end{aligned}$$

where $S_1$ and $S_2$ are the $2 \times 2$ upper triangular matrices obtained from the QR decomposition of the matrices

$$\begin{aligned} U_1 = \begin{bmatrix} \mathscr {V}_m \mathscr {Y}_m\mathscr {H}_{D,m}^\mathrm{-T} E_m H_{m+1,m}^{D^\mathrm{T} }&\,\,\,\,s_{m+1}V_{m+1}- \Phi _{A,m} \end{bmatrix} \end{aligned}$$

and

$$\begin{aligned} U_2 = \begin{bmatrix} \mathscr {W}_m \mathscr {Y}_m^\mathrm{T} \mathscr {H}_{A,m}^\mathrm{-T} E_m H_{m+1,m}^{A^\mathrm{T}}&\,\,\,\, \tilde{s}_{m+1}W_{m+1}- \Phi _{D,m} \end{bmatrix}. \end{aligned}$$

The quantities $\Phi _{A,m}$ and $\Phi _{D,m}$ are given by the expressions (26) and (28), respectively.

Proof

We have

$$\begin{aligned} \mathscr {R}(\mathscr {X}_m)= & {} A\mathscr {X}_m+\mathscr {X}_mD +EF^\mathrm{T} \\= & {} A\mathscr {V}_m \mathscr {Y}_m\mathscr {W}_m^\mathrm{T} +\mathscr {V}_m \mathscr {Y}_m\mathscr {W}_m^\mathrm{T} D +EF^\mathrm{T}. \end{aligned}$$

Replacing $A\mathscr {V}_m$ and $\mathscr {W}_m^\mathrm{T} D$ by (29) and (30), respectively, we get

$$\begin{aligned} \mathscr {R}(X_m)= & {} \big [ \mathscr {V}_m \mathscr {T}_{A,m} + (s_{m+1}V_{m+1}- \Phi _{A,m}) H_{m+1,m}^AE_m^\mathrm{T} \mathscr {H}_{A,m}^{-1} \big ] \mathscr {Y}_m\mathscr {W}_m^\mathrm{T} \\&+ \mathscr {V}_m \mathscr {Y}_m \big [ \mathscr {W}_m \mathscr {T}_{D,m} + (\tilde{s}_{m+1}W_{m+1}- \Phi _{D,m}) H_{m+1,m}^D E_m^\mathrm{T} \mathscr {H}_{D,m}^{-1} \big ]^\mathrm{T} +EF^\mathrm{T} \\= & {} \mathscr {V}_m \mathscr {T}_{A,m}\mathscr {Y}_m\mathscr {W}_m^\mathrm{T} + (s_{m+1}V_{m+1}- \Phi _{A,m}) H_{m+1,m}^AE_m^\mathrm{T} \mathscr {H}_{A,m}^{-1}\mathscr {Y}_m\mathscr {W}_m^\mathrm{T} \\&+ \mathscr {V}_m \mathscr {Y}_m \mathscr {T}_{D,m}^\mathrm{T} \mathscr {W}_m^\mathrm{T} + \mathscr {V}_m \mathscr {Y}_m \mathscr {H}_{D,m}^\mathrm{-T} E_m H_{m+1,m}^{D^\mathrm{T} }(\tilde{s}_{m+1}W_{m+1}^T- \Phi _{D,m}^T) +EF^\mathrm{T}. \end{aligned}$$

Using the fact that $\mathscr {Y}_m$ solves the low-dimensional Sylvester equation (35), it follows that

$$\begin{aligned} \mathscr {R}(\mathscr {X}_m)= & {} (s_{m+1}V_{m+1}- \Phi _{A,m}) H_{m+1,m}^AE_m^\mathrm{T} \mathscr {H}_{A,m}^{-1}\mathscr {Y}_m\mathscr {W}_m^\mathrm{T} \\&+ \mathscr {V}_m \mathscr {Y}_m \mathscr {H}_{D,m}^\mathrm{-T} E_m H_{m+1,m}^{D^\mathrm{T} }(\tilde{s}_{m+1}W_{m+1}^T- \Phi _{D,m}^T) \\= & {} \begin{bmatrix} \mathscr {V}_m \mathscr {Y}_m \mathscr {H}_{D,m}^\mathrm{-T} E_m H_{m+1,m}^{D^\mathrm{T} }&\, s_{m+1}V_{m+1}- \Phi _{A,m} \end{bmatrix} \begin{bmatrix} 0&\quad I_r \\ I_r&\quad o \end{bmatrix} \begin{bmatrix} H_{m+1,m}^AE_m^\mathrm{T} \mathscr {H}_{A,m}^{-1}\mathscr {Y}_m\mathscr {W}_m^\mathrm{T} \\ \tilde{s}_{m+1}W_{m+1}^T- \Phi _{D,m}^T \end{bmatrix} \\= & {} U_1 J U_2^\mathrm{T}. \end{aligned}$$

Therefore, using the QR factorizations of $U_1=\widetilde{Q}_1 S_1$ and $U_2=\widetilde{Q}_2 S_2$, the result follows.

The following result shows that $\mathscr {X}_m$ is an exact solution of a perturbed Sylvester matrix equation.

Proposition 1

The approximate solution $\mathscr {X}_m=\mathscr {V}_m \mathscr {Y}_m\mathscr {W}_m^\mathrm{T}$ solves the following perturbed Sylvester matrix equation:

$$\begin{aligned}{}[A-\Delta _{A,m}] \mathscr {X}_m + \mathscr {X}_m[D- \Delta _{D,m} ] + EF^\mathrm{T}=0, \end{aligned}$$

where

$$\begin{aligned} \Delta _{A,m}=(s_{m+1}V_{m+1}- \Phi _{A,m}) H_{m+1,m}^AE_m^\mathrm{T} \mathscr {H}_{A,m}^{-1} \mathscr {V}_m^\mathrm{T} , \end{aligned}$$

and

$$\begin{aligned} \Delta _{D,m}=\mathscr {W}_m \mathscr {H}_{D,m}^\mathrm{-T} E_m H_{m+1,m}^{D^\mathrm{T} }(\tilde{s}_{m+1}W_{m+1}^T- \Phi _{D,m}^T). \end{aligned}$$

Proof

Multiplying Eq. (35) from the left by $\mathscr {V}_m$ and from the right by $\mathscr {W}_m^\mathrm{T}$, and using the relations (29) and (30), the result follows:

An important issue when dealing with high-dimensional problems is the storage that requires a large amount of memory. In order to save memory, we can give the approximation $\mathscr {X}_m$ in a factored form. Let $\mathscr {Y}_m=\widetilde{U} \widetilde{\Sigma } \widetilde{V}^T$ be the SVD of $\mathscr {Y}_m$ where $\widetilde{\Sigma }$ is the matrix of the singular values of $\mathscr {Y}_m$ sorted in decreasing order; $\widetilde{U}$ and $\widetilde{V}$ are unitary matrices. We choose a tolerance dtol and define $\widetilde{U}_l$ and $\widetilde{V}_l$ the matrices of the first l columns of $\widetilde{U}$ and $\widetilde{V}$ corresponding to the l singular values of magnitude greater than dtol.

Setting $\widetilde{\Sigma }_l={\mathrm{diag}}[\sigma _1,\ldots ,\sigma _l]$, we get the approximation $\mathscr {Y}_m \approx \widetilde{U}_l \widetilde{\Sigma }_l \widetilde{V}_l^\mathrm{T}$ (which is the best l-rank approximation of $\mathscr {Y}_m$).

Then we have the low-rank approximation

$$\begin{aligned} \mathscr {X}_m \approx Z_m^{A}(Z_m^{D})^T, \end{aligned}$$

(37)

with $Z_m^{A}=\mathscr {V}_m \widetilde{U}_l \widetilde{\Sigma }_l^{1/2}$ and $Z_m^{D}=\mathscr {W}_m \widetilde{V}_l \widetilde{\Sigma }_l^{1/2}$.

Remark 1

When considering the Lyapunov balanced truncation method, we have seen that one has to compute the Gramians P and Q by solving the two coupled Lyapunov matrix equations (11) and (12), respectively. For large problems, these Gramians are computed by using the rational block Arnoldi algorithm and given in factored form $P \approx Z_{P} {Z_P}^T$ with $Q \approx Z_{Q} {Z_Q}^T$ . These factorizations are then used instead of Cholesky factors to build the balanced truncation reduced order model.

The rational block Arnoldi algorithm for solving large-scale Sylvester matrix equations is given as follows:

4 The Riccati-balanced truncation method

4.1 The LQG-Riccati method for model reduction

In this subsection, we assume that the original system is no longer stable and present a reduced order model method, called the linear quadratic Gaussian (LQG) balanced truncation method; see Fortuna et al. (1992) and Mullis and Roberts (1976). The basic idea of (LQG) balanced truncation is to replace the Lyapunov Gramians P and Q used for the classical balanced truncation for stable systems by the stabilizing solutions of the dual Kalman filtering algebraic Riccati equation (FARE) and continuous-time algebraic Riccati equation (CARE) defined as follows:

$$\begin{aligned} AP+PA^\mathrm{T} -PC^TCP+BB^\mathrm{T}=0,\, \, (\mathrm{FARE}) \end{aligned}$$

(38)

and

$$\begin{aligned} A^TQ+QA -QBB^TQ+C^TC=0.\, \, (\mathrm{CARE}) \end{aligned}$$

(39)

Assuming that the classical conditions of controllability and detectability are satisfied, let $P_+$ and $Q_+$ be the stabilizing and positive semidefinite solutions of the matrix Riccati equations (FARE) and (CARE), respectively, which means that the eigenvalues of the closed loops $A-P_+C^TC$ and $A^\mathrm{T}-Q_+BB^\mathrm{T}$ lie in the open left half-plane $\mathbb {C}^-$. It is known (Fortuna et al. 1992) that, as for the classical balanced truncation, the eigenvalues of the product $P_+Q_+$ are invariant quantities under any state coordinate transformation $\tilde{x}(t) =Tx(t)$ where T is a nonsingular $n \times n$ matrix and we have

$$\begin{aligned} {\widetilde{P}}_+{\widetilde{Q}}_+ = TP_+Q_+T^{-1}. \end{aligned}$$

The Riccati Gramians $P_+$ and $Q_+$ are then used as we explained in Subsection 4.2, to construct the model reduction in the same way as when using balanced truncation via the Lyapunov Gramians obtained by solving two coupled Lyapunov matrix equations. Here those Lyapunov Gramians are replaced by the Riccati ones: $P_+$ and $Q_+$.

As we mentioned earlier, the solutions $P_+$ and $Q_+$ have usually low numerical rank and could then be approximated by low-rank factorizations $P_+ \approx Z_m Z_m^\mathrm{T}$ and $Q_+ \approx Y_m Y_m^\mathrm{T}$ where the matrix factors $Y_m$ and $Z_m$ have low ranks. As in the classical Lyapunov balanced truncation method, the factors $Y_m$ and $Z_m$ could be used to construct the (LQG)-balanced truncation reduced order model.

Next, we show how to construct low-rank approximate solutions to algebraic Riccati equations (38) and (39) by using the rational block Arnoldi process.

4.2 The rational block Arnoldi for continuous-time algebraic Riccati equations

Consider the continuous-time algebraic Riccati equation

$$\begin{aligned} A^\mathrm{T} X + XA- XBB^\mathrm{T} X+ C^\mathrm{T} C=0, \end{aligned}$$

(40)

where $ A \in \mathbb {R}^{n \times n} $ is nonsingular, $ B \in \mathbb {R}^{n \times r} $ and $ C \in \mathbb {R}^{r \times n} $. The matrices B and C are assumed to be of full rank with $ r \ll n $.

Riccati equations play a fundamental role in many other areas such as control, filter design theory, differential equations and robust control problems. For historical developments, applications and importance of algebraic Riccati equations, we refer to Abou-Kandil et al. (2003), Bittanti et al. (1991), Datta (2003), Heyouni and Jbilou (2009), Jbilou (2003, (2006) and Kleinman (1968) and the references therein.

Under the hypotheses, the pair (A, B) is stabilizable (i.e., $ \exists $ a matrix S such that $ A-B\,S $ is stable) and the pair (C, A) is detectable (i.e., $ (A^\mathrm{T},C^\mathrm{T}) $ stabilizable), Eq. (40) has a unique symmetric positive semidefinite and stabilizing solution.

To extract low-rank approximate solutions to the continuous-time algebraic Riccati equation (40), we project the initial problem onto the rational block Krylov subspace $ {\mathscr {K}}_{m}(A^\mathrm{T},C^\mathrm{T})$. Applying the Rational block Arnoldi process (Algorithm 1) to the pair $ (A^\mathrm{T},C^\mathrm{T}) $ gives us an orthonormal basis $ \{V_1,\ldots ,V_m\} $ of $ {\mathscr {K}}_m(A^\mathrm{T},C^\mathrm{T}) $.

We consider low-rank approximate solutions that have the form

$$\begin{aligned} \mathscr {X}_m = {\mathscr {V}}_m\, \mathscr {Y}_m\,{\mathscr {V}}_m^\mathrm{T}, \end{aligned}$$

(41)

where $ {\mathscr {V}}_m = [V_1,\ldots ,V_m] $ and $ \mathscr {Y}_m \in \mathbb {R}^{2mr \times 2mr} $.

From now on, the matrix ${\mathscr {T}}_m$ is defined by ${\mathscr {T}}_m={\mathscr {V}}_m^\mathrm{T}\, A^\mathrm{T}\, {\mathscr {V}}_m$.

Setting $ R_m (\mathscr {X}_m) = A^\mathrm{T} \mathscr {X}_m + \mathscr {X}_mA- \mathscr {X}_mBB^\mathrm{T} \mathscr {X}_m+ C^\mathrm{T} C$ and using the Galerkin condition $\mathscr {V}^\mathrm{T} \mathscr {R}_m (\mathscr {X}_m) \mathscr {W}=0$, we get the low-dimensional continuous-time algebraic Riccati equation:

$$\begin{aligned} {\mathscr {T}}_m\, \mathscr {Y}_m + \mathscr {Y}_m\,{\mathscr {T}}_m^\mathrm{T} - \mathscr {Y}_m\,{\tilde{B}}_m\,{\tilde{B}}_m^\mathrm{T}\, \mathscr {Y}_m + \tilde{C}_m^\mathrm{T}\,\tilde{C}_m = 0, \end{aligned}$$

(42)

where $\tilde{B_m}= \mathscr {V}_m^\mathrm{T} B$, $\tilde{C_m}= C \mathscr {V}_m$. We assume that the projected algebraic Riccati equation (42) has a unique symmetric positive semidefinite and stabilizing solution $ \mathscr {Y}_m $. This solution can be obtained by a standard direct method such as the Schur method (Laub 1979).

Proposition 2

Let $\mathscr {V}_m=[V_1,\ldots , V_m ]$ whose columns form an orthonormal basis of the rational block Krylov subspace ${\mathbf {K}}_m(A^\mathrm{T} ,C^\mathrm{T})$. Let $\mathscr {X}_m=\mathscr {V}_m \mathscr {Y}_m\mathscr {V}_m^\mathrm{T}$ the approximate solution given by (41) and let $\mathscr {R}_m(\mathscr {X}_m)$ be the corresponding residual. Then

$$\begin{aligned} ||\mathscr {R}_m(\mathscr {X}_m) ||_2= ||\tilde{S}J\tilde{S}^\mathrm{T} ||_2, \quad J=\begin{bmatrix} 0&\quad I_r \\ I_r&\quad o, \end{bmatrix} \end{aligned}$$

(43)

where $\tilde{S}$ is the $2 \times 2$ upper triangular matrix obtained from the QR decomposition of

$$\begin{aligned} \tilde{U}=\begin{bmatrix} \mathscr {V}_m \mathscr {Y}_m \mathscr {H}_{A^\mathrm{T},m}^\mathrm{-T} E_m H_{m+1,m}^{A^\mathrm{T}}&\,\,\, s_{m+1}V_{m+1}- \Phi _{A^\mathrm{T},m} \end{bmatrix} \end{aligned}$$

(44)

Proof

The proof is similar to the one given for Theorem 37.

Remark 2

The approximate solution $\mathscr {X}_m$ could also be approximated here as a product of two matrices of low ranks. Consider the eigendecomposition of the matrix $\mathscr {Y}_m=\widetilde{U} \, \widetilde{D} \, \widetilde{U}^T$, where $\widetilde{D}$ is the diagonal matrix of the eigenvalues of the symmetric and positive semi-definite solution $\mathscr {Y}_m$ sorted in decreasing order. Let $\widetilde{U}_l$ be the matrix of the first l columns of $\widetilde{U}$ corresponding to the l eigenvalues of magnitude greater than some tolerance toler. We obtain the truncated eigendecomposition $\mathscr {Y}_m \approx \widetilde{U}_l\, D_l\, \widetilde{U}_l^\mathrm{T}$ where $D_l = \mathrm{diag}[\lambda _1, \ldots , \lambda _l]$. Setting $\mathscr {Z}_m={\mathscr {V}}_m \, \widetilde{U}_l\, D_l^{1/2}$, we get

$$\begin{aligned} \mathscr {X}_m \approx \mathscr {Z}_m \mathscr {Z}_m^\mathrm{T}. \end{aligned}$$

(45)

As we have seen earlier, the factor $\mathscr {Z}_m $ is used to build the low-order (LQG) balanced truncation model.

The Rational block Arnoldi algorithm for computing low-rank approximate solutions to the continuous-time algebraic Riccati equation (40) is described as follows:

5 Numerical experiments

In this section, we give some experimental results to show the effectiveness of the proposed approaches. The experiments were performed on a computer of Intel Core i5 at 1.3 GHz and 8 GB of RAM. The algorithms were coded in Matlab R2010a and we used different known benchmark models listed in Table 1.

Table 1 Test matrices

Full size table

The matrices for the benchmark problems CDplayer, FOM were obtained from NICONET (Mehrmann and Penzl 1998) while the matrices for the Flow model were obtained from the discretization of a 2D convective thermal flow problem ( flow meter model v0.5) from the Oberwolfach collection.^{Footnote 1} Some information on these matrices is reported in Table 1. For the FDM model, the corresponding matrix A is obtained from the centered finite difference discretization of the operator

$$\begin{aligned} L_{A}(u) = \Delta u-f(x,y) \frac{\partial u}{\partial x}-g(x,y) \frac{\partial u}{\partial y}-h(x,y)u, \end{aligned}$$

on the unit square $[0, 1] \times [0, 1]$ with homogeneous Dirichlet boundary conditions with

$$\begin{aligned} \left\{ \begin{array}{lll} f(x,y) &{} = &{} \sin (x+2y),\\ g(x,y) &{} = &{} e^{x+y},\\ h(x,y) &{} = &{} x+y,\\ \end{array} \right. \end{aligned}$$

and the matrices B and C were random matrices with entries uniformly distributed in [0, 1]. The number of inner grid points in each direction was $n_0=300$ and the dimension of A is $n = n_0^2$.

Example 1

In the first experiment, we considered the models CDplayer and FOM. For the FOM model B and C were random matrices. Although the matrices of these models have small sizes they are usually considered as benchmark examples. In Fig. 1 we plotted the maximum singular values of the exact (solid) and approximated (dashed) transfer functions for the CDplayer (left) with a space dimension of $m=25$ and the FOM model (right) with a space dimension of $m=20$. For the two models we used $s=r=2$.

The plots of Fig. 2 show the norms of the errors for the balanced-rational block Arnoldi (solid) and IRKA (dashed) with $\omega \in [10^{-5}, 10^5]$ for CDplayer (left figure) and FOM (right figure). The sizes of the reduced order dynamical systems were $m=30$ for the CDplayer model and $m=40$ for the FOM model.

Example 2

For this example, we compared the performances of the balanced Rational block Arnoldi and the IRKA algorithms. In Table 2, we reported the $H_{\infty }$ norm of the errors, the size of the reduced order system and the execution times. As seen from this table, IRKA has difficulties for large-scale problems and cannot converge within a maximum of 300 s. The $\mathscr {H}_{\infty }$ norm of the errors was computed for the frequencies: $\omega \in [10^{-5},\, 10^{5}]$.

Table 2 The ${\mathscr {H}}_{\infty }$ of the errors, the size of the reduced order system and the execution times for rational balanced-truncation and IRKA methods

Full size table

6 Conclusion

In this work, we proposed a new method based on the rational block Arnoldi algorithm to compute low-rank approximate solutions to large Sylvester (or Lyapunov) and continuous-time algebraic Riccati equations having low-rank right-hand sides. These approximate solutions are given in factored forms and are used to build reduced order models that approximate the original large-scale dynamical linear system. We showed how the obtained approximate Gramians could be used in the balanced truncation method. We gave some theoretical results and present numerical experiments on some benchmark examples.

Notes

Oberwolfach model reduction benchmark collection, 2003. http://www.imtek.de/simulation/benchmark.

References

Abidi O, Hached M, Jbilou K (2016) Adaptive rational block Arnoldi methods for model reductions in large-scale MIMO dynamical systems. N Tren Math 4(2):227–239
Article MathSciNet Google Scholar
Abou-Kandil H, Freiling G, Ionescu V, Jank G (2003) Matrix Riccati equations in control and sytems theory. In: Sys. & Contr. Foun. & Appl. Birkhauser, Boston
Antoulas AC (2005) Approximation of large-scale dynamical systems. Adv. des. contr. SIAM, Philadelphia
Book Google Scholar
Bartels RH, Stewart GW (1972) Solution of the matrix equation $ AX+XB=C $. Commun ACM 15:820–826
Article MATH Google Scholar
Benner P, Li J, Penzl T (2008) Numerical solution of large Lyapunov equations, Riccati equations, and linear-quadratic optimal control problems. Numer Lin Alg Appl 15(9):755–777
Bittanti S, Laub A, Willems JC (1991) The Riccati equation. Springer, Berlin
Book MATH Google Scholar
Datta BN (2003) Numerical methods for linear control systems design and analysis. Elsevier Academic Press, Amsterdam
Druskin V, Simoncini V (2011) Adaptive rational Krylov subspaces for large-scale dynamical systems. Syst Contr Lett 60:546–560
Article MathSciNet MATH Google Scholar
El Guennouni A, Jbilou K, Riquet AJ (2002) Block Krylov subspace methods for solving large Sylvester equations. Numer Alg 29:75–96
Article MathSciNet MATH Google Scholar
Fortuna L, Nunnari G, Gallo A (1992) Model order reduction techniques with applications in electrical engineering. Springer, London
Book Google Scholar
Glover K (1984) All optimal Hankel-norm approximations of linear multivariable systems and their L-infinity error bounds. Inter J Cont 39:1115–1193
Article MATH Google Scholar
Gugercin S, Antoulas AC (2004) A survey of model reduction by balanced truncation and some new results. Inter J Cont 77(8):748–766
Article MathSciNet MATH Google Scholar
Heyouni M, Jbilou K (2009) An extended block Arnoldi algorithm for large-scale solutions of the continuous-time algebraic Riccati equation. Elect Trans Num Anal 33:53–62
MathSciNet MATH Google Scholar
Jaimoukha IM, Kasenally EM (1994) Krylov subspace methods for solving large Lyapunov equations. SIAM J Matrix Anal Appl 31(1):227–251
MathSciNet MATH Google Scholar
Jbilou K (2010) ADI preconditioned Krylov methods for large Lyapunov matrix equations. Lin Alg Appl 432(10):2473–2485
Article MathSciNet MATH Google Scholar
Jbilou K (2006) Low rank approximate solutions to large Sylvester matrix equations Appl. Math Comput 177:365–376
MathSciNet MATH Google Scholar
Jbilou K (2003) Block Krylov subspace methods for large continuous-time algebraic Riccati equations. Numer Alg 34:339–353
Article MATH Google Scholar
Jbilou K (2006) An Arnoldi based method for large algebraic Riccati equations. Appl Math Lett 19:437–444
Article MathSciNet MATH Google Scholar
Kleinman BN (1968) On an iterative technique for Riccati equation computations. IEEC Trans Autom Contr 13:114–115
Laub AJ (1979) A Schur method for solving algebraic Riccati equations. IEEE Trans Automat control 24:913–921
Mehrmann V, Penzl T (1998) Benchmark collections in SLICOT. Technical report SLWN1998- 5, SLICOT working note, ESAT, KU Leuven, K. Mercierlaan 94, Leuven-Heverlee 3100, Belgium, 1998. http://www.win.tue.nl/niconet/NIC2/reports.html
Moore BC (1981) Principal component analysis in linear systems: controllability, observability and model reduction. IEEE Trans. Autom. Contr. 26:17–32
Mullis CT, Roberts RA, Trans IEEE Acoust Speec Signal Process 24 (1976)
Penzl T (2012) LYAPACK A MATLAB toolbox for large Lyapunov and Riccati equations. In: Model reduction problems, and linear-quadratic optimal control problems. http://www.tu-chemintz.de/sfb393/lyapack
Simoncini V (2007) A new iterative method for solving large-scale Lyapunov matrix equations. SIAM J Sci Comput 29(3):1268–1288
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Université du Littoral, Côte d’Opale, batiment H. Poincarré, 50 rue F. Buisson, Calais Cedex, 62280, France
O. Abidi & K. Jbilou

Authors

O. Abidi
View author publications
You can also search for this author in PubMed Google Scholar
K. Jbilou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to K. Jbilou.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Abidi, O., Jbilou, K. Balanced truncation-rational Krylov methods for model reduction in large scale dynamical systems. Comp. Appl. Math. 37, 525–540 (2018). https://doi.org/10.1007/s40314-016-0359-z

Download citation

Received: 20 March 2016
Revised: 23 May 2016
Accepted: 04 June 2016
Published: 18 June 2016
Issue Date: March 2018
DOI: https://doi.org/10.1007/s40314-016-0359-z

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Balanced truncation-rational Krylov methods for model reduction in large scale dynamical systems

Abstract

Similar content being viewed by others

A model reduction method in large scale dynamical systems using an extended-rational block Arnoldi method

Balanced truncation model order reduction in limited time intervals for large systems

A tangential method for the balanced truncation in model reduction

1 Introduction

2 Lyapunov-balanced truncation

2.1 The transfer function

2.2 Controllability and observability Gramians

Definition 1

2.3 The \(\mathscr {H}_{\infty }\)-norm

Definition 2

2.4 Lyapunov balanced truncation

Theorem 1

3 The rational block Arnoldi method for solving large Sylvester matrix equations

Theorem 2

Proof

Proposition 1

Proof

Remark 1

4 The Riccati-balanced truncation method

4.1 The LQG-Riccati method for model reduction

4.2 The rational block Arnoldi for continuous-time algebraic Riccati equations

Proposition 2

Proof

Remark 2

5 Numerical experiments

Example 1

Example 2

6 Conclusion

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation