1 Introduction

This paper focuses on the numerical solution of large-scale continuous-time algebraic Riccati matrix equations (CARE):

$$\begin{aligned} A^{T}X+XA+Q-XGX=0, \end{aligned}$$
(1)

where \(Q=C^{T}C\) is symmetric and positive definite, \(G=BB^{T}\) is symmetric and positive semi-definite, and \(A\in {\mathbb {R}}^{n\times n},\ B\in {\mathbb {R}}^{n\times m}, \ C\in {\mathbb {R}}^{p\times n}\) are known matrices, and \(X\in {\mathbb {R}}^{n\times n}\) is unknown matrix. Here, rank(C)=p, rank(B)=m and \(p,\ m\ll n\). The numerical treatment of this type of equation plays a significant role in various fields. For instance, linear quadratic regulators (Anderson and Moore 1990), linear model reduction systems based on equilibrium (Roberts 1980), parabolic partial differential equations and transport theory (Jonckheere and Silverman 1983; Saak 2009), Wiener-Hopf factorization of Markov chains (Williams 1982), and factorization of rational matrix functions (Clancey and Gohberg 1981), etc. In this paper, we assume that the coefficient matrix A is sparse. Typically, the stable solution to equation (1) is desired, where the solution X is symmetric and positive semi-definite, and \(A-GX\) is stable, meaning its eigenvalues have negative real parts. Such stable solutions exist and under certain assumptions, it is unique (Lancaster and Rodman 1995). When n is large, it is common to seek low-rank approximations of the symmetric positive semi-definite solution in the form of \(ZZ^{T}\approx X\), where the column rank of Z is low, i.e., rank\((Z)\ll n\). Since storing the full matrix X requires a significant amount of memory, considering only the storage of the matrix Z allows us to optimize resource utilization when dealing with large-scale problems.

Firstly, we present an application of Eq. (1) by considering a linear time-invariant control system (Kleinman 1968):

$$\begin{aligned} \left\{ \begin{array}{ll} {\dot{x}}(t)=Ax(t)+Bu(t),\ \ \ x(0)=x_{0},\\ y(t)=Cx(t), \end{array} \right. \end{aligned}$$

where \(x(t)\in {\mathbb {R}}^{n}\) is the state vector, \(u(t)\in \mathbb R^{m}\) is the control vector, and \(y(t)\in {\mathbb {R}}^{p}\) is the output vector. Quadratic optimal control aims to minimize

$$\begin{aligned} J(x_{0},u)=\frac{1}{2}\int _{0}^{+\infty }(y(t)^{T}y(t)+u(t)^{T}u(t))dt. \end{aligned}$$

Assuming that (AB) is stabilizable, i.e., there exists a matrix S such that \(A-BS\) is stable. And (CA) is detectable, i.e., \((A^{T},C^{T})\) is stable, there exists a unique optimal solution \(\bar{u}\) that minimizes the functional \(J(x_{0},u)\) (Wonham 1968), which can be determined through the feedback operator P, i.e., \(\bar{u}(t)=Px(t)\), where \(P=B^{T}X\), and \(X\in {\mathbb {R}}^{n\times n}\) is the unique symmetric positive semi-definite stable solution of the matrix Eq. (1).

There are many methods have been proposed for the numerical solution of Eq. (1), including the Schur method (Laub 1979), matrix sign function (Roberts 1980; Bai and Demmel 1998), structured doubling algorithm (Guo et al. 2005), symplectic Lanczos method (Benner and Fassbender 1997), and projection methods based on the global Arnoldi process of the Krylov subspace (Jbilou 2003; Heyouni and Jbilou 2009). However, these methods often require multiple iterations to obtain an accurate approximate solution, leading to significant increases in computational time and memory requirements. To address this issue, some scholars have also investigated approximate low-rank solutions for computing large sparse matrix equations. Typically, we combine the Newton’s iteration method with the alternating direction implicit (ADI) algorithm to solve such equations, and take advantage of the quadratic local convergence properties of Newton’s method. However, at each Newton iteration step, solving a large Lyapunov matrix equation is required to obtain the next iteration solution. The continuous-time Lyapunov matrix equation (Lu and Wachspress 1991) as follows:

$$\begin{aligned} F^{T}X+XF=Q, \end{aligned}$$
(2)

where \(Q=C^{T}C\), \(F\in {\mathbb {R}}^{n\times n},\ C\in \mathbb R^{p\times n}\). From Eq. (1), we observe that equation (2) is a specific case of Eq. (1) when \(G=0\). If the spectrum of matrix F is in the positive-real half-plane. Under these conditions F and \(-F^{T}\) have no common characteristic roots (Rutherford 1932), and there is a unique symmetric solution X.

Heinkenschloss et al. (2016) proposed the low-rank Kleinman-Newton ADI iteration method to solve Eq. (1). In the computational process, the Lyapunov matrix equation is solved using a low-rank Cholesky factorization. This method is based on solving linear systems with shifted matrices \(A+\alpha I\), where \(\alpha \) is the ADI parameter. However, determining the optimal ADI parameter and finding approximations for the Lyapunov equation increase the burden on memory requirements and computational time. Recently, Wong and Balakrishnan (2005) introduced an algorithm called Quadratic ADI (qADI) method to solve the algebraic Riccati equation (1). Their method is a direct extension of the Lyapunov ADI method. Additionally, Wong and Balakrishnan provided a low-level variant of this algorithm. However, this variant has a significant drawback: in every step, all low-rank factors need to be reconstructed, which greatly affects the performance of the algorithm. In addition to the qADI method, several approaches for solving large-scale Riccati equations have emerged in recent literature. For instance, Amodei and Buchot (2010) obtain approximate solutions by computing low-dimensional subspaces of the associated Hamiltonian matrix. Benner et al. (2018) propose a novel ADI iteration method called Riccati ADI (RADI), which expands each factor by several multiples of columns or rows while keeping the elements from previous steps unchanged. Their method yields low-rank Lyapunov ADI iteration formulas.

In addition, there are currently many algorithms available for solving the Lyapunov Eq. (2). For instance, fixed-point iteration (Astudillo and Gijzen 2016), Krylov subspace methods (Druskin et al. 2011), and rank-2 updates (Ren et al. 2018) are commonly employed. Zhou et al. (2015) proposed a generalized Hermitian and skew-Hermitian splitting (GHSS) iterative method, demonstrating convergence under certain assumptions. ADI iterative methods (Benner et al. 2009) can accelerate convergence if the optimal shifts of A and \(A^{T}\) can be effectively estimated. Therefore, for stable Lyapunov Eq. (2), when solving large-scale sparse problems, ADI iteration methods are often preferred as they preserve sparsity and are more amenable to parallelization in most cases. Recent theoretical results (Penzl 2000; Simoncini 2008; Li and White 2002) indicate that using the Cholesky factorization-alternating direction implicit (CF-ADI) algorithm to compute low-rank approximations for the Lyapunov equation is effective. Based on the (Jiang et al. 2021), the GADI iteration for solving large-scale sparse linear systems can be described as follows:

$$\begin{aligned} Ax=b,\ \ A\in {\mathbb {R}}^{n\times n},\ \ x,b\in {\mathbb {R}}^{n}. \end{aligned}$$
(3)
  • Firstly, the matrix A is split, assuming that A can be represented as \(A=M+N \), and then assign parameters to obtain

    $$\begin{aligned}{} & {} \alpha x+Mx=\alpha x-Nx+b,\\{} & {} \alpha x+Nx=Nx-(1-\omega )\alpha x+(1-\omega )\alpha x+\alpha x=Nx-(1-\omega )\alpha x+(2-\omega )\alpha x. \end{aligned}$$
  • Next, the algorithm is obtained by alternating between these two splittings. Given an initial \(x_{0}=0\), the GADI iteration computes a sequence \({x_{k}}\) as follows

    $$\begin{aligned} \left\{ \begin{array}{ll} (\alpha I+M)x_{k+\frac{1}{2}}=(\alpha I-N)x_{k}+b,\\ (\alpha I+N)x_{k+1}=(N-(1-\omega )\alpha I)x_{k}+(2-\omega )\alpha x_{k+\frac{1}{2}}, \end{array} \right. \end{aligned}$$
    (4)

    with \(\alpha >0,\ \ 0\le \omega <2\).

  • Specifically, the matrix A is split into \(A=H+S\), where \(H=\frac{A+A^{*}}{2}\) is a Hermiten matrix, and \(S=\frac{A-A^{*}}{2}\) is a skew-Hermitian matrix, then GADI-HS format is obtained

    $$\begin{aligned} \left\{ \begin{array}{ll} (\alpha I+H)x_{k+\frac{1}{2}}=(\alpha I-S)x_{k}+b,\\ (\alpha I+S)x_{k+1}=(S-(1-\omega )\alpha I)x_{k}+(2-\omega )\alpha x_{k+\frac{1}{2}}. \end{array} \right. \end{aligned}$$
    (5)

In this paper, we propose a low-rank generalized alternating direction implicit iteration (R-GADI) algorithm, which is an improvement over the GADI algorithm for solving the Lyapunov equation. We represent the solution as a low-rank approximation \(X\approx VW^{T}\), where rank(V) and rank\((W)\ll n\). During the computation, the R-GADI method provides a low-rank approximation of the solution X, eliminating the need to store X at each iteration and reducing storage requirements. Additionally, we combine the Kleinman-Newton method with R-GADI (referred to as Kleinman-Newton-RGADI) to solve the Riccati Eq. (1).This method is a variant of the Newton-GADI algorithm (Li et al. 2022), which significantly reduces the total number of ADI iterations and thus lowers the overall computational cost. Finally, numerical examples in the paper demonstrate the effectiveness of the proposed algorithm.

The remaining structure of this paper is as follows: in Sect. 2, we introduce the R-GADI iteration format for solving the Lyapunov equation and demonstrates the consistency between R-GADI and GADI iterations in terms of convergence. The selection of parameters, algorithm complexity, and comparison with other methods are discussed, along with relevant numerical examples. In Sect. 3, we first transform the Riccati equation into the Lyapunov equation using the Kleinman-Newton method, and then present the R-GADI iteration format. Convergence, algorithm complexity, and additional numerical examples are also discussed to validate the effectiveness of the proposed algorithm. Finally, in Sect. 4 concludes the paper by summarizing the findings and offering some concluding remarks.

In this article, we use the following notation: \({\mathbb {R}}^{n\times m}\) denotes the set of all \(n\times m\) real matrices. If \(A\in {\mathbb {R}}^{n\times n}\), then \(A^{T}\) and \(A^{-1}\) represent the transposition and inverse of A, respectively. The sets of eigenvalues and singular values of A are denoted as \(\Lambda (A)=\{\lambda _{i}(A),\ i=1,2,\cdots ,n\}\) and \(\Sigma (A)=\{\sigma _ {i}(A),\ i=1,2,\cdots ,n\}\), where \(\lambda _{i}(A)\) and \(\sigma _ {i}(A)\) are the i-th eigenvalue and the i-th singular value of A, respectively. \(\rho (A)=\max _{1\le i\le n}\{|\lambda _{i}(A)|\}\) represents the spectral radius of A. \(A>0\ (A\ge 0)\) indicates that A is positive definite (positive semidefinite), \(\Vert A\Vert _{2}\) denotes the 2-norm of A, Re(A) and Im(A) represent the real and imaginary parts of the eigenvalues of A, respectively. \(A\otimes I\) denotes the Kronecker product of A and I.

Definition 1

Let \(A_{1}=[a_{ij}]\in {\mathbb {C}}^{m\times n},\ \ B_{1}\in \mathbb C^{p\times q}\), then

$$\begin{aligned} A_{1}\otimes B_{1}=\left( \begin{array}{cccc} a_{11}B_{1} &{} a_{12}B_{1} &{} \cdots &{} a_{1n}B_{1} \\ a_{21}B_{1} &{} a_{22}B_{1} &{} \cdots &{} a_{2n}B_{1} \\ \vdots &{} \vdots &{} &{} \vdots \\ a_{m1}B_{1} &{} a_{m2}B_{1} &{} \cdots &{} a_{mn}B_{1} \\ \end{array} \right) \in {\mathbb {C}}^{mp\times nq}, \end{aligned}$$

it’s called Kronecker product of \(A_ {1}\) and \(B_{1}\).

Definition 2

If the vectorization operator vec satisfies \({\mathbb {C}}^{m\times n}\rightarrow {\mathbb {C}}^{mn}\):

$$\begin{aligned} vec(X_{1})=(x_{1}^{T},x_{2}^{T},\cdots ,x_{n}^{T})^{T},\ \ X_{1}=[x_{1},x_{2},\cdots ,x_{n}]\in {\mathbb {C}}^{m\times n}, \end{aligned}$$

then this operator is called a straightening operator.

2 Low rank GADI for solving Lyapunov equation

2.1 Derivation of iterative format

Firstly, we consider the ADI iterative method for solving the Lyapunov Eq. (2) with a single parameter.

$$\begin{aligned} \left\{ \begin{array}{ll} (F^{T}+\alpha I)X_{k+\frac{1}{2}}=Q-X_{k}(F-\alpha I),\\ X_{k+1}(F+\alpha I)=Q-(F^{T}-\alpha I)X_{k+\frac{1}{2}}. \end{array} \right. \end{aligned}$$
(6)

By (6), we can obtain iterative format for \(X_{k+1}\),

$$\begin{aligned} X_{k+1}=(F^{T}-\alpha I)(F^{T}+\alpha I)^{-1}X_{k}(F-\alpha I)(F+\alpha I)^{-1}+2\alpha (F^{T}+\alpha I)^{-1}Q(F+\alpha I)^{-1}. \end{aligned}$$

Since \(F-\alpha I\) is interchangeable with \((F+\alpha I)^{-1}\), we can derive a low-rank ADI (R1-ADI) iterative formula with a single parameter.

$$\begin{aligned} \left\{ \begin{array}{lllll} V_{1}=\sqrt{2\alpha }(F^{T}+\alpha I)^{-1}C^{T},\ \ V_{1}\in {\mathbb {R}}^{n\times p},\\ V_{k}=[(F^{T}-\alpha I)(F^{T}+\alpha I)^{-1}V_{k-1},V_{1}],\ \ V_{k}\in {\mathbb {R}}^{n\times kp},\\ X_{k}=V_{k}V_{k}^{T},\ \ X_{k}\in R^{n\times n}. \end{array} \right. \end{aligned}$$
(7)

Next, we consider the ADI iterative method with two parameters for solving the Lyapunov Eq. (2).

$$\begin{aligned} \left\{ \begin{array}{ll} (F^{T}+\alpha I)X_{k+\frac{1}{2}}=Q-X_{k}(F-\alpha I),\\ X_{k+1}(F+\beta I)=Q-(F^{T}-\beta I)X_{k+\frac{1}{2}}. \end{array} \right. \end{aligned}$$
(8)

Similarly, we can obtain a low-rank ADI (R2-ADI) iterative formula with two parameters.

$$\begin{aligned} \left\{ \begin{array}{lllll} V_{1}=\sqrt{\alpha +\beta }(F^{T}+\alpha I)^{-1}C^{T},\ \ V_{1}\in {\mathbb {R}}^{n\times p},\\ V_{k}=[(F^{T}-\beta I)(F^{T}+\alpha I)^{-1}V_{k-1},V_{1}],\ \ V_{k}\in {\mathbb {R}}^{n\times kp},\\ W_{1}=\sqrt{\alpha +\beta }(F^{T}+\beta I)^{-1}C^{T},\\ W_{k}=[(F^{T}+\beta I)^{-1}(F^{T}-\alpha I)W_{k-1},W_{1}],\\ X_{k}=V_{k}W_{k}^{T},\ \ X_{k}\in {\mathbb {R}}^{n\times n}. \end{array} \right. \end{aligned}$$
(9)

We apply the GADI iterative framework to solve the Lyapunov Eq.  (2). Firstly, the straightening operator is applied and from the Kronecker product, we have

$$\begin{aligned} (F^{T}\otimes I+I\otimes F^{T})x=q,\ \ x=\text {vec}(X),\ \ q=\text {vec}(Q). \end{aligned}$$
(10)

Secondly, by applying the GADI iterative method in Eq.  (10), we obtain the following expression.

$$\begin{aligned} \left\{ \begin{array}{ll} (\alpha I_{n^{2}}+I\otimes F^{T})x_{k+\frac{1}{2}}=(\alpha I_{n^{2}}-F^{T}\otimes I)x_{k}+q,\\ (\alpha I_{n^{2}}+F^{T}\otimes I)x_{k+1}=(F^{T}\otimes I-(1-\omega )\alpha I_{n^{2}})x_{k}+(2-\omega )\alpha x_{k+\frac{1}{2}}. \end{array} \right. \end{aligned}$$
(11)

We rewrite Eq. (11) into matrix form as

$$\begin{aligned} \left\{ \begin{array}{ll} (\alpha I+F^{T})X_{k+\frac{1}{2}}=X_{k}(\alpha I-F)+Q,\\ X_{k+1}(\alpha I+F)=X_{k}(F-(1-\omega )\alpha I)+(2-\omega )\alpha X_{k+\frac{1}{2}}. \end{array} \right. \end{aligned}$$
(12)

We select the appropriate parameter \(\alpha \) to ensure that both matrices \(\alpha I+F^{T}\) and \(\alpha I+F\) are invertible. From the first equation of (12), we can obtain

$$\begin{aligned} X_{k+\frac{1}{2}}=(\alpha I+F^{T})^{-1}X_{k}(\alpha I-F)+(\alpha I+F^{T})^{-1}C^{T}C, \end{aligned}$$

we substitute it into the second equation, then

$$\begin{aligned} \begin{aligned} X_{k+1}&=X_{k}(F-(1-\omega )\alpha I)(\alpha I+F)^{-1}\\&\quad +(2-\omega )\alpha X_{k+\frac{1}{2}}(\alpha I+F)^{-1}\\&=X_{k}(F-(1-\omega )\alpha I)(\alpha I+F)^{-1}\\&\quad + (2-\omega )\alpha [(\alpha I+F^{T})^{-1}X_{k}(\alpha I-F)\\&\quad +(\alpha I+F^{T})^{-1}C^{T}C](\alpha I+F)^{-1}\\&=X_{k}(F-(1-\omega )\alpha I)(\alpha I+F)^{-1}\\&\quad + (2-\omega )\alpha (\alpha I+F^{T})^{-1}X_{k}(\alpha I-F)(\alpha I+F)^{-1}\\&\quad + (2-\omega )\alpha (\alpha I+F^{T})^{-1}C^{T}C(\alpha I+F)^{-1}. \end{aligned} \end{aligned}$$

Taking the initial value \(X_ {0}=0\), we have \(X_{1}=(2-\omega )\alpha (\alpha I+F^{T})^{-1}C^{T}C(\alpha I+F)^{-1}\).

Let

$$\begin{aligned} V_{1}=W_{1}=\sqrt{(2-\omega )\alpha }(\alpha I+F^{T})^{-1}C^{T}, \end{aligned}$$

where \(X_{1}=V_{1}W_{1}^{T}\). We can also get

$$\begin{aligned} X_{2}&=X_{1}(F-(1-\omega )\alpha I)(\alpha I+F)^{-1}\\&\quad +(2-\omega )\alpha (\alpha I+F^{T})^{-1}X_{1}(\alpha I-F)(\alpha I+F)^{-1}\\&\quad +(2-\omega )\alpha (\alpha I+F^{T})^{-1}C^{T}C(\alpha I+F)^{-1}\\&=V_{1}W_{1}^{T}(F-(1-\omega )\alpha I)(\alpha I+F)^{-1}\\&\quad +(2-\omega )\alpha (\alpha I+F^{T})^{-1}V_{1}W_{1}^{T}(\alpha I-F)(\alpha I+F)^{-1}+V_{1}W_{1}^{T}. \end{aligned}$$

Let

$$\begin{aligned}{} & {} V_{2}=[V_{1},\sqrt{(2-\omega )\alpha }(\alpha I+F^{T})^{-1}V_{1},V_{1}],\\{} & {} W_{2}=[(\alpha I+F^{T})^{-1}(F^{T}-(1-\omega )\alpha I)W_{1},\sqrt{(2-\omega )\alpha }(\alpha I+F^{T})^{-1}(\alpha I-F^{T})W_{1},W_{1}], \end{aligned}$$

where \(X_{2}=V_{2}W_{2}^{T}\).

Based on the previous derivation method, we can obtain the R-GADI iterative format for soliving the Lyapunov Eq. (2).

$$\begin{aligned} \left\{ \begin{array}{lllll} V_{1}=\sqrt{(2-\omega )\alpha }(\alpha I+F^{T})^{-1}C^{T},\\ V_{k}=[V_{k-1},\sqrt{(2-\omega )\alpha }(\alpha I+F^{T})^{-1}V_{k-1},V_{1}],\ \ V_{k}\in R^{n\times (2^{k}-1)p},\\ W_{1}=\sqrt{(2-\omega )\alpha }(\alpha I+F^{T})^{-1}C^{T},\\ W_{k}=[(\alpha I+F^{T})^{-1}(F^{T}-(1-\omega )\alpha I)W_{k-1},\sqrt{(2-\omega )\alpha }(\alpha I+F^{T})^{-1}(\alpha I-F^{T})W_{k-1},W_{1}],\\ X_{k}=V_{k}W_{k}^{T}. \end{array} \right. \nonumber \\ \end{aligned}$$
(13)

Next, the R-GADI algorithm is given as Algorithm 1:

Algorithm 1
figure a

R-GADI iteration for solving Lyapunov equation 2

2.2 Convergence analysis

Some simple properties of Kronecker product can be easily derived from the Definitions 1 and 2.

Lemma 1

(Xu (2011)) Let \(\alpha \in {\mathbb {C}}\), \(A_{1}\in {\mathbb {C}}^{m\times n},\ \ B_{1}\in {\mathbb {C}}^{p\times q},\ \ X_{1}\in {\mathbb {C}}^{n\times p},\ \ C_{1}\in {\mathbb {C}}^{m\times n},\ \ D_{1}\in {\mathbb {C}}^{p\times q}\), then

  1. (a)

    \(\alpha (A_{1}\otimes B_{1})=(\alpha A_{1})\otimes B_{1}=A_{1}\otimes (\alpha B_{1});\)

  2. (b)

    \((A_{1}\otimes B_{1})(C_{1}\otimes D_{1})=(A_{1}C_{1})\otimes (B_{1}D_{1});\)

  3. (c)

    \((A_{1}\otimes B_{1})^{T}=A_{1}^{T}\otimes B_{1}^{T};\)

  4. (d)

    \(vec(A_{1}X_{1}B_{1})=(B_{1}^{T}\otimes A_{1})vec(X_{1});\)

  5. (e)

    \(\Lambda (I_{p}\otimes A_{1}-B_{1}^{T}\otimes I_{n})=\{\lambda -\mu :\lambda \in \Lambda (A_{1}),\ \ \mu \in \Lambda (B_{1})\}.\)

Lemma 2

Let \({\mathscr {R}}(X)=F^{T}X+XF-Q\), and \(\{X_{k}\}\) be an approximate solution sequence of the Lyapunov Eq. (2) generated by GADI iteration (12), and X is a symmetric positive definite solution of Eq. (2). Then for any \(k\ge 0\), we have

  1. (1)

    \((\alpha I+F^{T})(X_{k+\frac{1}{2}}-X)=(X_{k}-X)(\alpha I-F);\)

  2. (2)

    \((\alpha I+F^{T})(X_{k+\frac{1}{2}}-X_{k})=-{\mathscr {R}}(X_{k});\)

  3. (3)

    \({\mathscr {R}}(X_{k+\frac{1}{2}})=(X_{k+\frac{1}{2}}-X_{k})(F-\alpha I);\)

  4. (4)

    \((X_{k+1}-X)(\alpha I+F)=(X_{k}-X)F+(1-\omega )\alpha (X_{k+\frac{1}{2}}-X_{k})+(X_{k+\frac{1}{2}}-X);\)

  5. (5)

    \((X_{k+1}-X_{k+\frac{1}{2}})(\alpha I+F)=(X_{k+\frac{1}{2}}-X_{k})((1-\omega )\alpha I-F).\)

Proof

  1. (1)

    From the first equation of (12), we have

    $$\begin{aligned} (\alpha I+F^{T})(X_{k+\frac{1}{2}}-X)=X_{k}(\alpha I-F)+Q-(\alpha I+F^{T})X. \end{aligned}$$

    Since \(Q-F^{T}X=XF\), then

    $$\begin{aligned} (\alpha I+F^{T})(X_{k+\frac{1}{2}}-X)&=XF-X_{k}F+X_{k}\alpha -\alpha X\\&=(X_{k}-X)(\alpha I-F). \end{aligned}$$
  2. (2)
    $$\begin{aligned} (\alpha I+F^{T})(X_{k+\frac{1}{2}}-X_{k})&=X_{k}(\alpha I-F)+Q-(\alpha I+F^{T})X_{k}\\&=Q-X_{k}F-F^{T}X_{k}\\&=-{\mathscr {R}}(X_{k}). \end{aligned}$$
  3. (3)

    Due to \(F^{T}X_{k+\frac{1}{2}}-Q=X_{k}(\alpha I-F)-\alpha X_{k+\frac{1}{2}},\) then

    $$\begin{aligned} {\mathscr {R}}(X_{k+\frac{1}{2}})&=F^{T}X_{k+\frac{1}{2}}+X_{k+\frac{1}{2}}F-Q\\&=X_{k}(\alpha I-F)-\alpha X_{k+\frac{1}{2}}+X_{k+\frac{1}{2}}F\\&=(X_{k+\frac{1}{2}}-X_{k})(F-\alpha I). \end{aligned}$$
  4. (4)

    From the second equation of (12), we have

    $$\begin{aligned} (X_{k+1}-X)(\alpha I+F)&=X_{k}(F-(1-\omega )\alpha I)+(2-\omega )\alpha X_{k+\frac{1}{2}}-X(\alpha I+F)\\&=X_{k}F-X_{k}(1-\omega )\alpha I+(1-\omega )\alpha X_{k+\frac{1}{2}}+\alpha X_{k+\frac{1}{2}}-\alpha X-XF\\&=(X_{k}-X)F+(1-\omega )\alpha (X_{k+\frac{1}{2}}-X_{k})+\alpha (X_{k+\frac{1}{2}}-X). \end{aligned}$$
  5. (5)
    $$\begin{aligned} (X_{k+1}-X_{k+\frac{1}{2}})(\alpha I+F)&=X_{k}(F-(1-\omega )\alpha I)+(2-\omega )\alpha X_{k+\frac{1}{2}}\\&-X(\alpha I+F)-X_{k+\frac{1}{2}}(\alpha I+F)\\&=X_{k}F-X_{k+\frac{1}{2}}F+(1-\omega )\alpha (X_{k+\frac{1}{2}}-X_{k})\\&=(X_{k+\frac{1}{2}}-X_{k})((1-\omega )\alpha I-F). \end{aligned}$$

\(\square \)

From the Lemma 2, we can prove that Theorem 3 holds.

Theorem 3

Let X be the symmetric positive definite solution of Lyapunov Eq. (2), and let the initial matrix \(X_{0}=0\), and parameter \(\alpha >0,\ 0\le \omega <2\). Then the matrix sequence \(\{X_{k}\}\) generated by the previous GADI iteration (12) holds the following inequality.

$$\begin{aligned} \Vert X_{k+1}-X\Vert _{2}\le \delta (\alpha )\Vert X_{k}-X\Vert _{2}+\eta (\alpha ,\omega )\Vert {\mathscr {R}}(X_{k})\Vert _{2}, \end{aligned}$$

where

$$\begin{aligned} \delta (\alpha )=\Vert F(\alpha I+F)^{-1}\Vert _{2}+\alpha \Vert (\alpha I+F^{T})^{-1}\Vert _{2}\Vert (\alpha I-F)(\alpha I+F)^{-1}\Vert _{2},\\ \eta (\alpha ,\omega )=|1-\omega |\alpha \Vert (\alpha I+F^{T})^{-1}\Vert _{2}\Vert (\alpha I+F)^{-1}\Vert _{2}. \end{aligned}$$

Proof

From Lemma 2, we can conclude that

$$\begin{aligned} X_{k+1}-X&=[(X_{k}-X)F+(1-\omega )\alpha (X_{k+\frac{1}{2}}-X_{k})+\alpha (X_{k+\frac{1}{2}}-X)](\alpha I+F)^{-1}\\&=(X_{k}-X)F(\alpha I+F)^{-1}-(1-\omega )\alpha (\alpha I+F^{T})^{-1}{\mathscr {R}}(X_{k})(\alpha I+F)^{-1}\\&\quad +\alpha (\alpha I+F^{T})^{-1}(X_{k}-X)(\alpha I-F)(\alpha I+F)^{-1}. \end{aligned}$$

Therefore, we can get

$$\begin{aligned} \Vert X_{k+1}-X\Vert _{2}&\le \Vert (X_{k}-X)F(\alpha I+F)^{-1}\Vert _{2}+|1-\omega |\alpha \Vert (\alpha I+F^{T})^{-1}{\mathscr {R}}(X_{k})(\alpha I+F)^{-1}\Vert _{2}\\&\quad +\alpha \Vert (\alpha I+F^{T})^{-1}(X_{k}-X)(\alpha I-F)(\alpha I+F)^{-1}\Vert _{2}\\&\le \Vert (X_{k}-X)\Vert _{2}\Vert F(\alpha I+F)^{-1}\Vert _{2}\\&\quad +|1-\omega |\alpha \Vert (\alpha I+F^{T})^{-1}\Vert _{2}\Vert {\mathscr {R}}(X_{k})\Vert _{2}\Vert (\alpha I+F)^{-1}\Vert _{2}\\&\quad +\alpha \Vert (\alpha I+F^{T})^{-1}\Vert _{2}\Vert X_{k}-X\Vert _{2}\Vert (\alpha I-F)(\alpha I+F)^{-1}\Vert _{2}\\&=[\Vert F(\alpha I+F)^{-1}\Vert _{2}+\alpha \Vert (\alpha I+F^{T})^{-1}\Vert _{2}\Vert (\alpha I-F)(\alpha I+F)^{-1}\Vert _{2}]\Vert X_{k}-X\Vert _{2}\\&\quad +|1-\omega |\alpha \Vert (\alpha I+F^{T})^{-1}\Vert _{2}\Vert (\alpha I+F)^{-1}\Vert _{2}\Vert {\mathscr {R}}(X_{k})\Vert _{2}. \end{aligned}$$

\(\square \)

Theorem 4

Let X be the symmetric positive definite solution of Lyapunov Eq. (2), and let parameters \(\alpha >0\) and \(0\le \omega <2\), then for any \(k=0,1,2, \cdots \), the matrix \(V_{k}W_{k}^{T}\) defined by (13) converges to X. This is equivalent to the convergence of the iterative sequence \({X_ {k}}\) converging to X, where \({X_{k}}\) is obtained from (12).

Proof

Here, we only need to prove the convergence of the iterative format (12), since \(\alpha >0\) and A are stable, then \(\alpha I_{n^{2}}+I\otimes F^{T}\) and \(\alpha I_{n^{2}}+F^{T}\otimes I\) are non-singular. From the iteration framework (11), we can obtain

$$\begin{aligned} x_{k+\frac{1}{2}}=(\alpha I_{n^{2}}+I\otimes F^{T})^{-1}(\alpha I_{n^{2}}-F^{T}\otimes I)x_{k}+(\alpha I_{n^{2}}+I\otimes F^{T})^{-1}q, \end{aligned}$$
$$\begin{aligned} x_{k+1}&=(\alpha I_{n^{2}}+F^{T}\otimes I)^{-1}[(F^{T}\otimes I-(1-\omega )\alpha I_{n^{2}})\\&\quad +(2-\omega )\alpha (\alpha I_{n^{2}}+I\otimes F^{T})^{-1}(\alpha I_{n^{2}}-F^{T}\otimes I)]x_{k}\\&\quad +(2-\omega )\alpha (\alpha I_{n^{2}}+F^{T}\otimes I)^{-1}(\alpha I_{n^{2}}+I\otimes F^{T})^{-1}q\\&=(\alpha I_{n^{2}}+F^{T}\otimes I)^{-1}(\alpha I_{n^{2}}+I\otimes F^{T})^{-1}[(\alpha ^{2}I_{n^{2}}\\&\quad +(I\otimes F^{T})(F^{T}\otimes I)-(1-\omega )\alpha (I\otimes F^{T}+F^{T}\otimes I)]x_{k}\\&\quad +(2-\omega )\alpha (\alpha I_{n^{2}}+F^{T}\otimes I)^{-1}(\alpha I_{n^{2}}+I\otimes F^{T})^{-1}q. \end{aligned}$$

Let

$$\begin{aligned} M{} & {} =I\otimes F^{T}+F^{T}\otimes I, \\ G(\alpha ,\omega ){} & {} =(2-\omega )\alpha (\alpha I_{n^{2}}+F^{T}\otimes I)^{-1}(\alpha I_{n^{2}}+I\otimes F^{T})^{-1},\\ T(\alpha ,\omega ){} & {} =(\alpha I_{n^{2}}+F^{T}\otimes I)^{-1}(\alpha I_{n^{2}}+I\otimes F^{T})^{-1}[(\alpha ^{2}I_{n^{2}}\\{} & {} \quad +(I\otimes F^{T})(F^{T}\otimes I)-(1-\omega )\alpha M], \end{aligned}$$

then

$$\begin{aligned} x_{k+1}=T(\alpha ,\omega )x_{k}+G(\alpha ,\omega )q. \end{aligned}$$

Next, we need to prove that for \(\alpha >0,\ 0\le \omega <2\), there is \(\rho (T(\alpha ,\omega ))<1\). Since

$$\begin{aligned} 2\alpha M=-(\alpha I_{n^{2}}-I\otimes F^{T})(\alpha I_{n^{2}}-F^{T}\otimes I)+(\alpha I_{n^{2}}+I\otimes F^{T})(\alpha I_{n^{2}}+F^{T}\otimes I), \end{aligned}$$

then we can obtain

$$\begin{aligned} T(\alpha ,\omega )&=(\alpha I_{n^{2}}+F^{T}\otimes I)^{-1}(\alpha I_{n^{2}}+I\otimes F^{T})^{-1}[(\alpha ^{2}I_{n^{2}}+(I\otimes F^{T})(F^{T}\otimes I)-(1-\omega )\alpha M]\\&=(\alpha I_{n^{2}}+F^{T}\otimes I)^{-1}(\alpha I_{n^{2}}+I\otimes F^{T})^{-1}[(\alpha I_{n^{2}}-I\otimes F^{T})(\alpha I_{n^{2}}-F^{T}\otimes I)+\omega \alpha M]\\&=\frac{1}{2}(\alpha I_{n^{2}}+F^{T}\otimes I)^{-1}(\alpha I_{n^{2}}+I\otimes F^{T})^{-1}[2(\alpha I_{n^{2}}-I\otimes F^{T})(\alpha I_{n^{2}}-F^{T}\otimes I)+\omega 2\alpha M]\\&=\frac{1}{2}[(2-\omega )(\alpha I_{n^{2}}+F^{T}\otimes I)^{-1}(\alpha I_{n^{2}}+I\otimes F^{T})^{-1}(\alpha I_{n^{2}}-I\otimes F^{T})(\alpha I_{n^{2}}-F^{T}\otimes I)\\&\quad +\omega I_{n^{2}}]\\&=\frac{1}{2}[(2-\omega )T(\alpha )+\omega I_{n^{2}}], \end{aligned}$$

where

$$\begin{aligned} T(\alpha )=(\alpha I_{n^{2}}+F^{T}\otimes I)^{-1}(\alpha I_{n^{2}}+I\otimes F^{T})^{-1}(\alpha I_{n^{2}}-I\otimes F^{T})(\alpha I_{n^{2}}-F^{T}\otimes I). \end{aligned}$$

Due to \(\lambda (T(\alpha ,\omega ))=\frac{1}{2}[(2-\omega )\lambda (T(\alpha ))+\omega ]\), then we have

$$\begin{aligned} \rho (T(\alpha ,\omega ))\le \frac{1}{2}[(2-\omega )\rho (T(\alpha ))+\omega ]. \end{aligned}$$

Let

$$\begin{aligned} {\widetilde{T}}(\alpha )=(\alpha I_{n^{2}}+I\otimes F^{T})^{-1}(\alpha I_{n^{2}}-I\otimes F^{T})(\alpha I_{n^{2}}-F^{T}\otimes I)(\alpha I_{n^{2}}+F^{T}\otimes I)^{-1}, \end{aligned}$$

it can be seen that \(T(\alpha )\) is similar to \({\widetilde{T}}(\alpha )\) through the matrix \(\alpha I_{n^{2}}+F^{T}\otimes I\). Therefore, we get

$$\begin{aligned} \rho (T(\alpha ))&\le \,\,\parallel (\alpha I_{n^{2}}+I\otimes F^{T})^{-1}(\alpha I_{n^{2}}-I\otimes F^{T})(\alpha I_{n^{2}}-F^{T}\otimes I)(\alpha I_{n^{2}}+F^{T}\otimes I)^{-1}\parallel _{2}\\&\le \,\,\parallel (\alpha I_{n^{2}}+I\otimes F^{T})^{-1}(\alpha I_{n^{2}}-I\otimes F^{T})\parallel _{2}\parallel (\alpha I_{n^{2}}-F^{T}\otimes I)(\alpha I_{n^{2}}+F^{T}\otimes I)^{-1}\parallel _{2}\\&=\,\,\parallel F_{L}\parallel _{2}\parallel F_{R}\parallel _{2}, \end{aligned}$$

where

$$\begin{aligned} F_{L}=(\alpha I_{n^{2}}+I\otimes F^{T})^{-1}(\alpha I_{n^{2}}-I\otimes F^{T}),\ \ F_{R}=(\alpha I_{n^{2}}-F^{T}\otimes I)(\alpha I_{n^{2}}+F^{T}\otimes I)^{-1}. \end{aligned}$$

For \(x\in {\mathbb {R}}^{n^{2}\times 1}\), we get

$$\begin{aligned} \parallel F_{L}\parallel _{2}^{2}&=\max _{\parallel x\parallel _{2}=1}\frac{\parallel (\alpha I_{n^{2}}-I\otimes F^{T})x\parallel _{2}^{2}}{\parallel (\alpha I_{n^{2}}+I\otimes F^{T})x\parallel _{2}^{2}}\\&=\max _{\parallel x\parallel _{2}=1}\frac{\parallel (I\otimes F^{T})x\parallel _{2}^{2}-\alpha x^{T}(I\otimes F+I\otimes F^{T})x+\alpha ^{2}}{\parallel (I\otimes F^{T})x\parallel _{2}^{2}+\alpha x^{T}(I\otimes F+I\otimes F^{T})x+\alpha ^{2}}\\&\le \max _{\parallel x\parallel _{2}=1}\frac{\parallel (I\otimes F^{T})x\parallel _{2}^{2}-2\alpha \min Re(\lambda (I\otimes F))+\alpha ^{2}}{\parallel (I\otimes F^{T})x\parallel _{2}^{2}+2\alpha \min Re(\lambda (I\otimes F))+\alpha ^{2}}\\&\le \frac{\parallel I\otimes F^{T}\parallel _{2}^{2}-2\alpha \min Re(\lambda (I\otimes F))+\alpha ^{2}}{\parallel I\otimes F^{T}\parallel _{2}^{2}+2\alpha \min Re(\lambda (I\otimes F))+\alpha ^{2}}\\&=\frac{\parallel I\otimes F^{T}\parallel _{2}^{2}-2\alpha \min Re(\lambda (F))+\alpha ^{2}}{\parallel I\otimes F^{T}\parallel _{2}^{2}+2\alpha \min Re(\lambda (F))+\alpha ^{2}}. \end{aligned}$$

Similarly, we have a conclusion

$$\begin{aligned} \parallel F_{R}\parallel _{2}^{2}\le \frac{\parallel F^{T}\otimes I\parallel _{2}^{2}-2\alpha \min Re(\lambda (F))+\alpha ^{2}}{\parallel F^{T}\otimes I\parallel _{2}^{2}+2\alpha \min Re(\lambda (F))+\alpha ^{2}}, \end{aligned}$$

since A is stable and \(\alpha >0\), then

$$\begin{aligned} \parallel F_{L}\parallel _{2}<1,\ \ \parallel F_{R}\parallel _{2}<1. \end{aligned}$$

Therefore, \(\rho (T(\alpha ))<1\) and \(\rho (T(\alpha ,\omega ))\le \frac{1}{2}[(2-\omega )\rho (T(\alpha ))+\omega ]<1\). \(\square \)

2.3 Selection of parameters

In this section, we first compare the convergence rates of the GADI method and the ADI method, where \(\rho (T(\alpha ,\ \omega ))\) and \(\rho (T(\alpha ))\) used here are defined in the previous proof of Theorem 3. Afterwards, we will provide a more practical method to select parameters.

Theorem 5

Assuming that the eigenvalues of matrix F have positive real parts, define

$$\begin{aligned} \rho (T(\alpha ,\omega ))=|\xi |=|a+bi|,\ \ \rho (T(\alpha ))=|\eta |=|c+di|, \end{aligned}$$
  1. (i)

    if \(|\eta |^{2}\le c\), then

    $$\begin{aligned} \rho (T(\alpha ))<\rho (T(\alpha ,\omega ))<1; \end{aligned}$$
  2. (ii)

    if \(|\eta |^{2}>c\) and \(0<\omega<\frac{4(|\eta _{k}|^{2}-c_{k})}{(1-c_{k})^{2}+d_{k}^{2}}<2\), we have

    $$\begin{aligned} \rho (T(\alpha ,\omega ))<\rho (T(\alpha ))<1. \end{aligned}$$

Theorem 6

Let \(\sigma _{j}(A)\) and \(\lambda _{j}(F)\) represent the singular and eigenvalues of the coefficient matrix F of the Lyapunov Eq. (2), respectively, with \(j=1,2,\cdots ,n\). Let

$$\begin{aligned} \mu =\max _{j}\sigma _{j}(F),\ \ \nu =\min _{j}Re(\lambda _{j}(F)), \end{aligned}$$

then we can obtain the optimal parameters \(\alpha ^{*}\) as follows

$$\begin{aligned} \alpha ^{*}=arg \min \limits _{\alpha }\frac{\mu ^{2}-2\alpha \nu +\alpha ^{2}}{\mu ^{2}+2\alpha \nu +\alpha ^{2}}=\mu . \end{aligned}$$

The proof of Theorems 5 and 6 is similar to the proof process of Theorems 2.7 and 2.9 in Li et al. (2022), and we will not delve into the details here.

For the parameter \(\omega \), since \(0\le \omega <2\), special values 0 and 1 can be selected for validation first. The general method is to select the appropriate parameter \(\omega \) by analyzing residuals in numerical examples.

2.4 Complexity analysis

According to the iterative scheme (11), this method is primarily used for efficiently solving large sparse Lyapunov equations. The approach involves utilizing Cholesky factorization and representing the solution in the form of low-rank factors. Before starting Algorithm 1, we need to determine the values of parameters \(\alpha \) and \(\omega \). Finding suitable parameter values can be challenging as the selection of parameters significantly impacts the iterative results and convergence speed. In Algorithm 1, a stopping criterion needs to be set to compute the maximum number of iterations. One approach is to stop the iteration when the change in the approximate solution is small, i.e., \(\parallel X_{k}-X_{k-1}\parallel <\varepsilon \), where

$$\begin{aligned} \parallel X_{k}-X_{k-1}\parallel =\parallel V_{k}W_{k}^{T}-V_{k-1}W_{k-1}^{T}\parallel . \end{aligned}$$

However, computing the norm of the approximate solution on high-dimensional matrices can be computationally expensive. We adopt an alternative approach by measuring the relative residual of the approximate solution \(X_{k}\) generated by the low-rank GADI iteration (12) at the k-th step. The relative residual \(\text {Res}(X_{k})\) is defined as follows:

$$\begin{aligned} \text {Res}(X_{k})=\frac{\Vert F^{T}X_{k}+X_{k}F-Q\Vert _{2}}{\Vert Q\Vert _{2}}. \end{aligned}$$

Therefore, with a given accuracy requirement \(\varepsilon \), we stop the iteration when Res\((X_{k})<\varepsilon \).

Next, we calculate the computational complexity of Algorithm  1. During the iteration process, the number of columns in matrices \(V_{k}\) and \(W_{k}\) increases with the number of iterations, i.e., \(V_{k},\ W_{k}\in {\mathbb {R}}^{n\times (2^{k}-1)p}\), where \(p\ll n\). Firstly, in the first iteration, the computational cost of obtaining \(V_{1}\) is \(2n^{2}p\), and at this point, \(W_{1}=V_{1}\). Then, in the k-th iteration, the computational cost of obtaining \(V_{k}\) is \(2n^{2}(2^{k-1}-1)p\), and the computational cost of obtaining \(W_{k}\) is \(4n^{3}+2n^{2}(2^{k-1}-1)p+2n^{2}p\). Therefore, the total computational complexity of this algorithm is \(4n^{3}+4n^{2}2^{k-1}p\).

2.5 Numerical experiments

In this section, we use two examples of the linear system to show the numerical feasibility and effectiveness of the R-GADI algorithms. The Lyapunov equation is widely used to determine the stability of linear systems. By solving the Lyapunov equation, such a function can be found to determine the stability of the system. In our numerical experiment, set the iteration tolerance \(\varepsilon =1\times 10^{-15}\). The whole process is performed on a computer with Intel Core 1.00GHz CPU, 8.00GB RAM, and MATLAB R2018a. IT represents the number of iteration steps, and CPU by the computing time. Denote the numerical symmetric solution as X, the finally relative residual as

$$\begin{aligned} \text {Res}(X)=\frac{\Vert F^{T}X+XF-Q\Vert _{2}}{\Vert Q\Vert _{2}}. \end{aligned}$$

Example 2.5.1 Consider the linear system of the form \({\dot{x}}=Fx\), if \(V(z)=z^{T}Xz\),then

$$\begin{aligned}{} & {} {\dot{V}}(z)=(Fz)^{T}Xz+z^{T}X(Fz)=z^{T}Qz, \end{aligned}$$

where

$$\begin{aligned}{} & {} F= \left( \begin{array}{ccccc} 5 &{} 0.3 &{} 0 &{} \cdots &{} 0\\ 0.2 &{} 5 &{} 0.3 &{} \cdots &{} 0\\ \vdots &{} \ddots &{} \ddots &{} \ddots &{} \vdots \\ 0 &{} \cdots &{} 0.2 &{} 5 &{} 0.3\\ 0 &{} \cdots &{} 0 &{} 0.2 &{} 5\\ \end{array} \right) _{n\times n},\\{} & {} C=\left( \begin{array}{ccccc} 1, &{} 1,&{} \cdots ,&{} 1,&{} 1 \\ \end{array} \right) _{1\times n},\ \ \ Q=C^{T}C. \end{aligned}$$

We set the parameters \(\omega =0.015\) and \(\alpha \) to be \(\sqrt{\lambda _{max}(F)\lambda _{min}(F)}\) and \(\max \sigma (F)\), respectively. We compare the numerical results for different matrix dimensions and different choices of \(\alpha \), which are listed in Table 1 below. From the data in the table, it can be observed that during the iteration process, when the matrix dimensions are the same and \(\alpha \) is chosen as the maximum singular value of matrix F, the total iteration time is shorter, resulting in better performance.

Table 1 Numerical results for Example 2.5.1

Next, we set the parameters \(\alpha =\max \sigma (F),\ \omega =0.015\). In the case of matrix dimensions being multiplied, we solve the problem using both the GADI method and the R-GADI method. Table  2 presents the numerical results for relative residual, iteration count, and CPU time. In terms of iteration count and time, the R-GADI method is relatively more efficient. For instance, when \(n=1024\), Fig. 1 shows the iteration count and relative residual obtained by applying these two methods iteratively, while Fig. 2 displays the computation time required by each method. It is evident from the figures that the R-GADI method is effective.

Table 2 Numerical results for Example 2.5.1
Fig. 1
figure 1

The residual curve of Example 2.5.1 \(n=1024\)

Fig. 2
figure 2

The time curve of Example 2.5.1

Example 2.5.2 Consider the linear system of the form \({\dot{x}}=Fx\), if \(V(z)=z^{T}Xz\), then

$$\begin{aligned} {\dot{V}}(z)=(Fz)^{T}Xz+z^{T}X(Fz)=z^{T}Qz, \end{aligned}$$

where

$$\begin{aligned}{} & {} F= \left( \begin{array}{ccccc} 9 &{} 3 &{} 0 &{} \cdots &{} 0\\ -2 &{} 9 &{} 3 &{} \cdots &{} 0\\ \vdots &{} \ddots &{} \ddots &{} \ddots &{} \vdots \\ 0 &{} \cdots &{} -2 &{} 9 &{} 3\\ 0 &{} \cdots &{} 0 &{} -2 &{} 9\\ \end{array} \right) _{n\times n}, \\{} & {} C=\left( \begin{array}{ccccc} 1, &{} 1,&{} \cdots ,&{} 1,&{} 1 \\ \end{array} \right) _{1\times n},\ \ \ Q=C^{T}C. \end{aligned}$$

We utilize the GADI, R1-ADI, R2-ADI, and R-GADI methods to solve the problem. Table 3 presents the relative residual, iteration count, and time obtained using these four iterative methods as the matrix dimension increases. By comparing these data, we can observe the effectiveness of the R-GADI method in solving the problem. Additionally, Fig. 3 illustrates the correspondence between iteration count and relative residual for these four iterative methods when \(n=1024\). Figure 4 displays the iteration time required by each method, it is evident that the R-GADI method achieves better results in solving the problem.

Table 3 Numerical results for Example 2.5.2
Fig. 3
figure 3

The residual curve of Example 2.5.2 \(n=1024\)

Fig. 4
figure 4

The time curve of Example 2.5.2

From the above Fig. 3, we can see that in the initial iteration phase, these four iterative methods exhibit relatively fast convergence. However, as the iterations progress, the convergence speed slows down. Upon reaching a certain number of iterations, it becomes evident that the R-GADI method demonstrates better convergence speed and requires the least amount of time.

3 Low rank GADI for solving Riccati equation

3.1 Derivation of iterative format

In this section, we investigate the conclusions related to solving the continuous algebraic Riccati Eq. (1) using the GADI method. The algebraic Riccati equation has its origins in numerical problems in control theory and finds wide applications, particularly in the design of quadratic optimal control. For certain linear quadratic optimization control problems, the problem can eventually be transformed into solving the algebraic Riccati equation for a stable solution. Generally, the solution of the algebraic Riccati Eq. (1) is not unique. Therefore, we first provide a sufficient condition for the existence of a unique solution.

Lemma 3

(Xu (2011)) If the matrix pair (AB) in the algebraic Riccati Eq.  (1) is stabilizable, meaning that for any \(\lambda \) such that Re\(\lambda \ge 0\), the matrix \([A-\lambda I\ B]\) has full row rank, and if the matrix pair \((C,\ A)\) is detectable, meaning that for any \(\lambda \) and x such that Re\(\lambda \ge 0\) and \(Ax=\lambda x\), we have \(Cx\ne 0\), then Eq. (1) has a unique positive semi-definite solution, and this solution is stable.

Firstly, we consider the Newton iteration method:

  • Define mapping \(F:{\mathbb {R}}^{n\times n}\rightarrow \mathbb R^{n\times n}\) is

    $$\begin{aligned} F(X)=A^{T}X+XA+Q-XGX, \ \ X\in {\mathbb {R}}^{n\times n}, \end{aligned}$$

    then we have

    $$\begin{aligned} F(X+E)&=A^{T}(X+E)+(X+E)A+Q-(X+E)G(X+E)\\&=A^{T}X+XA-XGX+Q+A^{T}E+EA-EGX-XGE-EGE\\&=F(X)+L(X)+O(\Vert E\Vert _{2}^{2}) \ (E\rightarrow 0), \end{aligned}$$

    where

    $$\begin{aligned} L(E)&=A^{T}E+EA-EGX-XGE\\&=(A-GX)^{T}E+E(A-GX), \end{aligned}$$

    it is a linear operator. From the definition of Fréchet differentiability, it follows that for any \(X\in {\mathbb {R}}^{n\times n},\ F\) is Fréchet differentiable at X. There is

    $$\begin{aligned} F^{\prime }(X)(E)=(A-GX)^{*}E+E(A-GX),\ E\in {\mathbb {R}}^{n\times n}. \end{aligned}$$

Thus, by applying Newton’s method \(F^{\prime }(X_{k})(E_{k})=-F(X_{k})\), we can get the iterative format of the Newton iteration method for solving equation (1).

  1. (a)

    Seeking \(E_{k}\) that satisfies the equation

    $$\begin{aligned} (A-GX_{k})^{T}E_{k}+E_{k}(A-GX_{k})=-F(X_{k}). \end{aligned}$$
    (14)
  2. (b)

    Calculate \(X_{k+1}=X_{k}+E_{k}\), where \(X_{0}\in \mathbb R^{n\times n}\) is the initial matrix. Rearranging Eq. (14) and using \(X_{k+1}=X_{k}+E_{k}\), then we can obtain the equation satisfied by \(X_{k+1}\)

$$\begin{aligned} (A-GX_{k})^{T}X_{k+1}+X_{k+1}(A-GX_{k})+X_{k}GX_{k}+Q=0. \end{aligned}$$
(15)
  • Next, we apply the Kleinman-Newton method (Kleinman 1968) with an initial feedback matrix \(K_{0}=0\). Let \(K_{k}=X_{k}B\) to obtain another equivalent form of Eq. (15).

    $$\begin{aligned} (A-BK_{k}^{T})^{T}X_{k+1}+X_{k+1}(A-BK_{k}^{T})=-K_{k}K_{k}^{T}-C^{T}C, \end{aligned}$$
    (16)

    with \(A_{k}=BK_{k}^{T}-A,\ M_{k}=[K_{k},\ C^{T}].\) Therefore, Eq. (16) can be rewritten as the following Lyapunov equation form

    $$\begin{aligned} A_{k}^{T}X_{k+1}+X_{k+1}A_{k}=M_{k}M_{k}^{T}. \end{aligned}$$
    (17)
  • In addition, we apply the GADI framework to solve the Lyapunov Eq. (17) and obtain an iterative format

    $$\begin{aligned} \left\{ \begin{array}{ll} (\alpha _{k}I+A_{k}^{T})X_{k+1}^{(l+\frac{1}{2})}=X_{k+1}^{(l)}(\alpha _{k}I-A_{k})+Q_{k},\\ X_{k+1}^{(l+1)}(\alpha _{k}I+A_{k})=X_{k+1}^{(l)}(A_{k}-(1-\omega _{k})\alpha _{k}I)+(2-\omega _{k})\alpha _{k}X_{k+1}^{(l+\frac{1}{2})}, \end{array} \right. \end{aligned}$$
    (18)

where

$$\begin{aligned}{} & {} Q_{k}=M_{k}M_{k}^{T},\ \ l=0,1,\cdots ,s, \\{} & {} X_{k+1}^{(0)}=X_{k},\ \ X_{k+1}=X_{k+1}^{(s)},\ \ \alpha _{k}>0,\ \ 0\le \omega _{k}<2. \end{aligned}$$

Next, we further manipulate the iterative scheme (18) and obtain the following expression under the assumption that \(\alpha _{k}I+A_{k}^{T}\) and \(\alpha _{k}I+A_{k}\) are non-singular.

$$\begin{aligned} X_{k+1}^{(l+1)}&=X_{k+1}^{(l)}(A_{k}-(1-\omega _{k})\alpha _{k}I)(\alpha _{k}I+A_{k})^{-1}\\&\quad +(2-\omega _{k})\alpha _{k}(\alpha _{k}I+A_{k}^{T})^{-1}X_{k+1}^{(l)}(\alpha _{k}I-A_{k})(\alpha _{k}I+A_{k})^{-1}\\&\quad +(2-\omega _{k})\alpha _{k}(\alpha _{k}I+A_{k}^{T})^{-1}Q_{k}(\alpha _{k}I+A_{k})^{-1}. \end{aligned}$$

Due to \(X_{k+1}^{(0)}=X_{k}\), then we have

$$\begin{aligned} X_{k+1}^{(1)}&=X_{k}(A_{k}-(1-\omega _{k})\alpha _{k}I)(\alpha _{k}I+A_{k})^{-1}\\&\quad +(2-\omega _{k})\alpha _{k}(\alpha _{k}I+A_{k}^{T})^{-1}X_{k}(\alpha _{k}I-A_{k})(\alpha _{k}I+A_{k})^{-1}\\&\quad +(2-\omega _{k})\alpha _{k}(\alpha _{k}I+A_{k}^{T})^{-1}Q_{k}(\alpha _{k}I+A_{k})^{-1}. \end{aligned}$$

Let the initial value of iteration \(X_ {0}=0 \), we get

$$\begin{aligned} X_{1}^{(1)}=(2-\omega _{0})\alpha _{0}(\alpha _{0}I+A_{0}^{T})^{-1}Q_{0}(\alpha _{0}I+A_{0})^{-1}=V_{1}^{(1)}(W_{1}^{(1)})^{T}, \end{aligned}$$

where

$$\begin{aligned} V_{1}^{(1)}&=\sqrt{(2-\omega _{0})\alpha _{0}}(\alpha _{0}I+A_{0}^{T})^{-1}M_{0}\\&=\sqrt{(2-\omega _{0})\alpha _{0}}(\alpha _{0}I-A^{T})^{-1}C^{T},\\ W_{1}^{(1)}&=V_{1}^{(1)}. \end{aligned}$$

Therefore, we provide a low rank Kleinman Newton GADI iterative scheme for the corresponding Riccati equation.

$$\begin{aligned} \left\{ \begin{array}{lllllllllll} A_{k}=BK_{k}^{T}-A,\ M_{k}=[K_{k},\ C^{T}],\ \ Q_{k}=M_{k}M_{k}^{T},\\ V_{1}^{(1)}=\sqrt{(2-\omega _{0})\alpha _{0}}(\alpha _{0}I-A^{T})^{-1}C^{T},\\ V_{k+1}^{(1)}=\sqrt{(2-\omega _{k})\alpha _{k}}(\alpha _{k}I+A_{k}^{T})^{-1}M_{k},\\ V_{k+1}^{(l)}=[V_{k+1}^{(l-1)},\sqrt{(2-\omega _{k})\alpha _{k}}(\alpha _{k}I+A_{k}^{T})^{-1}V_{k+1}^{(l-1)},V_{k+1}^{(1)}],\\ W_{1}^{(1)}=V_{1}^{(1)},\\ W_{k+1}^{(1)}=\sqrt{(2-\omega _{k})\alpha _{k}}(\alpha _{k}I+A_{k}^{T})^{-1}M_{k},\\ W_{k+1}^{(l)}=[(\alpha _{k}I+A_{k}^{T})^{-1}(A_{k}^{T}-(1-\omega _{k})\alpha _{k}I)W_{k+1}^{(l-1)},\\ \ \ \ \ \ \ \ \ \ \sqrt{(2-\omega _{k})\alpha _{k}}(\alpha _{k}I+A_{k}^{T})^{-1}(\alpha _{k}I-A_{k}^{T})W_{k+1}^{(l-1)},W_{k+1}^{(1)}],\\ X_{k+1}^{(l)}=V_{k+1}^{(l)}(W_{k+1}^{(l)})^{T},\\ K_{k+1}= V_{k+1}W_{k+1}^{T}B. \end{array} \right. \end{aligned}$$
(19)
Algorithm 2
figure b

Low rank GADI iteration for solving Lyapunov equation 17

Algorithm 3
figure c

A low rank Kleinman-Newton GADI method for solving Riccati equation 1

3.2 Convergence analysis

Next, we provide the convergence results of using the Newton method to solve Eq. (15).

Theorem 7

(Xu (2011)) For the algebraic Riccati Eq. (1), assuming \((A,\ G)\) is stable and \((Q,\ A)\) is detectable, and selecting a symmetric positive semi-definite matrix \(X_ {0}\) such the \(A-GX_ {0}\) is stable, then the matrix sequence \(\{X_ {k}\}\) generated by the Newton iteration converges quadratically to the unique positive semi-definite solution X of (1). In other words, there exists a constant \(\delta >0\) independent of k, such that for all positive integers

$$\begin{aligned} \Vert X_{k}-X\Vert _{2}\le \delta \Vert X_{k-1}-X\Vert _{2}^{2}. \end{aligned}$$

Furthermore, the iteration sequence \(\{X_{k}\}\) exhibits monotonic convergence, i.e.,

$$\begin{aligned} 0\le X\le \cdots \le X_{k+1}\le X_{k}\le \cdots \le X_{1}. \end{aligned}$$

Let

$$\begin{aligned} R(X_{k+1})=A_{k}^{T}X_{k+1}+X_{k+1}A_{k}-M_{k}M_{k}^{T} \end{aligned}$$

be the residual matrix of the Lyapunov Eq. (17), then we have the following proposition.

Proposition 8

Let \(X_{k}\) be the k-th step iteration generated by the Kleinman-Newton method as described above, then

$$\begin{aligned} F(X_{k+1})=-R(X_{k+1})-(X_{k}B-X_{k+1}B)(X_{k}B-X_{k+1}B)^{T}. \end{aligned}$$

Proof

From the residual equation defined above, we can obtain that

$$\begin{aligned} R(X_{k+1})&=A_{k}^{T}X_{k+1}+X_{k+1}A_{k}-M_{k}M_{k}^{T}\\&=(K_{k}B^{T}-A^{T})X_{k+1}+X_{k+1}(BK_{k}^{T}-A)-K_{k}K_{k}^{T}-C^{T}C\\&=K_{k}B^{T}X_{k+1}-A^{T}X_{k+1}+X_{k+1}BK_{k}^{T}-X_{k+1}A-K_{k}K_{k}^{T}-C^{T}C\\&=-F(X_{k+1})+K_{k}B^{T}X_{k+1}+X_{k+1}BK_{k}^{T}-X_{k+1}BB^{T}X_{k+1}-K_{k}K_{k}^{T}-C^{T}C\\&=-F(X_{k+1})+X_{k}BB^{T}X_{k+1}+X_{k+1}BB^{T}X_{k}\\&\quad -X_{k+1}BB^{T}X_{k+1}-X_{k}BB^{T}X_{k}-C^{T}C\\&=-F(X_{k+1})-(X_{k}-X_{k+1})BB^{T}(X_{k}-X_{k+1})\\&=-F(X_{k+1})-(X_{k}B-X_{k+1}B)(X_{k}B-X_{k+1}B)^{T}. \end{aligned}$$

Therefore, from Proposition 8 we can directly prove

$$\begin{aligned} \Vert F(X_{k+1})\Vert _{2}&\le \Vert R(X_{k+1})\Vert _{2}+\Vert (X_{k}B-X_{k+1}B)(X_{k}B-X_{k+1}B)^{T}\Vert _{2}\\&\le \Vert R(X_{k+1})\Vert _{2}+\Vert X_{k}B-X_{k+1}B\Vert _{2}^{2}. \end{aligned}$$

\(\square \)

Theorem 9

Let \(X_{k+1}\) is a symmetric positive semi-definite of the Eq. (17), and let \((A_{k},\ M_{k})\) is stable, and parameter \(\alpha >0,\ 0\le \omega <2\), then for all \(k=0,1,\cdots \), the low-rank form defined in Eq. (19) \(V_{k+1}^{(l)}(W_{k+1}^{(l)})^{T}\) converges to \(V_{k+1}W_{k+1}^{T}\), which is equivalent to the iteration sequence defined in Eq. (18) \(\{X_{k+1}^{(l)}\}_{l=0}^{\infty }\) converges to \(X_{k+1}\).

Proof

Here, we only need to prove the convergence of the GADI iteration format in Eq. (18). Applying the flattening operator to Eq. (18) yields

$$\begin{aligned} \left\{ \begin{array}{lll} (\alpha _{k}I_{n^{2}}+I\otimes A_{k}^{T})\text {vec}(X_{k+1}^{(l+\frac{1}{2})})=(\alpha _{k}I_{n^{2}}-A_{k}^{T}\otimes I)\text {vec}(X_{k+1}^{(l)})+\text {vec}(Q_{k}),\\ (\alpha _{k}I_{n^{2}}+A_{k}^{T}\otimes I)\text {vec}(X_{k+1}^{(l+1)})=(A_{k+1}^{T}\otimes I-(1-\omega _{k})\alpha _{k}I_{n^{2}})\text {vec}(X_{k+1}^{(l)})\\ \qquad \qquad \qquad \qquad \qquad \qquad \quad +(2-\omega _{k})\alpha _{k}\text {vec}(X_{k+1}^{(l+\frac{1}{2})}), \end{array} \right. \end{aligned}$$
(20)

where \(x=\text {vec}(X),\ \ q_{k}=\text {vec}(Q_{k})\), then the equivalent iteration format is obtained as follows

$$\begin{aligned} \left\{ \begin{array}{ll} (\alpha _{k}I_{n^{2}}+I\otimes A_{k}^{T})x_{k+1}^{(l+\frac{1}{2})}=(\alpha _{k}I_{n^{2}}-A_{k}^{T}\otimes I)x_{k+1}^{(l)}+q_{k},\\ (\alpha _{k}I_{n^{2}}+A_{k}^{T}\otimes I)x_{k+1}^{(l+1)}=(A_{k}^{T}\otimes I-(1-\omega _{k})\alpha _{k}I_{n^{2}})x_{k+1}^{(l)}+(2-\omega _{k})\alpha _{k}x_{k+1}^{(l+\frac{1}{2})}. \end{array} \right. \end{aligned}$$
(21)

Next, the proof of this theorem can be referenced to the proof method of Theorem 2.6 in Li et al. (2022). \(\square \)

3.3 Complexity analysis

Here, we adopt the Kleinman–Newton iteration to transform the Riccati equation into the Lyapunov Eq. (17) and then use the low-rank GADI method for iterative solving. Each iteration in the algorithm requires solving a Lyapunov equation, resulting in significant computational complexity. The Kleinman–Newton iteration method is an improved version of the Newton iteration method, as the right-hand side of Eq. (14) is usually indefinite with a full rank matrix, while the method employed in this paper relies heavily on the low-rank structure of the right-hand side. The rank of the matrix selected in equation is at most \(m+p\), and we use the low-rank method to compute the approximate solution, making the Kleinman–Newton iteration method more suitable. We use the product of two low-rank matrices \(V_{k}^{(l)}(W_{k}^{(l)})^{T}\) to replace \(X_{k}^{(l)}\). Next, we provide the stopping criterion used in the algorithm.

Firstly, we define the residual matrix of the Riccati equation (17) as follows:

$$\begin{aligned} F(X_{k})=A^{T}X_{k}+X_{k}A-X_{k}GX_{k}+Q. \end{aligned}$$

During the iteration process, in order to reflect its relative approximation level, the relative residual

$$\begin{aligned} \text {Res}(X_{k})=\frac{\Vert A^{T}X_{k}+X_{k}A-X_{k}GX_{k}+Q\Vert _{2}}{\Vert Q\Vert _{2}} \end{aligned}$$

is commonly used as a stopping criterion for iteration. However, in Newton iterations, forming the residual matrix \(R(V_{k}W_{k}^{T})\) requires a significant amount of memory. If the number of columns in \(V_{k}\) is much smaller than the number of rows, the relative residual can be effectively used as the stopping criterion.

Additionally, we can observe the variation of the feedback matrix \(K_{k+1}\). We use the criterion

$$\begin{aligned} R_{s}=\frac{\Vert K_{k+1}-K_{k}\Vert _{2}}{\Vert K_{k+1}\Vert _{2}}<\varepsilon , \end{aligned}$$

where \(K_{k+1}=V_{k+1}W_{k+1}^{T}B\) and \(\varepsilon \) is a very small positive number. This criterion is computationally efficient since \(K\in {\mathbb {R}}^{n\times m}\) and \(m\ll n\).

The direct iteration method has a memory requirement of \(O(n^{2})\), and a computational complexity of \(O(n^{3})\). First, the computational cost of \(X_{k+1}^{l+\frac{1}{2}}\) is \(4n^{2}(2n-1)+3n^{2}\), and the cost of \(X_{k+1}^{l+1}\) is \(4n^{2}(2n-1)+4n^{2}\). Therefore, the total computational cost is \(16n^{3}-n^{2}\). By comparing Algorithm 2 and Algorithm 3, we can see that they have smaller memory requirements. If A is a sparse matrix, the memory requirement can reach O(n). Next, we calculate their computational complexity. Since \(A\in {\mathbb {R}}^{n\times n}, \ B\in {\mathbb {R}}^{n\times m}\) and \(C\in {\mathbb {R}}^{p\times n}\), during the iteration process, the column numbers of matrices \(V_{k}\) and \(W_{k}\) ncrease with the number of iterations. Here, we have \(A_{k}\in {\mathbb {R}}^{n\times n},\ M_{k}\in {\mathbb {R}}^{n\times (m+p)}\). The computational cost of calculating \(V_{1}^{(1)}\) is \(2n^{2}p\), where \(W_{1}^{(1)}=V_{1}^{(1)}\). The cost of calculating \(V_{k}^{(1)}\) is \(2n^{2}(m+p)\), and the cost of \(V_{k}^{(l)}\) is \(n^{2}(2^{l}-2)(m+p)\), while the cost of \(W_{k}^{(l)}\) is \(4n^{3}+2n^{2}(2^{l}-2)(m+p)\). Therefore, the total computational cost is \(4n^{3}+n^{2}(3\cdot 2^{l}(m+p)-2(2m+p))\).

3.4 Numerical experiments

In this section, we use two examples of the quadratic optimal control to show the numerical feasibility and effectiveness of the Kleinman-Newton-RGADI algorithms. The Riccati equation plays a central role in optimal control theory, particularly in linear quadratic regulator (LQR) problems. By solving the Riccati equation, optimal control strategies can be found to minimize the cost of control and state errors. Additionally, the Riccati equation can be utilized for stability analysis of linear systems and designing stable feedback control systems. In our numerical experiment, set the iteration tolerance \(\varepsilon =1\times 10^{-12}\). The whole process is performed on a computer with Intel Core 1.00GHz CPU, 8.00GB RAM, and MATLAB R2018a. IT(inn) and IT(out) represent the number of inter iteration steps and outer iteration steps, respectively, and CPU by the computing time.

Example 3.4.1 Consider the linear time-invariant control system of the form

$$\begin{aligned} \left\{ \begin{array}{ll} {\dot{x}}(t)=Ax(t)+Bu(t),\\ y(t)=Cx(t). \end{array} \right. \end{aligned}$$

where

$$\begin{aligned}{} & {} A= \left( \begin{array}{ccccc} -12 &{} -3 &{} 0 &{} \cdots &{} 0\\ 2 &{} -12 &{} -3 &{} \cdots &{} 0\\ \vdots &{} \ddots &{} \ddots &{} \ddots &{} \vdots \\ 0 &{} \cdots &{} 2 &{} -12 &{} -3\\ 0 &{} \cdots &{} 0 &{} 2 &{} -12\\ \end{array} \right) _{n\times n},\ \ B=\left( \begin{array}{c} 0.2 \\ 0.2 \\ \vdots \\ 0.2 \\ 0.2 \\ \end{array} \right) _{n\times 1},\\{} & {} C=\left( \begin{array}{ccccc} 0.1, &{} 0.1,&{} \cdots ,&{} 0.1,&{} 0.1 \\ \end{array} \right) _{1\times n},\ \ \ G=BB^{T},\ \ Q=C^{T}C. \end{aligned}$$

When the matrix dimension increases in multiples, we choose the relative residual as the iteration stopping criterion and use the Kleinman–Newton-GADI (K-N-GADI), Kleinman-Newton-RADI (K-N-RADI), and Kleinman–Newton-RGADI (K-N-RGADI) methods for computation. These iteration methods all start from the initial value \(X_{0}=0\) and yield numerical results shown in Table 4. From the table data, we can see that the K-N-RGADI method is more efficient in solving this example problem compared to the K-N-GADI and K-N-RADI methods. Additionally, Fig. 5 clearly shows the variation of iteration steps and relative residual for these three methods when \(n=1024\), while Fig. 5 displays the time consumption of these three iteration methods as the matrix dimension increases. These findings further demonstrate the effectiveness of the K-N-RGADI method.

Table 4 Numerical results for Example 3.4.1
Fig. 5
figure 5

The residual curve of Example 3.4.1 \(n=1024\)

Fig. 6
figure 6

The time curve of Example 3.4.1

Furthermore, as the order n of the coefficient matrix in the equation increases multiplicatively, we employ the K-N-RGADI method with the relative change of the feedback matrix \((R_{s})\) as the iteration stopping criterion. We compare this method with the K-N-GADI and K-N-RGADI methods that use the relative residual Res as the iteration stopping criterion. From Table 5, we can observe that when using the relative change of the feedback matrix as the iteration stopping criterion, the K-N-RGADI method significantly reduces the running time.

Table 5 Numerical results for Example 3.4.1

Figure 7 shows the iterative steps and corresponding residual \(R_{s}\) obtained by using the K-N-RGADI method to compute different matrix dimensions n. It can be clearly seen that as the matrix dimension increases, the residual \(R_{s}\) also increases.

Fig. 7
figure 7

The residual curve of Example 3.4.1

Example 3.4.2 Consider the linear time-invariant control system of the form

$$\begin{aligned} \left\{ \begin{array}{ll} {\dot{x}}(t)=Ax(t)+Bu(t),\\ y(t)=Cx(t). \end{array} \right. \end{aligned}$$

where

$$\begin{aligned}{} & {} A= \left( \begin{array}{ccccccc} -12 &{} -3 &{} -2 &{} 0 &{} 0 &{} \cdots &{} 0\\ 2 &{} -12 &{} -3 &{} -2 &{} 0 &{}\cdots &{} 0\\ 1 &{} 2 &{} -12 &{} -3 &{} -2&{} \cdots &{} 0\\ \vdots &{} \ddots &{} \ddots &{} \ddots &{} \ddots &{} \ddots &{} \vdots \\ 0 &{} \cdots &{} 1 &{} 2 &{} -12 &{} -3 &{} -2\\ 0 &{} \cdots &{}0 &{} 1 &{} 2 &{} -12 &{} -3\\ 0 &{} \cdots &{}0 &{} 0 &{} 1 &{} 2 &{} -12\\ \end{array} \right) _{n\times n},\ \ B=\left( \begin{array}{c} 0.2 \\ 0.2 \\ \vdots \\ 0.2 \\ 0.2 \\ \end{array} \right) _{n\times 1},\\{} & {} C=\left( \begin{array}{ccccc} 0.1, &{} 0.1,&{} \cdots ,&{} 0.1,&{} 0.1 \\ \end{array} \right) _{1\times n},\ \ \ G=BB^{T},\ \ Q=C^{T}C. \end{aligned}$$

We first used the relative residual Res as the iterative stopping criterion and adopted K-N-GADI, K-N-RADI, and K-N-RGADI methods to solve this example problem. The initial iteration value is set to \(X_{0}=0\), and the numerical results obtained are shown in Table  6.

Table 6 Numerical results for Example 3.4.2

Next, we utilize the relative change of the feedback matrix \((R_{s})\) as the stopping criterion for the K-N-RGADI iteration method and compare its runtime with that of the K-N-GADI and K-N-RGADI iteration methods using the relative residual Res as the stopping criterion. As shown in Table 7, when using the relative change of the feedback matrix as the iteration stopping criterion, the K-N-RGADI method achieves a shorter runtime.

Table 7 Numerical results for Example 3.4.2

4 Conclusions

This paper presents a low-rank GADI algorithm for computing low-rank approximate solutions to large-scale Lyapunov and algebraic Riccati equations. In the computation of low-rank approximate solutions to the algebraic Riccati equation, we combine the Kleinman-Newton method and utilize the low-rank GADI algorithm to solve the Lyapunov equation at each Newton step, resulting in the Kleinman-Newton-RGADI algorithm. Additionally, we observe that the low-rank GADI method exhibits the same convergence properties as the GADI method when solving both the Lyapunov and algebraic Riccati equations with low-rank approximations. Furthermore, numerical examples are provided to compare the effectiveness of the low-rank ADI algorithm and the low-rank GADI algorithm. The results demonstrate that the low-rank GADI method is more efficient. However, like other solvers, the performance of this algorithm heavily relies on the choice of shift parameters, which remains a challenging problem. Currently, there have been some new parameter selection methods developed. For instance, neural networks can be utilized to learn parameters during the process of solving the Lyapunov equation, to further improve the efficiency of the algorithm. Additionally, the Gaussian process regression (GPR) method based on the Bayesian inference, to predict the GADI framework’s relatively optimal parameters. Applying these parameter selection methods to our work will be a key focus of our upcoming efforts.