Abstract
The time-ordered exponential is defined as the function that solves a system of coupled first-order linear differential equations with generally non-constant coefficients. In spite of being at the heart of much system dynamics, control theory, and model reduction problems, the time-ordered exponential function remains elusively difficult to evaluate. The \(*\)-Lanczos algorithm is a (symbolic) algorithm capable of evaluating it by producing a tridiagonalization of the original differential system. In this paper, we explain how the \(*\)-Lanczos algorithm is built from a generalization of Krylov subspaces, and we prove crucial properties, such as the matching moment property. A strategy for its numerical implementation is also outlined and will be subject of future investigation.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Let \(t'\ge t\in I\subseteq \mathbb {R}\) be variables—called times for convenience—in an interval I, and \(\mathsf {A}(t')\) be an \(N\times N\) time-dependent matrix. For a fixed t (usually \(t=0\)), the time-ordered exponential of \(\mathsf {A}(t')\) is defined as the unique solution \(\mathsf {U}(t',t)\) of the non autonomous system of linear ordinary differential equations
with \(\mathsf {Id}\) the identity matrix. Note that t represents the time on which the initial condition is given, and that the (unusual) notation \(\mathsf {U}(t',t)\) will be useful later. If the matrix \(\mathsf {A}\) commutes with itself at all times, i.e., \(\mathsf {A}(\tau _1)\mathsf {A}(\tau _2)-\mathsf {A}(\tau _2)\mathsf {A}(\tau _1)=0\) for all \(\tau _1,\tau _2 \in I\), then the time-ordered exponential is given by the matrix exponential \(\mathsf {U}(t',t)=\exp \left( \int _t^{t'} \mathsf {A}(\tau )\, \text {d}\tau \right) .\) However, when \(\mathsf {A}\) does not commute with itself at all times, the time-ordered exponential has no known explicit form in terms of \(\mathsf {A}\) and is rather denoted
with \(\mathcal {T}\) the time-ordering operator [1]. This expression, introduced by Dyson in 1952, is more a notation than an explicit form as the action of the time-ordering operator is very difficult to evaluate. In particular, \(\mathsf {U}(t',t)\) does not have a Cauchy integral representation, and it cannot be evaluated via ordinary diagonalization. It is unlikely that a closed form expression for \(\mathsf {U}(t',t)\) in terms of \(\mathsf {A}\) exists at all since even when \(\mathsf {A}\) is \(2\times 2\), \(\mathsf {U}\) can involve very complicated special functions [2, 3].
Evaluating time-ordered exponentials is a central question in the field of system dynamics, in particular in quantum physics where \(\mathsf {A}\) is the Hamiltonian operator. Situations where this operator does not commute with itself are routinely encountered [4], and the departure of the time-ordered exponential from a straightforward matrix exponential is responsible for many peculiar physical effects [5,6,7]. Further applications are found via differential Lyapunov and Riccati matrix equations, which frequently appear in control theory, filter design, and model reduction problems [8,9,10,11,12]. Indeed, the solutions of such differential equations involve time-ordered exponentials [13,14,15,16].
In [17], we introduced a tridiagonal form for the matrix \(\mathsf {A}(t')\) from which it is possible to express a time-ordered exponential via path-sum continued fractions of finite depth. More precisely, the described procedure formulates each element of a time-ordered exponential in terms of a finite and treatable number of scalar integro-differential equations. Such a tridiagonal form is obtained by using the so-called \(*\)- Lanczos algorithm. The algorithm also appeared in [18] as the paper’s motivation. Despite being at the core of results in [17] and being the motivation of [18], the \(*\)-Lanczos algorithm construction and the proofs of related main properties have yet not appeared in a scientific journal. The present paper aims to fill a gap in the literature as it constructs the \(*\)-Lanczos algorithm from a (generalized) Krylov subspace perspective, proves the related Matching Moment Property, introduces a bound for the approximation error, and adds further results on the breakdown issue that may affect the method.
1.1 Existing analytical approaches: Pitfalls and drawbacks
In spite of the paramount importance of the time-ordered exponential, it is usually omitted from the literature on matrix functions. Until 2015, only two families of analytical approaches existed (numerical methods will be discussed in Sect. 5). The first one to have been devised relies on Floquet theory and necessitates \(\mathsf {A}(t')\) to be periodic (see, e.g., [4]). This method transforms Eq. (1.1) into an infinite system of coupled linear differential equations with constant coefficients. This system is then solved by a perturbative expansion at very low order relying on the presence of a small parameter. Orders higher than 2 or 3 are typically too involved to be treated. The second method was developed in 1954 by Wilhelm Magnus [19]. It produces an infinite series of nested commutators of \(\mathsf {A}\) with itself at different times, the ordinary matrix exponential of which provides the desired solution \(\mathsf {U}(t',t)\). Magnus series are very much in use nowadays [4], especially because they guarantee that the approximation to \(\mathsf {U}(t',t)\) is unitary in quantum mechanical calculations [4]. Nevertheless, the Magnus series for \(\mathsf {U}(t',t)\) has a small convergence domain; see [20] and also [21,22,23,24,25].
In 2015, in the joint article [26], one of the authors presented a third method to obtain time-ordered exponentials using graph theory and necessitating only the entries \(\mathsf {A}(t')_{k\ell }\) to be bounded functions of time [26]. The method formulates any desired entry or group of entries of \(\mathsf {U}(t',t)\) as a branched continued fraction of finite depth and breadth. It has been succesfully used to solve challenging quantum dynamic problems, see e.g. [27, 28]. This approach is unconditionally convergent and it provides exact expressions in terms of a finite number of integrals and Volterra equations. However, it suffers from a complexity drawback. Indeed, it requires one to find all the simple cycles and simple paths of a certain graph G. These are the walks on G which are not self-intersecting. Unfortunately, the problem of enumerating such walks is \(\#\)P-completeFootnote 1 [29], hindering the determination of exact solutions in large systems that must be treated using a further property of analytical path-sums called scale-invariance [28]. The present work with the results in [17, 18] solves this issue by transforming the original matrix into a structurally simpler one on which the path-sum solution takes the form of an ordinary, finite, continued fraction.
1.2 The non-Hermitian Lanczos algorithm: background
Consider the simpler case in which \(\mathsf {A}\) is not time-dependent. The solution of (1.1) is given by the matrix function \(\exp (\mathsf {A}(t'-t))\) which can be numerically approximated in several different ways (see, e.g., [30,31,32]). One possible method is the (non-Hermitian) Lanczos algorithm. Computing the \((k,\ell )\) element of \(\exp (\mathsf {A})\) is equivalent to computing the bilinear form \(\varvec{e}_k^H\exp (\mathsf {A}) \,\varvec{e}_\ell \), with \(\varvec{e}_k, \varvec{e}_\ell \) vectors from the canonical Euclidean basis, and \(\varvec{e}_k^H\) the usual Hermitian transpose (here it coincides with the transpose since the vector is real). The non-Hermitian Lanczos algorithm (e.g., [33,34,35,36]) gives, when no breakdown occurs, the matrices
whose columns are biorthonormal bases respectively for the Krylov subspaces
Note that for \(\mathsf {A}\) Hermitian and \(k=\ell \) we can equivalently use the Hermitian Lanczos algorithm (getting \(\mathsf {V}_n = \mathsf {W}_n\)). The so-called (complex) Jacobi matrix \(\mathsf {J}_n\) is the tridiagonal symmetric matrix with generally complex elements obtained by
As described in [35], we can use the approximation
which relies on the so-called matching moment property, i.e.,
see, e.g., [35, 36] for the Hermitian case, and [37, 38] for the non-Hermitian one. The approximation (1.2) is a model reduction in two ways. First, the size of \(\mathsf {J}_n\) is much smaller than that of \(\mathsf {A}\). Second, the structure of the matrix \(\mathsf {J}_n\) is much simpler since it is tridiagonal.
Given a matrix \(\mathsf {A}\) with size N, the Lanczos algorithm can be used as a method for its tridiagonalization (see, e.g., [39]). Assuming no breakdown, the Nth iteration of the non-Hermitian Lanczos with input the matrix \(\mathsf {A}\) and a couple of vectors \(\varvec{v}, \varvec{w}\) produces the tridiagonal matrix \(\mathsf {J}_N\), and the biorthogonal square matrices \(\mathsf {V}_N, \mathsf {W}_N\) so that
giving the exact expression
Theorem 2.2 which we prove in this work extends this result to time-ordered exponentials.
The Lanczos approximation (1.2) is connected with several further topics, such as (formal) orthogonal polynomials, Gauss quadrature, continued fractions, the moment problem, and many others. Information about these connections and references to the related rich and vast literature can be found, e.g., in the monographs [35, 36, 40] and the surveys [37, 38].
Inspired by approximation (1.2), the \(*\)-Lanczos algorithm produces a model reduction of a time-ordered exponential by providing a time-dependent tridiagonal matrix \(\mathsf {T}_n\) satisfying properties analogous to the ones described above [17]. Differently from the classical case, the \(*\)-Lanczos algorithm works on vector distribution subspaces and it has to deal with a non-commutative product.
The time-dependent framework in which the proposed method works is much more complicated than the time-independent Krylov subspace approximation given by the Lanczos algorithm. In this paper, we will not deal with the behavior of the \(*\)-Lanczos algorithm when taking into account approximations and finite-precision arithmetic problems.
1.3 Outline
The work is organized as follows: In Sect. 2, we build the \(*\)-Lanczos algorithm. The algorithm relies on a non-commutative \(*\)-product between generalized functions of two-time variables, which we describe in Sect. 2.1. Then, in Sect. 2.2, we state the main result, Theorem 2.1, which underpins the Lanczos-like procedure. Theorem 2.2 establishes that the first 2n \(*\)-moments of a certain tridiagonal matrix \(\mathsf {T}_n\) match the corresponding \(*\)-moments of the original matrix \(\mathsf {A}\). Theorem 2.1 is proved with the tools developed in the subsequent Sect. 2.3. Section 3 is devoted to the convergence and breakdown properties of the algorithm, while examples of its use are given in Sect. 4. In Sect. 5 we outline a way to implement the Lanczos-like procedure numerically and we evaluate its computational cost. Section 6 concludes the paper.
2 The \(*\)-Lanczos algorithm
2.1 The \(*\)-product and \(*\)-moments
In this section, we recall the definition and the main properties of the product introduced in [18, Section 1.2].
Let t and \(t'\) be two real variables in a real time interval I. We consider the class \(\text {D}(I)\) of all distributions which are linear superpositions of Heaviside theta functions and Dirac delta derivatives with smooth coefficients over \(I^2\). That is, a distribution d is in \(\text {D}(I)\) if and only if it can be written as
where \(N\in \mathbb {N}\) is finite, \(\Theta (\cdot )\) stands for the Heaviside theta function (with the convention \(\Theta (0)=1\)) and \(\delta ^{(i)}(\cdot )\) is the ith derivative of the Dirac delta distribution \(\delta = \delta ^{(0)}\). Here and from now on, a tilde over a function (e.g., \(\widetilde{d}(t',t)\)) indicates that it is an ordinary function smooth in both \(t'\in I\) and \(t\in I\). Note that we consider distributions as defined by Schwartz ( [41]). Hence a distribution \(f \in \text {D}(I)\) should be interpreted as a linear functional applied to test functions.
We can endow the class \(\text {D}(I)\) with a non-commutative algebraic structure upon defining a product between its elements. For \(f_1, f_2 \in \text {D}(I)\) we define the convolution-like \(*\) product between \(f_1(t',t)\) and \(f_2(t',t)\) as
that has as identity element the Dirac delta distribution, \(1_*:=\delta (t'-t)\). When \(f(t',t) = f(t'-t)\) has bounded supporting set, the \(*\)-product \(f *g\) (and \(g *f\)) is equivalent to the convolution product for distributions defined by Schwartz ([42, § 11] and [41, Chapter VI]). Since \(\delta ^{(i)}(t'-t)\) has bounded supporting set, given \(f \in \text {D}(I)\), the \(*\)-product \(\delta ^{(i)}(t'-t) *f\) and \(f *\delta ^{(i)}(t'-t)\) are well-defined and are both elements of \(\text {D}(I)\); see [18] for further details. Moreover, it holds
Consider the subclass \(\text {Sm}_\Theta (I)\) of \(\text {D}(I)\) comprising those distributions of the form
For \(f_1,\,f_2\in \text {Sm}_\Theta (I)\), the \(*\)-product between \(f_1,f_2\) simplifies to
which makes calculations involving such functions easier to carry out and shows that \(\text {Sm}_\Theta (I)\) is closed under \(*\)-multiplication. Together with the arguments above, this proves that \(\text {D}(I)\) is closed under \(*\)-multiplication. Hence, for \(f \in \text {D}(I)\), we can define its kth \(*\)-power \(f^{*k}\) as the k \(*\)-products \(f *f *\dots *f\), with the convention \(f^{*0} = \delta (t'-t)\). First examples of \(*\)-powers are
Note that, for members of \(\text {Sm}_\Theta (I)\), the \(*\)-product reduces exactly to the Volterra composition, a product between smooth functions of two-variables developed by Volterra and Pérès [43].
We will not discuss technicality associated with the \(*\)-product by and between distributions such as Dirac delta derivatives since it would bring us too far from the paper’s goals. More details on the distributions themselves can be found in [41] while illustrative examples of \(*\)-products involving such distributions can be found in [18].
The \(*\)-product extends directly to distributions of \(\text {D}(I)\) whose smooth coefficients depend on less than two variables. Indeed, consider a generalized function \(f_3(t',t)=\tilde{f}_3(t')\delta ^{(i)}(t'-t)\) with \(i\ge -1\) and \(\delta ^{(-1)}= \Theta \). Then
where \(f_1(t',t)\in \text {Sm}_\Theta (I)\) is as defined on p. 6. Hence the variable of \(\tilde{f}_3(t')\) is treated as the left variable of a smooth function of two variables. This observation extends straightforwardly should \(\tilde{f}_3\) be constant and, by linearity, to any distribution of \(\text {D}(I)\).
The \(*\)-product also naturally extends to matrices whose entries are distributions of \(\text {D}(I)\). Consider two of such matrices \(\mathsf {A}_1(t',t)\) and \(\mathsf {A}_2(t',t)\in \text {D}(I)^{N\times N}\) then
where the sizes of \(\mathsf {A}_1, \mathsf {A}_2\) are compatible for the usual matrix product (here and in the following, we omit the dependency on \(t'\) and t when it is clear from the context) and the integral is treated component-wise. As earlier, the \(*\)-product is associative and distributive with respect to the addition, but it is non-commutative. The identity element with respect to this product is now \(\mathsf {Id}_*:=\mathsf {Id}\,1_*\), with \(\mathsf {Id}\) the identity matrix of appropriate size.
Given a square matrix \(\mathsf {A}(t',t)\) composed of elements from \(\text {D}(I)\), we define the k-th matrix \(*\)-power \(\mathsf {A}^{*k}\) as the k \(*\)-products \(\mathsf {A}*\mathsf {A} *\dots *\mathsf {A}\). In particular, by (2.4) we get the bound
with \(\Vert \cdot \Vert _\star \) any induced matrix norm. As a consequence, the \(*\)-resolvent of any matrix depending on at most two variables is well defined, as \(\mathsf {R}_{*}(\mathsf {A}):=\left( \mathsf {Id}_*-\mathsf {A}\right) ^{*-1}=\mathsf {Id}_{*}+\sum _{k\ge 1} \mathsf {A}^{*k}\) exists provided every entry of \(\mathsf {A}\) is bounded for all \(t',t \in I\) (see [26]). Then
is the time-ordered exponential of \(\mathsf {A}(t',t)\); see [26]. We recall that \(\Theta (\cdot )\) here stands for the Heaviside function under the convention that \(\Theta (0)=1\). Note that time-ordered exponentials are usually presented with only one-time variable, corresponding to \(\mathsf {U}(t)=\mathsf {U}(t,0)\). Yet, in general \(\mathsf {U}(t',t)\ne \mathsf {U}(t'-t,0)\).
In the spirit of the Lanczos algorithm, given a time-dependent matrix \(\mathsf {A}(t',t)\), we will construct a matrix \(\mathsf {T}_n(t',t)\) of size \(n\le N\) with a simpler (tridiagonal) structure and so that, fixing the indexes \(k,\ell \), it holds
compare it with (1.3). In particular, when \(n=N\) property (2.7) stands for every \(j\ge 0\), giving
Hence the solution is given by the path-sum techniques exploiting the fact that the graph having \(\mathsf {T}_N\) as its adjacency matrix admits self-loops. More in general, given time-independent vectors \(\varvec{v},\varvec{w}\) we call the jth \(*\)-moment of \(\mathsf {A},\varvec{v},\varvec{w}\) the scalar function \(\varvec{w}^H (\mathsf {A}^{*j}(t',t))\, \varvec{v}\), for \(j \ge 0\) ( note that when the * product symbol is omitted, it stands for the usual matrix-vector product). Then property (2.7) is an instance of the more general case
2.2 Building up the \(*\)-Lanczos process
Given a doubly time-dependent matrix \(\mathsf {A}(t',t)=\widetilde{\mathsf {A}}(t')\Theta (t'-t)\) and \(k+1\) scalar generalized functions \(\alpha _0(t',t), \alpha _1(t',t),\dots ,\alpha _k({t',t}) \in \text {D}(I)\) which play the role of the coefficients, we define the matrix \(*\)-polynomial \(p(\mathsf {A})(t',t)\) of degree k as
moreover, we define the corresponding dual matrix \(*\)-polynomial as
where, in general, \(\bar{d}\) is the conjugated value of \(d \in \text {D}(I)\) and it is defined by conjugating the functions \(\widetilde{d}\) and \(\widetilde{d}_i\) in (2.1). Let \(\varvec{v}\) be a time independent vector, we can define the set of time-dependent vectors \(p(\mathsf {A})\varvec{v}\), with p a matrix \(*\)-polynomial. Such a set is a vector space with respect to the product \(*\) and with scalars \(\alpha _j(t',t)\) (the addition is the usual addition between vectors). Similarly, given a vector \(\varvec{w}^H\) not depending on time, we can define the vector space given by the dual vectors \(\varvec{w}^H p^D(\mathsf {A})\). In particular, we can define the \(*\)-Krylov subspaces
The vectors \(\varvec{v}, \mathsf {A}\varvec{v}, \dots ,\mathsf {A}^{*(n-1)}\varvec{v}\) and \(\varvec{w}^H, \varvec{w}^H \mathsf {A}, \dots ,\varvec{w}^H \mathsf {A}^{*(n-1)}\) are bases respectively for \(\mathcal {K}_n(\mathsf {A}, \varvec{v})\) and \(\mathcal {K}^D_n(\mathsf {A}, \varvec{w})\). We aim to derive \(*\)-biorthonormal bases \(\varvec{v}_0, \dots , \varvec{v}_{n-1}\) and \(\varvec{w}_0^H, \dots , \varvec{w}_{n-1}^H\) for the \(*\)-Krylov subspaces, i.e., so that
with \(\delta _{ij}\) the Kronecker delta.
Assume that \(\varvec{w}^H \varvec{v} = 1\), we can use a non-Hermitian Lanczos like biorthogonalization process for the triplet \((\varvec{w}, \mathsf {A}(t',t), \varvec{v})\). We shall call this method the \(*\)-Lanczos process. The first vectors of the biorthogonal bases are
so that \(\varvec{w}_0^H * \varvec{v}_0 = 1_*\). Consider now a vector \(\widehat{\varvec{v}}_1 \in \mathcal {K}_2(\mathsf {A},\varvec{v})\) given by
If the vector satisfies the \(*\)-biorthogonal condition \(\varvec{w}_0^H * \widehat{\varvec{v}}_1 = 0\), then
Similarly, we get the expression
with \(\alpha _0\) given by (2.9). Hence the \(*\)-biorthonormal vectors are defined as
with \(\beta _1=\widehat{\varvec{w}}_1^H*\widehat{\varvec{v}}_1\) and \(\beta _1^{*-1}\) its \(*\)-inverse, i.e., \(\beta _1^{*-1}*\beta _1 = \beta _1 *\beta _1^{*-1}= 1_*\). We give sufficient conditions for the existence of such \(*\)-inverses below. Assume now that we have the \(*\)-biorthonormal bases \(\varvec{v}_0, \dots , \varvec{v}_{n-1}\) and \(\varvec{w}_0^H, \dots , \varvec{w}_{n-1}^H\). Then we can build the vector
where the \(\gamma _i\) are determined by the condition \(\varvec{w}_j^H * \widehat{\varvec{v}}_n = \delta _{jn}1_*\), for \(j=0,\dots ,n-1\), giving
In particular, since \(\varvec{w}_j^H*\mathsf {A} \in \mathcal {K}_{j+1}^D(\mathsf {A},\varvec{w})\) we get \(\gamma _j = 0\) for \(j=0,\dots ,n-3\). This leads to the following three-term recurrences for \(n=1,2,\dots \) using the convention \(\varvec{v}_{-1} = \varvec{w}_{-1} = 0\),
with the coefficients given by
Should \(\beta _n\) not be \(*\)-invertible, we would get a breakdown in the algorithm, since it would be impossible to compute \(\varvec{v}_n\). We developed a range of general methods to determine the \(*\)-inverse of functions of two-time variables which are gathered in [18]. These methods constructively show the existence of \(\beta _n^{*-1}\) almost everywhere on I under the following conditions:
-
\(\beta _n\) is not identically zero in \(I^2\);
-
\(\beta _n\in \text {Sm}_\Theta (I)\).
The question of whether all \(\alpha _n, \beta _n\in \text {Sm}_\Theta (I)\) was settled affirmatively in [17].
Since the issue of breakdowns of the \(*\)-Lanczos algorithm is connected with the behavior of (usual) Lanczos techniques, we proceed as it is common when working with the non-Hermitian Lanczos algorithm. Thus, from now on, we assume all \(\beta _n\) to be \(*\)-invertible, while we come back to the issue of breakdowns in Sect. 3.2.
The \(*\)-orthogonalization process described above defines the \(*\)-Lanczos algorithm ( Algorithm 1).
Let us define the tridiagonal matrix
and the matrices \(\mathsf {V}_n:=[\varvec{v}_0, \dots , \varvec{v}_{n-1}]\) and \(\mathsf {W}_n := [\varvec{w}_0, \dots , \varvec{w}_{n-1}]\). Then the three-term recurrences Eqs. (2.10) read, in matrix form,
Hence the tridiagonal matrix (2.12) can be expressed as
The following property of \(\mathsf {T}_n\) is fundamental for the time-ordered exponential approximation; its proof is given in Sect. 2.3.
Theorem 2.1
(Matching Moment Property) Let \(\mathsf {A},\varvec{w},\varvec{v}\) and \(\mathsf {T}_n\) be as described above, then
Consider the time-ordered exponential \(\mathsf {U}_n\) given by the differential equation
Theorem 2.1 and Eq. (2.6) justify the use of the approximation \(\varvec{w}^H\mathsf {U}(t',t)\,\varvec{v} \approx \varvec{e}_1^H\mathsf {U}_n(t',t)\,\varvec{e}_1\) [17]. The system (2.16) can be seen as a reduced order model of the initial differential Eq. (1.1) from two points of view. First, n may be much smaller than the size of \(\mathsf {A}\); in this sense, in Sect. 3, we will discuss the convergence behavior of the approximation using Theorem 2.1. Secondly, looking at \(\mathsf {A}\) and \(\mathsf {T}_n\) as adjacency matrices, \(\mathsf {A}\) may correspond to a graph with a complex structure, while \(\mathsf {T}_n\) corresponds to a very simple graph composed of one path with possible self-loops. Then the path-sum method gives
see [26, 44]. This expression is analogous to the one for the first diagonal entry of the inverse of an ordinary tridiagonal matrix [45], except here all operations are taken with respect to the \(*\)-product.
For \(n=N\), we get
Theorem 2.2
Let \(\mathsf {A},\mathsf {V}_N,\mathsf {W}_N\) and \(\mathsf {T}_N\) be as described above, then
and thus
The theorem follows by using (2.18). Here, any entry of \(R_*(T_N)\) is computable using a path-sum continued fraction of depth at most N.
Remark 2.3
The Lanczos-like method presented here for the time-ordered exponential is immediately valid for the ordinary matrix exponential function, since the latter is obtained from the former in the situation where \(\mathsf {A}\) commutes with itself at all times,
This situation includes the case where \(\mathsf {A}\) is time-independent, in which case setting \(t=0\) and \(t'=1\) above yields the matrix exponential of \(\mathsf {A}\). However, the \(*\)-Lanczos algorithm cannot be considered a generalization of the Lanczos algorithm since its outputs on constant matrices are made of distributions and time dependent functions.
2.3 Matching \(*\)-moments through \(*\)-biorthonormal polynomials
In order to prove Theorem 2.1, we will exploit the connection between the \(*\)-Lanczos algorithm and families of \(*\)-biorthonormal polynomials. Let us define the set of \(*\)-polynomials
with \(\gamma _j(t',t) \in \text {D}(I)\). Consider a \(*\)-sesquilinear form \([\cdot , \cdot ]: \mathcal {P}_* \times \mathcal {P}_* \rightarrow \text {D}(I)\), i.e., so that given \(p_1,p_2,q_1,q_2 \in \mathcal {P}_*\) and \(\alpha , \beta \in \text {D}(I)\), it satisfies
From now on we assume that every considered \(*\)-sesquilinear form \([\cdot , \cdot ]\) also satisfies
The \(*\)-sesquilinear form \([\cdot ,\cdot ]\) is determined by its \(*\)-moments defined as
We aim to build sequences of \(*\)-polynomials \(p_0, p_1, \dots \) and \(q_0, q_1, \dots \) so that they are \(*\)-biorthonormal with respect to \([\cdot ,\cdot ]\), i.e.,
where the subindex j in \(p_j\) and \(q_j\) corresponds to the degree of the \(*\)-polynomial. Here and in the following we assume \(m_0 = 1_*\), getting \(p_0 = q_0 = 1_*\). Consider the \(*\)-polynomial
The orthogonality conditions (2.20) give \(\alpha _0 = [\lambda *q_0, p_0]\). Similarly, we get the \(*\)-polynomial
with \(\alpha _0 = [q_0, \lambda *p_0], \beta _1 = [q_1, \lambda *p_0]\). Repeating the \(*\)-orthogonalization process, we obtain the three-term recurrences for \(n=1,2,\dots \)
with \(p_{-1} = q_{-1} = 0\) and
Note that deriving the recurrences needs property (2.19). The \(*\)-biorthonormal polynomials \(p_0, \dots , p_n\) and \(q_0, \dots , q_n\) exist under the assumption that \(\beta _1, \dots , \beta _n\) are \(*\)-invertible.
Let \(\mathsf {A}\) be a time-dependent matrix, and \(\varvec{w},\varvec{v}\) time-independent vectors such that \(\varvec{w}^H\varvec{v} = 1\). Consider the \(*\)-sesquilinear form \([\cdot , \cdot ]\) defined by
Assume that there exist \(*\)-polynomials \(p_0, \dots , p_n\) and \(q_0,\dots ,q_n\) \(*\)-biorthonormal with respect to \([\cdot , \cdot ]\). Defining the vectors
and using the recurrences (2.21) gives the \(*\)-Lanczos recurrences (2.10). Moreover, the coefficients in (2.22) are the \(*\)-Lanczos coefficients in (2.11).
Let \(\mathsf {T}_n\) be a tridiagonal matrix as in (2.12) composed of the coefficients (2.22) associated with the \(*\)-sesquilinear form \([\cdot ,\cdot ]\). Then we can define the \(*\)-sesquilinear form
The following lemmas show that
proving Theorem 2.1.
Lemma 2.4
Let \(p_0,\dots ,p_n\) and \(q_0,\dots ,q_n\) be \(*\)-biorthonormal polynomials with respect to the \(*\)-sesquilinear form \([\cdot ,\cdot ]\). Assume that the coefficients \(\beta _1,\dots , \beta _n\) in the related recurrences (2.21) are \(*\)-invertible. Then the \(*\)-polynomials are also \(*\)-biorthonormal with respect to the form \([\cdot ,\cdot ]_n\) defined above.
Proof
Consider the vectors \(\varvec{y}_j^H = \varvec{e}_1^H \, \mathsf {T}_n^{*j}\) and \(\varvec{x}_j= \mathsf {T}_n^{*j} \, \varvec{e}_1\). Since the matrix \(\mathsf {T}_n\) is tridiagonal, for \(j=1,\dots ,n-1\), we have
By assumption, the product \(\beta _j * \cdots * \beta _1\) is \(*\)-invertible. Therefore there exist \(*\)-polynomials \(\widehat{p}_0,\dots ,\widehat{p}_{n-1}\) and \(\widehat{q}_0,\dots ,\widehat{q}_{n-1}\) so that, for \(i=0,\dots ,n-1\), we get
Such \(*\)-polynomials are \(*\)-biorthonormal with respect to \([\cdot ,\cdot ]_n\) since they satisfy
Moreover, for \(i=0,\dots ,n-1\), the corresponding recurrence coefficients (2.22) are the same as the ones of the \(*\)-polynomials \(p_0,\dots ,p_{n-1}\) and \(q_0,\dots ,q_{n-1}\). Indeed,
Since \(\widehat{p}_0 = p_0 = \widehat{q}_0 = q_0 = 1_*\), we get \(\widehat{p}_i = p_i\) and \(\widehat{q}_i = q_i\) for \(i=0,\dots ,n-1\). \(\square \)
Lemma 2.5
Let \(p_0,\dots ,p_{n-1}\) and \(q_0,\dots ,q_{n-1}\) be \(*\)-biorthonormal polynomials with respect to a \(*\)-sesquilinear form \([\cdot ,\cdot ]_A\) and to a \(*\)-sesquilinear form \([\cdot ,\cdot ]_B\). If \([1_*,1_*]_A = [1_*,1_*]_B = 1_*\), then \([\lambda ^{*j},1_*]_A = [\lambda ^{*j},1_*]_B\) for \(j=0,\dots ,2n-1\).
Proof
We prove it by induction. Let \(m_j = [\lambda ^{*j},1_*]_A\) and \(\widehat{m}_j = [\lambda ^{*j},1_*]_B\) for \(j=0,1,\dots , 2n-1\). By the expression for the coefficients in (2.22) we get
Hence \(m_1 = \alpha _0 = \widehat{m}_1\). Assuming \(m_j = \widehat{m}_j\) for \(j=0,\dots ,2k-3\) we will prove that \(m_{2k-2} = \widehat{m}_{2k-2}\) and \(m_{2k-1} = \widehat{m}_{2k-1}\), for \(k = 2,\dots ,n\). The coefficient expressions in (2.22) gives
which can be rewritten as
with \(a_i, b_j\) the coefficients respectively of \(q_{k-1}\) and \(p_{k-2}\). The inductive assumption implies
The leading coefficients of the \(*\)-polynomials \(q_{2k-2}\) and \(p_{2k-2}\) are respectively \(a_{k-1}= 1_*\) and \(b_{k-2}=(\beta _{k-2} * \cdots *\beta _{1})^{*-1}\). Hence \(m_{2k-2} = \widehat{m}_{2k-2}\). Repeating the same argument with the coefficient \(\alpha _{k-1}\) (2.22) concludes the proof. \(\square \)
3 Convergence, breakdown, and related properties
3.1 The convergence behavior of intermediate approximations
Assuming no breakdown, the \(*\)-Lanczos algorithm in conjunction with the path-sum method converges to the solution \(\varvec{w}^H\mathsf {U}(t',t)\varvec{v}\) in N iterations, with N the size of \(\mathsf {A}\); see Theorem 2.2. Most importantly, intermediate \(*\)-Lanczos iterations provide a sequence of approximations \(\int _t^{t'} \mathsf {R}_{*}(\mathsf {T}_n)_{1,1}(\tau ,t)\, \text {d}\tau \), \(n=1,\dots ,N\), whose convergence behavior we analyze hereafter.
As we have already discussed before, under the assumption that all entries of \(\mathsf {A}\) are smooth over I and that all the \(\beta _j^{(1,0)}(t,t)\ne 0\), for every \(t \in I\), all the \(\alpha _j\) and \(\beta _j\) distributions are elements of \(\text {Sm}_\Theta (I)\). The proof of this statement is very long and technical and serves only to establish the theoretical feasibility of tridigonalization for systems of coupled linear differential equations with variable coefficients using smooth functions. It was therefore presented in a separate work. We refer the reader to [17] for a full exposition and proof, while here we only state some of the main results:
Theorem 3.1
[17] Let \(\mathsf {A}(t',t)= \widetilde{\mathsf {A}}(t')\Theta (t'-t)\) be an \(N\times N\) matrix composed of elements from \(\text {Sm}_\Theta (I)\). Let \(\alpha _{n-1}\) and \(\beta _n\) be the coefficients generated by Algorithm 1 running on \(\mathsf {A}\) and the time-independent vectors \(\varvec{w}, \varvec{v}\) (\(\varvec{w}^H \varvec{v}=1\)). Let
For any \(1 \le n \le N\), assuming that \(\beta _j^{(1,0)}(t,t) \ne 0\) for every \(t \in I\), \(j=1,\dots ,n-1\), we get \(\beta _n(t',t), \alpha _{n-1}(t',t) \in \text {Sm}_\Theta (I)\). In addition, all required \(*\)-inverses \(\beta _{j}^{*-1}\) exist and are of the form
with \(b_j^{L}\in \text {Sm}_\Theta (I)\).
All \(b_j^{L}\) have explicit expansions in terms of \(\beta _j\) which are given in [17] but not reproduced here owing to length concerns. When \(\beta _n \not \equiv 0\), but \(\beta _n^{(1,0)}(\rho ,\rho )=0\) for some \(\rho \in I\), the results of Theorem 3.1 hold if we restrict the intial interval I to a subinteval \(J \subset I\) so that \(\rho \notin J\).
Thanks to these regularity results, we can establish the following bound for the approximation error:
Proposition 3.2
Let us consider the setting and assumptions of Theorem 3.1. Moreover, let \(\mathsf {U}\) designate the time-ordered exponential of \(\mathsf {A}\) and let \(\mathsf {T}_n\) be the tridiagonal matrix (2.12) such that
Then, for \(t'\ge t\), with \(t', t \in I\),
Here
are both finite, with \(\Vert \cdot \Vert _\infty \) the matrix norm induced by the uniform norm.
Proof
Assume \(t' \ge t\). Observe that
so that
Now \(\sup _{t'\in I}\vert \varvec{w}^H{\mathsf {A}}(t')\varvec{v}\vert \le C\) and
We proceed similarly for the terms involving \(\mathsf {T}_n\). Theorem 3.1 implies the existence of \(\widehat{D}_n := \sup _{t',t\in I^2}\max _{0\le j\le n-1}\big \{\vert {\alpha }_j(t',t)\vert ,\vert {\beta }_j(t',t)\vert \big \} < +\infty \). The matrix element \((\mathsf {T}_n^{*j})_{1,1}\) is given by the sum of \(*\)-products of coefficients \(\alpha _i, \beta _i\) and \(1_*\). Replacing all the factors in those \(*\)-products with \(\widehat{D}_n \Theta (t'-t)\) gives an upper bound for \((\mathsf {T}_n^{*j})_{1,1}\). Hence we get
where \(\mathsf {P}_n\) is the \(n\times n\) tridiagonal matrix whose nonzero entries are equal to 1. Note that \(\Vert \mathsf {P}_n\Vert _{\infty }=3\). Hence the error can be bounded by
concluding the proof. \(\square \)
Analogously to the classical non-Hermitian Lanczos algorithm, we need to further assume \(D_n\) to be not too large for \(n\ge 1\) in order to get a meaningful bound. Such an assumption can be verified a-posteriori. The bound of Proposition 3.2 demonstrates that under reasonable assumptions, the approximation error has a super-linear decay. Assuming no breakdown and exact precision arithmetic, we also recall that the algorithm does necessarily converge in at most N steps, independently of the error bound. The computational cost of the algorithm is discussed separately in Sect. 5.
3.2 Breakdown
In the classical non-Hermitian Lanczos algorithm, a breakdown appears either when an invariant Krylov subspace is produced (lucky breakdown) or when the last vectors of the biorthogonal bases \(\varvec{v}_n, \varvec{w}_n\) are nonzero, but \(\varvec{w}_n^H\varvec{v}_n = 0\) (serious breakdown); for further details refer, e.g., to [33, 34, 39, 46,47,48]. Analogously, in the \(*\)-Lanczos algorithm 1, a lucky breakdown arises when either \(\varvec{w}_n \equiv 0\) or \(\widehat{\varvec{v}}_n \equiv 0\). In such a case, the algorithm has converged to the solution, as the following proposition shows.
Proposition 3.3
Assume that the \(*\)-Lanczos algorithm 1 does not breakdown until the nth step when a lucky breakdown arises, i.e., \(\widehat{\varvec{v}}_n \equiv 0\) (or \(\varvec{w}_n \equiv 0\)). Then
Proof
Assuming \(\widehat{\varvec{v}}_n \equiv 0\), eq. (2.13) becomes
By \(*\)-multiplying (3.1) from the left by \(\mathsf {A}\), we get
which becomes
using (3.1). Repeating the argument gives
from which we get that \(\mathsf {W}_n^H * \mathsf {A}^{*j} \, * \mathsf {V}_n = \mathsf {T}_n^{*j}\), \(j\ge 0\). The proposition easily follows from here. An analogous proof holds for the case \(\widehat{\varvec{w}}_n \equiv 0\).
\(\square \)
The \(*\)-Lanczos algorithm construction and its polynomial interpretation in Sect. 2.3 suggest that it may be possible to deal with the serious breakdown issue by a look-ahead strategy analogous to the one for the non-Hermitian Lanczos algorithm; see, e.g., [33, 34]. If \(i\ne j\), then \(\varvec{w}^H\varvec{v} = 0\), which does not satisfy the \(*\)-Lanczos assumption. Moreover, if \(i=j\) and \(\mathsf {A}\) is a sparse non-Hermitian matrix, then it may be possible that \(\mathsf {A}_{ii} \equiv 0\) and \(\mathsf {A}^{{*2}_{ii}} \equiv 0\). As a consequence, we get \(\beta _1 \equiv 0\). We can try to fix these problems rewriting the approximation of the time-ordered exponential \(\mathsf {U}\) as
with \(\varvec{e} = (1, \dots , 1)^H\). Then one can approximate \((\varvec{e} + \varvec{e}_i)^H \mathsf {U} \, \varvec{e}_j\) and \(\varvec{e}^H \mathsf {U} \, \varvec{e}_j\) separately, which are less likely going to have a breakdown, thanks to the fact that \(\varvec{e}\) is a full vector; see, e.g., [35, Section 7.3]. Note that while this strategy can prevent a breakdown in the first iterations, a serious breakdown might still happen in later ones.
4 Examples
In this section, we use the \(*\)-Lanczos algorithm 1 on examples in ascending order of difficulty. All the computations have been performed using Mathematica 11.
Example 4.1
(Ordinary matrix exponential) Let us first consider a constant matrix
Because \(\mathsf {A}\) commutes with itself at all times, its time-ordered exponential coincides with its ordinary exponential, \(\mathcal {T}e^{\int \mathsf {A}(\tau )\, \text {d}\tau }\equiv e^{\mathsf {A}(t')}\) (we set \(t=0\)). Note that the matrix chosen here is symmetric only to lead to concise expressions suitable for presentation in an article, e.g.,
and that such symmetries are not a requirement of the \(*\)-Lanczos approach.
Now let us find the result of Eq. (4.1) with Algorithm 1. We define \(\varvec{w}:=\varvec{v}^H:=(1,0,0)\), \(\varvec{w}_0=\varvec{w}1_*\), \(\varvec{v}_0=\varvec{v}1_*\), from which it follows that \(\alpha _0(t',t) = -1\times \Theta (t'-t)\) and \(\varvec{w}_1=\widehat{\varvec{v}}_1^H=(0,1,1)\Theta (t'-t)\). Furthermore, since \(\mathsf {A}\) is a constant matrix times \(\Theta (t'-t)\), we have \(\mathsf {A}^{*n} = \tilde{\mathsf {A}}^n \times \Theta (t'-t)^{*n} = \tilde{\mathsf {A}}\times (t'-t)^{n-1}/(n-1)!\times \Theta (t'-t)\) and similarly \(\alpha _0^{*2}(t',t)=\tilde{\alpha }_0^2\times (t'-t)\Theta (t'-t)\). Thus
The \(*\)-inverse follows as \(\beta ^{*-1}=\frac{1}{2} \delta ''\) [18], from which we get
Now it follows that
Then \(\beta _2^{*-1}=4 \delta ''\) and so
At this point we have determined the \(*\)-Lanczos matrices \(\mathsf {T}\), \(\mathsf {V}\) and \(\mathsf {W}\) entirely
In all of these expressions, \(\Theta \) is a short-hand notation for \(\Theta (t'-t)\) and \(\delta \), \(\delta '\) and \(\delta ''\) are to be evaluated in \(t'-t\). It is now straightforward to verify the matching moment property \(\big (\mathsf {T}^{*j}\big )_{11}=\big (\mathsf {A}^{*j}\big )_{11}\) for all \(j\in \mathbb {N}\). We can also check directly that the time-ordered exponential of \(\mathsf {A}\) is correctly determined from \(\mathsf {T}\) using either the general formula of Eq. (2.17) or, because the situation is so simple that all entries depend only on \(t'-t\), we may use a Laplace transform with respect to \(t'-t\). This gives \(\mathsf {T}(s)\), and the inverse Laplace-transform of the resolvent \(\big (\mathsf {I}-\mathsf {T}(s)\big )^{-1}_{11}\) is the desired quantity. Both procedures give the same result, namely the derivative of \(e^{\mathsf {A}t}\) as it should [26], i.e.,
which is indeed the derivative of Eq. (4.1).
Example 4.2
(Time-ordered exponential of a time-dependent matrix) In this example, we consider the \(5\times 5\) time-dependent matrix \(\mathsf {A}(t',t)=\tilde{\mathsf {A}}(t')\Theta (t'-t)\) with
The matrix \(\tilde{\mathsf {A}}\) does not commute with itself at different times \(\tilde{\mathsf {A}}(t')\tilde{\mathsf {A}}(t)-\tilde{\mathsf {A}}(t)\tilde{\mathsf {A}}(t')\ne 0\), and the corresponding differential system Eq. (1.1) has no known analytical solution. We use Algorithm 1 to determine the tridiagonal matching moment matrix \(\mathsf {T}\) such that \(\big (\mathsf {A}^{*j}\big )_{11}=\big (\mathsf {T}^{*j}\big )_{11}\) for \(j \in \mathbb {N}\). We define \(\varvec{w}:=\varvec{v}^H:=(1,0,0,0,0)\), \(\varvec{w}_0=\varvec{w}1_*\), \(\varvec{v}_0=\varvec{v}1_*\), from which it follows that
Observing that \(\beta _1 = \Theta (t'-t)*t'\Theta (t'-t)\), we get \(\beta _1^{*-1} = \frac{1}{t} \delta '(t'-t)*\delta '(t'-t)=-\frac{1}{t^2}\delta '(t'-t)+\frac{1}{t}\delta ''(t'-t)\), so that
which terminates the initialization phase of the Algorithm. We proceed with
As we did for \(\beta _1\), we factorize \(\beta _2 = \Theta (t'-t) *t\, \Theta (t'-t)\) so that its \(*\)-inverse is \(\beta _2^{*-1} = \frac{1}{t'} \delta '(t'-t)*\delta '(t'-t)=\frac{1}{t'}\delta ''\). Then
Continuing in this fashion yields the tridiagonal output matrix \(\mathsf {T}_5\equiv \mathsf {T}\),
with
and the bases matrices
In all of these expressions, \(\Theta \) and \(\delta ^{(n)}\) are short-hand notations respectively for \(\Theta (t'-t)\) and \(\delta ^{(n)}(t'-t)\). All the required \(\beta _j^{*-1}\) were calculated using the strategies described in [18], getting the factorized \(*\)-inverses
We have also verified that \(\big (\mathsf {A}^{*j}\big )_{11}=\big (\mathsf {T}^{*j}\big )_{11}\) holds for j up to 9. The \(*\)-resolvent of \(\mathsf {T}\) has no closed-form expression, its Neumann series likely converging to a hitherto undefined special function. Ultimately, such difficulties are connected with the propensity of systems of coupled linear ordinary differential equations with non-constant coefficients to produce transcendent solutions.
5 Outlook: numerical implementation
We do not expect closed-forms to exist in most cases for the entries of time-ordered matrix exponentials as these can involve complicated special functions [2]. Also, very large matrices \(\mathsf {A}(t')\) are to be treatable by the algorithm for it to be relevant to most applications. For these reasons, it is fundamental to implement the \(*\)-Lanczos algorithm numerically, e.g., using time discretization approximations.
As shown in [26], there exists an isometry \(\Phi \) between the algebra of distributions of \(\text {D}(I)\) equipped with the \(*\)-product and the algebra of time-continuous operators (for which the time variables \(t'\) and t serve as line and row indices). Consider, for simplicity, a discretization of the interval I with constant time step \(\Delta t\); then these operators become ordinary matrices. Specifically, given \(f, g \in \text {Sm}_\Theta (I)\), their discretization counterparts are the lower triangular matrices \(\mathsf {F}, \mathsf {G}\). Moreover, the function \(f *g\) corresponds to the matrix \(\mathsf {F} \, \mathsf {G} \, \Delta t\), with the usual matrix product. In other terms, the isometry \(\Phi \) followed by a time discretization sends the \(*\)-product to the ordinary matrix product times \(\Delta t\). Similarly, the Dirac delta distribution is sent to the identity matrix times \(1/(\Delta t)\), the kth Dirac delta derivative \(\delta ^{(k)}\) is sent to the finite difference matrix
and \(\Theta \) is sent to the matrix \( (\mathsf {M}_\Theta )_{ij}=1 \) if \(i\ge j\) and 0 otherwise. Most importantly, in this picture, the \(*\)-inverse of a function \(f(t',t)\in D(I)\) is given as \(\mathsf {F}^{-1}/(\Delta t)^2\), with \(\mathsf {F}\) the lower triangular matrix corresponding to f. Moreover, the time-discretized version of the path-sum formulation Eq. (2.17) only involves ordinary matrix resolvents. At the same time, the final integration of \(\mathsf {R}_*(\mathsf {T}_n)_{11}\) yielding \(\varvec{w}^H\mathsf {U}\,\varvec{v}\) becomes a left multiplication by \(\mathsf {M}_\Theta \). Therefore, a numerical implementation of the time-discretized \(*\)-Lanczos algorithm only requires ordinary operations on triangular matrices.
We can now meaningfully evaluate the numerical cost of the time-discretized version of the algorithm. Let \(N_t\) be the number of time subintervals in the discretization of I for both the t and \(t'\) time variables. Then time-discretized \(*\)-multiplications or \(*\)-inversions cost \(O(N_t^3)\) operations. Considering a sparse time-dependent matrix \(\mathsf {A}(t)\) with \(N_{nz}\) nonzero elements, the \(*\)-Lanczos algorithm therefore necessitates \(O(N_i\times N_t^3\times N_{nz})\) operations to obtain the desired \(\varvec{w}^H\mathsf {U}\, \varvec{v}\). Here \(N_i\) is the number of iterations needed to get an error lower than a given tolerance. Unfortunately, as well-explained in [36], the presence of computational errors can slow down the (usual) Lanczos algorithm convergence. Hence, in general, we cannot assume \(N_i \approx N\) since the \(*\)-Lanczos algorithm could analogously require more iterations. However, in many cases, the (usual) Lanczos algorithm demands few iterations to reach the tolerance also in finite precision arithmetic. We expect the \(*\)-Lanczos algorithm to behave analogously, giving \(N_i \ll N\) in many cases. Concerning \(N_t\), there is no reason to expect that is would depend on N since \(N_t\) controls the quality of individual generalized functions. We also remark that the \(*\)-Lanczos algorithm can exploit the sparsity structure of the matrix \(\mathsf {A}\), making it inherently competitive when dealing with large sparse matrices that are typical of applications.
The classical numerical methods (e.g., Runge–Kutta methods) for the approximation of the system of ODEs (1.1) are known to perform poorly in certain cases. These include for example, very large system sizes, or in the presence of highly oscillatory coefficients. Consequently, in the last decades, novel techniques have been sought and proposed, many of which are based on the Magnus series; see, for instance, [4, 23, 49,50,51,52,53,54,55,56]. However, for large matrices, these methods are known to be highly consuming in resources. This motivates the research of novel approaches in particular for large-scale problems. Here the guaranteed convergence of the \(*\)-Lanczos algorithm in a finite number of iterations, the sequence of approximations it produces, its ability to exploit matrix sparsity and its relations with numerically well studied Lanczos procedures are all promising elements which justify further works on concrete numerical implementations. More precise theoretical results about the overall approximation quality and further issues on numerical applications of the present algorithm are beyond the scope of this work. They will appear together with a practical implementation of the algorithm in a future contribution.
6 Conclusion
In this work, we constructed the \(*\)-Lanczos algorithm as a biorthogonalization process of Krylov subspaces composed of distributions with respect to the \(*\)-product, a convolution-like product. The algorithm relies on a non-commutative operation and is analogous in spirit to the non-Hermitian Lanczos algorithm. The \(*\)-Lanczos algorithm can express the element of a time-ordered exponential of size \(N \times N\) by the path-sum continued fraction (2.17). To our knowledge, such an expression is the only one composed of \(\mathcal {O}(N)\) scalar integro-differential equations. Such a time-ordered exponential approximation relies on the matching moment property proved in this paper.
The overall approach generates a controllable sequence of time-ordered exponential approximations, offers an innovative perspective of the connection between numerical linear algebra and differential calculus, and opens the door to efficient numerical algorithms for large-scale computations.
Notes
The \(\#\)P class is the class of problems equivalent to counting the number of accepting paths of a polynomial-time non-deterministic Turing machine. Problems in \(\#\)P-complete are at least as hard as NP-complete problems.
References
Dyson, F.J.: Divergence of perturbation theory in quantum electrodynamics. Phys. Rev. 85(4), 631–632 (1952). https://doi.org/10.1103/PhysRev.85.631
Xie, Q., Hai, W.: Analytical results for a monochromatically driven two-level system. Phys. Rev. A 82, 032117 (2010)
Hortaçsu, M.: Heun functions and some of their applications in physics. Adv. High Energy Phys. 2018, 8621573 (2018)
Blanes, S., Casas, F., Oteo, J.A., Ros, J.: The Magnus expansion and some of its applications. Phys. Rep. 470(5), 151–238 (2009)
Autler, S.H., Townes, C.H.: Stark effect in rapidly varying fields. Phys. Rev. 100, 703–722 (1955)
Shirley, J.H.: Solution of the Schrödinger equation with a Hamiltonian periodic in time. Phys. Rev. 138, 979–987 (1965)
Lauder, M.A., Knight, P.L., Greenland, P.T.: Pulse-shape effects in intense-field laser excitation of atoms. Opt. Acta 33(10), 1231–1252 (1986)
Reid, W.T.: Riccati matrix differential equations and non-oscillation criteria for associated linear differential systems. Pac. J. Math. 13(2), 665–685 (1963)
Kwakernaak, H., Sivan, R.: Linear Optimal Control Systems, vol. 1. Wiley-interscience, New York (1972)
Corless, M., Frazho, A.: Linear Systems and Control: An Operator Perspective. Pure and Applied Mathematics. Marcel Dekker, New York (2003)
Blanes, S.: High order structure preserving explicit methods for solving linear-quadratic optimal control problems. Numer. Algor. 69(2), 271–290 (2015)
Benner, P., Cohen, A., Ohlberger, M., Willcox, K.: Model Reduction and Approximation: Theory and Algorithms. Computational Science and Engineering. SIAM, Philadelphia (2017)
Kučera, V.: A review of the matrix Riccati equation. Kybernetika 9(1), 42–61 (1973)
Abou-Kandil, H., Freiling, G., Ionescu, V., Jank, G.: Matrix Riccati Equations in Control and Systems. Theory Systems & Control: Foundations & Applications. Birkhäuser, Basel (2003)
Hached, M., Jbilou, K.: Numerical solutions to large-scale differential Lyapunov matrix equations. Numer. Algor. 79(3), 741–757 (2018)
Kirsten, G., Simoncini, V.: Order reduction methods for solving large-scale differential matrix Riccati equations. SIAM J. Sci. Comput. 42(4), 2182–2205 (2020). https://doi.org/10.1137/19m1264217
Giscard, P.-L., Pozza, S.: Tridiagonalization of systems of coupled linear differential equations with variable coefficients by a Lanczos-like method. Linear Algebra Appl. 624, 153–173 (2021). https://doi.org/10.1016/j.laa.2021.04.011
Giscard, P.-L., Pozza, S.: Lanczos-like algorithm for the time-ordered exponential: the \(\ast \)-inverse problem. Appl. Math. 65(6), 807–827 (2020). https://doi.org/10.21136/am.2020.0342-19
Magnus, W.: On the exponential solution of differential equations for a linear operator. Comm. Pure Appl. Math. 7(4), 649–673 (1954)
Casas, F.: Sufficient conditions for the convergence of the Magnus expansion. J. Phys. A 40(50), 15001–15017 (2007). https://doi.org/10.1088/1751-8113/40/50/006
Fel’dman, E.B.: On the convergence of the Magnus expansion for spin systems in periodic magnetic fields. Phys. Lett. 104A(9), 479–481 (1984)
Maricq, M.M.: Convergence of the Magnus expansion for time dependent two level systems. J. Chem. Phys. 86(10), 5647–5651 (1987)
Iserles, A., Munthe-Kaas, H.Z., Nørsett, S.P., Zanna, A.: Lie-group methods. Acta Numer. 9, 215–365 (2000). https://doi.org/10.1017/S0962492900002154
Moan, P.C., Niesen, J.: Convergence of the Magnus series. Found. Comput. Math. 8(3), 291–301 (2007). https://doi.org/10.1007/s10208-007-9010-0
Sánchez, S., Casas, F., Fernández, A.: New analytic approximations based on the Magnus expansion. J. Math. Chem. 49(8), 1741–1758 (2011)
Giscard, P.-L., Lui, K., Thwaite, S.J., Jaksch, D.: An exact formulation of the time-ordered exponential using path-sums. J. Math. Phys. 56(5), 053503 (2015)
Balasubramanian, V., DeCross, M., Kar, A., Parrikar, O.: Quantum complexity of time evolution with chaotic Hamiltonians. J. High Energ. Phys. 134 (2020)
Giscard, P.-L., Bonhomme, C.: Dynamics of quantum systems driven by time-varying Hamiltonians: solution for the Bloch–Siegert Hamiltonian and applications to NMR. Phys. Rev. Res. (2020). https://doi.org/10.1103/PhysRevResearch.2.023081
Flum, J., Grohe, M.: The parameterized complexity of counting problems. SIAM J. Comput. 33, 892–922 (2004)
Moler, C., Van Loan, C.: Nineteen dubious ways to compute the exponential of a matrix. SIAM Rev. 20(4), 801–836 (1978)
Moler, C., Van Loan, C.: Nineteen dubious ways to compute the exponential of a matrix, twenty-five years later. SIAM Rev. 45(1), 3–49 (2003)
Higham, N.J.: Functions of Matrices. Theory and Computation, p. 425. SIAM, Philadelphia (2008)
Gutknecht, M.H.: A completed theory of the unsymmetric Lanczos process and related algorithms. I. SIAM J. Matrix Anal. Appl. 13(2), 594–639 (1992)
Gutknecht, M.H.: A completed theory of the unsymmetric Lanczos process and related algorithms. II. SIAM J. Matrix Anal. Appl. 15(1), 15–58 (1994)
Golub, G.H., Meurant, G.: Matrices, Moments and Quadrature with Applications. Princeton Ser. Appl. Math. Princeton University Press, Princeton, p. 363 (2010)
Liesen, J., Strakoš, Z.: Krylov Subspace Methods: Principles and Analysis. Numer. Math. Sci. Comput. Oxford University Press, Oxford (2013)
Pozza, S., Pranić, M.S., Strakoš, Z.: Gauss quadrature for quasi-definite linear functionals. IMA J. Numer. Anal. 37(3), 1468–1495 (2017)
Pozza, S., Pranić, M.S., Strakoš, Z.: The Lanczos algorithm and complex Gauss quadrature. Electron. Trans. Numer. Anal. 50, 1–19 (2018)
Parlett, B.N.: Reduction to tridiagonal form and minimal realizations. SIAM J. Matrix Anal. Appl. 13(2), 567–593 (1992)
Draux, A.: Polynômes Orthogonaux Formels. Lecture Notes in Math., vol. 974. Springer, Berlin, p. 625 (1983)
Schwartz, L.: Théorie Des Distributions, Nouvelle édition, entièrement corrigée, refondue et augmentée. Hermann, Paris (1978)
Halperin, I., Schwartz, L.: Introduction to the Theory of Distributions. University of Toronto Press, Toronto (2019). https://doi.org/10.3138/9781442615151. https://toronto.degruyter.com/view/title/550976
Volterra, V., Pérès, J.: Leçons sur la Composition et les Fonctions Permutables. Éditions Jacques Gabay, Paris (1928)
Giscard, P.-L., Thwaite, S.J., Jaksch, D.: Walk-sums, continued fractions and unique factorisation on digraphs (2012). arXiv:1202.5523 [cs.DM]
Kılıç, E.: Explicit formula for the inverse of a tridiagonal matrix by backward continued fractions. Appl. Math. Comput. 197(1), 345–357 (2008)
Taylor, D.R.: Analysis of the look ahead Lanczos algorithm. PhD thesis, University of California, Berkeley (1982)
Parlett, B.N., Taylor, D.R., Liu, Z.A.: A look-ahead Lanczos algorithm for unsymmetric matrices. Math. Comp. 44(169), 105–124 (1985)
Freund, R.W., Gutknecht, M.H., Nachtigal, N.M.: An implementation of the look-ahead Lanczos algorithm for non-Hermitian matrices. SIAM J. Sci. Comput. 14, 137–158 (1993)
Hochbruck, M., Lubich, C.: Exponential integrators for quantum-classical molecular dynamics. BIT Numer. Math. 39(4), 620–645 (1999). https://doi.org/10.1023/A:1022335122807
Budd, C.J., Iserles, A., Iserles, A., Nørsett, S.P.: On the solution of linear differential equations in lie groups. Philos. Trans. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci. 357(1754), 983–1019 (1999). https://doi.org/10.1098/rsta.1999.0362
Iserles, A.: On the global error of discretization methods for highly-oscillatory ordinary differential equations. BIT Numer. Math. 42(3), 561–599 (2002). https://doi.org/10.1023/A:1022049814688
Iserles, A.: On the method of Neumann series for highly oscillatory equations. BIT Numer. Math. 44(3), 473–488 (2004). https://doi.org/10.1023/B:BITN.0000046810.25353.95
Degani, I., Schiff, J.: RCMS: Right correction Magnus series approach for oscillatory ODEs. J. Comput. Appl. Math. 193(2), 413–436 (2006). https://doi.org/10.1016/j.cam.2005.07.001
Cohen, D., Jahnke, T., Lorenz, K., Lubich, C.: Numerical integrators for highly oscillatory Hamiltonian systems: A review. In: Mielke, A. (ed.) Analysis, Modeling and Simulation of Multiscale Problems, pp. 553–576. Springer, Berlin (2006)
Bader, P., Iserles, A., Kropielnicka, K., Singh, P.: Efficient methods for linear Schrödinger equation in the semiclassical regime with time-dependent potential. Proc. R. Soc. A Math. Phys. Eng. Sci. 472, 20150733 (2016)
Blanes, S., Casas, F.: A Concise Introduction to Geometric Numerical Integration. CRC Press, Bocan Raton (2017)
Acknowledgements
We thank Francesca Arrigo, Des Higham, Jennifer Pestana, and Francesco Tudisco for their invitation to the University of Strathclyde, without which this work would not have come to fruition. The first author was supported in part by 2019 Alcohol project ANR-19-CE40-0006 and 2020 Magica project ANR-20-CE29-0007. The second author was supported by Charles University Research programs No. PRIMUS/21/SCI/009 and UNCE/SCI/023.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Giscard, PL., Pozza, S. A Lanczos-like method for non-autonomous linear ordinary differential equations. Boll Unione Mat Ital 16, 81–102 (2023). https://doi.org/10.1007/s40574-022-00328-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40574-022-00328-6