1 Introduction

Differential equations have extensive applications throughout mathematics, science, and engineering. Numerical methods for differential equations have been widely studied (see for example [25]), giving fast algorithms for solving them using classical computers.

Recent work has developed quantum algorithms with the potential to extract information about solutions of systems of differential equations even faster than is possible classically. This body of work grew from the quantum linear systems algorithm (QLSA) [20], which produces a quantum state proportional to the solution of a sparse system of d linear equations in time \({{\,\mathrm{poly}\,}}(\log d)\). Subsequent work improved the performance of that algorithm [1, 15] and applied it to develop similar quantum algorithms for differential equations.

To achieve this improvement, quantum algorithms must operate under different assumptions than those made for algorithms in classical numerical analysis. To represent the output using \({{\,\mathrm{poly}\,}}(\log d)\) qubits, the output is produced as a quantum state, not as an explicit description of a vector. Furthermore, to facilitate applying Hamiltonian simulation, these quantum algorithms use implicit access to the system of equations (say, through a matrix specified by a sparse Hamiltonian oracle and the ability to prepare quantum states encoding certain vectors). While these assumptions restrict the types of equations that can be addressed and the types of information that can be extracted from the final state, they nevertheless appear capable of producing useful information that cannot be efficiently computed classically.

In this paper, we focus on systems of first-order linear ordinary differential equations (ODEs). Such equations can be written in the form

$$\begin{aligned} \frac{\mathrm {d}{x(t)}}{\mathrm {d}{t}}=A(t)x(t)+f(t) \end{aligned}$$
(1.1)

where \(t \in [0,T]\) for some \(T>0\), the solution \(x(t)\in {\mathbb {C}}^d\) is a d-dimensional vector, and the system is determined by a time-dependent matrix \(A(t)\in {\mathbb {C}}^{d\times d}\) and a time-dependent inhomogeneity \(f(t)\in {\mathbb {C}}^d\). Provided A(t) and f(t) are continuous functions of t, the initial value problem (i.e., the problem of determining x(t) for a given initial condition x(0)) has a unique solution [2].

The Hamiltonian simulation problem is simply the special case of the quantum ODE problem where A is antihermitian and f is zero. A substantial body of work has developed fast quantum algorithms for that special case [6,7,8,9, 11, 13, 27,28,29,30]. Hamiltonian simulation underlies the QLSA [15, 20] which in turn gives algorithms for more general differential equations.

Berry presented the first efficient quantum algorithm for general linear ODEs [5]. His algorithm represents the system of differential equations as a system of linear equations using a linear multistep method and solves that system using the QLSA. This approach achieves complexity logarithmic in the dimension d and, by using a high-order integrator, close to quadratic in the evolution time T. While this method could in principle be applied to handle time-dependent equations, the analysis of [5] only explicitly considers the time-independent case for simplicity.

Since it uses a finite difference approximation, the complexity of Berry’s algorithm as a function of the solution error \(\epsilon \) is \({{\,\mathrm{poly}\,}}(1/\epsilon )\) [5]. Reference [10] improved this to \({{\,\mathrm{poly}\,}}(\log (1/\epsilon ))\) by using a high-precision QLSA based on linear combinations of unitaries [15] to solve a linear system that encodes a truncated Taylor series. However, this approach assumes that A(t) and f(t) are time-independent so that the solution of the ODE can be written as an explicit series, and it is unclear how to generalize the algorithm to time-dependent ODEs.

While we focus here on extending the above line of work, several other approaches have been proposed for addressing differential equations with quantum computers. Reference [26] used a quantum version of the Euler method to handle nonlinear ODEs with polynomial nonlinearities. This algorithm has complexity logarithmic in the dimension but exponential in the evolution time (as is inevitable for general nonlinear ODEs). Other work has developed quantum algorithms for partial differential equations (PDEs). Reference [16] described a quantum algorithm that applies the QLSA to implement a finite element method for Maxwell’s equations. Reference [17] applied Hamiltonian simulation to a finite difference approximation of the wave equation. Most recently, reference [3] presented a continuous-variable quantum algorithm for initial value problems with non-homogeneous linear PDEs.

Most of the aforementioned algorithms use a local approximation: they discretize the differential equations into small time intervals to obtain a system of linear equations or linear differential equations that can be solved by the QLSA or Hamiltonian simulation. For example, the central difference scheme approximates the time derivative at the point x(t) as

$$\begin{aligned} \frac{\mathrm {d}{x(t)}}{\mathrm {d}{t}}=\frac{x(t+h)-x(t-h)}{2h}+O(h^2). \end{aligned}$$
(1.2)

High-order finite difference or finite element methods can reduce the error to \(O(h^k)\), where \(k-1\) is the order of the approximation. However, when solving an equation over the interval [0, T], the number of iterations is \(T/h = \Theta (\epsilon ^{-1/k})\) for fixed k, giving a total complexity that is \({{\,\mathrm{poly}\,}}(1/\epsilon )\) even using high-precision methods for the QLSA or Hamiltonian simulation.

For ODEs with special structure, some prior results already show how to avoid a local approximation and thereby achieve complexity \({{\,\mathrm{poly}\,}}(\log (1/\epsilon ))\). When A(t) is anti-Hermitian and \(f(t)=0\), we can directly apply Hamiltonian simulation [9]; if A and f are time-independent, then [10] uses a Taylor series to achieve complexity \({{\,\mathrm{poly}\,}}(\log ({1}/{\epsilon }))\). However, the case of general time-dependent linear ODEs had remained elusive.

In this paper, we use a nonlocal representation of the solution of a system of differential equations to give a new quantum algorithm with complexity \({{\,\mathrm{poly}\,}}(\log (1/\epsilon ))\) even for time-dependent equations. While this is an exponential improvement in the dependence on \(\epsilon \) over previous work, it does not necessarily give an exponential runtime improvement in the context of an algorithm with classical output. In general, statistical error will introduce an overhead of \({{\,\mathrm{poly}\,}}(1/\epsilon )\) when attempting to measure an observable with precision \(\epsilon \). However, achieving complexity \({{\,\mathrm{poly}\,}}(\log (1/\epsilon ))\) can result in a polynomial improvement in the overall running time. In particular, if an algorithm is used as a subroutine k times, we should ensure error O(1/k) for each subroutine to give an overall algorithm with bounded error. A subroutine with complexity \({{\,\mathrm{poly}\,}}(\log (1/\epsilon ))\) can potentially give significant polynomial savings in such a case.

Time-dependent linear differential equations describe a wide variety of systems in science and engineering. Examples include the wave equation and the Stokes equation (i.e., creeping flow) in fluid dynamics [24], the heat equation and the Boltzmann equation in thermodynamics [19, 33], the Poisson equation and Maxwell’s equations in electromagnetism [22, 34], and of course Schrödinger’s equation in quantum mechanics. Moreover, some nonlinear differential equations can be studied by linearizing them to produce time-dependent linear equations (e.g., the linearized advection equation in fluid dynamics [12]).

We focus our discussion on first-order linear ODEs. Higher-order ODEs can be transformed into first-order ODEs by standard methods. Also, by discretizing space, PDEs with both time and space dependence can be regarded as sparse linear systems of time-dependent ODEs. Thus we focus on an equation of the form (1.1) with initial condition

$$\begin{aligned} x(0)=\gamma \end{aligned}$$
(1.3)

for some specified \(\gamma \in {\mathbb {C}}^d\). We assume that A(t) is s-sparse (i.e., has at most s nonzero entries in any row or column) for any \(t\in [0,T]\). Furthermore, we assume that A(t), f(t), and \(\gamma \) are provided by black-box subroutines (which serve as abstractions of efficient computations). In particular, following essentially the same model as in [10] (see also Section 1.1 of [15]), suppose we have an oracle \(O_A(t)\) that, for any \(t \in [0,T]\) and any given row or column specified as input, computes the locations and values of the nonzero entries of A(t) in that row or column. We also assume oracles \(O_x\) and \(O_f(t)\) that, for any \(t \in [0,T]\), prepare normalized states \(|\gamma \rangle \) and \(|f(t)\rangle \) proportional to \(\gamma \) and f(t), and that also compute \(\Vert \gamma \Vert \) and \(\Vert f(t)\Vert \), respectively. Given such a description of the instance, the goal is to produce a quantum state \(\epsilon \)-close to \(|x(T)\rangle \) (a normalized quantum state proportional to x(T)).

As mentioned above, our main contribution is to implement a method that uses a global approximation of the solution. We do this by developing a quantum version of so-called spectral methods, a technique from classical numerical analysis that (approximately) represents the components of the solution \(x(t)_i \approx \sum _j c_{ij} \phi _j(t)\) as linear combinations of basis functions \(\phi _j(t)\) expressing the time dependence. Specifically, we implement a Chebyshev pseudospectral method [4, 21] using the QLSA. This approach approximates the solution by a truncated Chebyshev series with undetermined coefficients and solves for those coefficients using a linear system that interpolates the differential equations. According to the convergence theory of spectral methods, the solution error decreases exponentially provided the solution is sufficiently smooth [18, 31]. We use the LCU-based QLSA to solve this linear system with high precision [15]. To analyze the algorithm, we upper bound the solution error and condition number of the linear system and lower bound the success probability of the final measurement. Overall, we show that the total complexity of this approach is \({{\,\mathrm{poly}\,}}(\log (1/\epsilon ))\) for general time-dependent ODEs. Informally, we show the following:

Theorem 1

(Informal). Consider a linear ODE (1.1) with given initial conditions. Assume A(t) is s-sparse and diagonalizable, and \(\mathop {\mathrm {Re}}(\lambda _i(t))\le 0\) for all eigenvalues of A(t). Then there exists a quantum algorithm that produces a state \(\epsilon \)-close in \(l_2\) norm to the exact solution, succeeding with probability \(\Omega (1)\), with query and gate complexity \(O\bigl (s\Vert A\Vert T\, {{\,\mathrm{poly}\,}}(\log (s\Vert A\Vert T/\epsilon )))\).

In addition to initial value problems (IVPs), our approach can also address boundary value problems (BVPs). Given an oracle for preparing a state \(\alpha |x(0)\rangle +\beta |x(T)\rangle \) expressing a general boundary condition, the goal of the quantum BVP is to produce a quantum state \(\epsilon \)-close to \(|x(t)\rangle \) (a normalized state proportional to x(t)) for any desired \(t \in [0,T]\). We also give a quantum algorithm for this problem with complexity \({{\,\mathrm{poly}\,}}(\log (1/\epsilon ))\), as follows:

Theorem 2

(Informal). Consider a linear ODE (1.1) with given boundary conditions. Assume A(t) is s-sparse and diagonalizable, and \(\mathop {\mathrm {Re}}(\lambda _i(t))\le 0\) for all eigenvalues of A(t). Then there exists a quantum algorithm that produces a state \(\epsilon \)-close in \(l_2\) norm to the exact solution, succeeding with probability \(\Omega (1)\), with query and gate complexity \(O\bigl (s\Vert A\Vert ^4T^4\, {{\,\mathrm{poly}\,}}(\log (s\Vert A\Vert T/\epsilon )))\).

We give formal statements of Theorems 1 and  2 in Sects. 8 and 9, respectively. Note that the dependence of the complexity on \(\Vert A\Vert \) and T is worse for BVPs than for IVPs. This is because a rescaling approach that we apply for IVPs (introduced in Sect. 3) cannot be extended to BVPs.

The remainder of this paper is organized as follows. Section 2 introduces the spectral method and Sect. 3 shows how to encode it into a quantum linear system. Then Sect. 4 analyzes the exponential decrease of the solution error, Sect. 5 bounds the condition number of the linear system, Sect. 6 lower bounds the success probability of the final measurement, and Sect. 7 describes how to prepare the initial quantum state. We combine these bounds in Sect. 8 to establish the main result. We then extend the analysis for initial value problems to boundary value problems in Sect. 9. Finally, we conclude in Sect. 10 with a discussion of the results and some open problems.

2 Spectral Method

Spectral methods provide a way of solving differential equations using global approximations [18, 31]. The main idea of the approach is as follows. First, express an approximate solution as a linear combination of certain basis functions with undetermined coefficients. Second, construct a system of linear equations that such an approximate solution should satisfy. Finally, solve the linear system to determine the coefficients of the linear combination.

Spectral methods offer a flexible approach that can be adapted to different settings by careful choice of the basis functions and the linear system. A Fourier series provides an appropriate basis for periodic problems, whereas Chebyshev polynomials can be applied more generally. The linear system can be specified using Gaussian quadrature (giving a spectral element method or Tau method), or one can simply interpolate the differential equations using quadrature nodes (giving a pseudo-spectral method) [31]. Since general linear ODEs are non-periodic, and interpolation facilitates constructing a straightforward linear system, we develop a quantum algorithm based on the Chebyshev pseudo-spectral method [4, 21].

In this approach, we consider a truncated Chebyshev approximation x(t) of the exact solution \({{\hat{x}}}(t)\), namely

$$\begin{aligned} x_i(t)=\sum _{k=0}^nc_{i,k}T_k(t),\quad i\in [{d}]_0:=\{0,1,\ldots ,d-1\} \end{aligned}$$
(2.1)

for any \(n \in {\mathbb {Z}}^+\). (See “Appendix A” for the definition of \(T_k(t)\) and a discussion of its properties.) The coefficients \(c_{i,k} \in {\mathbb {C}}\) for all \(i\in [{d}]_0\) and \(k\in [{n+1}]_0\) are determined by demanding that x(t) satisfies the ODE and initial conditions at a set of interpolation nodes\(\{t_l\}_{l=0}^n\) (with \(1=t_0>t_1>\cdots >t_n=-1\)), where \(x(t_0)\) and \(x(t_n)\) are the initial and final states, respectively. In other words, we require

$$\begin{aligned} \frac{\mathrm {d}{x(t_l)}}{\mathrm {d}{t}}=A(t_l)x(t_l)+f(t_l), \quad \forall \, l\in [{n+1}],~ t\in [-1,1], \end{aligned}$$
(2.2)

and

$$\begin{aligned} x_i(t_0)=\gamma _i, \quad i\in [{d}]_0. \end{aligned}$$
(2.3)

We choose the domain \([-1,1]\) in (2.2) because this is the natural domain for Chebyshev polynomials. Correspondingly, in the following section, we rescale the domain of initial value problems to be \([-1,1]\). We would like to be able to increase the accuracy of the approximation by increasing n, so that

$$\begin{aligned} \Vert {\hat{x}}(t)-x(t)\Vert \rightarrow 0 \quad \text {as} \quad n\rightarrow \infty . \end{aligned}$$
(2.4)

There are many possible choices for the interpolation nodes. Here we use the Chebyshev-Gauss-Lobatto quadrature nodes, \(t_l=\cos \frac{l\pi }{n}\) for \(l \in [{n+1}]_0\), since these nodes achieve the highest convergence rate among all schemes with the same number of nodes [23, 25]. These nodes also have the convenient property that \(T_k(t_l)=\cos \frac{kl\pi }{n}\), making it easy to compute the values \(x_i(t_l)\).

To evaluate the condition (2.2), it is convenient to define coefficients \(c'_{i,k}\) for \(i\in [{d}]_0\) and \(k\in [{n+1}]_0\) such that

$$\begin{aligned} \frac{\mathrm {d}{x_i(t)}}{\mathrm {d}{t}}=\sum _{k=0}^n c'_{i,k}T_k(t). \end{aligned}$$
(2.5)

We can use the differential property of Chebyshev polynomials,

$$\begin{aligned} 2T_k(t)=\frac{T'_{k+1}(t)}{k+1}-\frac{T'_{k-1}(t)}{k-1}, \end{aligned}$$
(2.6)

to determine the transformation between \(c_{i,k}\) and \(c'_{i,k}\). As detailed in “Appendix A”, we have

$$\begin{aligned} c'_{i,k}=\sum _{j=0}^n[D_n]_{kj}c_{i,j},\quad i\in [{d}]_0,~k\in [{n+1}]_0, \end{aligned}$$
(2.7)

where \(D_n\) is the \((n+1) \times (n+1)\) upper triangular matrix with nonzero entries

$$\begin{aligned}{}[D_n]_{kj}=\frac{2j}{\sigma _k},\qquad k+j\text { odd},~ j>k, \end{aligned}$$
(2.8)

where

$$\begin{aligned} \sigma _k := {\left\{ \begin{array}{ll} 2 &{} k=0\\ 1 &{} k\in [{n}] := \{1,2,\ldots ,n\}. \end{array}\right. } \end{aligned}$$
(2.9)

Using this expression in (2.2), (2.7), and (2.8), we obtain the following linear equations:

$$\begin{aligned} \sum _{k=0}^nT_k(t_l)c'_{i,k}=\sum _{j=0}^{d-1}A_{ij} (t_l)\sum _{k=0}^nT_k(t_l)c_{j,k}+f(t_l)_i,\quad i\in [{d}]_0,~l\in [{n+1}]_0.\nonumber \\ \end{aligned}$$
(2.10)

We also demand that the Chebyshev series satisfies the initial condition \(x_i(1)=\gamma _i\) for all \(i\in [{d}]_0\). This system of linear equations gives a global approximation of the underlying system of differential equations. Instead of locally approximating the ODE at discretized times, these linear equations use the behavior of the differential equations at the \(n+1\) times \(\{t_l\}_{l=0}^n\) to capture their behavior over the entire interval \([-1,1]\).

Our algorithm solves this linear system using the high-precision QLSA [15]. Given an encoding of the Chebyshev coefficients \(c_{ik}\), we can obtain the approximate solution x(t) as a suitable linear combination of the \(c_{ik}\), a computation that can also be captured within a linear system. The resulting approximate solution x(t) is close to the exact solution \({\hat{x}}(t)\):

Lemma 1

(Lemma 19 of [18]). Let \({\hat{x}}(t)\in C^{r+1}(-1,1)\) be the solution of the differential equations (1.1) and let x(t) satisfy (2.2) and (2.3) for \(\{t_l=\cos \frac{l\pi }{n}\}_{l=0}^n\). Then there is a constant C, independent of n, such that

$$\begin{aligned} \max _{t\in [-1,1]}\Vert {\hat{x}}(t)-x(t)\Vert \le C \max _{t\in [-1,1]}\frac{\Vert {\hat{x}}^{(n+1)}(t)\Vert }{n^{r-2}}. \end{aligned}$$
(2.11)

This shows that the convergence behavior of the spectral method is related to the smoothness of the solution. For a solution in \(C^{r+1}\), the spectral method approximates the solution with \(n={{\,\mathrm{poly}\,}}({1}/{\epsilon })\). Furthermore, if the solution is smoother, we have an even tighter bound:

Lemma 2

(Eq. (1.8.28) of [31]). Let \({\hat{x}}(t)\in C^{\infty }(-1,1)\) be the solution of the differential equations (1.1) and let x(t) satisfy (2.2) and (2.3) for \(\{t_l=\cos \frac{l\pi }{n}\}_{l=0}^n\). Then

$$\begin{aligned} \max _{t\in [-1,1]}\Vert {\hat{x}}(t)-x(t)\Vert \le \sqrt{\frac{2}{\pi }} \max _{t\in [-1,1]}\Vert {\hat{x}}^{(n+1)}(t)\Vert \left( \frac{e}{2n}\right) ^n. \end{aligned}$$
(2.12)

For simplicity, we replace the value \(\sqrt{{2}/{\pi }}\) by the upper bound of 1 in the following analysis.

This result implies that if the solution is in \(C^{\infty }\), the spectral method approximates the solution to within \(\epsilon \) using only \(n={{\,\mathrm{poly}\,}}(\log (1/\epsilon ))\) terms in the Chebyshev series. Consequently, this approach gives a quantum algorithm with complexity \({{\,\mathrm{poly}\,}}(\log (1/\epsilon ))\).

3 Linear System

In this section we construct a linear system that encodes the solution of a system of differential equations via the Chebyshev pseudospectral method introduced in Sect. 2. We consider a system of linear, first-order, time-dependent ordinary differential equations, and focus on the following initial value problem:

Problem 1

In the quantum ODE problem, we are given a system of equations

$$\begin{aligned} \frac{\mathrm {d}{x(t)}}{\mathrm {d}{t}}=A(t)x(t)+f(t) \end{aligned}$$
(3.1)

where \(x(t) \in {\mathbb {C}}^d\), \(A(t) \in {\mathbb {C}}^{d\times d}\) is s-sparse, and \(f(t) \in {\mathbb {C}}^d\) for all \(t\in [0,T]\). We assume that \(A_{ij},f_i \in C^\infty (0,T)\) for all \(i,j \in [{d}]\). We are also given an initial condition \(x(0) = \gamma \in {\mathbb {C}}^d\). Given oracles that compute the locations and values of nonzero entries of A(t) for any t, and that prepare normalized states \(|\gamma \rangle \) proportional to \(\gamma \) and \(|f(t)\rangle \) proportional to f(t) for any \(t \in [0,T]\), the goal is to output a quantum state \(|x(T)\rangle \) that is proportional to x(T).

Without loss of generality, we rescale the interval [0, T] onto \([-1,1]\) by the linear map \(t \mapsto 1-2t/T\). Under this rescaling, we have \(\frac{\mathrm {d}}{\mathrm {d}{t}} \mapsto -\frac{T}{2}\frac{\mathrm {d}}{\mathrm {d}{t}}\), so \(A \mapsto -\frac{T}{2}A\), which can increase the spectral norm. To reduce the dependence on T—specifically, to give an algorithm with complexity close to linear in T—we divide the interval [0, T] into subintervals \([0,\Gamma _1],[\Gamma _1,\Gamma _2],\ldots ,[\Gamma _{m-1},T]\) with \(\Gamma _0:=0, \Gamma _m:=T\). Each subinterval \([\Gamma _h,\Gamma _{h+1}]\) for \(h\in [{m}]_0\) is then rescaled onto \([-1,1]\) with the linear map \(K_h :[\Gamma _h,\Gamma _{h+1}] \rightarrow [-1,1]\) defined by

$$\begin{aligned} \begin{aligned} K_h :t \mapsto 1-\frac{2(t-\Gamma _h)}{\Gamma _{h+1}-\Gamma _h}, \end{aligned} \end{aligned}$$
(3.2)

which satisfies \(K_h(\Gamma _h)=1\) and \(K_h(\Gamma _{h+1})=-1\). To solve the overall initial value problem, we simply solve the differential equations for each successive interval (as encoded into a single system of linear equations).

Now let \(\tau _h:=|\Gamma _{h+1}-\Gamma _h|\) and define

$$\begin{aligned} A_h(t)&:= -\frac{\tau _h}{2}A(K_h(t)) \end{aligned}$$
(3.3)
$$\begin{aligned} x_h(t)&:= x(K_h(t)) \end{aligned}$$
(3.4)
$$\begin{aligned} f_h(t)&:= -\frac{\tau _h}{2}f(K_h(t)). \end{aligned}$$
(3.5)

Then, for each \(h\in [{m}]_0\), we have the rescaled differential equations

$$\begin{aligned} \frac{\mathrm {d}{x_h}}{\mathrm {d}{t}}=A_h(t)x_h(t)+f_h(t) \end{aligned}$$
(3.6)

for \(t\in [-1,1]\) with the initial conditions

$$\begin{aligned} x_h(1) = {\left\{ \begin{array}{ll} \gamma &{} h=0 \\ x_{h-1}(-1) &{} h\in [{m}]. \end{array}\right. } \end{aligned}$$
(3.7)

By taking

$$\begin{aligned} \tau _h \le \frac{2}{\max _{t \in [\Gamma _h,\Gamma _{h+1}]} \Vert {A(t)}\Vert } \end{aligned}$$
(3.8)

where \(\Vert {\cdot }\Vert \) denotes the spectral norm, we can ensure that \(\Vert {A_h(t)}\Vert \le 1\) for all \(t \in [-1,1]\). In particular, it suffices to take

$$\begin{aligned} \tau := \max _{h \in \{0,1,\ldots ,m-1\}} \tau _h \le \frac{2}{\max _{t \in [0,T]} \Vert {A(t)}\Vert }. \end{aligned}$$
(3.9)

Having rescaled the equations to use the domain \([-1,1]\), we now apply the Chebyshev pseudospectral method. Following Sect. 2, we substitute the truncated Chebyshev series of x(t) into the differential equations with interpolating nodes \(\{t_l=\cos \frac{l\pi }{n} : l \in [{n}]\}\), giving the linear system

$$\begin{aligned} \frac{\mathrm {d}{x(t_l)}}{\mathrm {d}{t}}=A_h(t_l)x(t_l)+f_h(t_l), \quad h\in [{m}]_0, l\in [{n+1}] \end{aligned}$$
(3.10)

with initial condition

$$\begin{aligned} x(t_0)=\gamma . \end{aligned}$$
(3.11)

Note that in the following, terms with \(l=0\) refer to this initial condition.

We now describe a linear system

$$\begin{aligned} L|X\rangle =|B\rangle \end{aligned}$$
(3.12)

that encodes the Chebyshev pseudospectral approximation and uses it to produce an approximation of the solution at time T.

The vector \(|X\rangle \in {\mathbb {C}}^{m+p} \otimes {\mathbb {C}}^d \otimes {\mathbb {C}}^{n+1}\) represents the solution in the form

$$\begin{aligned} |X\rangle =\sum _{h=0}^{m-1}\sum _{i=0}^{d-1} \sum _{l=0}^nc_{i,l}(\Gamma _{h+1})|hil\rangle +\sum _{h=m}^{m+p}\sum _{i=0}^{d-1}\sum _{l=0}^nx_i|hil\rangle \end{aligned}$$
(3.13)

where \(c_{i,l}(\Gamma _{h+1})\) are the Chebyshev series coefficients of \(x(\Gamma _{h+1})\) and \(x_i := x(\Gamma _m)_i\) is the ith component of the final state \(x(\Gamma _m)\).

The right-hand-side vector \(|B\rangle \) represents the input terms in the form

$$\begin{aligned} |B\rangle =\sum _{h=0}^{m-1}|h\rangle |B(f_h)\rangle \end{aligned}$$
(3.14)

where

$$\begin{aligned} |B(f_h)\rangle =\sum _{i=0}^{d-1}\gamma _i|i0\rangle +\sum _{i=0}^{d-1}\sum _{l=1}^nf_{h}(\cos \tfrac{l\pi }{n})_i|il\rangle ,\quad h\in [{m-1}]. \end{aligned}$$
(3.15)

Here \(\gamma \) is the initial condition and \(f_{h}(\cos \frac{l\pi }{n})_i\) is ith component of \(f_h\) at the interpolation point \(t_l=\cos \frac{l\pi }{n}\).

We decompose the matrix L in the form

$$\begin{aligned} L= & {} \sum _{h=0}^{m-1}|h\rangle \langle h|\otimes (L_1+L_2(A_h))+\sum _{h=1}^{m}|h\rangle \langle h-1|\otimes L_3\nonumber \\&+\sum _{h=m}^{m+p}|h\rangle \langle h|\otimes L_4+\sum _{h=m+1}^{m+p}|h\rangle \langle h-1|\otimes L_5. \end{aligned}$$
(3.16)

We now describe each of the matrices \(L_i\) for \(i \in [{5}]\) in turn.

The matrix \(L_1\) is a discrete representation of \(\frac{\mathrm {d}{x}}{\mathrm {d}{t}}\), satisfying

$$\begin{aligned} |h\rangle \langle h| \otimes L_1|X\rangle = \sum _{i=0}^{d-1}\sum _{k=0}^nT_k(t_0)c_{i,k} |hi0\rangle +\sum _{i=0}^{d-1}\sum _{l=1,k,r=0}^nT_k(t_l)[D_n]_{kr}c_{i,r} |hil\rangle \nonumber \\ \end{aligned}$$
(3.17)

(recall from (2.5) and (2.7) that \(D_n\) encodes the action of the time derivative on a Chebyshev expansion). Thus \(L_1\) has the form

$$\begin{aligned} L_1&=\sum _{i=0}^{d-1}\sum _{k=0}^nT_k(t_0)|i0\rangle \langle ik| +\sum _{i=0}^{d-1}\sum _{l=1,k,r=0}^n \cos \frac{kl\pi }{n}[D_n]_{kr}|il\rangle \langle ir| \end{aligned}$$
(3.18)
$$\begin{aligned}&=I_d\otimes (|0\rangle \langle 0|P_n+\sum _{l=1}^n|l\rangle \langle l|P_nD_n) \end{aligned}$$
(3.19)

where the interpolation matrix is a discrete cosine transform matrix:

$$\begin{aligned} P_n:=\sum _{l,k=0}^n \cos \frac{kl\pi }{n}|l\rangle \langle k|. \end{aligned}$$
(3.20)

The matrix \(L_2(A_h)\) discretizes \(A_h(t)\), i.e.,

$$\begin{aligned} |h\rangle \langle h| \otimes L_2(A_h)|X\rangle =-\sum _{i,j=0}^{d-1}\sum _{l=1,k=0}^n A_h(t_l)_{ij}T_k(t_l)c_{j,k}|hil\rangle . \end{aligned}$$
(3.21)

Thus

$$\begin{aligned} L_2(A_h)&=-\sum _{i,j=0}^{d-1}\sum _{l=1,k=0}^n A_h(t_l)_{ij} \cos \frac{kl\pi }{n}|il\rangle \langle jk| \end{aligned}$$
(3.22)
$$\begin{aligned}&=-\sum _{l=1}^nA_h(t_l)\otimes |l\rangle \langle l|P_n. \end{aligned}$$
(3.23)

Note that if \(A_h\) is time-independent, then

$$\begin{aligned} L_2(A_h)=-A_h\otimes P_n. \end{aligned}$$
(3.24)

The matrix \(L_3\) combines the Chebyshev series coefficients \(c_{i,l}\) to produce \(x_i\) for each \(i\in [{d}]_0\). To express the final state \(x(-1)\), \(L_3\) represents the linear combination \(x_i(-1)=\sum _{k=0}^nc_{i,k}T_k(-1)=\sum _{k=0}^n(-1)^kc_{i,k}\). Thus we take

$$\begin{aligned} L_3=\sum _{i=0}^{d-1}\sum _{k=0}^n(-1)^k|i0\rangle \langle ik|. \end{aligned}$$
(3.25)

Notice that \(L_3\) has zero rows for \(l\in [{n}]\).

When \(h=m\), \(L_4\) is used to construct \(x_i\) from the output of \(L_3\) for \(l=0\), and to repeat \(x_i\)n times for \(l\in [{n}]\). When \(m+1\le h\le m+p\), both \(L_4\) and \(L_5\) are used to repeat \(x_i\)\((n+1)p\) times for \(l\in [{n}]\). This repetition serves to increase the success probability of the final measurement. In particular, we take

$$\begin{aligned} L_4=-\sum _{i=0}^{d-1}\sum _{l=1}^n|il\rangle \langle il-1|+\sum _{i=0}^{d-1}\sum _{l=0}^n|il\rangle \langle il| \end{aligned}$$
(3.26)

and

$$\begin{aligned} L_5=-\sum _{i=0}^{d-1}|i0\rangle \langle in|. \end{aligned}$$
(3.27)

In summary, the linear system is as follows. For each \(h \in [{m}]_0\), \((L_1+L_2(A_h))|X\rangle =|B_h\rangle \) solves the differential equations over \([\Gamma _h,\Gamma _{h+1}]\), and the coefficients \(c_{i,l}(\Gamma _{h+1})\) are combined by \(L_3\) into the \((h+1)\)st block as initial conditions. When \(h=m\), the final coefficients \(c_{i,l}(\Gamma _m)\) are combined by \(L_3\) and \(L_4\) into the final state with coefficients \(x_i\), and this solution is repeated \((p+1)(n+1)\) times by \(L_4\) and \(L_5\).

To explicitly illustrate the structure of this system, we present a simple example in “Appendix B”.

4 Solution Error

In this section, we bound how well the solution of the linear system defined above approximates the actual solution of the system of differential equations.

Lemma 3

For the linear system \(L|X\rangle =|B\rangle \) defined in (3.12), let x be the approximate ODE solution specified by the linear system and let \({\hat{x}}\) be the exact ODE solution. Then for n sufficiently large, the error in the solution at time T satisfies

$$\begin{aligned} \Vert {\hat{x}}(T)-x(T)\Vert \le m\max _{t\in [0,T]}\Vert {\hat{x}}^{(n+1)}(t)\Vert \frac{e^{n+1}}{(2n)^n}. \end{aligned}$$
(4.1)

Proof

First we carefully choose n satisfying

$$\begin{aligned} n\ge \frac{e}{2}\left\lfloor \frac{\log (\omega )}{\log (\log (\omega ))}\right\rfloor \end{aligned}$$
(4.2)

where

$$\begin{aligned} \omega :=\max _{t\in [0,T]}\frac{\Vert {\hat{x}}^{(n+1)}(t)\Vert }{\Vert \gamma \Vert }(m+1) \end{aligned}$$
(4.3)

to ensure that

$$\begin{aligned} \max _{t\in [0,T]}\frac{\Vert {\hat{x}}^{(n+1)}(t)\Vert }{\Vert \gamma \Vert } \Bigl (\frac{e}{2n}\Bigr )^n\le \frac{1}{m+1}. \end{aligned}$$
(4.4)

According to the quantum spectral method defined in Sect. 3, we solve

$$\begin{aligned} \frac{\mathrm {d}{x}}{\mathrm {d}{t}}=A_h(t)x(t)+f_h(t),\quad h\in [{m}]_0. \end{aligned}$$
(4.5)

We denote the exact solution by \({\hat{x}}(\Gamma _{h+1})\), and we let \(x(\Gamma _{h+1})=\sum _{i=0}^d\sum _{l=0}^n(-1)^nc_{i,l}(\Gamma _{h+1})\), where \(c_{i,l}(\Gamma _{h+1})\) is defined in (3.13). Define

$$\begin{aligned} \Delta _{h+1}:=\Vert {\hat{x}}(\Gamma _{h+1})-x(\Gamma _{h+1})\Vert . \end{aligned}$$
(4.6)

For \(h=0\), Lemma 2 implies

$$\begin{aligned} \Delta _1=\Vert {\hat{x}}(\Gamma _1)-x(\Gamma _1) \Vert \le \max _{t\in [0,T]}\Vert {\hat{x}}^{(n+1)}(t)\Vert \Bigl (\frac{e}{2n}\Bigr )^n. \end{aligned}$$
(4.7)

For \(h \in [{m}]\), the error in the approximate solution of \(\frac{\mathrm {d}{x}}{\mathrm {d}{t}}=A_h(t)x(t)+f_h(t)\) has two contributions: the error from the linear system and the error in the initial condition. We let \({\widetilde{x}}(\Gamma _{h+1})\) denote the solution of the linear system \(\bigl (L_1 + L_2(A_h)\bigr ) |{\widetilde{x}}(\Gamma _{h+1})\rangle = |B(f_h)\rangle \) under the initial condition \({\hat{x}}(\Gamma _h)\). Then

$$\begin{aligned} \Delta _{h+1} \le \Vert {\hat{x}}(\Gamma _{h+1})-{\widetilde{x}}(\Gamma _{h+1})\Vert +\Vert {\widetilde{x}}(\Gamma _{h+1})-x(\Gamma _{h+1})\Vert . \end{aligned}$$
(4.8)

The first term can be bounded using Lemma 2, giving

$$\begin{aligned} \Vert {\hat{x}}(\Gamma _{h+1})-{\widetilde{x}}(\Gamma _{h+1})\Vert \le \max _{t\in [0,T]}\Vert {\hat{x}}^{(n+1)}(t)\Vert \Bigl (\frac{e}{2n}\Bigr )^n. \end{aligned}$$
(4.9)

The second term comes from the initial error \(\Delta _h\), which is transported through the linear system. Let

$$\begin{aligned} E_{h+1}={\hat{E}}_{h+1}+\delta _{h+1} \end{aligned}$$
(4.10)

where \(E_{h+1}\) is the solution of the linear system with input \(\Delta _h\) and \({\hat{E}}_{h+1}\) is the exact solution of \(\frac{\mathrm {d}{x}}{\mathrm {d}{t}}=A_{h+1}(t)x(t)+f_{h+1}(t)\) with initial condition \(x(\Gamma _h)=\Delta _h\). Then by Lemma 2,

$$\begin{aligned} \Vert \delta _{h+1}\Vert =\Vert {\hat{E}}_{h+1}-E_{h+1}\Vert \le \frac{\Delta _h}{\Vert \gamma \Vert }\max _{t\in [0,T]}\Vert {\hat{x}}^{(n+1)} (t)\Vert \Bigl (\frac{e}{2n}\Bigr )^n, \end{aligned}$$
(4.11)

so

$$\begin{aligned} \Vert {\widetilde{x}}(\Gamma _{h+1})-x(\Gamma _{h+1})\Vert \le \Delta _h +\frac{\Delta _h}{\Vert \gamma \Vert }\max _{t\in [0,T]}\Vert {\hat{x}}^{(n+1)} (t)\Vert \Bigl (\frac{e}{2n}\Bigr )^n. \end{aligned}$$
(4.12)

Thus, we have an inequality recurrence for bounding \(\Delta _h\):

$$\begin{aligned} \Delta _{h+1}&\le \Bigl (1+\max _{t\in [0,T]} \frac{\Vert {\hat{x}}^{(n+1)}(t)\Vert }{\Vert \gamma \Vert } \Bigl (\frac{e}{2n}\Bigr )^n\Bigr )\Delta _h+\max _{t\in [0,T]} \Vert {\hat{x}}^{(n+1)}(t)\Vert \Bigl (\frac{e}{2n}\Bigr )^n. \end{aligned}$$
(4.13)

Now we iterate h from 1 to m. Equation (4.4) implies

$$\begin{aligned} \max _{t\in [0,T]}\frac{\Vert {\hat{x}}^{(n+1)}(t)\Vert }{\Vert \gamma \Vert } \Bigl (\frac{e}{2n}\Bigr )^n\le \frac{1}{m+1}\le \frac{1}{m}, \end{aligned}$$
(4.14)

so

$$\begin{aligned} \Bigl (1+\max _{t\in [0,T]}\frac{\Vert {\hat{x}}^{(n+1)}(t)\Vert }{\Vert \gamma \Vert }\Bigl (\frac{e}{2n}\Bigr )^n\Bigr )^{m-1}\le \Bigl (1+\frac{1}{m}\Bigr )^m\le e. \end{aligned}$$
(4.15)

Therefore

$$\begin{aligned} \begin{aligned} \Delta _m&\le \Bigl (1+\max _{t\in [0,T]}\frac{\Vert {\hat{x}}^{(n+1)} (t)\Vert }{\Vert \gamma \Vert }\Bigl (\frac{e}{2n}\Bigr )^n\Bigr )^{m-1}\Delta _1 \\&\quad +\sum _{h=1}^{m-1}\Bigl (1+\max _{t\in [0,T]} \frac{\Vert {\hat{x}}^{(n+1)}(t)\Vert }{\Vert \gamma \Vert } \Bigl (\frac{e}{2n}\Bigr )^n\Bigr )^{h-1}\max _{t\in [0,T]} \Vert {\hat{x}}^{(n+1)}(t)\Vert \Bigl (\frac{e}{2n}\Bigr )^n\\&\le \Bigl (1+\frac{1}{m}\Bigr )^{m-1}\Delta _1+(m-1) \Bigl (1+\frac{1}{m}\Bigr )^{m-1}\max _{t\in [0,T]}\Vert {\hat{x}}^{(n+1)}(t) \Vert \Bigl (\frac{e}{2n}\Bigr )^n\\&\le \max _{t\in [0,T]}\Vert {\hat{x}}^{(n+1)}(t)\Vert \frac{e^{n+1}}{(2n)^n}+(m-1) \max _{t\in [0,T]}\Vert {\hat{x}}^{(n+1)}(t)\Vert \frac{e^{n+1}}{(2n)^n}\\&=m\max _{t\in [0,T]}\Vert {\hat{x}}^{(n+1)}(t)\Vert \frac{e^{n+1}}{(2n)^n}, \end{aligned} \end{aligned}$$
(4.16)

which shows that the solution error decreases exponentially with n. In other words, the linear system approximates the solution with error \(\epsilon \) using \(n={{\,\mathrm{poly}\,}}(\log ({1}/{\epsilon }))\). \( \square \)

Note that for time-independent differential equations, we can directly estimate \(\Vert {\hat{x}}^{(n+1)}(t)\Vert \) using

$$\begin{aligned} {\hat{x}}^{(n+1)}(t)=A_h^{n+1}{\hat{x}}(t)+A_h^nf_h. \end{aligned}$$
(4.17)

Writing \(A_h=V_h\Lambda _hV_h^{-1}\) where \(\Lambda _h={{\,\mathrm{diag}\,}}(\lambda _0,\ldots ,\lambda _{d-1})\), we have \(e^{A_h}=V_he^{\Lambda _h}V_h^{-1}\). Thus the exact solution of time-independent equation with initial condition \({\hat{x}}(1)=\gamma \) is

$$\begin{aligned} \begin{aligned} {\hat{x}}(t)&=e^{A_h(1-t)}\gamma +(e^{A_h(1-t)}-I)A_h^{-1}f_h\\&=V_he^{\Lambda _h}V_h^{-1}\gamma +V_h(e^{\Lambda _h(1-t)}-I)\Lambda _h^{-1}V_h^{-1}f_h. \end{aligned} \end{aligned}$$
(4.18)

Since \(\mathop {\mathrm {Re}}(\lambda _i) \le 0\) for all eigenvalues \(\lambda _i\) of \(A_h\) for \(i \in [{d}]_0\), we have \(\Vert e^{\Lambda _h}\Vert \le 1\). Therefore

$$\begin{aligned} \Vert {\hat{x}}(t)\Vert \le \kappa _V(\Vert \gamma \Vert +2\Vert f_h\Vert ). \end{aligned}$$
(4.19)

Furthermore, since \(\max _{h,t}\Vert A_h(t)\Vert \le 1\), we have

$$\begin{aligned} \begin{aligned} \max _{t\in [0,T]}\Vert {\hat{x}}^{(n+1)}(t) \Vert&\le \max _{t\in [0,T]}(\Vert {\hat{x}}(t)\Vert +\Vert f_h(t)\Vert )\\&\le \kappa _V(\Vert \gamma \Vert +3\Vert f_h\Vert )\\&\le \kappa _V(\Vert \gamma \Vert +2\tau \Vert f\Vert ). \end{aligned} \end{aligned}$$
(4.20)

Thus the solution error satisfies

$$\begin{aligned} \Vert {\hat{x}}(T)-x(T)\Vert \le m\kappa _V(\Vert \gamma \Vert +2\tau \Vert f\Vert )\frac{e^{n+1}}{(2n)^n}. \end{aligned}$$
(4.21)

Note that, although we represent the solution differently, this bound is similar to the corresponding bound in [10, Theorem 6].

5 Condition Number

We now analyze the condition number of the linear system.

Lemma 4

Consider an instance of the quantum ODE problem as defined in Problem 1. For all \(t\in [0,T]\), assume A(t) can be diagonalized as \(A(t)=V(t)\Lambda (t)V^{-1}(t)\) for some \(\Lambda (t)={{\,\mathrm{diag}\,}}(\lambda _0(t),\ldots ,\lambda _d(t))\), with \(\mathop {\mathrm {Re}}(\lambda _i(t))\le 0\) for all \(i\in [{d}]_0\). Let \(\kappa _V:=\max _{t\in [0,T]}\kappa _V(t)\) be an upper bound on the condition number of V(t). Then for \(m,p\in {\mathbb {Z}}^+\) and n sufficiently large, the condition number of L in the linear system (3.12) satisfies

$$\begin{aligned} \kappa _L\le (\pi m+p+2)(n+1)^{3.5}(2\kappa _V+e\Vert \gamma \Vert ). \end{aligned}$$
(5.1)

Proof

We begin by bounding the norms of some operators that appear in the definition of L. First we consider the \(l_{\infty }\) norm of \(D_n\) since this is straightforward to calculate:

$$\begin{aligned} \Vert D_n\Vert _{\infty }:=\max _{1\le i\le n}\sum _{j=0}^n|[D_n]_{ij}|= {\left\{ \begin{array}{ll} \frac{n(n+2)}{2} &{} n\text { even},\\ \frac{(n+1)^2}{2}-2 &{} n \text { odd}. \end{array}\right. } \end{aligned}$$
(5.2)

Thus we have the upper bound

$$\begin{aligned} \Vert D_n\Vert \le \sqrt{n+1}\Vert D_n\Vert _{\infty }\le \frac{(n+1)^{2.5}}{2}. \end{aligned}$$
(5.3)

Next we upper bound the spectral norm of the discrete cosine transform matrix \(P_n\):

$$\begin{aligned} \Vert P_n\Vert ^2\le \max _{0\le l\le n}\sum _{k=0}^n\cos ^2\frac{kl\pi }{n}\le \max _{0\le l\le n}\{n+1\}=n+1. \end{aligned}$$
(5.4)

Therefore

$$\begin{aligned} \Vert P_n\Vert \le \sqrt{n+1}. \end{aligned}$$
(5.5)

Thus we can upper bound the norm of \(L_1\) as

$$\begin{aligned} \Vert L_1\Vert \le \Vert D_n\Vert \Vert P_n\Vert \le \frac{(n+1)^3}{2}. \end{aligned}$$
(5.6)

Next we consider the spectral norm of \(L_2(A_h)\) for any \(h \in [{m}]_0\). We have

$$\begin{aligned} L_2(A_h)=-\sum _{l=1}^nA_h(t_l)\otimes |l\rangle \langle l|P_n. \end{aligned}$$
(5.7)

Since the eigenvalues of each \(A_h(t_l)\) for \(l \in [{n+1}]_0\) are all eigenvalues of

$$\begin{aligned} \sum _{l=0}^nA_h(t_l)\otimes |l\rangle \langle l|, \end{aligned}$$
(5.8)

we have

$$\begin{aligned} \biggl \Vert \sum _{l=1}^nA_h(t_l)\otimes |l\rangle \langle l|\biggr \Vert \le \biggl \Vert \sum _{l=0}^nA_h(t_l)\otimes |l\rangle \langle l|\biggr \Vert \le \max _{t \in [-1,1]}\Vert A_h(t)\Vert \le 1 \end{aligned}$$
(5.9)

by (3.8). Therefore

$$\begin{aligned} \Vert L_2(A_h)\Vert \le \Vert P_n\Vert \le \sqrt{n+1}. \end{aligned}$$
(5.10)

By direct calculation, we have

$$\begin{aligned} \Vert L_3\Vert&= \sqrt{n+1}, \end{aligned}$$
(5.11)
$$\begin{aligned} \Vert L_4\Vert&\le 2, \end{aligned}$$
(5.12)
$$\begin{aligned} \Vert L_5\Vert&=1. \end{aligned}$$
(5.13)

Thus, for \(n\ge 5\), we find

$$\begin{aligned} \Vert L\Vert \le \frac{(n+1)^3}{2}+\sqrt{n+1}+\sqrt{n+1}+2+1\le (n+1)^3.\ \end{aligned}$$
(5.14)

Next we upper bound \(\Vert L^{-1}\Vert \). By definition,

$$\begin{aligned} \Vert L^{-1}\Vert =\sup _{\Vert |B\rangle \Vert \le 1}\Vert L^{-1}|B\rangle \Vert . \end{aligned}$$
(5.15)

We express \(|B\rangle \) as

$$\begin{aligned} |B\rangle =\sum _{h=0}^{m+p}\sum _{l=0}^n \sum _{i=0}^{d-1}\beta _{hil}|hil\rangle =\sum _{h=0}^{m+p}\sum _{l=0}^n |b_{hl}\rangle \end{aligned}$$
(5.16)

where \(|b_{hl}\rangle := \sum _{i=0}^{d-1}\beta _{hil}|hil\rangle \) satisfies \(\Vert |b_{hl}\rangle \Vert ^2=\sum _{i=0}^{d-1}|\beta _{hil}|^2\le 1\). For any fixed \(h \in [{m+p+1}]_0\) and \(l \in [{n+1}]_0\), we first upper bound \(\Vert L^{-1}|b_{hl}\rangle \Vert \) and use this to upper bound the norm of \(L^{-1}\) applied to linear combinations of such vectors.

Recall that the linear system comes from (2.10), which is equivalent to

$$\begin{aligned} \sum _{k=0}^nT'_k(t_r)c_{i,k}(\Gamma _h)= \sum _{j=0}^{d-1}A_h(t_r)_{ij}\sum _{k=0}^nT_k(t_r)c_{j,k}(\Gamma _h)+f_h(t_r)_i,\quad i\in [{d}]_0,~r\in [{n+1}]_0. \nonumber \\ \end{aligned}$$
(5.17)

For fixed \(h \in [{m+p+1}]_0\) and \(r \in [{n+1}]_0\), define vectors \(x_{hr},x_{hr}' \in {\mathbb {C}}^d\) with

$$\begin{aligned} (x_{hr})_i := \sum _{k=0}^nT_k(t_r)c_{i,k}(\Gamma _h), \qquad (x'_{hr})_i:= \sum _{k=0}^nT'_k(t_r)c_{i,k}(\Gamma _h) \end{aligned}$$
(5.18)

for \(i \in [{d}]_0\). We claim that \(x_{hr} = x_{hr}' = 0\) for any \(r \ne l\). Combining only the equations from (5.17) with \(r \ne l\) gives the system

$$\begin{aligned} x'_{hr} = A_h(t_r)x_{hr} . \end{aligned}$$
(5.19)

Consider a corresponding system of differential equations

$$\begin{aligned} \frac{\mathrm {d}{{{\hat{x}}}_{hr}(t)}}{\mathrm {d}{t}} = A_h(t_r) {{\hat{x}}}(t) + b \end{aligned}$$
(5.20)

where \({{\hat{x}}}_{hr}(t) \in {\mathbb {C}}^d\) for all \(t \in [-1,1]\). The solution of this system with \(b=0\) and initial condition \({{\hat{x}}}_{hr}(1) = 0\) is clearly \({{\hat{x}}}_{hr}(t)=0\) for all \(t \in [-1,1]\). Then the nth-order truncated Chebyshev approximation of (5.20), which should satisfy the linear system (5.19) by (2.1) and (2.2), is exactly \(x_{hr}\). Using Lemma 3 and observing that \({\hat{x}}^{(n+1)}(t)=0\), we have

$$\begin{aligned} x_{hr}={{\hat{x}}}_{hr}(t)=0. \end{aligned}$$
(5.21)

When \(t=t_l\), we let \(|B\rangle =|b_{hl}\rangle \) denote the first nonzero vector. Combining only the equations from (5.17) with \(r = l\) gives the system

$$\begin{aligned} x'_{hl} = A_h(t_l)x_{hl} . \end{aligned}$$
(5.22)

Consider a corresponding system of differential equations

$$\begin{aligned} \frac{\mathrm {d}{{{\hat{x}}}_{hr}(t)}}{\mathrm {d}{t}} = A_h(t_r) {{\hat{x}}}(t) + b, \end{aligned}$$
(5.23)

with \(\gamma =b_{h0},b=0\) for \(l=0\); or \(\gamma =0,b=b_{hl}\) for \(l\in [{n}]\).

Using the diagonalization \(A_h(t_l)=V(t_l)\Lambda _h(t_l)V^{-1}(t_l)\), we have \(e^A=V(t_l)e^{\Lambda _h(t_l)}V^{-1}(t_l)\). Thus the exact solution of the differential equations (5.20) with \(r=l\) and initial condition \({{\hat{x}}}_{hr}(1)=\gamma \) is

$$\begin{aligned} \begin{aligned} {\hat{x}}_{hr}(t)&=e^{A_h(t_l)(1-t)}\gamma +(e^{A_h(t_l)(1-t)}-I)A_h(t_l)^{-1}b\\&=V(t_l)e^{\Lambda _h(t_l)(1-t)}V^{-1}(t_l) \gamma +V(e^{\Lambda _h(t_l)(1-t)}-I)\Lambda _h(t_l)^{-1}V^{-1}b. \end{aligned} \end{aligned}$$
(5.24)

According to Eq. (4.4) in the proof of Lemma 3, we have

$$\begin{aligned} x_{hl}={\hat{x}}_{hl}(-1)+\delta _{hl} \end{aligned}$$
(5.25)

where

$$\begin{aligned} \Vert \delta _{hl}\Vert \le \max _{t\in [0,T]}\Vert {\hat{x}}_{hl}^{(n+1)}(t) \Vert \frac{e^{n+1}}{(2n)^n}\le \frac{e\Vert \gamma \Vert }{m+1}. \end{aligned}$$
(5.26)

Now for \(h \in [{m+1}]_0\), we take \(x_{hl}\) to be the initial condition \(\gamma \) for the next subinterval to obtain \(x_{(h+1)l}\). Using (5.24) and (5.25), starting from \(\gamma =b_{h0},b=0\) for \(l=0\), we find

$$\begin{aligned} x_{ml} =V(t_l) \biggl (\prod _{j=1}^{m-h+1}e^{2\Lambda _h(t_l)}\biggr )V^{-1}(t_l)\gamma +\sum _{k=0}^{m-h}V(t_l)\biggl (\prod _{j=1}^k e^{2\Lambda _h(t_l)}\biggr )V^{-1}(t_l)\delta _{(m-k)l}.\nonumber \\ \end{aligned}$$
(5.27)

Since \(\Vert \Lambda _h(t_l)\Vert \le \Vert \Lambda \Vert \le 1\) and \(\Lambda _h(t_l)={{\,\mathrm{diag}\,}}(\lambda _0,\ldots ,\lambda _{d-1})\) with \(\mathop {\mathrm {Re}}(\lambda _i)\le 0\) for \(i \in [{d}]_0\), we have \(\Vert e^{2\Lambda _h(t_l)}\Vert \le 1\). Therefore

$$\begin{aligned} \Vert x_{hl}\Vert&\le \Vert x_{ml}\Vert \le \kappa _V(t_l)\Vert b_{hl}\Vert +(m-h+1)\kappa _V(t_l)\Vert \delta _{hl}\Vert \le \kappa _V(t_l)+e\Vert \gamma \Vert \le \kappa _V+e\Vert \gamma \Vert . \end{aligned}$$
(5.28)

On the other hand, with \(\gamma =0, b=b_{hl}\) for \(l\in [{n}]\), we have

$$\begin{aligned} \begin{aligned} x_{ml}&=V(t_l)\biggl (\prod _{j=1}^{m-h} e^{2\Lambda _h(t_l)}\biggr )\bigl ((e^{2\Lambda _h(t_l)}-I) \Lambda _h(t_l)^{-1}\bigr )V^{-1}(t_l)b \\&\quad +\sum _{k=0}^{m-h}V(t_l)\biggl (\prod _{j=1}^ke^{2\Lambda _h(t_l)}\biggr ) V^{-1}(t_l)\delta _{(m-k)l}, \end{aligned} \end{aligned}$$
(5.29)

so

$$\begin{aligned} \Vert x_{hl}\Vert \le 2\kappa _V(t_l)\Vert b_{hl}\Vert +(m-h+1)\kappa _V(t_l)\Vert \delta _{hl}\Vert \le 2\kappa _V(t_l)+e\Vert \gamma \Vert \le 2\kappa _V+e\Vert \gamma \Vert . \nonumber \\ \end{aligned}$$
(5.30)

For \(h \in \{m,m+1,\ldots ,m+p\}\), according to the definition of \(L_4\) and \(L_5\), we similarly have

$$\begin{aligned} \Vert x_{hl}\Vert =\Vert x_{ml}\Vert \le 2\kappa _V+e\Vert \gamma \Vert . \end{aligned}$$
(5.31)

According to (5.24), \({\hat{x}}_{hl}(t)\) is a monotonic function of \(t \in [-1,1]\), which implies

$$\begin{aligned} \Vert {\hat{x}}_{hl}(t)\Vert ^2\le \max \{\Vert {\hat{x}}_{hl}(-1) \Vert ^2,\Vert {\hat{x}}_{hl}(1)\Vert ^2\}\le (2\kappa _V+e\Vert \gamma \Vert )^2. \end{aligned}$$
(5.32)

Using the identity

$$\begin{aligned} \int _{-1}^1\frac{\mathrm {d}{t}}{\sqrt{1-t^2}}=\pi , \end{aligned}$$
(5.33)

we have

$$\begin{aligned} \int _{-1}^1\Vert {\hat{x}}_{hl}(t)\Vert ^2\frac{\mathrm {d}{t}}{\sqrt{1-t^2}} \le (2\kappa _V+e\Vert \gamma \Vert )^2\int _{-1}^1\frac{\mathrm {d}{t}}{\sqrt{1-t^2}} =\pi (2\kappa _V+e\Vert \gamma \Vert )^2.\nonumber \\ \end{aligned}$$
(5.34)

Consider the Chebyshev expansion of \({\hat{x}}_{hl}(t)\) as in (2.1):

$$\begin{aligned} {\hat{x}}_{hl}(t)=\sum _{i=0}^{d-1}\sum _{l=0}^{\infty }c_{i,l} (\Gamma _{h+1})T_l({\overline{t}}). \end{aligned}$$
(5.35)

By the orthogonality of Chebyshev polynomials (as specified in (A.7)), we have

$$\begin{aligned} \begin{aligned}&\int _{-1}^1\Vert {\hat{x}}_{hl}(t)\Vert ^2\frac{\mathrm {d}{t}}{\sqrt{1-t^2}} =\int _{-1}^1 \biggl (\sum _{i=0}^{d-1}\sum _{l=0}^{\infty }c_{i,l} (\Gamma _{h+1})T_l({\overline{t}})\biggr )^2\frac{\mathrm {d}{t}}{\sqrt{1-t^2}}\\ =&\sum _{i=0}^{d-1}\sum _{l=1}^{\infty }c^2_{i,l}(\Gamma _{h+1}) +2\sum _{i=0}^{d-1}c^2_{i,0}(\Gamma _{h+1})\ge \sum _{i=0}^{d-1} \sum _{l=1}^nc^2_{i,l}(\Gamma _{h+1})+2\sum _{i=0}^{d-1}c^2_{i,0}(\Gamma _{h+1}). \end{aligned} \end{aligned}$$
(5.36)

Using (5.34), this gives

$$\begin{aligned} \sum _{i=0}^{d-1}\sum _{l=0}^nc^2_{i,l}(\Gamma _{h+1}) \le \int _{-1}^1{\hat{x}}_{hl}^2(t)\frac{\mathrm {d}{t}}{\sqrt{1-t^2}} \le \pi (2\kappa _V+e\Vert \gamma \Vert )^2. \end{aligned}$$
(5.37)

Now we compute \(\Vert |X\rangle \Vert \), summing the contributions from all \(c_{i,r}(\Gamma _h)\) and \(x_{mr}\), and notice that \(c_{i,r}=0\) and \(x_{mr}=0\) for all \(r\ne l\), giving

$$\begin{aligned} \begin{aligned} \Vert |X\rangle \Vert ^2&=\sum _{h=0}^{m-1}\sum _{i=0}^{d-1}c^2_{i,l} (\Gamma _{h+1})+(p+1)(x_{ml})^2 \\&\le \pi m(2\kappa _V+e\Vert \gamma \Vert )^2+(p+1)(\kappa _V+e\Vert \gamma \Vert )^2 \\&\le (\pi m+p+1)(2\kappa _V+e\Vert \gamma \Vert )^2. \end{aligned} \end{aligned}$$
(5.38)

Finally, considering all \(h\in [{m+p+1}]_0\) and \(l\in [{n+1}]_0\), from (5.16) we have

$$\begin{aligned} \Vert |B\rangle \Vert ^2=\sum _{h=0}^{m+p}\sum _{l=0}^n\Vert |b_{hl}\rangle \Vert ^2\le 1, \end{aligned}$$
(5.39)

so

$$\begin{aligned} \begin{aligned} \Vert L^{-1}\Vert ^2&=\sup _{\Vert |B\rangle \Vert \le 1}\Vert L^{-1}|B\rangle \Vert ^2 =\sup _{\Vert |B\rangle \Vert \le 1}\sum _{h=0}^{m+p}\sum _{l=0}^n\Vert L^{-1}|b_{hl}\rangle \Vert ^2\\&\le (\pi m+p+1)(m+p+1)(n+1)(2\kappa _V+e\Vert \gamma \Vert )^2\\&\le (\pi m+p+1)^2(n+1)(2\kappa _V+e\Vert \gamma \Vert )^2, \end{aligned} \end{aligned}$$
(5.40)

and therefore

$$\begin{aligned} \Vert L^{-1}\Vert \le (\pi m+p+1)(n+1)^{0.5}(2\kappa _V+e\Vert \gamma \Vert ). \end{aligned}$$
(5.41)

Finally, combining (5.14) and (5.41) gives

$$\begin{aligned} \kappa _L=\Vert L\Vert \Vert L^{-1}\Vert \le (\pi m+p+1)(n+1)^{3.5}(2\kappa _V+e\Vert \gamma \Vert ) \end{aligned}$$
(5.42)

as claimed. \(\square \)

6 Success Probability

We now evaluate the success probability of our approach to the quantum ODE problem.

Lemma 5

Consider an instance of the quantum ODE problem as defined in Problem 1 with the exact solution \({\hat{x}}(t)\) for \(t\in [0,T]\), and its corresponding linear system (3.12) with \(m,p\in {\mathbb {Z}}^+\) and n sufficiently large. When applying the QLSA to this system, the probability of measuring a state proportional to \(|x(T)\rangle =\sum _{i=0}^{d-1}x_i|i\rangle \) is

$$\begin{aligned} P_{\mathrm {measure}} \ge \frac{(p+1)(n+1)}{\pi mq^2+(p+1)(n+1)}, \end{aligned}$$
(6.1)

where \(x_i\) is defined in (3.13), \(\tau \) is defined in (3.9), and

$$\begin{aligned} q:=\max _{t\in [0,T]} \frac{\Vert {\hat{x}}(t)\Vert }{\Vert x(T)\Vert }. \end{aligned}$$
(6.2)

Proof

After solving the linear system (3.12) using the QLSA, we measure the first and third registers of \(|X\rangle \) (as defined in (3.13)). We decompose this state as

$$\begin{aligned} |X\rangle =|X_{\mathrm {bad}}\rangle +|X_{\mathrm {good}}\rangle , \end{aligned}$$
(6.3)

where

$$\begin{aligned} |X_{\mathrm {bad}}\rangle&=\sum _{h=0}^{m-1} \sum _{i=0}^{d-1}\sum _{l=0}^nc_{i,l}(\Gamma _{h+1})|hil\rangle , \end{aligned}$$
(6.4)
$$\begin{aligned} |X_{\mathrm {good}}\rangle&=\sum _{h=m}^{m+p}\sum _{i=0}^{d-1}\sum _{l=0}^nx_i|hil\rangle . \end{aligned}$$
(6.5)

When the first register is observed in some \(h \in \{m,m+1,\ldots ,m+p\}\) (no matter what outcome is seen for the third register), we output the second register, which is then in a normalized state proportional to the final state:

$$\begin{aligned} |X_{\mathrm {measure}}\rangle =\frac{|x(T)\rangle }{\Vert |x(T)\rangle \Vert }, \end{aligned}$$
(6.6)

with

$$\begin{aligned} |x(T)\rangle =\sum _{i=0}^{d-1}x_i|i\rangle =\sum _{i=0}^{d-1} \sum _{k=0}^nc_{i,k}T_k(t)|i\rangle . \end{aligned}$$
(6.7)

Notice that

$$\begin{aligned} \Vert |x(T)\rangle \Vert ^2=\sum _{i=0}^{d-1}x_i^2 \end{aligned}$$
(6.8)

and

$$\begin{aligned} \Vert |X_{\mathrm {good}}\rangle \Vert ^2=(p+1)(n+1)\sum _{i=0}^{d-1} x_i^2=(p+1)(n+1)\Vert |x(T)\rangle \Vert ^2. \end{aligned}$$
(6.9)

Considering the definition of q, the contribution from time interval h under the rescaling (3.2), and the identity (5.33), we have

$$\begin{aligned} \begin{aligned} q^2\Vert x(T)\Vert ^2&=\max _{t\in [0,T]}\Vert {\hat{x}}(t)\Vert ^2 =\frac{1}{\pi }\int _{-1}^1 \frac{\mathrm {d}{\tau }}{\sqrt{1-\tau ^2}}\max _{t\in [0,T]}\Vert {\hat{x}}(t)\Vert ^2\\&\ge \frac{1}{\pi }\int _{-1}^1 \frac{\mathrm {d}{\tau }}{\sqrt{1-\tau ^2}} \max _{t\in [\Gamma _h,\Gamma _{h+1}]}\Vert {\hat{x}}(t)\Vert ^2\\&=\frac{1}{\pi }\int _{-1}^1 \frac{\mathrm {d}{\tau }}{\sqrt{1-\tau ^2}}\max _{{\overline{t}}\in [-1,1]}\Vert {\hat{x}}_h({\overline{t}})\Vert ^2 \ge \frac{1}{\pi }\int _{-1}^1 \Vert {\hat{x}}_h({\overline{t}})\Vert ^2 \frac{\mathrm {d}{{\overline{t}}}}{\sqrt{1-{\overline{t}}^2}}, \end{aligned} \end{aligned}$$
(6.10)

where \({\hat{x}}_h({\overline{t}})\) is the solution of (4.5) with the rescaling in (3.4). By the orthogonality of Chebyshev polynomials (as specified in (A.7)),

$$\begin{aligned} \begin{aligned} q^2\Vert x(T)\Vert ^2&\ge \frac{1}{\pi }\int _{-1}^1 \Vert {\hat{x}}_h({\overline{t}})\Vert ^2 \frac{\mathrm {d}{{\overline{t}}}}{\sqrt{1-{\overline{t}}^2}} =\frac{1}{\pi }\int _{-1}^1(\sum _{i=0}^{d-1}\sum _{k=0}^{\infty } c_{i,k}(\Gamma _{h+1})T_k({\overline{t}}))^2 \frac{\mathrm {d}{{\overline{t}}}}{\sqrt{1-{\overline{t}}^2}}\\&=\frac{1}{\pi }(\sum _{i=0}^{d-1}\sum _{k=1}^{\infty }c^2_{i,k} (\Gamma _{h+1})+2\sum _{i=0}^{d-1}c^2_{i,0}(\Gamma _{h+1})) \ge \frac{1}{\pi }\sum _{i=0}^{d-1}\sum _{k=0}^nc^2_{i,k}(\Gamma _{h+1}). \end{aligned} \end{aligned}$$
(6.11)

For all \(h\in [{m}]_0\), we have

$$\begin{aligned} mq^2\Vert x(T)\Vert ^2\ge \sum _{h=0}^{m-1}\frac{1}{\pi }\sum _{i=0}^{d-1} \sum _{k=0}^nc^2_{i,k}(\Gamma _{h+1}) =\frac{1}{\pi }\Vert |X_{\mathrm {bad}}\rangle \Vert ^2, \end{aligned}$$
(6.12)

and therefore

$$\begin{aligned} \Vert |X_{\mathrm {good}}\rangle \Vert ^2 =(p+1)(n+1)\Vert x(T)\Vert ^2 \ge \frac{(p+1)(n+1)}{\pi mq^2}\Vert |X_{\mathrm {bad}}\rangle \Vert ^2. \end{aligned}$$
(6.13)

Thus we see that the success probability of the measurement satisfies

$$\begin{aligned} P_{\mathrm {measure}} \ge \frac{(p+1)(n+1)}{\pi mq^2+(p+1)(n+1)} \end{aligned}$$
(6.14)

as claimed. \(\square \)

7 State Preparation

We now describe a procedure for preparing the vector \(|B\rangle \) in the linear system (3.12) (defined in (3.14) and (3.15)) using the given ability to prepare the initial state of the system of differential equations. We also evaluate the complexity of this procedure.

Lemma 6

Consider state preparation oracles acting on a state space with basis vectors \(|h\rangle |i\rangle |l\rangle \) for \(h\in [{m}]_0,i\in [{d}]_0,l\in [{n}]_0\), where \(m,d,n \in {\mathbb {N}}\), encoding an initial condition \(\gamma \in {\mathbb {C}}^d\) and function \(f_{h}(\cos \frac{l\pi }{n}) \in {\mathbb {C}}^d\) as in (3.15). Specifically, for any \(h\in [{m}]_0\) and \(l\in [{n}]\), let \(O_x\) be a unitary oracle that maps \(|0\rangle |0\rangle |0\rangle \) to a state proportional to \(|0\rangle |\gamma \rangle |0\rangle \) and \(|h\rangle |\phi \rangle |l\rangle \) to \(|h\rangle |\phi \rangle |l\rangle \) for any \(|\phi \rangle \) orthogonal to \(|0\rangle \); let \(O_f(h,l)\) be a unitary that maps \(|h\rangle |0\rangle |l\rangle \) to a state proportional to \(|h\rangle |f_{h}(\cos \frac{l\pi }{n})\rangle |l\rangle \) and maps \(|0\rangle |\phi \rangle |0\rangle \) to \(|0\rangle |\phi \rangle |0\rangle \) for any \(|\phi \rangle \) orthogonal to \(|0\rangle \). Suppose \(\Vert \gamma \Vert \) and \(\Vert f_{h}(\cos \frac{l\pi }{n})\Vert \) are known. Then the normalized quantum state

$$\begin{aligned} |B\rangle \propto |0\rangle |\gamma \rangle |0\rangle +\sum _{h=0}^{m-1}\sum _{l=1}^n| h\rangle |f_{h}(\cos \tfrac{l\pi }{n})\rangle |l\rangle \end{aligned}$$
(7.1)

can be prepared with gate and query complexity O(mn).

Proof

We normalize the components of the state using the coefficients

$$\begin{aligned} \begin{aligned} b_{00}=&\frac{\Vert \gamma \Vert }{\sqrt{\Vert \gamma \Vert ^2+\sum _{l=1}^n\Vert f_h(\cos \frac{l\pi }{n})\Vert ^2}},\\ b_{hl}=&\frac{\Vert f_h(\cos \frac{l\pi }{n})\Vert }{\sqrt{\Vert \gamma \Vert ^2+\sum _{l=1}^n\Vert f_h(\cos \frac{l\pi }{n})\Vert ^2}}, \quad h\in [{m}]_0, l\in [{n}] \end{aligned} \end{aligned}$$
(7.2)

so that

$$\begin{aligned} \sum _{h=0}^{m-1}\sum _{l=0}^nb_{hl}^2=1. \end{aligned}$$
(7.3)

First we perform a unitary transformation mapping

$$\begin{aligned} |0\rangle |0\rangle |0\rangle \mapsto b_{00}|0\rangle |0\rangle |0\rangle +b_{01}|0\rangle |0\rangle |1\rangle +\cdots +b_{(m-1)n}|m-1\rangle |0\rangle |n\rangle . \end{aligned}$$
(7.4)

This can be done in time complexity O(mn) by standard techniques [32]. Then we perform \(O_x\) and \(O_f(h,l)\) for all \(h\in [{m}]_0, l\in [{n}]\), giving

$$\begin{aligned} |0\rangle |\gamma \rangle |0\rangle +\sum _{h=0}^{m-1}\sum _{l=1}^n |h\rangle |f_{h}(\cos \tfrac{l\pi }{n})\rangle |l\rangle \end{aligned}$$
(7.5)

using O(mn) queries. \(\square \)

8 Main Result

Having analyzed the solution error, condition number, success probability, and state preparation procedure for our approach, we are now ready to establish the main result.

Theorem 1

Consider an instance of the quantum ODE problem as defined in Problem 1. Assume A(t) can be diagonalized as \(A(t)=V(t)\Lambda (t)V^{-1}(t)\) where \(\Lambda (t)={{\,\mathrm{diag}\,}}(\lambda _1(t),\ldots ,\lambda _d(t))\) with \(\mathop {\mathrm {Re}}(\lambda _i(t))\le 0\) for each \(i \in [{d}]_0\) and \(t\in [0,T]\). Then there exists a quantum algorithm that produces a state \(x(T)/\Vert x(T)\Vert \)\(\epsilon \)-close to \({\hat{x}}(T)/\Vert {\hat{x}}(T)\Vert \) in \(l^2\) norm, succeeding with probability \(\Omega (1)\), with a flag indicating success, using

$$\begin{aligned} O\bigl (\kappa _Vs\Vert A\Vert Tq \, {{\,\mathrm{poly}\,}}(\log (\kappa _Vs\Vert A\Vert g'T/\epsilon g))\bigr ) \end{aligned}$$
(8.1)

queries to oracles \(O_A(h,l)\) (a sparse matrix oracle for \(A_h(t_l)\) as defined in (3.3)) and \(O_x\) and \(O_f(h,l)\) (as defined in Lemma 6). Here \(\Vert A\Vert := \max _{t \in [0,T]}\Vert A(t)\Vert \); \(\kappa _V:=\max _t\kappa _V(t)\), where \(\kappa _V(t)\) is the condition number of V(t); and

$$\begin{aligned} g&:=\Vert {\hat{x}}(T)\Vert ,&g'&:=\max _{t\in [0,T]} \max _{n \in {\mathbb {N}}}\Vert {\hat{x}}^{(n+1)}(t)\Vert ,&q&:=\max _{t\in [0,T]} \frac{\Vert {\hat{x}}(t)\Vert }{\Vert x(T)\Vert }. \end{aligned}$$
(8.2)

The gate complexity is larger than the query complexity by a factor of \({{\,\mathrm{poly}\,}}(\log (\kappa _Vds\Vert A\Vert g'T/\epsilon ))\).

Proof

We first present the algorithm and then analyze its complexity.

Statement of the algorithm. First, we choose m to guarantee

$$\begin{aligned} \frac{\Vert A\Vert T}{2m}\le 1. \end{aligned}$$
(8.3)

Then, as in Sect. 3, we divide the interval [0, T] into small subintervals \([0,\Gamma _1],[\Gamma _1,\Gamma _2],\ldots ,[\Gamma _{m-1},T]\) with \(\Gamma _0=0, \Gamma _m=T\), and define

$$\begin{aligned} \tau&:=\max _{0\le h\le m-1}\{\tau _h\},&\tau _h&:=|\Gamma _{h+1}-\Gamma _h|=\frac{T}{m}. \end{aligned}$$
(8.4)

Each subinterval \([\Gamma _h,\Gamma _{h+1}]\) for \(h\in [{m-1}]\) is mapped onto \([-1,1]\) with a linear mapping \(K_h\) satisfying \(K_h(\Gamma _h)=1,K_h(\Gamma _{h+1})=-1\):

$$\begin{aligned} K_h:t \mapsto {\overline{t}}=1-\frac{2(t-\Gamma _h)}{\Gamma _{h+1}-\Gamma _h}. \end{aligned}$$
(8.5)

We choose

$$\begin{aligned} n=\frac{e}{2}\max \biggl \{\biggl \lfloor \frac{\log (\Omega )}{\log (\log (\Omega ))}\biggr \rfloor ,\biggl \lfloor \frac{\log (\omega )}{\log (\log (\omega ))}\biggr \rfloor \biggr \} \end{aligned}$$
(8.6)

where

$$\begin{aligned} \Omega :=\frac{g'em}{\delta }=\frac{g'em(1+\epsilon )}{g\epsilon } \end{aligned}$$
(8.7)

and

$$\begin{aligned} \omega :=\frac{g'}{\Vert \gamma \Vert }(m+1). \end{aligned}$$
(8.8)

Since \(\max _{t\in [0,T]}\Vert {\hat{x}}^{(n+1)}(t)\Vert \le g'\), by Lemma 3, this choice guarantees

$$\begin{aligned} \Vert {\hat{x}}(T)-x(T)\Vert \le m\max _{t\in [0,T]}\Vert {\hat{x}}^{(n+1)}(t)\Vert \frac{e^{n+1}}{(2n)^n}\le \delta \end{aligned}$$
(8.9)

and

$$\begin{aligned} \max _{t\in [0,T]}\frac{\Vert {\hat{x}}^{(n+1)}(t)\Vert }{\Vert \gamma \Vert } \Bigl (\frac{e}{2n}\Bigr )^n\le \frac{1}{m+1}. \end{aligned}$$
(8.10)

Now \(\Vert {\hat{x}}(T)-x(T)\Vert \le \delta \) implies

$$\begin{aligned} \biggl \Vert \frac{{\hat{x}}(T)}{\Vert {\hat{x}}(T)\Vert }-\frac{x(T)}{\Vert x(T)\Vert }\biggr \Vert \le \frac{\delta }{\min \{\Vert {\hat{x}}(T)\Vert ,\Vert x(T)\Vert \}} \le \frac{\delta }{g-\delta }=:\epsilon , \end{aligned}$$
(8.11)

so we can choose such n to ensure that the normalized output state is \(\epsilon \)-close to \({\hat{x}}(T)/\Vert {\hat{x}}(T)\Vert \).

Following Sect. 3, we build the linear system \(L|X\rangle =|B\rangle \) (see (3.12)) that encodes the quantum spectral method. By Lemma 4, the condition number of this linear system is at most \((\pi m+p+1)(n+1)^{3.5}(2\kappa _V+e\Vert \gamma \Vert _{\infty })\). Then we use the QLSA from reference [15] to obtain a normalized state \(|X\rangle \) and measure the first and third register of \(|X\rangle \) in the standard basis. If the measurement outcome for the first register belongs to

$$\begin{aligned} S=\{m,m+1,\ldots ,m+p\}, \end{aligned}$$
(8.12)

we output the state of the second register, which is a normalized state \(| x(T)\rangle /\Vert | x(T)\rangle \Vert \) satisfying (8.11). By Lemma 5, the probability of this event happening is at least \(\frac{(p+1)(n+1)}{\pi mq^2+(p+1)(n+1)}\). To ensure \(m+p=O(\Vert A\Vert T)\), we can choose

$$\begin{aligned} p=O(m)=O(\Vert A\Vert T), \end{aligned}$$
(8.13)

so we can achieve success probability \(\Omega (1)\) with \(O(q/\sqrt{n})\) repetitions of the above procedure.

Analysis of the complexity The matrix L is an \((m+p+1)d(n+1)\times (m+p+1)d(n+1)\) matrix with O(ns) nonzero entries in any row or column. By Lemma 4 and our choice of parameters, the condition number of L is \(O\bigl (\kappa _V(m+p)n^{3.5}\bigr )\). Consequently, by Theorem 5 of [15], the QLSA produces the state \(|x(T)\rangle \) with

$$\begin{aligned} O\bigl (\kappa _V(m+p)n^{4.5}s\, {{\,\mathrm{poly}\,}}(\log (\kappa _Vmns/\delta ))\bigr )=O\bigl (\kappa _Vs\Vert A\Vert T\, {{\,\mathrm{poly}\,}}(\log (\kappa _Vs\Vert A\Vert g'T/\epsilon g))\bigr ) \nonumber \\ \end{aligned}$$
(8.14)

queries to the oracles \(O_A(h,l)\), \(O_x\), and \(O_f(h,l)\), and its gate complexity is larger by a factor of \({{\,\mathrm{poly}\,}}(\log (\kappa _Vmnds/\delta ))\). Using \(O(q/\sqrt{n})\) steps of amplitude amplification to achieve success probability \(\Omega (1)\), the overall query complexity of our algorithm is

$$\begin{aligned} O\bigl (\kappa _V(m+p)n^4sq \, {{\,\mathrm{poly}\,}}(\log (\kappa _Vmns/\delta ))\bigr )=O\bigl (\kappa _Vs\Vert A\Vert Tq \, {{\,\mathrm{poly}\,}}(\log (\kappa _Vs\Vert A\Vert g'T/\epsilon g))\bigr ),\nonumber \\ \end{aligned}$$
(8.15)

and the gate complexity is larger by a factor of

$$\begin{aligned} {{\,\mathrm{poly}\,}}(\log (\kappa _Vds\Vert A\Vert g'T/\epsilon g)) \end{aligned}$$
(8.16)

as claimed. \(\square \)

In general, \(g'\) could be unbounded above as \(n\rightarrow \infty \). However, we could obtain a useful bound in such a case by solving the implicit Eqs. (8.9) and (8.10).

Note that for time-independent differential equations, we can replace \(g'\) by \(\Vert \gamma \Vert +2\tau \Vert f\Vert \) as shown in (4.20). In place of (8.7) and (8.8), we choose

$$\begin{aligned} \Omega :=\frac{(\Vert \gamma \Vert +2\tau \Vert f\Vert )em\kappa _V}{\delta } =\frac{(\Vert \gamma \Vert +2\tau \Vert f\Vert )em\kappa _V(1+\epsilon )}{g\epsilon } \end{aligned}$$
(8.17)

and

$$\begin{aligned} \omega :=\frac{\Vert \gamma \Vert +2\tau \Vert f\Vert }{\Vert \gamma \Vert }(m+1)\kappa _V. \end{aligned}$$
(8.18)

By Lemma 3, this choice guarantees

$$\begin{aligned} \Vert {\hat{x}}(T)-x(T)\Vert \le \max _{t\in [-1,1]}\Vert {\hat{x}}(t)-x(t)\Vert \le m\kappa _V(\Vert \gamma \Vert +2\tau \Vert f\Vert )\frac{e^{n+1}}{(2n)^n}\le \delta \nonumber \\ \end{aligned}$$
(8.19)

and

$$\begin{aligned} \max _{t\in [0,T]}\frac{\Vert {\hat{x}}^{(n+1)}(t)\Vert }{\Vert \gamma \Vert } \Bigl (\frac{e}{2n}\Bigr )^n\le \frac{\kappa _V(\Vert \gamma \Vert +2\tau \Vert f\Vert )}{\Vert \gamma \Vert }\Bigl (\frac{e}{2n}\Bigr )^n\le \frac{1}{m+1}. \end{aligned}$$
(8.20)

Thus we have the following:

Corollary 1

For time-independent differential equations, under the same assumptions of Theorem 1, there exists a quantum algorithm using

$$\begin{aligned} O\bigl (\kappa _Vs\Vert A\Vert Tq \, {{\,\mathrm{poly}\,}}(\log (\kappa _Vs\gamma \Vert A\Vert \Vert f\Vert T/\epsilon g))\bigr ) \end{aligned}$$
(8.21)

queries to \(O_A(h,l)\), \(O_x\), and \(O_f(h,l)\). The gate complexity of this algorithm is larger than its query complexity by a factor of \({{\,\mathrm{poly}\,}}(\log (\kappa _Vds\gamma \Vert A\Vert \Vert f\Vert T/\epsilon ))\).

The complexity of our algorithm depends on the parameter q in defined in (8.2), which characterizes the decay of the final state relative to the initial state. As discussed in Section 8 of [10], it is unlikely that the dependence on q can be significantly improved, since renormalization of the state effectively implements postselection and an efficient procedure for performing this would have the unlikely consequence \(\mathrm {BQP} = \mathrm {PP}\).

We also require the real parts of the eigenvalues of A(t) to be non-positive for all \(t\in [0,T]\) so that the solution cannot grow exponentially. This requirement is essentially the same as in the time-independent case considered in [10] and improves upon the analogous condition in [5] (which requires an additional stability condition). Also as in [10], our algorithm can produce approximate solutions for non-diagonalizable A(t), although the dependence on \(\epsilon \) degrades to \({{\,\mathrm{poly}\,}}(1/\epsilon )\). For further discussion of these considerations, see Sections 1 and 8 of [10].

9 Boundary Value Problems

So far we have focused on initial value problems (IVPs). Boundary value problems (BVPs) are another widely studied class of differential equations that appear in many applications, but that can be harder to solve than IVPs.

Consider a sparse, linear, time-dependent system of differential equations as in Problem 1 but with a constraint on some linear combination of the initial and final states:

Problem 2

In the quantum BVP, we are given a system of equations

$$\begin{aligned} \frac{\mathrm {d}{x(t)}}{\mathrm {d}{t}}=A(t)x(t)+f(t), \end{aligned}$$
(9.1)

where \(x(t) \in {\mathbb {C}}^d\), \(A(t) \in {\mathbb {C}}^{d\times d}\) is s-sparse, and \(f(t) \in {\mathbb {C}}^d\) for all \(t\in [0,T]\), and a boundary condition \(\alpha x(0)+\beta x(T)=\gamma \) with \(\alpha , \beta , \gamma \in {\mathbb {C}}^d\). Suppose there exists a unique solution \({\hat{x}}\in C^{\infty }(0,T)\) of this boundary value problem. Given oracles that compute the locations and values of nonzero entries of A(t) for any t, and that prepare quantum states \(\alpha |x(0)\rangle +\beta |x(T)\rangle =|\gamma \rangle \) and \(|f(t)\rangle \) for any t, the goal is to output a quantum state \(|x(t^*)\rangle \) that is proportional to \(x(t^*)\) for some specified \(t^*\in [0,T]\).

As before, we can rescale [0, T] onto \([-1,1]\) by a linear mapping. However, since we have boundary conditions at \(t=0\) and \(t=T\), we cannot divide [0, T] into small subintervals. Instead, we directly map [0, T] onto \([-1,1]\) with a linear map K satisfying \(K(0)=1\) and \(K(T)=-1\):

$$\begin{aligned} K:t \mapsto {\overline{t}}=1-\frac{2t}{T}. \end{aligned}$$
(9.2)

Now the new differential equations are

$$\begin{aligned} \frac{\mathrm {d}{x}}{\mathrm {d}{\overline{t}}}=-\frac{T}{2}\bigl (A(t)x+f(t)\bigr ). \end{aligned}$$
(9.3)

If we define \(A_K({\overline{t}}):=-\frac{T}{2}A(t)\) and \(f_K({\overline{t}})=-\frac{T}{2}f(t)\), we have

$$\begin{aligned} \frac{\mathrm {d}{x}}{\mathrm {d}{\overline{t}}}=A_K({\overline{t}}) x({\overline{t}})+f_K({\overline{t}}) \end{aligned}$$
(9.4)

for \({\overline{t}}\in [-1,1]\). Now the boundary condition takes the form

$$\begin{aligned} \alpha x(1)+\beta x(-1)=\gamma . \end{aligned}$$
(9.5)

Since we only have one solution interval, we need to choose a larger order n of the Chebyshev series to reduce the solution error. In particular, we take

$$\begin{aligned} n=\frac{e}{2}\Vert A\Vert T\max \biggl \{\biggl \lfloor \frac{\log (\Omega )}{\log (\log (\Omega ))}\biggr \rfloor ,\biggl \lfloor \frac{\log (\omega )}{\log (\log (\omega ))}\biggr \rfloor \biggr \} \end{aligned}$$
(9.6)

where \(\Omega \) and \(\omega \) are the same as Theorem 1.

As in Sect. 3, we approximate x(t) by a finite Chebyshev series with interpolating nodes \(\{t_l=\cos \frac{l\pi }{n} : l \in [{n}]\}\) and thereby obtain a linear system

$$\begin{aligned} \frac{\mathrm {d}{x(t_l)}}{\mathrm {d}{t}}=A_K(t_l)x(t_l)+f(t_l), \quad l\in [{n}] \end{aligned}$$
(9.7)

with the boundary condition

$$\begin{aligned} \alpha x(t_0)+\beta x(t_n)=\gamma . \end{aligned}$$
(9.8)

Observe that the linear equations have the same form as in (3.10). Instead of (3.11), the term with \(l=0\) encodes the condition (9.8) expanded in a Chebyshev series, namely

$$\begin{aligned} \alpha _i\sum _{k=0}^nc_{i,k}T_k(t_0)+\beta _i\sum _{k=0}^nc_{i,k}T_k(t_n)=\gamma _i \end{aligned}$$
(9.9)

for each \(i\in [{d}]_0\). Since \(T_k(t_0)=1\) and \(T_k(t_n)=(-1)^k\), this can be simplified as

$$\begin{aligned} \sum _{k=0}^n(\alpha _i+(-1)^k\beta _i)c_{i,k}=\gamma _i. \end{aligned}$$
(9.10)

If \(\alpha _i+(-1)^k\beta _i=0\), the element of \(|il\rangle \langle ik|\) of \(L_2(A_K)\) is zero; if \(\alpha _i+(-1)^k\beta _i\ne 0\), without loss of generality, the two sides of this equality can be divided by \(\alpha _i+(-1)^k\beta _i\) to guarantee that the terms with \(l=0\) can be encoded as in (3.11).

Now this system can be written in the form of equation (3.12) with \(m=1\). Here L, \(|X\rangle \), and \(|B\rangle \) are the same as in (3.16), (3.13), and (3.14), respectively, with \(m=1\), except for adjustments to \(L_3\) that we now describe.

The matrix \(L_3\) represents the linear combination \(x_i(t^*)=\sum _{k=0}^nc_{i,k}T_k(t^*)\). Thus we take

$$\begin{aligned} L_3=\sum _{i=0}^{d-1}\sum _{k=0}^nT_k(t^*)|i0\rangle \langle ik|. \end{aligned}$$
(9.11)

Since \(|T_k(t^*)|\le 1\), we have

$$\begin{aligned} \Vert L_3\Vert \le n+1, \end{aligned}$$
(9.12)

and it follows that Lemma 3 also holds for boundary value problems. Similarly, Lemma 4 still holds with \(m=1\).

We are now ready to analyze the complexity of the quantum BVP algorithm. The matrix L defined above is a \((p+2)d(n+1)\times (p+2)d(n+1)\) matrix with O(ns) nonzero entries in any row or column, with condition number \(O(\kappa _Vpn^{3.5})\). By Lemma 5 with \(p=O(1)\), \(O(q/\sqrt{n})\) repetitions suffice to ensure success probability \(\Omega (1)\). By (9.6), n is linear in \(\Vert A\Vert T\) and poly-logarithmic in \(\Omega \) and \(\omega \). Therefore, we have the following:

Theorem 2

Consider an instance of the quantum BVP as defined in Problem 2. Assume A(t) can be diagonalized as \(A(t)=V(t)\Lambda (t)V^{-1}(t)\) where \(\Lambda (t)={{\,\mathrm{diag}\,}}(\lambda _1(t),\ldots ,\lambda _d(t))\) with \(\mathop {\mathrm {Re}}(\lambda _i(t))\le 0\) for each \(i \in [{d}]_0\) and \(t\in [0,T]\). Then there exists a quantum algorithm that produces a state \(x(t^*)/\Vert x(t^*)\Vert \)\(\epsilon \)-close to \({\hat{x}}(t^*)/\Vert {\hat{x}}(t^*)\Vert \) in \(l^2\) norm, succeeding with probability \(\Omega (1)\), with a flag indicating success, using

$$\begin{aligned} O\bigl (\kappa _Vs\Vert A\Vert ^4T^4q \, {{\,\mathrm{poly}\,}}(\log (\kappa _Vs\Vert A\Vert g'T/\epsilon g))\bigr ) \end{aligned}$$
(9.13)

queries to \(O_A(h,l)\), \(O_x\), and \(O_f(h,l)\). Here \(\Vert A\Vert \), \(\kappa _V\), g, \(g'\) and q are defined as in Theorem 1. The gate complexity is larger than the query complexity by a factor of \({{\,\mathrm{poly}\,}}(\log (\kappa _Vds\Vert A\Vert g'T/\epsilon ))\).

As for initial value problems, we can simplify this result in the time-independent case.

Corollary 2

For a time-independent boundary value problem, under the same assumptions of Theorem 2, there exists a quantum algorithm using

$$\begin{aligned} O\bigl (\kappa _Vs\Vert A\Vert ^4T^4q \, {{\,\mathrm{poly}\,}}(\log (\kappa _Vs\gamma \Vert A\Vert \Vert f\Vert T/\epsilon g))\bigr ) \end{aligned}$$
(9.14)

queries to \(O_A(h,l)\), \(O_x\), and \(O_f(h,l)\). The gate complexity of this algorithm is larger than its query complexity by a factor of \({{\,\mathrm{poly}\,}}(\log (\kappa _Vds\gamma \Vert A\Vert \Vert f\Vert T/\epsilon ))\).

10 Discussion

In this paper, we presented a quantum algorithm to solve linear, time-dependent ordinary differential equations. Specifically, we showed how to employ a global approximation based on the spectral method as an alternative to the more straightforward finite difference method. Our algorithm handles time-independent differential equations with almost the same complexity as [10], but unlike that approach, can also handle time-dependent differential equations. Compared to [5], our algorithm improves the complexity of solving time-dependent linear differential equations from \({{\,\mathrm{poly}\,}}(1/\epsilon )\) to \({{\,\mathrm{poly}\,}}(\log (1/\epsilon ))\).

This work raises several natural open problems. First, our algorithm must assume that the solution is smooth. If the solution is in \(C^r\), the solution error is \(O(\frac{1}{n^{r-2}})\) by Lemma 1. Can we improve the complexity to \({{\,\mathrm{poly}\,}}(\log (1/\epsilon ))\) under such weaker smoothness assumptions?

Second, the complexity of our algorithm is logarithmic in the parameter \(g'\) defined in (8.2), which characterizes the amount of fluctuation in the solution. However, the query complexity of Hamiltonian simulation is independent of that parameter [7, 30]. Can we develop quantum algorithms for general differential equations with query complexity independent of \(g'\)?

Third, our algorithm has nearly optimal dependence on T, scaling as \(O(T {{\,\mathrm{poly}\,}}(\log T))\). According to the no-fast-forwarding theorem [6], the complexity must be at least linear in T, and indeed linear complexity is achievable for the case of Hamiltonian simulation [14]. Can we handle general differential equations with complexity linear in T? Furthermore, can we achieve an optimal tradeoff between T and \(\epsilon \) as shown for Hamiltonian simulation in [28]?

Finally, can the techniques developed here be applied to give improved quantum algorithms for linear partial differential equations, or even for nonlinear ODEs or PDEs?