Abstract
Nonlinear differential equations exhibit rich phenomena in many fields but are notoriously challenging to solve. Recently, Liu et al. (in: Proceedings of the National Academy of Sciences 118(35), 2021) demonstrated the first efficient quantum algorithm for dissipative quadratic differential equations under the condition \(R < 1\), where R measures the ratio of nonlinearity to dissipation using the \(\ell _2\) norm. Here we develop an efficient quantum algorithm based on Liu et al. (2021) for reaction–diffusion equations, a class of nonlinear partial differential equations (PDEs). To achieve this, we improve upon the Carleman linearization approach introduced in Liu et al. (2021) to obtain a faster convergence rate under the condition \(R_D < 1\), where \(R_D\) measures the ratio of nonlinearity to dissipation using the \(\ell _{\infty }\) norm. Since \(R_D\) is independent of the number of spatial grid points n while R increases with n, the criterion \(R_D<1\) is significantly milder than \(R<1\) for high-dimensional systems and can stay convergent under grid refinement for approximating PDEs. As applications of our quantum algorithm we consider the Fisher-KPP and Allen-Cahn equations, which have interpretations in classical physics. In particular, we show how to estimate the mean square kinetic energy in the solution by postprocessing the quantum state that encodes it to extract derivative information.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Nonlinear partial differential equations (PDEs) of reaction–diffusion type are widespread and have many applications, ranging from biology and ecology to data science. Exhibiting rich phenomena, reaction–diffusion equations have been applied to describe biological transport networks such as leaf venations and blood flow [1,2,3,4,5,6,7,8,9,10], predator–prey dynamics in interacting populations [11,12,13], prediction of brain functions and tumor growth [14, 15], the formation of the Turing patterns in tissues and organs [16,17,18,19], dendritic colony growth [20, 21], complex chemical processes such as combustion [22,23,24,25] and calcium dynamics [26]. Reaction–diffusion equations have also been applied to data classification [27,28,29], and image segmentation and inpainting [30,31,32,33]. In many cases, the underlying model can be viewed as an energy optimization procedure, with the reaction–diffusion equations as the gradient flow. Such reaction–diffusion equations inherit the property that energy decays with time. Moreover, the maximum principle is satisfied, which states that if the initial and boundary conditions are bounded by a certain constant, then the entire solution remains bounded (in the \(L^\infty \) sense) for all time. When designing numerical approximation schemes it is often of great interest to maintain these properties exactly.
Attempts to solve such PDEs on classical computers are hampered by the so-called curse of dimensionality, in which computational complexity grows exponentially with spatial dimension [34]. For example, in d dimensions, if each coordinate is discretized by n grid points, the grid will have \(\Omega (n^d)\) grid points.
Recent advances in quantum computing offer a fresh approach to the efficient solution of such high-dimensional problems. Quantum algorithms have been developed to prepare a quantum state encoding the solution to an \(n^d\)-dimensional linear system, while in some cases only requiring quantum circuits of complexity \({{\,\textrm{poly}\,}}(d,\log n)\) [35,36,37,38,39,40,41,42,43]. Such quantum algorithms have been applied to address high-dimensional problems governed by linear ODEs [44,45,46,47] and PDEs [48,49,50,51,52,53,54].
It has been a longstanding open problem to understand the capability of quantum computers to solve nonlinear differential equations. An early work proposed a quantum algorithm for ODEs that simulates polynomial nonlinearities by employing multiple copies of the solution [55]. For n-dimensional systems of polynomial ordinary differential equations, this quantum algorithm scales as \({{\,\textrm{poly}\,}}(\log n, 1/\epsilon ^{T})\). Finding quantum algorithms with polynomial scaling in T for solving nonlinear differential equations remained an open problem. Furthermore, complexity-theoretic arguments indicate that this should not be achievable in the most general case [56,57,58], but rather will require exploiting specific properties of restricted classes of nonlinear differential equations.
Recently, in [59], a quantum algorithm based on Carlemann linearization [60,61,62] was proposed for solving a class of nonlinear differential equations
Here we assume \(F_1\) is dissipative, i.e. all the eigenvalues of \(F_1\) are negative. We are given an initial condition \(U(0) = U_{\textrm{in}}\), an error tolerance \(\epsilon \), and a time-duration T. The efficiency of the algorithm depends on R, defined as
where \(\lambda _1\) is the largest eigenvalue of \(F_1\). The quantity R is used to quantify the relative strength of the nonlinearity and forcing to the linear dissipation according to the \(\ell _2\) norm. A more general definition of R for polynomial differential equations is given in (3.10). Under the condition \(R < 1\), the algorithm has complexity \(\mathcal {O}( \frac{T^2 q}{g \epsilon } {{\,\textrm{poly}\,}}(\log T, \log n, \log 1/\epsilon ) )\), where \(q = \Vert U_{\textrm{in}}\Vert /\Vert U(T)\Vert \), and \(g = \Vert U(T)\Vert \). This quadratic scaling with T was an exponential improvement over prior quantum algorithms for solving nonlinear differential equations. The error dependence of quantum Carleman linearization was subsequently improved from \({{\,\textrm{poly}\,}}(1/\epsilon )\) to \({{\,\textrm{poly}\,}}(\log 1/\epsilon )\) in [63], by assuming the log-norm of the dissipation matrix is negative rather than exploiting a diagonalizability condition.
Various quantum algorithms for nonlinear differential equations have also been investigated recently based on Koopman-von Neumann linearization [64,65,66,67,68] and the related level set formalism [69]. Others have been proposed based on the non-Hermitian Hamiltonian approach [64, 66, 70] and the homotopy perturbation approach [71]. Carleman linearization can be treated as a particular Koopman-von Neumann linearization, while the non-Hermitian Hamiltonian approach is inspired by quantum simulation. The quantum algorithm of [71] combines the homotopy perturbation method with the high-precision quantum linear ODE solver of [45], achieving complexity that scales linearly in T and polylogarithmically in \(1/\epsilon \). This is shown under the condition \(K = 4R < 1\), which is stricter than the condition \(R < 1\) used in [59]. The quantum algorithm of [69] is based on the level set method, which maps a nonlinear differential equation into a linear differential equation describing the dynamics of the level sets of the solution to the original nonlinear differential equation. Given a specific construction for the encoding of initial data, the quantum algorithm of [69] encodes the level set function from which physical observables can be estimated corresponding to multiple initial conditions.
Many of the quantum algorithms proposed for solving differential equations depend on solving a high dimensional linear system, and their complexities are thus determined by the condition number of this system. Deriving bounds on this condition number based on the properties of the original differential equations is challenging and largely unsolved. Quantum complexity lower bounds on simulating nonlinear quantum dynamics [58] or classical dynamics [59] show that this condition number becomes exponential in the worst case.
In this paper, we extend upon the quantum algorithm of [59] by adapting the Carlemann linearization approach to the context of reaction–diffusion PDEs. We also show that reaction–diffusion equations are still tractable on quantum computers even for larger R under the Maximum Principle, ruling out the worst-case exponential time complexity in [59]. Finally, we conduct several numerical experiments for Fisher-KPP equations and Allen-Cahn equations to verify the convergence rate and efficiency of the improved Carleman linearization.
We compare our improved quantum Carleman linearization algorithm to the original one [59] in Table 1. Both quantum Carleman linearization methods solve an \(n_d\)-dimensional system of ordinary differential equations with initial condition \(U_{\textrm{in}}\), for a given evolution time T and normalized \(\ell _2\) error tolerance \(\epsilon \). (In the present work, we consider this set of ODEs as arising from the discretization of a PDE.) The quantum algorithm in [59] solves nonlinear dissipative differential equations of the form in (1.1). Our new quantum algorithm solves reaction–diffusion equations, where \(F_1\) is Laplacian and \(F_0 = 0\), but the \(F_2 U^{\otimes 2}\) term is instead allowed to be a high-degree polynomial. Thus, the class of problems we consider here is neither a strict generalization nor a strict special case of that considered in [59].
Our new quantum algorithm produces a Feynman-Kitaev history state encoding the full time-evolution of the solution \(U(t): t \in [0,T]\), whereas the algorithm of [59] produces a quantum state encoding the final value U(T). The history state we produce corresponds to the gradient flow of the energy functional. We can post-process this state to extract derivative information which can be interpreted as kinetic energy in classical physical systems modeled by the nonlinear PDE.
The quantum algorithm of [59] has time-complexity proportional to q/g where \(q = \Vert U_{\textrm{in}}\Vert /\Vert U(t)\Vert \) and \(g = \Vert U(T)\Vert \) is the final norm of the solution. Our new algorithm instead has complexity proportional to \(\Vert U_{\textrm{in}}\Vert /G\), where \(G = \frac{1}{T}\int _0^T \Vert U(t)\Vert \) is the time-averaged norm of the solution. In some cases, 1/G can be much smaller than 1/g. For example, the solution \(u(t) = \text {e}^{-t}\) arises in many homogeneous dissipative differential equations. In this case \(1/g = \text {e}^T\) and \(1/G = \Omega (T)\).
Polynomial complexity is here shown under the assumption \(R_D < 1\), where \(R_D\) is a ratio of nonlinearity to dissipation in \(\ell _\infty \) norm, whereas the algorithm of [59] requires \(R < 1\), where R is a ratio of nonlinearity to dissipation in \(\ell _2\) norm. The latter is a stronger assumption, not well suited for the solution of high-dimensional PDEs because it grows under grid refinement, whereas \(R_D\) converges to a constant. Specifically, in the limit where each spatial dimension is discretized into \(n \rightarrow \infty \) steps, the number of lattice sites scales as \(n^d\) and the \(\ell _2\) norm of the discretized solution vector scales as \(n^{d/2}\), which leads to divergent R (see (1.12) and (1.13) for detailed discussion).
The solutions to general nonlinear differential equations can have exponentially growing norms, which would result in an exponential complexity for the algorithm introduced here. However, we rule this out for reaction–diffusion equations by establishing an upper bound on the \(\ell _{\infty }\) norm of the solution independent of T as shown in Theorem 5.1.
We also study the extraction of classical information of practical interest from the history state produced by our algorithm. First, from the quantum state, we can directly estimate the mean square amplitude over a specific sub-domain, which can be understood as the portion of a physical observable on this sub-domain. Our approach is a direct application of amplitude estimate technique [72] and can achieve a quadratic speedup in precision over standard classical Monte Carlo sampling. Secondly, we show how to estimate the portion of the kinetic energy on a specific sub-domain by developing a quantum algorithm that can transfer a quantum state with function values to a quantum state encoding its partial derivatives. This algorithm is based on the discrete Fourier transform. It only requires \(\mathcal {O}(1)\) uses of quantum Fourier transform (QFT) and input oracle of a diagonal matrix, and can potentially be of independent interest in other problems such as quantum optimization algorithms. Our main results are summarized in Table 2. Second, we briefly discuss the potential advantages of the history state compared to the final state. In particular, the history state structure allows us to estimate the time when the system reaches equilibrium and run a pre-diagnosis procedure to avoid possible exponential overhead brought by the fast decay of the solution.
The paper is organized as follows. Section 2 introduces the background of reaction–diffusion equations. Section 3 develops the Carleman linearization with \(\ell _2\) and \(\ell _{\infty }\) convergence analysis. Section 4 presents the problem model and gives the quantum algorithm with a detailed complexity analysis. Section 5 establishes lower bound results. Section 6 describes how our approach could be applied to kinetic energy estimation problems. Finally, we conclude with a discussion of the results and some possible future directions in Sect. 7.
1.1 Preliminaries
Here we denote the domain, boundaries, functions, and norms as follows.
We consider a d-dimensional hypercube as the spatial domain, denoting as \(\mathscr {D}:=[0,1]^d\). We denote the spatial and time domain as \(\mathscr {D}_T:=[0,1]^d\times (0,T]\). We also denote \(\partial \mathscr {D}\) and \(\partial \mathscr {D}_T\) as boundary domains of \(\mathscr {D}\) and \(\mathscr {D}_T\), respectively.
We consider a uniform spatial discretization on \(\mathscr {D}\) and introduce n discretization nodes for each coordinate. To represent it, we denote \([{n}]_0 :=\{0,1,\ldots ,n-1\}\) and a set of multi-indices as
We then denote the set of uniform nodes as
where \(\chi _l\) maps index l to the discretization node. The exact expression for \(\chi _l\) depends on the boundary condition. For periodic boundary condition, \(\chi _l\) is defined as
while for Dirichlet boundary condition, it is given by
For convenience, we also introduce the set of boundary indices, which is defined as
Let \(u:\mathscr {D}_T \rightarrow \mathbb {R}\) be the solution to a PDE. We can discretize u(x, t) on the set of uniform nodes \(\chi \) to obtain an \(n_d\)-dimensional vector U(t), where \(n_d = n^d\). The vector’s entries \(U_1(t), U_2(t), \ldots , U_{n_d}(t)\) are the elements of \(\{u(\chi _l,t)|l \in \mathcal {I}\}\), arranged according to the lexicographic order on \(\mathcal {I}\).
We now discuss our notations for norms. For a vector \(a = [a_1, a_2, \ldots , a_n]\in \mathbb {R}^n\), we denote the vector \(\ell _p\) norm as
For a matrix \(A\in \mathbb {R}^{n\times n}\), we denote the operator norm \(\left\Vert \cdot \right\Vert _{p,q}\) induced by the vector \(\ell _p\) and \(\ell _q\) norms as
For a continuous scalar function \(f: [0,T]\rightarrow \mathbb {R}\), we denote the \(L^\infty \) norm as
For a continuous scalar function \(u: \overline{\mathscr {D}}_T\rightarrow \mathbb {R}\), for a fixed t, the \(L^p\) norm of \(u(\cdot ,t)\) is given by
In particular, when no subscript is used, we mean \(\left\Vert \cdot \right\Vert = \left\Vert \cdot \right\Vert _2\) for vector and matrix norms by default, and \(\left\Vert \cdot \right\Vert = \left\Vert \cdot \right\Vert _{L^2}\) for function norm by default.
For a continuous scalar function \(u: [0,1]\times [0,T]\rightarrow \mathbb {R}\), which is discretized in space using uniform interpolation nodes, obtain an estimate of its \(L^p\) norm as in a Riemann sum:
If we denote \(U(t) = [u(\frac{0}{n},t), u(\frac{1}{n},t), \ldots , u(\frac{n-1}{n},t)]\), the RHS of (1.12) is \(\frac{1}{n}\Vert U(t)\Vert _{p}^p\). This indicates that
Thus, if \(u(\cdot ,t)\) is a given function in continuous one-dimensional space, then the \(\ell _p\) norm of its spatially discretized function vector U(t) increases under grid refinement as \(n^{1/p}\). Similarly, for a \(u: \overline{\mathscr {D}}_T\rightarrow \mathbb {R}\) with a general spatial dimension d, the \(\ell _p\) norm of the spatially discretized function vector U(t) increases as \(n^{d/p}\).
Note that when \(p=\infty \),
That means the \(\ell _{\infty }\) norm of the spatially discretized function vector U(t) stays convergent in the continuum limit of \(n \rightarrow \infty \).
For real functions \(f,g: \mathbb {R}\rightarrow \mathbb {R}\), we write \(f=\mathcal {O}(g)\) if there exists \(c>0\), such that \(|f(\tau )|\leqslant c|g(\tau )|\) for all \(\tau \in \mathbb {R}\). We write \(f=\Omega (g)\) if \(g=\mathcal {O}(f)\), and \(f=\Theta (g)\) if both \(f=\mathcal {O}(g)\) and \(g=\mathcal {O}(f)\). We use \({\widetilde{\mathcal {O}}}\) to suppress logarithmic factors in the asymptotic expression, i.e., \(f={\widetilde{\mathcal {O}}} (g)\) if \(f=\mathcal {O}\Bigl (g{{\,\textrm{poly}\,}}(\log g)\Bigr )\). We write \(f=o(g)\) if \(\limsup _{\tau \rightarrow \infty } \frac{|f(\tau )|}{|g(\tau )|}=0\), where g is nonzero.
2 Problem Settings
In this section, we introduce the class of nonlinear PDEs that we focus on and discuss its spatial discretization, as well as a priori bounds on its solutions. We then introduce the problem statement with input and output settings.
2.1 Reaction–diffusion equation
We focus on a class of nonlinear PDEs—the reaction–diffusion equations
where u is a real-valued scalar function at position \(x \in \mathscr {D}= [0,1]^d\) and time \(t\in \mathbb {R}^+\), f(u) is the nonlinear term and D is a positive number. Without loss of generality, denoting \(\mathscr {D}= \mathscr {D}_1\times \mathscr {D}_2\), we are given homogeneous Dirichlet boundary conditions imposed on \(d_1\) dimensions (\(x\in \mathscr {D}_1\)) and periodic boundary conditions imposed on \(d_2\) dimensions (\(x\in \mathscr {D}_2\), \(d_1 + d_2 = d\))
The solution u(x, t) in (2.1) is the \(L^2\) gradient flow of the free energy functional
with a potential satisfying
In other words, f is a field driven by the potential F.
In this paper, we focus on the following specific reaction–diffusion equations
with the integer \(M\geqslant 2\). Without loss of generality, we assume \(|a|, |b| = o(d)\). The motivation to consider this type of PDEs is two-fold: first, in physical and biological applications, a nonlinearity of this form is frequently encountered. For example, in the phase transition model (the so-called Allen-Cahn equation), \(M = 3\) [73], while \(M = 2\) corresponds to the Fisher-KPP equation [16, 74]. Furthermore, on quantum computers, it is a reasonable task to construct tensor powers of a quantum state, such as \(u^{\otimes M}\), which exactly corresponds to the polynomial nonlinearity in (2.6). Although we do not do so in this paper, one might also consider an input model in which a more general nonlinearity f is specified by an oracle \(U_f\).
2.2 Spatial discretization
Our approach to solving reaction–diffusion PDEs (2.6) on quantum computers starts by performing spatial discretization to reduce to a problem of solving a system of nonlinear ODEs. Specifically, we apply the central difference discretization on (2.1) to obtain the \(n^d\)-dimensional polynomial ODE
where \(\Delta _h\) stands for the central difference of the Laplacian with homogeneous Dirichlet boundary condition or periodic boundary condition, defined as
Here \(D_h\) is the one-dimensional discrete Laplacian operator. For homogeneous Dirichlet boundary conditions, \(D_h\) is
We denote the eigenvalues of \(D_h\) as \(\mu _1\ge \mu _2\ge \cdots \ge \mu _n\). Specifically, \(\mu _1=-4(n+1)^2\sin ^2\left( \frac{\pi }{2n+2}\right) \). For periodic boundary conditions, the one-dimensional discrete Laplacian operator is
In this case, the largest eigenvalue of \(D_h^\text {per}\) is \(\mu _1^\text {per} = 0\).
It is worth pointing out that besides serving as a numerical discretization of the corresponding PDE, the discrete reaction–diffusion equation is of interest unto itself. For example, the discrete Allen-Cahn equation has been applied to unsupervised and semi-supervised graph classification, graph cut minimization, social network segmentation, and image inpainting [27,28,29, 75,76,77].
We also introduce bounds on the solution to the discrete reaction–diffusion equations (2.7), which are used in the proof of later theorems.
Lemma 2.1
(\(\ell _\infty \) A priori Bounds on the Solution). Assume \(f \in C^\infty (\mathbb {R})\) and has at least two distinct real-valued roots. Denote any two distinct roots of f as \(\gamma _1, \gamma _2\) with \(\gamma _1 < \gamma _2\). Consider the solution \(U(t) = (U_j)_{j \in \mathcal {I}}\) to (2.7) with initial condition \(U_j(0) = u_0(x_j)\) for all \(j \in \mathcal {I}\).
(i) Comparison Principle. If the initial condition satisfies
and so does the solution on the boundary indices \(\mathcal {B}\), then the solution \(U_j(t)\) remains bounded, that is,
(ii) Maximum Principle. In particular, we denote \(\gamma \) as the largest absolute value of roots of f. If the initial condition satisfies
then
We present the proof of Lemma 2.1 in Sect. A. It is worth remarking that this result asserts that the solution U(t) stays in the invariant set \([\gamma _1, \gamma _2]\), which is different from the type of estimate where \(\left\Vert U(t)\right\Vert _\infty \leqslant \left\Vert U(0)\right\Vert _\infty \). In fact, the solution \(\left\Vert U(t)\right\Vert _\infty \) can increase in time as depicted in Fig. 1. We also point out that the formation of this invariant region is precisely due to the nonlinear terms in f, and hence the nonlinear parts of the differential equation cannot be neglected, even if the solution remains small.
For the case of the specific reaction–diffusion equation (2.6), the roots of \(f(u) = au + bu^{M} = 0\) gives the explicit expression \(\gamma = (\frac{|a|}{|b|})^{\frac{1}{M-1}}\).
2.3 Problem statement
We are interested in solving high-dimensional reaction–diffusion equations with quantum computers. Given the initial condition described by a quantum state, we aim to provide a quantum state description of the solution given the evolution time T.
The main computational problem we consider is as follows.
Problem 1
We consider an initial value problem of an \(n_d\)-dimensional polynomial ODE on [0, T] as in (2.14)
Here \(U=[U_1, \ldots , U_{n_d}]^{T}\in \mathbb {R}^{n_d}\), \(U^{\otimes M}=[U_1^M, U_1^{M-1}U_2, \ldots , U_{n_d}^{M-1}U_{n_d-1}, U_{n_d}^M]^{T}\in \mathbb {R}^{n_d^M}\). We assume \(F_1\), \(F_M\) are s-sparse,Footnote 1\(F_1\) is symmetric diagonalizable and eigenvalues of \(F_1\) are negative, and \(\Vert F_M\Vert \leqslant |\lambda _1|\) by rescaling.Footnote 2 We also know an a priori bound on the solution \(\max _t\Vert U(t))\Vert _{\infty }\leqslant \gamma \) as in (2.13). We have oracles \(O_{F_1}\), \(O_{F_M}\) that provide the locations and values of the nonzero entries of \(F_1\), \(F_M\). We also know \(\Vert U_{\textrm{in}}\Vert =\Vert U(0)\Vert \) and have an oracle \(O_x\) that maps \(|00\ldots 0\rangle \in \mathbb {C}^n\) to a quantum state proportional to \(U_{\textrm{in}}\). Our goal is to produce a quantum state as a superposition of the solution at different timesteps
with a sufficiently large m, where \({\hat{y}}_1(t)\in \mathbb {R}^{n_d}\) is a vector function that approximates U(t), \({\hat{Y}}_{\textrm{evo}} = \sum _{k=0}^{m}{\hat{y}}_1(kh)|k\rangle \) is a superposition of \({\hat{y}}_1(t)\) at different timesteps, and \(\Vert {\hat{Y}}_{\textrm{evo}}\Vert = \sqrt{\sum _{k=0}^{m}\Vert {\hat{y}}_1(kh)\Vert ^2}\) is the normalization factor.
For the reaction–diffusion equation, we have the specific form of (2.14) as
The corresponding \(F_1\) and \(F_M\) in (2.14) satisfy \(F_1 = D \Delta _h + aI \in \mathbb {R}^{n_d\times n_d}\), and \(F_M \in \mathbb {R}^{n_d\times n_d^M}\) maps \(u^{\otimes M}\) to \(bu^{M}\), with \(U^{.M}=[U_1^M, U_2^M, \ldots , U_{n_d}^M]^{T}\in \mathbb {R}^{n^M}\), and henceforth \(F_1\) and \(F_M\) are s-sparse with \(s=O(1)\). The representation of the Laplacian matrix \(\Delta _h\) with mixed boundary conditions refers to (2.3), with \(d_1\)-dimensional Dirichlet boundary conditions and \(d_2\)-dimensional periodic boundary conditions (\(d_1+d_2 = d\)). We require the eigenvalues \(\lambda _j\) of \(F_1\) are negative, i.e., \( 4 D d_1 (n+1)^2\sin ^2\left( \frac{\pi }{2n+2}\right) > a\).
The quantum state (2.15) corresponding with \([U(0), U(h), \ldots , U(mh)]\), also known as the history state in [78], describes a quantum-state evolution for the gradient flow of the free energy (2.4).
3 Carleman Linearization
We aim to perform Carleman linearization on discretized nonlinear PDEs (2.14) and then use quantum linear system solvers to obtain quantum states proportion to the solutions. In this section, we revisit the Carleman linearization procedure and then introduce the improved convergence result.
Defining \({{\widetilde{y}}}_j: = U^{\otimes j}\) for \(j = 1, 2 \cdots \), one has
where
Therefore, the Carleman linearization procedure gives rise to the following infinite-dimensional system \({{\widetilde{y}}}'(t) = {\widetilde{A}} {\widetilde{y}}(t),\) where \({\widetilde{A}}\) is an infinite-dimensional block upper-triangular matrix
It follows from the definition of \(A_k^j\) that the following inequalities are satisfied.
Lemma 3.1
For all \(j\ge 1\),
where \(\lambda _1 :=D d_1 \mu _1 + a\), \(\mu _1=-4(n+1)^2\sin ^2\left( \frac{\pi }{2n+2}\right) \).
We then truncate the above infinite-dimensional system of linear ODEs at order N, thereby obtaining a finite system
with the upper triangular block structure
Here, \({\hat{y}}_j=U^{\otimes j}\in \mathbb {R}^{n_d^j}\), \({\hat{y}}_{\textrm{in}}=[U_{\textrm{in}}; U_{\textrm{in}}^{\otimes 2}; \ldots ; U_{\textrm{in}}^{\otimes N}]\), and \(A_j^j \in \mathbb {R}^{n_d^j\times n_d^j}\), \(A_{j+1}^j \in \mathbb {R}^{n_d^j\times n_d^{j+1}}\) for \(j\in [{N}]\) are defined as (3.2). Note that A is an (Ns)-sparse matrix, where s is the largest nonzero number of each column and row in \(F_1\) and \(F_M\). The dimension of (3.6) is denoted as
Denote the solution to the truncated system as \({\hat{y}}_j\) (\(j = 0, 1, \cdots \)) and define the error resulting from the truncation as
In particular, \(\eta _1(t) = {{\widetilde{y}}}_1(t) - \hat{y}_1(t) = U(t) - \hat{y}_1(t)\) is the error due to the Carleman linearization procedure.
Theorem 3.2
(\(\ell _2\) Convergence of the Carleman Linearization). For the discrete reaction–diffusion equation (2.16) with mixed boundary conditions in (2.3), as originally proposed in [59], we define
Suppose that the largest eigenvalue of \(D\Delta _h+a I\), denoted by \(\lambda _1\), is negative. Then the approximation error of the Carleman linearization satisfies
for \(t\ge 0\). In particular, if N is some integer multiple of \(M-1\), then the error of the solution \(\eta _1(t) = y_1(t) - \hat{y}_1(t) = U(t) - {\hat{U}}(t)\) satisfies
for \(t\ge 0\). Furthermore, if \(R\le 1\), then \(R = {\overline{R}}\).
The detailed proof is presented in Sect. D. This theorem for polynomial differential equations is a straightforward extension of the quadratic case in [59, Corollary 1] and implies exponential convergence in the order of truncation N in terms of \(\ell _2\) norm as long as \(R \le 1\), that is
as shown in (3.10). However, if \(R>1\), we only have \(R \leqslant {\overline{R}}\), and \({\overline{R}}\) is the exact convergence radius.
Theorem 3.3
(\(\ell _{\infty }\) Convergence of the Carleman Linearization) . For the discrete reaction–diffusion equation (2.16) with mixed boundary conditions as proposed in (2.3), we define
Here \(\gamma = (\frac{|a|}{|b|})^{\frac{1}{M-1}}\), and \(\gamma \) upper bounds \(\Vert U(t)\Vert _{\infty }\) for all \(t\geqslant 0\), as stated in Lemma 2.1; the constant \(C(\lambda )\) has the form
where \(\mu _1=-4(n+1)^2\sin ^2\left( \frac{\pi }{2n+2}\right) \), \(\lambda _1 = D d_1 \mu _1 + a < 0\), and \(\lambda \) is an arbitrary value satisfying \(\lambda _1<\lambda < 0\). Then the approximation error of the Carleman linearization satisfies
for \(t\ge 0\), with \(R_{D}\) defined as (3.14). In particular, if N is some integer multiple of \(M-1\), then the error of the solution \(\eta _1(t) = y_1(t) - \hat{y}_1(t) = U(t) - {\hat{U}}(t)\) satisfies
for \(t\ge 0\).
This theorem implies an alternative exponential convergence in the order of truncation N in terms of \(\ell _{\infty }\) norm as long as \(R_{D} <1\), that is,
as shown in (3.14).
Proof
The truncation error \(\eta _j\) satisfies the equation
Applying the variation of constants formula [79] to (3.19), one has
Note that it follows from Lemma 2.1 that
Therefore, we have for \(N-M+2 \leqslant j \leqslant N\),
For simplicity, we denote
such that for \(N-M+2 \leqslant j \leqslant N\),
Next, for \(N-2M+3 \leqslant j \leqslant N-M+1\),
One can continue by mathematical induction for every group of \(M-1\) terms and obtain
We now consider an upper bound on \(C_j(t)\). By computing the integration of \(\Vert \text {e}^{j(t-s) (D \Delta _h + a)}\Vert _{\infty }\) in Lemma B.6, we have
Therefore,
where the right-hand side is exactly \(C(\lambda )\) in (3.15) and is independent of both j and t.
Finally, substituting \(C_j(t)\) with its upper bound \(C(\lambda )\) in (3.25) gives
where we use \(R_D = \frac{|b|}{|\lambda _1|}\gamma ^{M-1}C(\lambda )\) as defined in (3.14). Therefore, a sufficient condition for the convergence of \(\left\Vert \eta _j(t)\right\Vert _\infty \) in N is \(R_D<1\).
In practice, we can set N as some integer multiple of \(M-1\), and thus
This completes the proof of the desired result. \(\square \)
Remark According to Theorems 3.2 and 3.3, the Carleman linearized solution \({\hat{y}}_1(t)\) approximates the exact solution U(kh) of the original reaction–diffusion equations with exponential convergence rate in terms of the convergence radius R or \(R_D\).
The quantities R and \(R_{D}\) are used to characterize the ratios of reaction and diffusion strengths in terms of \(\ell _{\infty }\) and \(\ell _2\) norms. Here we briefly discuss the relationship between these. In particular, we are interested in the case where the convergence radius \(R_D\) has an advantage over R. Note that \(R_D\le R \) is equivalent to \( C(\lambda ) \le \left( \frac{\Vert U_{\textrm{in}}\Vert }{\gamma }\right) ^{M-1}\) and that \(\lambda \) can be an arbitrary number between \(\lambda _1\) and 0. Hence the regime of our interest turns out to be
which holds true for a large regime of parameters in the high-dimensional or finely discretized scenarios because the right-hand side is likely to grow rapidly in n and d while the left-hand side only has a weak dependence.
Specifically, according to Sect. E, solving the optimization problem on the left-hand side of (3.30) helps to obtain a sharper estimate of \(R_D\). When \(a=0\), the minimum of \(C(\lambda )\) has an explicit expression. As for \(a\ne 0\), we advise tuning \(\lambda \) for a sharper estimate of \(R_D\) in real applications, since the optimization problem is hard to solve explicitly. Nevertheless, in both cases, we can show that there exists an upper bound of \(\min _{\lambda _1<\lambda <0} C(\lambda )\) which is also independent of n. For sufficiently large n, the quantity \(\min _{\lambda _1<\lambda <0} C(\lambda )\) is \(\mathcal {O}(d)\), where d is the spatial dimension.
For \(n_d\)-dimensional vectors, \(\Vert U_{\textrm{in}}\Vert \) can be significantly larger than \(\gamma \) due to the inequality \(\Vert U_{\textrm{in}}\Vert _{\infty } \leqslant \Vert U_{\textrm{in}}\Vert \leqslant \sqrt{n^d}\Vert U_{\textrm{in}}\Vert _{\infty }\). First, when d is fixed, \(\Vert U_{\textrm{in}}\Vert \) has a polynomial growth with n in the worst case and then \(R_D<R\) for large n. Second, when n is large enough and fixed, \(\Vert U_{\textrm{in}}\Vert \) increases exponentially with d in the worst case while \(\min _{\lambda _1<\lambda <0}C(\lambda )\) grows at most linearly with d. Therefore, \(R_D\) is smaller than R for large d as well.
For the case of grid refinement, the above results also show that \(R_D\) stays bounded in the continuum limit \(n_d\rightarrow \infty \), while R diverges.
3.1 Numerical results
In this subsection, we present some numerical results to examine the effectiveness of Carleman linearization.
In order to demonstrate the convergence of Carleman linearization (Theorem 3.3), we apply our algorithm to (2.1) with different types of nonlinearity f(u). In the first example, the nonlinear term is \(f(u)=0.2u-u^2\), the Fisher-KPP type. We assume \(D=0.2\), choose \(u(x,0)=0.1\left( 1-\cos (2\pi x)\right) \) as the initial condition, and impose homogeneous Dirichlet boundary conditions. In our second example, \(f(u)=0.16u-u^3\), \(D=0.1\), \(u(x,0)=0.2\sin (2\pi x)\), and homogeneous Dirichlet boundary conditions are again used. The numerical results for these two examples are depicted in Fig. 2 and Fig. 3, respectively. We see from the error convergence plots that the absolute error, maximized over \(t \in [0,1]\), decreases exponentially as the truncation level N is increased. As a function of the time t, the absolute error first increases and then decreases exponentially due to the decay of the exact solution. In particular, in Fig. 3, the absolute error curve depicting the absolute error for \(N=1\) agrees with that for \(N=2\). That is because, according to (3.7), only \({\hat{y}}_1\) takes part in the time evolution of \({\hat{y}}_1\), no matter whether \(N=1\) or 2. A similar argument holds for the agreement of two curves for \(N=3\) and \(N=4\).
Based on our numerical tests, we also find that Carleman linearization works for more general cases. In Fig. 4a, we relax the requirement that the eigenvalues of \(F_1\) are all negative. We test (2.1) with homogeneous boundary conditions and choose \(D=0.01\), \(f(u)=0.25u-u^3\), and \(u(x,0)=0.08\sin (2\pi x)\) as initial condition. Carleman linearization still has good numerical performance, as implied by the absolute error plot. Figure 4b illustrates the advantage of \(R_D\) over R. In that example, we consider (2.1) with homogeneous boundary conditions and assume \(D=0.012\), \(f(u)=0.0196u-u^3\) and \( u(x,0)=0.14\sin (2\pi x)\). We discretize the spatial domain into 15 sub-intervals, i.e., the value of n is 16. By computation, \(R=1.4924\) and \(R_D= 0.9299\) where we choose \(\lambda =\lambda _1/2.3\). This illustrates that error remains well-controlled under \(R_D<1\), which is a milder condition than \(R<1\).
4 Quantum Algorithm
We now describe an efficient quantum algorithm for computing the numerical solution of the linearized ODEs (3.6). Our main algorithmic result is stated as follows.
Theorem 4.1
(Quantum Carleman linearization). Consider an instance of the quantum ODE problem as defined in Problem 1, with its N-th Carleman linearization as defined in (3.6). We denote a parameter
which parameterized the average \(\ell _2\) norm of the evolution of the N-th Carleman linearized solution \({\hat{Y}}_{\textrm{evo}}\). Assume \(R_D<1\). Then there exists a quantum algorithm that produces a state that approximates \({\hat{Y}}_{\textrm{evo}}\) in terms of the \(\ell _2\) normalized error \(\epsilon \leqslant 1/2\) succeeding with probability \(\Omega (1)\), with a flag indicating success. The query complexity (to the oracles \(O_{F_1},O_{F_M}\), and \(O_x\)) is
The gate complexity is larger than its query complexity by logarithmic factors.
We next describe the quantum algorithm in detail, including ingredients such as state preparation, quantum linear system algorithm, and measurement. We conclude the proof of Theorem 4.1 at the end of this section.
Remark We notice that a prefactor \(\Vert U_{\textrm{in}}\Vert ^{2N}\) in the complexity of the N-th Carleman linearization. This cost is similar to prefactor \(5^{2k}\) in the complexity of the k-th product formula [80]. In practice, we usually choose a suitably small value of N, such as 1, 2, 3, to reduce the cost.
4.1 State preparation
We first recall a lemma used in [59] for preparing a quantum state corresponding to the initial vector \({\hat{y}}_{\textrm{in}}=[U_{\textrm{in}}; U_{\textrm{in}}^{\otimes 2}; \ldots ; U_{\textrm{in}}^{\otimes N}]\), given the value \(\Vert U_{\textrm{in}}\Vert \) and the ability to prepare a quantum state proportional to \(U_{\textrm{in}}\).
Lemma 4.2
(Lemma 5 of [59]). Assume we are given the value \(\Vert U_{\textrm{in}}\Vert \), and let \(O_x\) be an oracle that maps \(|00\ldots 0\rangle \in \mathbb {C}^n\) to a normalized quantum state \(|U_{\textrm{in}}\rangle \) proportional to \(U_{\textrm{in}}\). Then the quantum state \(|{\hat{y}}_{\textrm{in}}\rangle \) proportional to \({\hat{y}}_{\textrm{in}}\) can be prepared using \(\mathcal {O}(N)\) queries to \(O_x\), and the gate complexity is larger by an \(\mathcal {O}(\text {poly}(\log N, \log n))\) factor.
We remark that, in fact, we embed \({\hat{y}}_{\textrm{in}}\) into a slightly larger space with a more convenient tensor product structure. Further details refer to Section 4.3 of [59].
4.2 Quantum linear system algorithm
After the state preparation of the initial condition, we perform the forward Euler method to discretize the time interval [0, T] into \(m = T/h\) sub-intervals, and construct a system of linear equations as
where \(y^k \in \mathbb {R}^{\mathcal {N}_{d,N}}\) approximates \({\hat{y}}(kh)\) for each \(k \in [{m+1}]_0 :=\{0,1,\ldots ,m\}\), with \(y^0 = y_{\textrm{in}} :={\hat{y}}(0) = {\hat{y}}_{\textrm{in}}\). This gives an \((m+1)\mathcal {N}_{d,N}\times (m+1)\mathcal {N}_{d,N}\) linear system
where
(4.4) encodes (4.3) and uses it to produce a numerical solution at time T. Observe that the system (4.4) has the lower triangular structure
For each \(\mathcal {N}_{d,N}\)-dim vector \(y^k\) with \(k\in [{m+1}]_0\), its first n components (i.e., \(y_1^k\)) approximate the exact solution U(kh), up to normalization. We apply the high-precision quantum linear system algorithm (QLSA) [37] to (4.4) and postselect to produce \(\sum _k|y_1^k\rangle |k\rangle \) for representing the gradient flow evolution. We would like to note that a more advanced QLSA with block-encoding input models was recently proposed in [81]. However, for technical simplicity, in this work we still employ the algorithm described in [37]. This is because the improvements introduced in [81] over [37] are relatively minor, only affecting logarithmic factors. Additionally, the input model used in this work, which involves sparse input oracles, is more consistent with that employed in [37].
In Theorem 4.1, the solution error has two contributions: the error in the time discretization of (3.6) by the forward Euler method, and the error from the QLSA. Since the QLSA produces a solution with error at most \(\epsilon \) with complexity \({{\,\textrm{poly}\,}}(\log (1/\epsilon ))\) [37], we focus on bounding the first contribution.
We provide an upper bound for the error incurred by approximating (3.6) with the forward Euler method. The following proof basically follows [59, Lemma 3].
Lemma 4.3
Consider an instance of the quantum ODE problem as defined in Problem 1, with \(R_D<1\) as defined in (3.14). Choose a time step
Then the global error from the forward Euler method (4.3) on the interval [0, T] for (3.6) satisfies
Proof
First of all, we establish the following bound
We decompose \(I+Ah\) as
where
All eigenvalues of \(\frac{1}{N}I+|j\rangle \langle j|\otimes A_j^jh\) range from \(\frac{1}{N}+j\lambda _nh\) to \(\frac{1}{N}+j\lambda _1h\). Here we require that these eigenvalues lie in [0, 1], given by \(N^2h\Vert F_1\Vert = N^2h[4Dd(n+1)^2 + a] \leqslant 1\) in (4.7). The norm of \(A_{j+M-1}^j\) is bounded by \(j\Vert F_M\Vert \). So we have the bound
Then \(\Vert F_M\Vert \leqslant |\lambda _1|\) in Problem 1 gives
It also holds for the case \(j\in [{N-1}] \backslash [{N-M+1}]\). Henceforth,
We then define the global error by
where \({\hat{y}}(kh)\) is the exact solution of (3.6), and \(y^k\) is the numerical solution of (4.3). Note that \(g^m = \Vert {\hat{y}}(T)-y^m\Vert \).
The stable condition (4.15) implies the local truncation error from the forward Euler method is non-increasing, and the global error increase at most linear in time. Following the standard procedure of the global error estimate (we refer it to the proof of [59, Lemma 3]), the global error is bounded by
where we have the following estimate
Sequentially, we conclude that
\(\square \)
Given the above linear system, we can upper bound the condition number that affects the complexity of the quantum linear system algorithm. Under the same construction of the matrix L, we can follow the same estimate proposed by [59, Lemma 4] to claim the following result.
Lemma 4.4
Consider an instance of the quantum ODE problem as defined in Problem 1. Apply the forward Euler method (4.3) with time step size (4.7) to the Carleman linearization (3.6). Then the condition number of the matrix L defined in (4.5) satisfies
Proof
We begin by upper bounding \(\Vert L\Vert \). We write
where
Clearly \(\Vert L_1\Vert = 1\). Furthermore, \(\Vert L_2\Vert \leqslant \Vert I+Ah\Vert \leqslant 1\) by (4.15). Therefore,
Next we upper bound \(\Vert L^{-1}\Vert \). We notice that \(L^{-1}\) can be directly written as
So that
Since \(\Vert I+Ah\Vert \leqslant 1\) by (4.15), we have
Finally, combining (4.25) with (4.28), we conclude
as claimed. \(\square \)
4.3 Measurement probability
After applying the QLSA to (4.4), we perform a measurement to extract a final state of the desired form. We now consider the probability of this measurement succeeding. Differing from [59, Lemma 6], we are interested in providing a history state \(\sum _{k}|y_1^k\rangle |k\rangle \) instead of a final state \(|y_1^m\rangle \). Thus the measurement probability does not include the \(\ell _2\) norm of the final state as well as the \(\ell _2\) scaling of the initial and final states (i.e., the parameters g and q as in [59, Lemma 6]).
Lemma 4.5
Consider an instance of the quantum ODE problem defined in Problem 1, with the QLSA applied to the linear system (4.4) using the forward Euler method (4.3) with time step (4.7). Suppose the global error from the forward Euler method as defined in Lemma 4.3 is bounded by
Then the probability of measuring a quantum state \(|y^k_1\rangle \) for \(k\in [{m+1}]_0\) satisfies
Proof
The quantum state produced by the QLSA applied to (4.4) has the form
where the normalization factor satisfies \(\Vert |Y\rangle \Vert ^2 = \sum _{k=0}^{m}\Vert y^k\Vert ^2 = \sum _{k=0}^{m}\sum _{j=1}^N\Vert y_j^k\Vert ^2\).
We aim to obtain the target quantum state as the form
which corresponds to the gradient flow evolution state (2.15) We measure the register \(|j\rangle \), \(j\in [{N}]\) and extract \(|Y_{\textrm{target}}\rangle \) from \(|Y\rangle \) when \(k=1\). The success probability is lower bounded as below.
According to the Cauchy-Schwarz inequality,
so that
Summing k from 0 to m, and using the definition of G in (4.1), we have
Second, we use (4.30) and the parallel inequality again to upper bound \(\Vert y^k\Vert ^2\) by
Therefore
Finally, using (4.36) and (4.38), we have
as claimed. \(\square \)
Using amplitude amplification, \(\mathcal {O}(\sqrt{1/P_{\textrm{measure}}})\) iterations suffice to succeed with constant probability.
4.4 Proof of Theorem 4.1
Proof
We first present the quantum Carleman linearization (QCL) algorithm and then analyze its complexity.
The QCL algorithm. We introduce the choice of parameters as follows. Given an error bound \(\epsilon \leqslant 1\) and G, we define
which satisfies \(\delta \leqslant G/2\) for any \(t\in [0,T]\).
Now we discuss the choice of h. On the one hand, h must follow (4.7) to satisfy the conditions of Lemma 4.3 and Lemma 4.4. On the other hand, we choose
Then according to the requirement (4.7) in Lemma 4.3, and for \(k\in [{m+1}]_0\),
It also leads to \(\Vert {\hat{y}}(kh)-y^k\Vert \leqslant G/2\) used as a condition in Lemma 4.5.
We now consider the error between the exact and numerical gradient flow evolutions
and (as denoted in (4.33))
where \(\Vert {\hat{Y}}_{\textrm{evo}}\Vert \) and \(\Vert Y_{\textrm{evo}}\Vert \) are normalization factors. Recall the definition of G in (4.1)
The \(\ell _2\) normalized error can be controlled by
Then using (4.42), since
we have
which gives
i.e., \(\epsilon \) upper bounds the \(\ell _2\) normalized error between \(|{\hat{Y}}_{\textrm{evo}}\rangle \) and \(|Y_{\textrm{evo}}\rangle \).
We follow the procedure in Lemma 4.2 to prepare the initial state \(|{\hat{y}}_{\textrm{in}}\rangle \). We apply the QLSA [37] to the linear system (4.4) with \(m=\lceil T/h \rceil \), giving a solution \(|Y\rangle \). By Lemma 4.5, the probability of obtaining a state is \(\sum _{k=0}^{m}|y_1^k\rangle |k\rangle \)
By amplitude amplification, we can achieve success probability \(\Omega (1)\) with \(\mathcal {O}(\max _{t\in [0,T]}\Vert {\hat{y}}(t)\Vert /G)\) repetitions of the above procedure.
Analysis of the complexity By Lemma 4.2, the initial state \(|{\hat{y}}_{\textrm{in}}\rangle \) can be prepared with \(\mathcal {O}(N)\) queries to \(O_x\), with gate complexity larger by a \(\mathcal {O}(\text {poly}(\log N, \log n))\) factor. The matrix L is an \((m+1)\mathcal {N}_{d,N}\times (m+1)\mathcal {N}_{d,N}\) matrix with \(\mathcal {O}(Ns)\) nonzero entries in any row or column. By Lemma 4.4 and our choice of parameters, the condition number of L is at most
Here we use \((\Vert F_1\Vert +\Vert F_M\Vert )^2 = [4Dd(n+1)^2 + a + b]^2\), and \(|a|, |b| = o(d)\). Consequently, by Theorem 5 of [37], the QLSA produces the state \(|Y\rangle \) with
queries to the oracles \(O_{F_1}\) and \(O_{F_2}\). Using \(O(\max _{t\in [0,T]}\Vert {\hat{y}}(t)\Vert /G)\) steps of amplitude amplification to achieve success probability \(\Omega (1)\), the overall query complexity of our algorithm is
and its gate complexity is larger than its query complexity only by logarithmic factors, based on the gate-efficient algorithm in Theorem 5 of [37].
We now estimate the quantity \(\max _{t\in [0,T]}\Vert {\hat{y}}(t)\Vert \). By the definition of \(\eta _j(t)\), Theorem 3.3, and \(R_D<1\), we have
so that
Based on the \(\ell _2\) estimate of the solution in Lemma C.4, when \(R_D < 1\), we have
Therefore, the overall query complexity of our algorithm is
and the gate complexity is larger than its query complexity by logarithmic factors as claimed. \(\square \)
5 Efficiency and Lower Bound Results
The reference [59] established a limitation on the ability of quantum computers to solve the quadratic ODE problem when the nonlinearity is sufficiently strong. In other words, general nonlinear differential equations are intractable on quantum computers when \(R \geqslant \sqrt{2}\). However, we can rule out such a worst-case by assuming that the initial condition of reaction–diffusion equations fulfills the maximum principle, and thus show the problem is still tractable on quantum computers.
In the following, we state and prove our hardness and efficiency results. Part (i) focuses on the hardness result when \(R\ge \sqrt{2}\), which has been studies [59, Theorem 2] by leveraging the hardness result of quantum state discrimination. However, there is a technical flaw in the original proof in [59]. The hardness result for quantum state discrimination used in [59] only assumes multiple copies of the input states at the beginning and does not allow access to the state during the algorithm. But, in most quantum ODE algorithms, including the Carleman linearization method, we indeed have a stronger assumption that we assume the state preparation oracle for the input state and its inverse, and we frequently apply those during the implementation of the algorithm. Therefore the existing lower bound in [59] has not yet fully ruled out the possibility of efficient algorithms with strong oracle assumptions. We fix this gap in part (i) by applying a recent lower bound for amplifiers [82, Theorem 13], where the state preparation oracles are assumed. Part (ii) shows that the worst-case scenario can be precluded by assuming the maximum principle, implying that our maximum principle analysis captures the underlying reason for the efficiency of Carleman linearization method.
Theorem 5.1
We consider the same assumptions in Problem 1.
-
(i)
Assume \(R \geqslant \sqrt{2}\), and the initial condition satisfies \(\Vert U_{\textrm{in}}\Vert _{\infty }>\gamma \). Then there is an instance of the quantum quadratic ODE problem defined in Problem 1 such that any quantum algorithm for producing a quantum state approximating \(u(T)/\Vert u(T)\Vert \) with bounded error must have worst-case query complexity exponential in T to the input state preparation oracle.
-
(ii)
If the initial condition satisfies the maximum principle \(\Vert U_{\textrm{in}}\Vert _{\infty }\leqslant \gamma \) as (2.12), then such a worst-case example can be precluded even \(R \geqslant \sqrt{2}\).
Proof
We consider the lower bound result when \(R \geqslant \sqrt{2}\) and \(\Vert U_{\textrm{in}}\Vert _{\infty }>\gamma \). The same as Theorem 2 of [59], we consider a 2-dimensional system of the form
with two single-qubit states as initial conditions
and
where \(\theta \in (0,\pi /4)\), \(2\sin ^2\frac{\theta }{2} = \epsilon \), with an arbitrary small \(\epsilon >0\). Then the overlap between the two initial states is
We let \(U(t)=[v(t);w(t)]\) denote the solution evolved from \(U(0)=[v_0;w_0]\). According to Lemma 8 of [59], \(|\phi (t)\rangle =|\phi (0)\rangle \) is the fixed state; but if \(1/R\geqslant 1/\sqrt{2}\), w(t) increases with t and goes to infinity after
The overlap of \(|\phi (T)\rangle \) and \(|\psi (T)\rangle \) is no larger than a constant (e.g., \(\frac{3}{\sqrt{10}}\) used in [59]) after a short evolution time
It was shown in [82, Theorem 13] that, if a quantum algorithm with oracle input model can amplify the infidelity of two quantum states from \(\epsilon \) to a constant level, then it must use \(\Omega (1/\sqrt{\epsilon })\) queries in the worst case. By applying this result and noticing that \(1/\sqrt{\epsilon } = \text {e}^{\Omega (T)}\), we directly obtain that when \(R \geqslant \sqrt{2}\), there is an instance of the quantum quadratic ODE problem that any quantum algorithm must have worst-case time complexity exponential in T.
In our paper, the ODE system (5.1) is a reduced example of reaction–diffusion equations (2.6) with \(D=0\), \(a=-1\), \(b=R\), \(M=2\), and \(R_D = 1\). Besides, \(\Vert U_{\textrm{in}}\Vert _{\infty } = \sin \Bigl (\theta +\frac{\pi }{4}\Bigr )\) satisfies
where \(\sin \Bigl (\theta +\frac{\pi }{4}\Bigr )\) is close to \(1/\sqrt{2}\) when \(\epsilon \) is close to 0. Notice that this example disobeys (2.12), the condition of the Maximum Principle Lemma 2.1, because
Secondly, we consider an upper bound on \(\Vert U_{\textrm{in}}\Vert \) given the maximum principle \(\Vert U_{\textrm{in}}\Vert _{\infty }\leqslant \gamma \). Then we have
Substituting this estimate into the complexity in Theorem 4.1, we can upper bound the query complexity by
Notice that the upper bound of the complexity still depends exponentially on N. However, according to Theorem 3.3, the Carleman error converges exponentially in N and can be bounded independently of T. So N can be chosen independently of T as well, therefore our algorithm does not have the exponential overhead in T stated in part (i). \(\square \)
The upper bounds on the query and gate complexity demonstrate that the quantum algorithm we develop has a roughly quadratic dependence on T when \(R_D<1\) and \(\Vert U_{\textrm{in}}\Vert _{\infty }\leqslant \gamma \), regardless of the value of R. Such a loose upper bound includes a polynomial dependence on n, revealing that the quantum algorithm does not have a potential exponential speedup in the dimension.
But if we are given an additional assumption
then we can upper bound the query complexity by
and the gate complexity has an upper bound that is larger by logarithmic factors as claimed. In this case, our algorithm still maintains the potential exponential speedup in the dimension over classical algorithms.
6 Applications
In this section, we show how the quantum state obtained by solving Problem 1 can be used to compute quantities of practical interest. For generality, in this section, we consider the applications of a quantum state in the form specified in Problem 1, but not limited to the output by particular algorithms. More specifically, let \(l=(l_1,\ldots ,l_d)\) with \(l_j\in [{n}]\), and let f(t, x) be a function defined on \([0,T]\otimes [0,1]^d\). We assume that there exists a quantum algorithm \(\mathcal {A}(\epsilon )\) which can prepare the quantum state
proportional to the vector \((f(kT/m,l_1/n,\ldots ,l_d/n))\) within some prescribed error tolerance \(\epsilon >0\) in \(\ell ^2\) norm. Here \({\hat{f}}_{k,l}\) represents an approximation of the function f evaluated at \((kT/m, l_1/n,\ldots , l_d/n)\). In the context of this paper, the algorithm \(\mathcal {A}\) is the quantum Carleman linearization method, but the discussion in this section works for any quantum algorithm which can encode a function evaluated at discrete grid points.
6.1 Mean square amplitude
One quantity of potential practical interest is the fraction of the squared amplitude contained in a sub-domain \(\mathscr {D}_t \times \mathscr {D}_x\) defined by \(\mathscr {D}_t \subset [0,T]\) and \(\mathscr {D}_x \subset [0,1]^d\). This can be described as the ratio
In the context of quantum mechanics such quantities are motivated by Born’s rule, whereas in the context of classical wave mechanics such quantities are motivated by notions of energy.
In the spatial discretization context, we can approximate the integrals via numerical quadrature with equidistant nodes, and thus we are interested in computing the ratio
Note that the difference between (6.2) and (6.3) scales as \(\mathcal {O}(T(T/m+d/n))\) [83], so it can be reduced to the level of \(\mathcal {O}(\epsilon )\) with arbitrarily small \(\epsilon \) by refining time and spatial discretization as \(m = \mathcal {O}(T^2/\epsilon ), n = \mathcal {O}(dT/\epsilon )\). For simplicity, here we fix the grids for discretization and focus on computing the discretized ratio as shown in (6.3). This quantity can be easily estimated given the quantum state in the form of (6.1). Let P be the projector onto the space spanned by the computational basis corresponding the \(\Omega \), i.e., \(P = \sum _{kT/m\in \mathscr {D}_t}\sum _{(l_1/n,\ldots ,l_d/n)\in \mathscr {D}_x} ({|{k}\rangle }{\langle {k}|})\otimes ({|{l_1}\rangle }{\langle {l_1}|})\otimes \cdots \otimes ({|{l_d}\rangle }{\langle {l_d}|})\). Then
and thus can be estimated by the amplitude estimate technique [72].
The complexity of such an algorithm is given in the following theorem, and the proof can be found in the appendix Sect. G.
Theorem 6.1
Assume that we are given an algorithm \(\mathcal {A}(\epsilon ')\) preparing (6.1) and a unitary transform \((I-2P)\) for the projector P associated with the domain \(\mathscr {D}\). Then, for any \(0< \epsilon < 1\), \(0< \delta < 1\), there exists a quantum algorithm which can output an approximation of (6.3) within tolerated error \(\epsilon \) and probability at least \((1-\delta )\), using \(\mathcal {A}(\epsilon /4)\) and \((I-2P)\) for \(\mathcal {O}((1/\epsilon )\log (1/\delta ))\) times.
Theorem 6.1 proves an almost linear sampling scaling in terms of the precision for the quantum approach to estimate the ratio. Compared with the most widely used classical approach for estimating high dimensional integral, standard Monte Carlo methods, which typically scale quadratically in the precision, we can obtain a quadratic speedup for the sampling step.
6.2 Derivatives and kinetic energy
As a particular example of practical interest, we would like to study the dynamical kinetic energy ratio of the system on the domain \(\mathscr {D}_t\times \mathscr {D}_x \subset [0,T]\times [0,1]^d\), which is defined to be
To apply our result in Theorem 6.1, we first study how to prepare a quantum state which is proportional to the partial derivative of the function f. That is, our goal is to prepare a quantum state which is a good approximation of
Compared to (6.1), here we introduce one more register to simultaneously encode all the partial derivatives in a single quantum state. In this subsection, we further assume that the function f(t, x) satisfies periodic boundary conditions for spatial variable x. Notice that the periodic boundary condition here is only for technical simplicity and not essential because a general function can be smoothly extended to a larger space domain with periodic boundary conditions. Our algorithm will also work by requiring access to another projector connecting the original and the extended space domains.
We first study how to prepare a quantum state encoding the derivative information within amplitudes. The idea is using the discrete Fourier transform to transform the state to the frequency domain, multiplying the frequency in this domain and then transforming back to the space domain by inverse discrete Fourier transform. More specifically, let \(\mathcal {F}\) denote the one-dimensional quantum Fourier transformFootnote 3 with n nodes. Furthermore, for any positive integer \(\theta \le n/2\), let \(l = (l_1,\ldots ,l_d)\) and \(D_{j,\theta }\) be a diagonal matrix
where
Notice that we define \(D_{j,\theta }(l)\) such that \(|D_{j,\theta }| \le 1\). Then, for any smooth function g(x) defined on \([0,1]^d\), we have
Such an approach has been widely used and is regarded as the standard way to compute derivatives in classical scientific computing, and we briefly illustrate the reasoning in the appendix Sect. F.
Now we discuss how to implement this approach on a quantum device. In the general high-dimensional case, we introduce another ancilla register with \(\mathcal {O}(\log d)\) qubits, which we will refer to as the dimension register later, and start with the state
Applying the Hadamard gates to the dimension register, we obtain
The discrete Fourier transform can be efficiently implemented via quantum Fourier transform. Specifically, we apply the operation \((\sum _{j=0}^{d-1}{|{j}\rangle }{\langle {j}|} \otimes I_{n_s}\otimes \mathcal {F}_j^{-1})\) and denote the resulting state as
For the multiplication of the matrix \(D_{j,\theta }\) (we will show later that \(\theta \) can be chosen as an \(\mathcal {O}(1)\) parameter for smooth functions), we assume that we are given an oracle of the mapping
Then the multiplication of the matrix \(D_{j,\theta }\) can be implemented as follows. We first add two ancilla registers, one as the rotation register on which we will perform conditional rotation later and the other as the D-register for encoding \(D_{j,\theta }\). Applying \(O_D\) to encode \(D_{j, \theta }(l)\) in the D-register gives
Performing a rotation on the rotation register conditioned on \({|{D_{j,\theta }(l)}\rangle }\) yields
Uncomputing the D-register gives the state
where the first part is the desired outcome after the diagonal transformation, and \({|{\perp }\rangle }\) represents a quantum state with the rotation register being \({|{1}\rangle }\). Finally, applying \((\sum _{j=0}^{d-1} {|{j}\rangle }{\langle {j}|} \otimes \mathcal {F}_j)\) on the registers \(({|{j}\rangle }\otimes {|{l}\rangle })\) completes the operation for computing the partial derivatives as discussed before, which yields an approximation of
By measuring the ancilla rotation register to get 0 and discarding the D-register, we get a quantum state approximately proportional to
which encodes the partial derivatives in the amplitude controlled by a dimension register. The entire quantum circuit is summarized in Fig. 5, and the overall complexity estimate is given in the following theorem, of which the proof can be found in the appendix Sect. G.
Theorem 6.2
Let f(t, x) be a smooth function and \(\textbf{f}\) be a possibly unnormalized vector
Assume that f satisfies periodic boundary condition for x and \(\sup _{j,p} (\Vert \partial _{x_j}^p f\Vert _{\infty })^{1/p} < \pi n\). Then for any \(0< \epsilon < 1\), \(0< \delta < 1\), there exists a quantum algorithm which, with probability at least \((1-\delta )\), outputs an \(\epsilon \)-approximation of the state proportional to
using queries to \(\mathcal {A}(\epsilon /Q)\) and \(O_D\) for \(\mathcal {O}(Q\log (1/\delta ))\) times and additional \(\mathcal {O}(d (\log n)^2)\) gates, where
We briefly compare our result with the standard classical approach of computing gradient in terms of the dimension parameters, including n and d. On the one hand, the cost for computing gradient evaluated at all discrete grid points typically scale \(\mathcal {O}(d n^d)\), since the sizes of both the vector storing the information of the gradients and the matrices related to finite difference and discrete Fourier transform scale \(\mathcal {O}(d n^d)\). On the other hand, the corresponding scaling for our quantum approach is subtler since it depends on the scaling of the quantity Q. Notice that this quantity can be very large if the function f is close to a constant function. However, it can also be of \(\mathcal {O}(1)\) if at least one of the non-trivial Fourier components in f is on the order of \(\Omega (1)\), since in this case \(\Vert \textbf{f}\Vert \sim \mathcal {O}(\sqrt{mn^d})\) and \(\Vert \mathbf {\nabla f}\Vert \sim \Omega (\sqrt{dmn^d})\). In this scenario, the overall complexity scales only polynomially in terms of d and poly-logarithmically in n, which achieves exponential speedup in terms of the dimension parameters.
The cost of computing the ratio of the kinetic energy can be directly estimated by combining Theorems 6.1 and 6.2. As discussed before, we can get a quadratic speedup in terms of precision compared to the classical Monte Carlo method. The result is summarized in the following Corollary.
Corollary 6.3
Let f(t, x) be a smooth function such that f satisfies periodic boundary condition for x, and \(\sup _{j,p} (\Vert \partial _{x_j}^p f\Vert _{\infty })^{1/p} < \pi n\). Assume that we are given a unitary transform \((I-2P)\) for the projector P associated with the domain \(\mathscr {D}\). Then for any \(0< \epsilon < 1\), \(0< \delta < 1\), there exists a quantum algorithm which can output an approximation of (6.5) within tolerated error \(\epsilon \) and probability at least \((1-\delta )\), using queries to \((I-2P)\) for \(\mathcal {O}((1/\epsilon )\log (1/\delta ))\) times, queries to \(\mathcal {A}(\epsilon /(4Q))\) and \(O_D\) for \(\mathcal {O}( Q(\log (1/\delta ))^2/\epsilon )\) times and additional \(\mathcal {O}(d (\log n )^2 \log (1/\delta ) /\epsilon )\) gates, where
6.3 History state and decay of kinetic energy
Unlike the existing work on quantum differential equation solvers that typically output a final state encoding the solution at the final time, our Carleman linearization algorithm and the input model assumed in this section take a more general history state encoding the solutions at all time steps. In this subsection, we briefly discuss how the general history state structure may broaden the application of our algorithm.
One potential application is to study the kinetic energy curve. As discussed in [10, 85], applied mathematicians are interested in the curve that describes the decay of the kinetic energy of the gradient flow (2.4), and particularly the time of reaching the equilibrium, i.e., the time when the kinetic energy almost stops changing. We define this time to be the equilibrium time \(t^*\), and it can be easily estimated by combining the history state and our algorithm for computing gradients in Fig. 5. Specifically, we first run the algorithm in Fig. 5 to get an approximation of the history state of gradients of the solution as in (6.6), and then measure the time register \({|{k}\rangle }\) to obtain an integer k. Notice that after the equilibrium time \(t^*\) when the kinetic energy almost stops changing, the corresponding partial derivatives are very close to 0. This implies that we have almost no measurement outcomes after \(t^*\). By repeating such a procedure and taking the maximum of the measure outcomes \(k_{\max }\), we can use \(k_{\max }h\) to estimate \(t^*\) with high probability.
The history state structure can also allow us to overcome potential exponential cost in the differential equation solvers caused by the decay of the solution. To the extent of our knowledge, most of the existing quantum differential equation solvers which output a final state [44,45,46, 59] scale at least linearly in terms of the parameter \(\Vert u_{\text {in}}\Vert /\Vert u_{\text {out}}\Vert \), where \(u_{\text {in}}\) and \(u_{\text {out}}\) denote the unnormalized solutions of the differential equation at the initial and final time, respectively. Such a linear dependence is typically caused by post-processing a quantum state obtained from solving specific linear systems of equations to get the desired final state and may introduce extra exponential time dependence if the solution of the differential equations experiences rapid decay. A simple example is the imaginary time evolution
where H is a positive definite matrix. Here \(\Vert u(T)\Vert = \Vert \exp (-HT)u(0)\Vert \le \exp (-\lambda T) \Vert u(0)\Vert \) with \(\lambda \) being the smallest eigenvalue of H, and thus \(\Vert u(0)\Vert /\Vert u(T)\Vert \ge \exp (\lambda T)\), leading to an extra exponentially large term in T. Another example is the nonlinear ordinary differential equation with no constant term, which can be explicitly written down as
It is proved in [59] that, if \(F_1\) only has negative eigenvalues and the nonlinearity is relatively weak in certain sense (namely the parameter R defined in this work is smaller than 1), then \(\Vert u(t)\Vert \) decays exponentially in terms of t, which also leads to an exponentially large \(\Vert u_{\text {in}}\Vert /\Vert u_{\text {out}}\Vert \) in T.
With a history state, we can run a “pre-diagnosis” to first identify whether the final state is close to 0. The key observation is that if the state has sufficiently decayed such that the solution is very close to 0, then the success probability of getting the corresponding time step by measuring the time register is exponentially small. In particular, assume that we are interested in obtaining an approximation of the final solution u(T) when \(\Vert u(T)\Vert \) is very small or the corresponding state \(u(T)/\Vert u(T)\Vert \) when \(\Vert u(T)\Vert \) is reasonably away from 0. We can repeatedly prepare a history state, measure the time register, and obtain the output k. If for all k’s, we have \(kh < T\), then the final solution is expected to be exponentially small with high probability, so we can stop here and directly use 0 to be the approximation of the solution. On the other hand, if there exists a k such that \(kh = T\), then this is a reasonable indication that the quantity \(\Vert u_{\text {in}}\Vert /\Vert u_{\text {out}}\Vert \) is not quite large, and we can follow the standard post-processing procedure to obtain the final state \(u(T)/\Vert u(T)\Vert \). The history state here helps us determine which scenario the differential equation is in without exponential cost in the evolution time T.
7 Discussion
We have presented an efficient quantum algorithm for the gradient flow evolution of reaction–diffusion equations. We improve the Carleman linearization under a condition \(R_D < 1\). It relaxes the previous condition \(R<1\) in [59] for high-dimensional systems of nonlinear differential equations. Besides, we discussed estimating the mean square amplitude and ratios of the kinetic energy of the gradient flow as potential applications.
This work raises several natural open problems. The first aspect is regarding further improvement of our algorithm. Though our work focuses on improving the convergence condition for the Carleman linearization, it is also interesting to seek further improvement of the dependence on other parameters in the complexity, such as the evolution time and the error tolerance. Another related topic is to obtain meaningful classical outputs with super-polynomial quantum speedups over the best-known classical algorithms. It is also an interesting question to generalize our algorithms or design new quantum algorithms dealing with other types of nonlinear differential equations. Our analysis relies heavily on the Maximum Principle and some good regularity of the solutions, which are essential properties of reaction–diffusion equations. On a high level, the maximum principle controls the norm of the solution that shares the same spirit of spectral norm preserving properties of Hamiltonian simulation. It is thus an interesting direction to consider other norm-controlled problems, such as the gradient flow structured on certain norm preserving manifold, and see whether such confinement helps conquer the nonlinearity at hand.
However, relaxing the regularity assumptions of the solutions seems to be a fundamentally difficult problem. All linearization-based techniques require the solutions to be well-posed and regular, which is not the case for applications such as the conservation laws, fluid dynamics, and Hamilton–Jacobi equation, where the solutions blow up in finite time. Therefore, other approaches beyond the linearization framework are desired for such nonlinear problems, requiring some new insights. We also point out that the quantum Carleman linearization based approaches may suffer from an overhead sensitive to the spatial grid refinement for partial differential equations. Though our improved \(\ell _\infty \) framed convergence criterion can concur the sensitivity to grid refinements, this sensitivity can be reintroduced through the \(\ell _2\) norm dependence when implementing the quantum algorithm since the \(\ell _2\) norm of the solution discretized by finite difference still grows as more grid points are used. In particular, it appears when getting the solution from the huge Carleman state vector. It is our future work to make the quantum algorithm fully insensitive to the spatial grid refinements, probably by trying other spatial discretization which can preserve the \(\ell _2\) norm in grid refinements such as Fourier discretization.
In the application section, we use the amplitude estimate technique to obtain classical information beyond the quantum state output and study the kinetic energy distribution by preparing a state with derivative information. The techniques we propose in this section do not rely on specific models or differential equations. Thus it is interesting to find applications of our technique to output classical information for other problems, such as phase separation and transition, chemical reactions, and self-organized biological patterns. We want to remark that all the classical outputs we study in this work are in terms of ratios. While obtaining the absolute value rather than the ratio seems to require accurate computation of the observable on the entire domain, which might incur exponential overhead, it is still very interesting to further study whether the absolute value of observables can be approximated with only a relatively small overhead. Our algorithm for preparing quantum states of derivatives requires the regularity of the function, and we want to understand its performance for non-smooth functions as well. It may also be of interest in some quantum optimization problems which require gradient information to optimize the objective function and will be our future work.
Notes
Have at most s nonzero entries in each column and row.
Given any nonlinear ODE, we can rescale \(u\rightarrow \alpha u\) with a proper \(\alpha \) to ensure \(\Vert F_M\Vert \leqslant |\lambda _1|\).
Note that, following the standard convention of choosing signs in [84], the quantum Fourier transform exactly corresponds to the inverse discrete Fourier transform in the classical setting.
References
Yuankai, L., Dan, H.: Optimisation of biological transport networks. East Asian J. Appl. Math. 12(1), 72–95 (2022)
Dan, H., Cai, D.: Adaptation and optimization of biological transport networks. Phys. Rev. Lett. 111, 138701 (2013)
Haskovec, J., Markowich, P., Perthame, B.: Mathematical analysis of a PDE system for biological network formation. Commun. Partial Differ. Equ. 40(5), 918–956 (2015)
Haskovec, J., Markowich, P., Perthame, B., Schlottbom, M.: Notes on a PDE system for biological network formation. Nonlinear Anal. 138, 127–155 (2016)
Albi, G., Artina, M., Foransier, M., Markowich, P.A.: Biological transportation networks: modeling and simulation. Anal. Appl. 14(1), 185–206 (2016)
Haskovec, J., Kreusser, L.M., Markowich, P.: ODE and PDE based modeling of biological transportation networks. (2018). arXiv:1805.08526
Burger, M., Haskovec, J., Markowich, P., Ranetbauer, H.: A mesoscopic model of biological transportation networks (2018). arXiv:1806.00120
Haskovec, J., Kreusser, L.M., Markowich, P.: Rigorous continuum limit for the discrete network formation problem (2018). arXiv:1808.01526
Albi, G., Burger, M., Haskovec, J., Markowich, P., Schlottbom, M.: Continuum modeling of biological network formation. In: Modeling and Simulation in Applied Sciences, Engineering, and Technology, pp. 1–48. Birkhäuser/Springer, Cham (2017)
Fang, D., Jin, S., Markowich, P., Perthame, B.: Implicit and semi-implicit numerical schemes for the gradient flow of the formation of biological transport networks. SMAI J. Comput. Math. 5, 229–249 (2019)
Garvie, M.R.: Finite-difference schemes for reaction–diffusion equations modeling predator–prey interactions in m ATLAB. Bull. Math. Biol. 69(3), 931–956 (2007)
Malchow, H.: Spatiotemporal Patterns in Ecology and Epidemiology: Theory, Models, and Simulation. Chapman and Hall/CRC, London (2007)
Petrovskii, S.V., Malchow, H.: A minimal model of pattern formation in a prey–predator system. Math. Comput. Model. 29(8), 49–63 (1999)
Lefèvre, J., Mangin, J.-F.: A reaction–diffusion model of human brain development. PLoS Comput. Biol. 6(4), e1000749 (2010)
Habib, S., Molina-París, C., Deisboeck, T.S.: Complex dynamics of tumors: modeling an emerging brain tumor system with coupled reaction–diffusion equations. Phys. A 327(3–4), 501–524 (2003)
Murray, J.D.: Mathematical Biology II: Spatial Models and Biomedical Applications, vol. 3. Springer, New York (2001)
Murray, J.D.: Mathematical Biology I: An Introduction. Interdisciplinary Applied Mathematics. Mathematical Biology. Springer, Berlin (2002)
Genieys, S., Volpert, V., Auger, P.: Pattern and waves for a model in population dynamics with nonlocal consumption of resources. Math. Model. Nat. Phenom. 1(1), 63–80 (2006)
Meinhardt, H.: Models of Biological Pattern Formation. Academic Press, New York (1982)
Golding, I., Kozlovsky, Y., Cohen, I., Ben-Jacob, E.: Studies of bacterial branching growth using reaction–diffusion models for colonial development. Phys. A 260(3–4), 510–554 (1998)
Mimura, M., Sakaguchi, H., Matsushita, M.: Reaction–diffusion modelling of bacterial colony patterns. Phys. A 282(1–2), 283–303 (2000)
Berestycki, H., Nicolaenko, B., Scheurer, B.: Traveling wave solutions to combustion models and their singular limits. SIAM J. Math. Anal. 16(6), 1207–1242 (1985)
Zeldovich, I.A., Barenblatt, G.I., Librovich, V.B., Makhviladze, G.M.: Mathematical Theory of Combustion and Explosions. Springer, New York (1985)
Poinsot, T., Veynante, D.: Theoretical and Numerical Combustion. RT Edwards Inc., Philadelphia (2005)
Perthame, B.: Growth, Reaction, Movement and Diffusion from Biology. Lecture Notes, University Paris, 6 (2012)
Means, S., Smith, A.J., Shepherd, J., Shadid, J., Fowler, J., Wojcikiewicz, R.J.H., Mazel, T., Smith, G.D., Wilson, B.S.: Reaction diffusion modeling of calcium dynamics with realistic ER geometry. Biophys. J. 91(2), 537–557 (2006)
Bertozzi, A.L., Flenner, A.: Diffuse interface models on graphs for classification of high dimensional data. Multiscale Model. Simul. 10(3), 1090–1118 (2012)
Bertozzi, A.L., Flenner, A.: Diffuse interface models on graphs for classification of high dimensional data. SIAM Rev. 58(2), 293–328 (2016)
Merkurjev, E., Kostic, T., Bertozzi, A.L.: An MBO scheme on graphs for classification and image processing. SIAM J. Imaging Sci. 6(4), 1903–1930 (2013)
Bertozzi, A.L., Esedoglu, S., Gillette, A.: Inpainting of binary images using the Cahn–Hilliard equation. IEEE Trans. Image Process. 16(1), 285–291 (2006)
Dobrosotskaya, J.A., Bertozzi, A.L.: A wavelet-Laplace variational technique for image deconvolution and inpainting. IEEE Trans. Image Process. 17(5), 657–663 (2008)
Esedoglu, S., March, R.: Segmentation with depth but without detecting junctions. J. Math. Imaging Vis. 18(1), 7–15 (2003)
Esedog, S., Tsai, Y.-H.R., et al.: Threshold dynamics for the piecewise constant Mumford–Shah functional. J. Comput. Phys. 211(1), 367–384 (2006)
Bellman, R.: Dynamic Programming. Princeton University Press, Princeton (1957)
Ambainis, A.: Variable time amplitude amplification and quantum algorithms for linear algebra problems. In: 29th Symposium on Theoretical Aspects of Computer Science, vol. 14, pp. 636–647. LIPIcs (2012). arXiv:1010.4458
An, D., Lin, L.: Quantum linear system solver based on time-optimal adiabatic quantum computing and quantum approximate optimization algorithm. ACM Trans. Quantum Comput. 3(2), 1–28 (2022)
Childs, A.M., Kothari, R., Somma, R.D.: Quantum algorithm for systems of linear equations with exponentially improved dependence on precision. SIAM J. Comput. 46(6), 1920–1950 (2017). arXiv:1511.02306
Gilyén, A., Su, Y., Low, G.H., Wiebe, N.: Quantum singular value transformation and beyond: exponential improvements for quantum matrix arithmetics. In Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing, pp. 193–204 (2019). arXiv:1806.01838
Harrow, A.W., Hassidim, A., Lloyd, S.: Quantum algorithm for linear systems of equations. Phys. Rev. Lett. 103(15), 150502 (2009). arXiv:0811.3171
Lin, L., Tong, Y.: Optimal quantum eigenstate filtering with application to solving quantum linear systems. Quantum 4, 361 (2020). arXiv:1910.14596
Subaşı, Y., Somma, R.D., Orsucci, D.: Quantum algorithms for systems of linear equations inspired by adiabatic quantum computing. Phys. Rev. Lett. 122(6), 060504 (2019). arXiv:1805.10549
Tong, Yu., An, D., Wiebe, N., Lin, L.: Fast inversion, preconditioned quantum linear system solvers, fast Green’s-function computation, and fast evaluation of matrix functions. Phys. Rev. A 104(3), 032422 (2021). arXiv:2008.13295
Costa, P.C.S., An, D., Sanders, Y.R., Su, Y., Babbush, R., Berry, D.W.: Optimal scaling quantum linear systems solver via discrete adiabatic theorem (2021). arXiv:2111.08152
Berry, D.W.: High-order quantum algorithm for solving linear differential equations. J. Phys. A: Math. Theor. 47(10), 105301 (2014). arXiv:1010.2745
Berry, D.W., Childs, A.M., Ostrander, A., Wang, G.: Quantum algorithm for linear differential equations with exponentially improved dependence on precision. Commun. Math. Phys. 356(3), 1057–1081 (2017). arXiv:1701.03684
Childs, A.M., Liu, J.-P.: Quantum spectral methods for differential equations. Commun. Math. Phys. 375, 1427–1457 (2020). arXiv:1901.00961
Fang, D., Lin, L., Tong, Yu.: Time-marching based quantum solvers for time-dependent linear differential equations. Quantum 7, 955 (2023)
Clader, B.D., Jacobs, B.C., Sprouse, C.R.: Preconditioned quantum linear system algorithm. Phys. Rev. Lett. 110(25), 250504 (2013). arXiv:1301.2340
Cao, Y., Papageorgiou, A., Petras, I., Traub, J., Kais, S.: Quantum algorithm and circuit design solving the Poisson equation. New J. Phys. 15(1), 013021 (2013). arXiv:1207.2485
Montanaro, A., Pallister, S.: Quantum algorithms and the finite element method. Phys. Rev. A 93(3), 032324 (2016). arXiv:1512.05903
Costa, P.C.S., Jordan, S.: Ostrander A 2019 Quantum algorithm for simulating the wave equation. Phys. Rev. A 99(1), 012323 (2019). arXiv:1711.05394
Childs, A.M., Liu, J.-P., Ostrander, A.: High-precision quantum algorithms for partial differential equations. Quantum 5, 574 (2021). arXiv:2002.07868
Engel, A., Smith, G., Parker, S.E.: Quantum algorithm for the Vlasov equation. Phys. Rev. A 100(6), 062315 (2019). arXiv:1907.09418
Linden, N., Montanaro, A., Shao, C.: Quantum vs. classical algorithms for solving the heat equation. arXiv:2004.06516
Leyton, S.K., Osborne, T.J.: A quantum algorithm to solve nonlinear differential equations (2008). arXiv:0812.4423
Abrams, D.S., Lloyd, S.: Nonlinear quantum mechanics implies polynomial-time solution for NP-complete and #P problems. Phys. Rev. Lett. 81(18), 3992 (1998). arXiv:quant-ph/9801041
Aaronson, S.: NP-complete problems and physical reality. ACM SIGACT News 36(1), 30–52 (2005). arXiv:quant-ph/0502072
Childs, A.M., Young, J.: Optimal state discrimination and unstructured search in nonlinear quantum mechanics. Phys. Rev. A 93(2), 022314 (2016). arXiv:1507.06334
Liu, J.-P., Kolden, H.Ø., Krovi, H.K., Loureiro, N.F., Trivisa, K., Childs, A.M.: Efficient quantum algorithm for dissipative nonlinear differential equations. In: Proceedings of the National Academy of Sciences 118(35) (2021). arXiv:2011.03185
Carleman, T.: Application de la théorie des équations intégrales linéaires aux systèmes d’équations différentielles non linéaires. Acta Math. 59(1), 63–87 (1932)
Kowalski, K., Steeb, W.-H.: Nonlinear Dynamical Systems and Carleman Linearization. World Scientific, Singapore (1991)
Forets, M., Pouly, A.: Explicit error bounds for Carleman linearization (2017). arXiv:1711.02552
Krovi, H.: Improved quantum algorithms for linear and nonlinear differential equations. Quantum 7, 913 (2023)
Dodin, I.Y., Startsev, E.A.: On applications of quantum computing to plasma simulations. Phys. Plasmas 28(9), 092101 (2021). arXiv:2005.14369
Joseph, I.: Koopman-von Neumann approach to quantum simulation of nonlinear classical dynamics. Phys. Rev. Res. 2(4), 043102 (2020). arXiv:2003.09980
Lloyd, S., De Palma, G., Gokler, C., Kiani, B., Liu, Z.-W., Marvian, M., Tennie, F., Palmer, T.: Quantum algorithm for nonlinear differential equations (2020). arXiv:2011.06571
Engel, A., Smith, G., Parker, S.E.: Linear embedding of nonlinear dynamical systems and prospects for efficient quantum algorithms. Phys. Plasmas 28(6), 062305 (2021). arXiv:2012.06681
Tronci, C., Joseph, I.: Koopman wavefunctions and Clebsch variables in Vlasov–Maxwell kinetic theory (2021). arXiv:2105.00294
Jin, S., Liu, N.: Quantum algorithms for computing observables of nonlinear partial differential equations (2022). arXiv:2202.07834
Dodin, I.Y., Startsev, E.A.: Quantum computation of nonlinear maps (2021). arXiv:2105.07317
Xue, C., Yu-Chun, W., Guo, G.-P.: Quantum homotopy perturbation method for nonlinear dissipative ordinary differential equations. New J. Phys. 23(12), 123035 (2021). arXiv:2111.07486
Brassard, G., Hoyer, P., Mosca, M., Tapp, A.: Quantum amplitude amplification and estimation. Contemp. Math. 305, 53–74 (2002). arXiv:quant-ph/0005055
Allen, S.M., Cahn, J.W.: A microscopic theory for antiphase boundary motion and its application to antiphase domain coarsening. Acta Metall. 27(6), 1085–1095 (1979)
Fisher, R.A.: The wave of advance of advantageous genes. Ann. Eugen. 7(4), 355–369 (1937)
Van Gennip, Y., Bertozzi, A.L., et al.: \(\Gamma \)-convergence of graph Ginzburg–Landau functionals. Adv. Differ. Equ. 17(11/12), 1115–1180 (2012). arXiv:1204.5220
Van Gennip, Y., Guillen, N., Osting, B., Bertozzi, A.L.: Mean curvature, threshold dynamics, and phase field theory on finite graphs. Milan J. Math. 82(1), 3–65 (2014). https://doi.org/10.1007/s00032-014-0216-8
Luo, X., Bertozzi, A.L.: Convergence of the graph Allen–Cahn scheme. J. Stat. Phys. 167(3–4), 934–958 (2017)
Kiani, B.T., De Palma, G., Englund, D., Kaminsky, W., Marvian, M., Lloyd, S.: Quantum advantage for differential equation analysis (2020). arXiv:2010.15776
Brauer, F., Nohel, J.A.: The Qualitative Theory of Ordinary Differential Equations: An Introduction. Dover Books on Mathematics. Dover Publications, Mineola (2012)
Berry, D.W., Ahokas, G., Cleve, R., Sanders, B.C.: Efficient quantum algorithms for simulating sparse Hamiltonians. Commun. Math. Phys. 270, 359–371 (2007). arXiv:quant-ph/0508139
Costa, P.C.S., An, D., Sanders, Y.R., Su, Y., Babbush, R., Berry, D.W.: Optimal scaling quantum linear-systems solver via discrete adiabatic theorem. PRX Quantum 3, 040303 (2022)
An, D., Liu, J.-P., Wang, D., Zhao, Q.: A theory of quantum differential equation solvers: limitations and fast-forwarding (2023)
Burden, R.L., Faires, J.D., Reynolds, A.C.: Numerical Analysis. Brooks Cole, Pacific Grove (2000)
Nielsen, M.A., Chuang, I.: Quantum computation and quantum information (2000)
Perthame, B.: Parabolic Equations in Biology. Lecture Notes on Mathematical Modelling in the Life Sciences. Springer, Cham (2015). Growth, reaction, movement and diffusion
Jerrum, M.R., Valiant, L.G., Vazirani, V.V.: Random generation of combinatorial structures from a uniform distribution. Theor. Comput. Sci. 43, 169–188 (1986)
Acknowledgements
We thank Andrew M. Childs and Lin Lin for valuable discussions. DA acknowledge the hospitality of the Simons Institute for the Theory of Computing in Berkeley. DA, DF and JW acknowledge the Challenge Institute for Quantum Computation funded by NSF through grant number OMA-2016245, and the Department of Energy under grant No. DE-SC0017867. DA acknowledges the support by the Department of Defense through the Hartree Postdoctoral Fellowship at QuICS, and the NSF under Grant No. DMS-1652330. DF is supported by the NSF grant number DMS-2208416. JPL acknowledges support from the Department of Energy, Office of Science, Office of Advanced Scientific Computing Research, Quantum Algorithms Teams and Accelerated Research in Quantum Computing programs, the National Science Foundation award (CCF-1813814, DMS-2008568), and from the National Science Foundation Quantum Information Science and Engineering Network (QISE-NET) triplet award (DMR-1747426).
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by G. Chiribella.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Proof of Lemma 2.1
We now discuss the proof of the a priori estimate of the solution. Before that, we introduce the comparison principle lemma for the discrete reaction–diffusion equation (2.7), which implies the a priori estimate as a direct consequence.
Lemma A.1
(Comparison principle). Assume \(f(u) \in C^\infty (\mathbb {R})\) and \(U(t) = [U_1, \ldots , U_n]^T\), \(V(t)= [V_1, \ldots , V_n]^T\) are continuous functions that satisfy
for \(t \in (0, T]\) and all the multi-indices \(j \in \mathcal {I}\). Furthermore, \(U_j(0) \leqslant V_j(0)\) for all \(j \in \mathcal {I}\) and \(U_j(t) \leqslant V_j(t)\) for \(j \in \mathcal {B}\) and \(t\in [0, T]\). Then it holds that
for all j and all time \(t\in (0, T]\).
We remark that Lemma A.1 immediately implies Lemma 2.1, because both \(U_j(t) = \gamma _1\) and \(U_j(t) = \gamma _2\) all j and t are solutions to (2.7) and hence comparison principle can be applied to the solution of interests and these constant-valued equilibrium solutions, which yields the desired result.
Proof
Our proof can be split into the following three steps: in the first two steps, we consider a linear operator of U given by \(\frac{\textrm{d}U_j}{\textrm{d}t} - D \left( \Delta _h U \right) _j + {\tilde{C}}_j U_j\) for some vector \({\tilde{C}} = \left( \tilde{C_{j}}\right) \), and show that a maximum principle result for this linear operator; and in the last step, we prove the comparison principle for the nonlinear problem as considered in (). It is worth pointing out that although a linear problem is considered at first, there is no linearization procedure introducing any extra error here.
First, we claim that if
for some \({\tilde{C}}\) with all entries positive, then it holds that
where \(\mathcal {B}\) denotes the boundary indices and \(V^+_{j}: = \max (V_j, 0)\).
Suppose the claim does not hold, then there exists some time \(t_0 \in (0, T]\) and \(j_0 \not \in \mathcal {B}\) such that \(V_j(t)\) attains the positive maximum value at \((t_0, j)\). Note that the definition (2.8) of \(\Delta _h\) can be written in the following equivalent form
and hence one has \(-\left( \Delta _h V \right) _{j_0} \ge 0\). Meanwhile, if \(t_0\in (0,T)\), then
otherwise, \(t_0 = T\), and one has
Therefore, one has
which is a contradiction. This completes the proof of the claim.
In the second step, we relax the condition of the claim from the following two angles: the condition on C is changed from positive to bounded from below; and the equality is allowed. To be precise, we shall show that for C satisfying that \(C_j \ge c_{\textrm{min}}\) for all \(j \in \mathcal {I}\), if
then
To prove this, we define \({\tilde{V}}(t)\) by the change of variable \({\tilde{V}}(t) = \text {e}^{c_{\textrm{min}} t} V(t) - \delta t\). A straightforward calculation yields
Thus \({\tilde{V}}_j\) satisfies (A.1). Letting \(\delta \rightarrow 0_+\) yields the desired result.
The last part of the proof is to show the comparison principle. Let \(W: = U-V\). One has \(W_j(0) \leqslant 0\) for \(j \in \mathcal {I}\) and \(W_j(t) \leqslant 0\) for all \(t\in (0,T]\) and \(j \in \mathcal {B}\) so that the right-hand-side of (A.4) is non-positive. Moreover, W satisfies
and by mean value theorem \(f(U_j) - f(V_j) = f'(\xi _j)\) with \(\xi _j\) in between \(U_j\) and \(V_j\), we arrive at
Applying the result of the second step yields
for all \(j \in \mathcal {I}\) and \(t \in [0, T]\), which completes the proof of this lemma. \(\square \)
Matrix Inequality
Lemma B.1
(Discrete maximal principle). Let \(D_h\) be the discrete 1-dimensional Laplacian operator with homogeneous Dirichlet boundary conditions, then
Proof
In this proof, we will apply the results,
First, we restrict \(t\le t_0:=\frac{1}{(n+1)^2}\). Then when k is larger enough, i.e. \(k\ge 2t_0 (n+1)^2\), one has
Hence,
By continuity, one can get that \(\left\Vert \text {e}^{tD_h}\right\Vert _{\infty }\le 1\). As for \(t>t_0\), we choose positive integer l such that \(t/l\le t_0\) and get
In this way, we obtain the desired results. \(\square \)
Lemma B.2
Let \(D_h^\text {per}\) be the discrete 1-dimensional Laplacian operator with periodic boundary conditions, then
Proof
Using the same argument in Lemma B.1, we can prove that
Now we only need to prove the inequality in the opposite direction. Denote v to be the eigenvector of \(D_h^\text {per}\) associated with eigenvalue 0. Then we know that \(\text {e}^{tD_h^\text {per}} v=v\), which implies \(\left\Vert \text {e}^{tD_h^\text {per}}\right\Vert _{\infty }\ge 1\). In this way, we obtain that \(\left\Vert \text {e}^{tD_h^\text {per}}\right\Vert _{\infty }= 1\). \(\square \)
Lemma B.3
Let \(D_h^\text {Dir}\) be the discrete 1-dimensional Laplacian operator with homogeneous Dirichlet boundary conditions, and denote \(\mu _1\ge \mu _2\ge \cdots \ge \mu _n\) to be the eigenvalues of \(D_h^\text {Dir}\). Then
Proof
We use the decomposition of \(D_h^\text {Dir}\):
Here \(\mu _k=-4(n+1)^2\left( \sin \left( \frac{k\pi }{2n+2}\right) \right) ^2\). Besides, we know that for any \(3\le k\le n\),
In the last step, we use the inequality \(\frac{2}{x}\le \sin (x)\le x\) for \(x\in [0,\frac{\pi }{2}]\). Note that
For any \(1\le k\le n\), we have
We notice that for any k,
where we use the equality \(\sum _{l=1}^n\sin \left( \frac{l \pi }{n+1}\right) =\frac{1}{\tan \left( \frac{ \pi }{2n+2}\right) }\) and \(\tan (x)\ge x\) for \(x\in [0,\frac{\pi }{2})\).
We also notice that
In this way, we obtain
\(\square \)
Combining Lemmas B.1 and B.3, we obtain the following results.
Lemma B.4
Let \(\mu _1\) be the largest eigenvalue of 1-dimensional \(D_h^\text {Dir}\) with homogeneous Dirichlet boundary conditions. Given \(t\in \mathbb {R}^+\), for arbitrary \(\mu >\mu _1\), we have the decay estimate
Proof
The case when \(t < \frac{\ln (3)}{2(\mu -\mu _1)}\) directly follows from Lemma B.1. When \(t \ge \frac{\ln (3)}{2(\mu -\mu _1)}\), recall that
Then we have
Notice that
for an \(\mu >\mu _1\), we get \(\left\Vert \text {e}^{tD_h^\text {Dir}}\right\Vert _{\infty }\le \text {e}^{t\mu }\). Together with Lemma B.5, we obtain the desired bound. \(\square \)
We introduce a lemma about the decay estimate of the discrete heat semigroup.
Lemma B.5
(Decay of Discrete Heat Semigroup). Let \(\Delta _h\) be the discrete d-dimensional Laplacian operator defined in (2.8) with homogeneous Dirichlet boundary conditions imposed on \(d_1\) dimensions and periodic boundary conditions imposed on \(d_2\) dimensions (\(d_1 + d_2 = d\)). Given \(t\in \mathbb {R}^+\), for arbitrary \(\mu >\mu _1\), we have the decay estimate
where \(\mu _1=-4(n+1)^2\sin ^2\left( \frac{\pi }{2n+2}\right) \).
Proof
Thanks to (2.8), one has
It suffices to consider the heat kernel for each dimension. For the dimensions with homogeneous Dirichlet boundary conditions, we have
Similarly, for the dimensions with periodic boundary conditions, we have
Combining these, we obtain
\(\square \)
Lemma B.6
Let \(\Delta _h\) be the discrete d-dimensional Laplacian operator defined in (2.8) with homogeneous Dirichlet boundary conditions imposed on \(d_1\) dimensions and periodic boundary conditions imposed on \(d_2\) dimensions (\(d_1 + d_2 = d\)). Given \(j\in \mathbb {Z}^+\), \(t\in \mathbb {R}^+\), and \(a\in \mathbb {R}\) such that \(\lambda _1 = D d_1 \mu _1 + a < 0\), for arbitrary \(\lambda \) such that \(\lambda _1<\lambda < 0\), we have the decay estimate
where \(\mu _1=-4(n+1)^2\sin ^2\left( \frac{\pi }{2n+2}\right) \).
Proof
Given any \(\lambda \), define \(\mu :=\frac{\lambda -a}{D d_1}\). Then \(\lambda > \lambda _1\) implies that \(\mu >\mu _1\). According to Lemma B.5, we have
We now integrate the upper bound of \(\Vert \text {e}^{jt (D \Delta _h + a)}\Vert _{\infty }\). When \(a>0\) or \(a<0\), we have such an estimate
When \(a=0\), the integration is reduced to
Since \(\int _0^t \Vert \text {e}^{j(t-s) (D \Delta _h + a)}\Vert _{\infty }\) is non-decreasing, we consider the upper bound as \(t \ge \frac{\ln (3)}{2jD(\mu -\mu _1)}\) to have
Then, noticing \(\text {e}^{\frac{\ln (3)}{2D(\mu -\mu _1)} \lambda } < 1\),
Exploiting the definition of \(\mu \), we get
\(\square \)
A Naive Estimate of \(\left\Vert U(t)\right\Vert \)
In this section, we present two estimates of \(\left\Vert U(t)\right\Vert \). Note that this estimate is by no means sharp, a tighter bound would depend on the specific forms of the nonlinearity and initial data.
Lemma C.1
Let U(t) be a solution to (2.14) that satisfies the assumptions of Lemma 2.1, i.e., the initial condition \(\left\Vert U(0)\right\Vert _{\infty }\le \gamma \). Then one has
where \(\lambda _1\) is the largest eigenvalue of \(D\Delta _h+a I\).
Proof
Consider the derivative of \(\left\Vert U(t)\right\Vert \),
In the last inequality, we use the maximum principle described in Lemma 2.1, that is, \(\left\Vert U(t)\right\Vert _{\infty }\le \gamma \) for any \(t>0\). Hence, we obtain
\(\square \)
Remark C.2
Given U(t) satisfying the assumption of C.1, we have the following estimate
which follows the maximal principle.
Remark C.3
When \(R_D<1\), we notice that \(C(\lambda )\ge 1\). So \(\frac{|b|}{|\lambda _1|}\gamma ^{M-1} \le R_D<1\), which implies that \(\lambda _1 +|b|\gamma ^{M-1}<0\). Therefore, we know that the 2-norm of the solution decays.
Lemma C.4
Let U(t) be a solution to (2.14). Suppose that the largest eigenvalue of \(D\Delta _h+a I\), \(\lambda _1\) is negative. If \(\left|b\right|\left\Vert U(0)\right\Vert _2^{M-1}+\lambda _1\ge 0\), then one has
Proof
Similarly, we consider the derivative of \(\left\Vert U(t)\right\Vert \),
In the last inequality, we use the inequality \(\left\Vert u\right\Vert _{M+1}\le \left\Vert u\right\Vert _2\) for any \(M\ge 1\). For simplicity, we denote \(\left\Vert U(t)\right\Vert _2\) as y and now we study the following equation instead,
This is a Bernoulli differential equation and we only consider its positive solution. Define \(z:=\frac{1}{(M-1)y^{M-1}}\) and it fulfils
By using the integrating factor, one gets
If \(\left|b\right|\left\Vert U(0)\right\Vert ^{M-1}+\lambda _1>0\), then
In terms of \(\left\Vert U(t)\right\Vert \), one has
The coefficient \(\frac{1}{\left\Vert U(0)\right\Vert ^{M-1}} +\frac{\left|b\right|}{\lambda _1}\) is negative and by the monotonic decreasing of \(\text {e}^{\lambda _1 (M-1)t}\), one obtains the desired results.
If \(\left|b\right|\left\Vert U(0)\right\Vert ^{M-1}+\lambda _1=0\), (C.8) implies \(z \ge -\frac{\left|b\right|}{\lambda _1 (M-1)}\). Hence \(\left\Vert U(t)\right\Vert \le \left( \frac{-\lambda _1}{\left|b\right|}\right) ^{\frac{1}{M-1}}=\left\Vert U(0)\right\Vert \), which completes the proof. \(\square \)
Proof of Theorem 3.2
Proof
The truncation error \(\eta _j\) satisfies the equation
Applying the variation of constant formula to (D.1), one has
Note that it follows from Lemma 2.1 that
Therefore, according to Lemma 3.1, we have for \(N-M+2 \leqslant j \leqslant N\),
For \(N-2M+3 \leqslant j \leqslant N-M+1\),
One can continue by mathematical induction for every group of \(M-1\) terms and arrive at
where we use \({\overline{R}}=\frac{\Vert F_M\Vert }{|\lambda _1|}\max _t\Vert U(t)\Vert ^{j+M-1}\) as in (3.10).
In practice, we set \(N-1\) as some integer multiples of \(M-1\), such that
Finally, if \(R\le 1\), according to Lemma C.4, we have \(\Vert U_{\textrm{in}}\Vert < \Vert U(t)\Vert \) for \(t>0\), and then \(R = {\overline{R}}\). This completes the proof of the desired result. \(\square \)
Estimate of the Preconstant
According to the definition of \(R_D\), the value of \(R_D\) depends on the choice of \(\lambda \). In order to obtain a sharper estimate of the approximation error, we hope to find the optimal value of \(\lambda \) such that it minimizes \(C(\lambda )\). When \(a=0\), this optimization problem is easy to solve. For any \(\lambda _1<\lambda <0\), one has
and the equality holds when \(\lambda =\frac{\lambda _1}{\sqrt{\frac{\ln (3)}{2}d_1}+1}\). The minimum of \(C(\lambda )\) is \(\left( \sqrt{\frac{\ln (3)}{2}d_1}+1\right) ^2\), which is O(d), and the corresponding value of \(R_D\) is
When \(a\ne 0\), the optimal value of \(\lambda \) can be obtained by solving the following equation
In real applications, we suggest tuning the parameter \(\frac{\lambda }{\lambda _1}\) to obtain a value of \(R_D\) around the optimum, instead of directly solving the above equation. Besides, we have the following theoretical results regarding \(\min _{\lambda _1<\lambda <0} C(\lambda )\).
Lemma E.1
Suppose that \(\lambda ^*:=-\pi ^2Dd_1+a<0\) and \(a\ne 0\), there exists an upper bound of \(\min _{\lambda _1<\lambda <0} C(\lambda )\), where \(\lambda _1=Dd_1 \mu _1 +a\) and \(\mu _1=-4(n+1)^2\sin ^2\left( \frac{\pi }{2n+2}\right) \). The upper bound is independent of n.
Proof
The inequality \(\sin (x)<x\) implies that \(-\pi ^2<\mu _1\) for any n. Besides, \(\lim _{n\rightarrow \infty } \mu _1=-\pi ^2\). Then there exists an positive integer \(n^*\) such that for any \(n\ge n^*\), \(\mu _1 \le -0.9\pi ^2 -\frac{a}{Dd_1}\), which leads to \(\lambda ^*\le \lambda _1\le 0.9 \lambda ^* \).
When \(n\ge n^*\), one has
Note that \(C\left( \frac{\lambda _1}{2}\right) \) is continuous over \([\lambda ^*, 0.9\lambda ^*]\), one gets that \(C_1\) is finite. Let \(C_2\) be the maximum of \(C\left( \frac{\lambda _1}{2}\right) \) for \(1\le n\le n^*\). Then \(\max (C_1, C_2)\) is the upper bound we desired. \(\square \)
Remark E.2
From the proof of Lemma E.1, we know that when \(a\ne 0\) and \(n>n^*\), \(C_1\) is an upper bound of \(\min _{\lambda _1<\lambda <0} C(\lambda )\), where
Here, we notice that \(\frac{\ln (3) d_1}{0.9(\pi ^2Dd_1-a)}a\) turns out to be o(1) for large d. Combining the discussion when \(a=0\), we know that for n large enough, \(\min _{\lambda _1<\lambda <0} C(\lambda )\) is \(\mathcal {O}(d)\) regardless of a.
An Illustration on Approximating Derivatives Using Discrete Fourier Transform
Here we briefly explain the reason why discrete Fourier transform can be applied to compute derivatives of a function in classical computing. For simplicity, we use a 1-dimensional example. Let f(x) denote a smooth function defined on the interval [0, 1], and our goal is to transform the vector
to the vector
Let \(\mathcal {F}\) denote the one-dimensional quantum Fourier transform operator, then the discrete Fourier transform \(\mathcal {F}^{-1}\) acts on a vector \(v = (v_0,\ldots , v_{n-1})\) and maps it to another vector according to the formula
where \(\omega _n = \text {e}^{2\pi i/n}\). \(\mathcal {F}^{-1} v\) can be interpreted as the set of discrete Fourier coefficients of the vector v. To see this, let the function f allow the following complex Fourier series expansion
for a positive integer \(\theta \). The error of this approximation is exponentially small in terms of \(\theta \) for any smooth function f. Then the Fourier transform of the vector \(\textbf{f} = (f(k/n))_0^{n-1}\) becomes
Noticing that
we have
This implies that, up to a normalization factor, each non-zero entry of the transformed vector matches one of the Fourier coefficients.
Fourier transform of the derivative can be computed similarly. Starting from the Fourier series again, we have
Replacing the coefficient \(c_m\) by \(2\pi i m c_m\) in (F.5), we have
This is just a multiplication of the diagonal matrix \(\hat{D}\) on the vector \(\mathcal {F}^{-1} \textbf{f}\), where
Therefore,
which implies that the derivative operator \(\mathbf {f'}\) can be numerically computed by first performing an inverse quantum Fourier transform, then multiplying from the left by a diagonal matrix \(\hat{D}\), and finally performing a quantum Fourier transform.
Proofs of the Results in Sect. 6
In this section we present technical details on proving the results presented in Sect. 6.
1.1 Proof of Theorem 6.1
Proof
The error in estimating the ratio can be decomposed into two parts, the error due to the approximate quantum state, and the error within the amplitude estimate. The first part of the error can be directly bounded such that
given that \({|{\hat{f}}\rangle }\) is produced by \(\mathcal {A}(\epsilon ')\) with tolerated error \(\epsilon '\), which will be determined later. Let E be the quantity obtained by amplitude estimate algorithm. Then, according to [72], by querying \(\mathcal {O}(q)\) times to \(\mathcal {A}\) and \((I-2P)\), we can bound the error with probability at least \(8/\pi ^2\) as
By the powering lemma [86], we can boost the success probability to at least \((1-\delta )\) by repeating the procedure \(\mathcal {O}(\log (1/\delta ))\) times and taking the median, leading to total \(\mathcal {O}(q\log (1/\delta ))\) queries to \(\mathcal {A}\) and \((I-2P)\). Combining (G.1) and (G.2), we have, with probability at least \((1-\delta )\),
The proof then can be completed by choosing \(\epsilon ' = \epsilon /4\) and \(q = 4\pi /\epsilon \). \(\square \)
1.2 Proof of Theorem 6.2
The proof of Theorem 6.2 can be decomposed into two main steps. The first step is to bound the classical error of using discrete Fourier transform to compute the derivatives, which is given in Lemma G.1. Then, we can apply this error bound to estimate the overall complexity of constructing the desired quantum state within the error \(\epsilon \). Here we present and prove a more general result in Theorem G.2, which can be viewed as a generalization of Theorem 6.2 with a weaker regularity assumption.
Lemma G.1
Let f(x) be a \(C^p\) function with \(p \ge 3\) defined on \([0,1]^d\),
be the possibly unnormalized vector encoding f(x) evaluated at discrete grids, and \(\mathcal {F}\) denote the one-dimensional quantum Fourier transform with n nodes. Furthermore, for any positive integer \(\theta \le n/2\), let \(D_{j,\theta }\) be a diagonal matrix
Then
where
Proof
We start with the definition of discrete Fourier transform acting on a vector \(\textbf{v} = \sum _l v_{l} {|{l_1}\rangle }\cdots {|{l_d}\rangle }\) that
where \(\omega _n = \text {e}^{2\pi i/n}\). \(\mathcal {F}^{-1} v\) can be interpreted as the set of discrete Fourier coefficients of the vector v. To see this, let the function f allow the following complex Fourier series expansion along j-th direction
Then the Fourier transform of the vector \(\textbf{f}\) becomes
Noticing that
we have
Fourier transform of the derivative can be computed similarly. Starting from the Fourier series again, we have
Replacing the coefficient c by \(2\pi i m_j c\) in (G.11), we have
(G.11) and (G.13) only differ by multiplication of corresponding frequency factors, and multiplying (G.11) by the matrix \(D_{j,\theta }\) will remove such a difference for bounded frequencies. Based on this observation, for each \(k = (k_1,\ldots ,k_d)\), we can compute the difference between the entries as
Now we study how to bound the difference here by first estimating the decay rate of the Fourier coefficients. According to the definition of the Fourier coefficients, for any \(m \ne 0\),
and by using integration by parts formula for p times, we obtain
Then we have, for \(0 \le k_i \le \theta \),
for \(n-\theta \le k_j \le n-1\),
and for \(\theta +1\le k_j \le n-\theta -1\),
Combining these three estimates, we can bound
This completes the proof by further using the fact that the quantum Fourier transform operator has unit 2-norm. \(\square \)
Theorem G.2
Let f(t, x) be a function such that f satisfies periodic boundary conditions for x, and its spatial partial derivatives exist and are continuous up to order \(p \ge 3\). Let the vector \(\textbf{f}\) be
and the vector \(\textbf{g}\) be
Then,
-
1.
we have
$$\begin{aligned} \begin{aligned}&\left\| \textbf{g} - \sum _{j=0}^{d-1}\sum _{k=0}^{m-1}\sum _{l_1,\ldots ,l_d=0}^{n-1} \partial _{x_j} f(kT/m,l_1/n,\ldots ,l_d/n) {|{j}\rangle }{|{k}\rangle }{|{l_1}\rangle }\cdots {|{l_d}\rangle }\right\| \\&\quad \le \left( \frac{8}{\pi ^{p-1}}\frac{m}{n^{p-2}} + \frac{2\sqrt{2}}{(2\pi )^{p-1}} \frac{mn}{\theta ^{p-1}}\right) \sum _{j=0}^{d-1} \Vert \partial _{x_j}^p f\Vert _{\infty }, \end{aligned} \end{aligned}$$(G.23) -
2.
for any \(0< \epsilon < 1\), \(0< \delta <1\), there exists a quantum algorithm which outputs an \(\epsilon \)-approximation of \(\textbf{g}/\Vert \textbf{g}\Vert \) with probability at least \((1-\delta )\), using queries to \(\mathcal {A}(\epsilon /Q)\) and \(O_D\) for \(2Q\log (1/\delta )\) times and additional \(\mathcal {O}(d^3(\log n)^2)\) gates, where \(Q = \frac{4\pi \theta \sqrt{d}\Vert \textbf{f}\Vert }{\Vert \textbf{g}\Vert }\).
Proof
Let \(\epsilon '\) denote the tolerated error in the algorithm \(\mathcal {A}\). Then, since all the operations are unitary, the obtained final quantum state (before measurement) is also an \(\epsilon '\)-approximation of the exact state. We write the exact final state as
where
According to Lemma G.1, we have
It remains to estimate errors in the quantum state after successful measurement and the success probability. For this purpose, we need some linear algebra results and we will state and prove here with slight off from the main proof.
Result Let \(\left\{ e_i,f_j\right\} \) form an orthonormal basis of a Hilbert space, and let \(\psi = a+b, \widetilde{\psi } = {\widetilde{a}} + {\widetilde{b}}\) with \(\Vert \psi \Vert = \Vert \widetilde{\psi }\Vert = 1\), \(a,{\widetilde{a}} \in \text {span}\left\{ e_i\right\} \), and \(b,{\widetilde{b}} \in \text {span}\left\{ f_j\right\} \). If \(\Vert \psi -\widetilde{\psi }\Vert < \epsilon \), then
-
1.
\(\Vert a/\Vert a\Vert - {\widetilde{a}}/\Vert {\widetilde{a}}\Vert \Vert < 2\epsilon /\Vert a\Vert \),
-
2.
\(\Vert {\widetilde{a}}\Vert > \Vert a\Vert - \epsilon \).
This result can be straightforwardly proved by direct computations as
and
The errors in the quantum state after successful measurement and the success probability can be directly bounded using this result and amplitude amplification, viewing \(\psi \) as the exact state and \(\widetilde{\psi }\) as the obtained state. Specifically, for a single run and measurement, errors in the quantum state after successful measurement can be bounded by
and can be further bounded by \(\epsilon \) by choosing
The success probability for a single run, after amplitude amplification, is bounded from below by
The overall probability of getting success at least once can be boosted to \((1-\delta )\) by repeating the algorithm M times with
\(\square \)
Finally, under the further regularity assumption specified in Theorem 6.2, we can obtain a simpler complexity estimate which shows a poly-logarithmic dependence in terms of the precision.
Proof of Theorem 6.2
According to Theorem G.2, the successful output of the algorithm is an \(\epsilon \)-approximation of \(\textbf{g}/\Vert \textbf{g}\Vert \) where, for any \(p \ge 3\),
Let \(c = \sup _{j,p} (\Vert \partial _{x_j}^p f\Vert _{\infty })^{1/p}\), and by choosing \(\theta = c/\pi +1\), we have
Since \(c/(\pi n) < 1\), we obtain \(\textbf{g} = \mathbf {\nabla f}\) by taking \(p \rightarrow \infty \). Therefore the claims in Theorem 6.2 directly follow from Theorem G.2. \(\square \)
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Liu, JP., An, D., Fang, D. et al. Efficient Quantum Algorithm for Nonlinear Reaction–Diffusion Equations and Energy Estimation. Commun. Math. Phys. 404, 963–1020 (2023). https://doi.org/10.1007/s00220-023-04857-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00220-023-04857-9