Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

With the appearance of fast and cheap digital computing devices in the last decades, digital computers have become increasingly important as a simulation tool in many fields, ranging from weather forecast to finance. The idea underlying such simulations is simple: pick some system which we want to study and simulate it on a computer using some numerical method. Quite often we can obtain in this manner information about the system which we could not collect otherwise. Think, for example, about the case of weather forecast.

However, this poses a fundamental question: how reliable are these simulations? The truth is that, although historically such simulations have already given fundamental insights (like suggesting that dynamical systems can have strange attractors [6]), due to phenomena like sensitive dependence on initial conditions, in general not much is known about the overall error committed in such simulations.

It therefore seems to make sense to develop numerical methods with the property that we can rigorously tell which is the error done when we apply such methods. This is in contrast to what happens usually in numerical analysis where, at best, only estimates of the error are presented. On the other side, to obtain rigorous bounds on the error, we need to use more complicated methods, which are more amenable for analysis, and which are usually slower or might even be unfeasible for practical implementation. In general, it is not trivial to devise numerical methods which are practical to use and for which the error can be rigorously determined.

To achieve a balance between these contradicting requirements, it makes sense to consider restricted classes of problems, which are nonetheless general enough to be of practical importance.

In this paper, we consider initial-value problems (IVPs) defined with polynomial ordinary differential equations (ODEs)

$$\begin{aligned} \left\{ \begin{array}{@{}r@{\,\,}l} y^{\prime }(t)&{}=p(y) \\ y(t_0)&{}=y_0 \end{array} \right. \end{aligned}$$
(1)

where p is a vector of polynomials. We consider, without loss of generality that the system is autonomous since the independent variable t can always be written as an extra variable \(y_{n+1}\) satisfying \(y_{n+1}^{\prime }=1\). We note that almost any IVP written with the “usual” functions of Analysis (trigonometric functions, exponentials, their inverses and compositions, etc.) can always be rewritten as a polynomial ODE (see e.g. [3, 9]).

Therefore IVPs with the format (1) are sufficiently broad to include a wide range of IVPs of practical interest. Moreover, since the right-hand side consists of relatively simple functions (polynomials), we are able to rigorously analyze the error committed when we solve numerically (1) by using properties of polynomials.

2 Solving IVPs Over Unbounded Domains

It is standard practice to analyze numerical methods which solve IVPs only over a compact time interval [0, T]. This is both true in the Numerical Analysis literature (see e.g. [1]) as it is in the Theoretical Computer Science literature (see e.g. [5]).

However, in practice, people seldom set a valid time interval [0, T] before implementing a numerical procedure, be it for the simple reason that they sometimes do not even know which might be the relevant value for T before doing some numerical simulations.

Therefore, it seems desirable to devise numerical methods which make no prior assumptions on the values which T might take. Of course, the time needed to execute the algorithm (the computational complexity) depends on T: in general, the higher T, the more time the algorithm will take to execute, but it seems to be a non-trivial task to determine which is the dependence of the execution time of the algorithm with respect to T.

There is a “conventional wisdom” that the unbounded time domain case can be reduced to the bounded time one, for which many results exist (see e.g. [4, 5]). However, this is not true, since in the bounded case many parameters which are important for the (unbounded) complexity are hidden in the constant of the “big-O” notation. A very simple example illustrates this problem. Assume that \(y:I\rightarrow \mathbb {R}^d\) is the solution of

$$ \left\{ \begin{array}{@{}r@{\,\,}l}y_1(0)&{}=1\\ y_2(0)&{}=1\\ \ldots &{}\\ y_n(0)&{}=1 \end{array}\right. \qquad \left\{ \begin{array}{@{}r@{\,\,}l}y_1'(t)&{}=y_1(t)\\ y_2(t)&{}=y_1(t)y_2(t)\\ \ldots &{}\\ y_d'(t)&{}=y_1(t)\cdots y_n(t) \end{array}\right. $$

It follows from [7] that for any fixed, compact I, y is polynomial time computable. On the other hand, this system can be solved explicitly and yields:

One immediately sees that, since y is a tower of exponentials, y cannot be polynomial time computable over \(\mathbb {R}\).

Note that this discrepancy arises because, in the bounded time case, the size of compact I is not taken as a parameter of the problem (because it is fixed). Also note that the dimension d of the system is hardly ever taken into account, although it has a huge influence on the resulting complexity. More precisely, if I is bounded then the complexity of computing y(t) can be seen to be polynomial in t, but more than exponential in the length of the interval I and on d: this part is usually hidden in the “big-O” part of the constants.

3 Contributions

The main contribution of this abstract is to show that there is a numerical method, which we denote as \({\text {SolvePIVPEx}}\) (see Sect. 4 for more details), which can rigorously solve IVPs (1) over unbounded domains.

Theorem 1

(Complexity and Correctness of \({\text {SolvePIVPEx}}\) ). Let \(t\in \mathbb {R}\), \(\varepsilon >0\), and assume that y satisfies (1) over \([t_0,t]\). Let

$$\begin{aligned} x={\text {SolvePIVPEx}}(t_0,y_0,p,t,\varepsilon ) \end{aligned}$$

where \({\text {SolvePIVPEx}}\) is a numerical method described in Sect. 4. Then

  • \(\parallel {x-y(t)}\parallel \leqslant \varepsilon \)

  • the arithmetic complexity of the algorithm is bounded by

    $$\begin{aligned} {\text {poly}}(k^d,{\text {Len}}(t_0,t),\log \parallel {y_0}\parallel ,-\log \varepsilon ) \end{aligned}$$
  • the bit complexity of the algorithm is bounded by

    $$\begin{aligned} {\text {poly}}(k,{\text {Len}}(t_0,t),\log \parallel {y_0}\parallel ,\log {\varSigma {p}},-\log \varepsilon )^d \end{aligned}$$

where k is the maximum degree of the components of p, d is the number of components of p, \({\varSigma {p}}\) is the sum of the absolute values of the coefficients of p, and \({\text {Len}}(t_0,t)\) is a bound on the length of the curve \(y(\cdot )\) from the point \((t_0,y(t_0))\) to the point (ty(t)).

4 The Numerical Method \(\text {SolvePIVPEx}\) and Sketch of the Proof of Theorem 1

For reasons of space, we will not present the algorithm defining the numerical method \({\text {SolvePIVPEx}}\) nor the detailed proof of Theorem 1 (see [8] for more details). However, in this section, we briefly sketch the ideas underlying \({\text {SolvePIVPEx}}\) and the proof of Theorem 1.

The numerical method \({\text {SolvePIVPEx}}\) is based on a generic adaptive Taylor meta-algorithm which numerically solves (1). This is a meta-algorithm in the sense that, in a first approach, we leave open the question of how we choose some of the parameters of the algorithm. The goal of this meta-algorithm is, given as input \(t\in \mathbb {Q}\) and \(0<\varepsilon <1\) and the initial condition of (1), to compute \(x\in \mathbb {Q}^d\) such that \(\parallel {x-y(t)\parallel }<\varepsilon \).

We assume that the meta-algorithm uses the following values:

  • \(n\in \mathbb {N}\) is the number of steps of the algorithm

  • \(t_0<t_1<\ldots <t_n=t\) are the intermediate times

  • \(\delta t_i=t_{i+1}-t_i\in \mathbb {Q}\) are the time steps

  • for \(i\in \{0,\ldots ,n-1\}\), \(\omega _i\in \mathbb {N}\) is the order at time \(t_i\) and \(\mu _i>0\) is the rounding error at time \(t_i\)

  • \(\tilde{y}_i\in \mathbb {Q}^d\) is the approximation of y at time \(t_i\).

This meta-algorithm works by solving the ODE (1) with initial condition \(y(t_i)=\tilde{y}_i\) over a small time interval \([t_i,t_{i+1}]\), yielding as a result the approximation \(\tilde{y}_{i+1}\) of \(y(t_{i+1})\). This approximation over this small time interval is obtained using a Taylor approximation of order \(\omega _i\) (we also do not fix, in a first approach, the value \(\omega _i\) to analyze its influence on the error and on the time complexity of the algorithm. After this analysis is done, we can choose appropriate values for \(\omega _i\)) using the polynomial algorithm given in [2]. This procedure is repeated recursively over \([t_0,t_1], [t_1,t_2],\dots ,[t_i,t_{i+1}],\dots \) until we reach the desire time \(t_n=t\). This introduces three potential sources of errors: (i) a global error due to the fact that, on the interval \([t_i,t_{i+1}]\) we do not solve \(y^{\prime }=p(y)\) with the initial value \(y(t_i)\) but instead with the initial value \(\tilde{y}_i\); (ii) a truncation error over \([t_i,t_{i+1}]\) because we only compute a truncated Taylor series of the solution instead of the full Taylor series; (iii) a rounding error because we might only have a finite number of bits to store partial results.

Using the crucial fact that the right-hand side of (1) consists of polynomials, at each time step \(t_i\), one can present an argument based on Cauchy majorants to establish a lower bound on the local radius of convergence. We can choose the step length \(t_{i+1}-t_i\) to be a constant fraction of the estimated radius of convergence, and the majorants can also be used to select a suitable truncation order \(\omega _i\). One can also show, using Gronwalls Lemma, that the propagation of errors from one step to the next can be controlled, and depends on a bound on the length of the curve \(y(\cdot )\) over the domain under consideration. This last parameter needs to be fed to the algorithm, but we can automatically determine a suitable value for it, since we can decide if a (rational) value is large enough to be fed as a bound to the length of the curve. By using some (arbitrary, say the value 1) initial guess and by restarting the method with a larger guess if needed, we can continue this procedure until we decide that we have obtained a high enough value which can be used as a bound for the length of the curve.

Proceeding in this manner we end up fixing the parameters of the meta-algorithm (length of time steps, order of the Taylor approximation of each step, etc.) and we end up with an algorithm \({\text {SolvePIVPEx}}\) which satisfies the conditions of Theorem 1.