1 Introduction

In this paper, we present a novel proof of the Cauchy–Kovalevskaya theorem in the ordinary differential equation (ODE) setting. The general theorem, first proved by Sonya Kovalevskaya in 1874, gives sufficient conditions for a Cauchy problem to have a unique analytic solution. Unfortunately, spaces of analytic functions are typically not the right regularity for studying solutions of partial differential equations (PDE) so the Cauchy–Kovalevskaya theorem is rarely practically applicable in this setting. On the other hand, the Cauchy–Kovalevskaya theorem is often applicable to initial value problems (IVP) arising from ODEs which is the focus of the present work. We begin by stating the theorem in this setting. A statement of the general theorem and its classical proof can be found in most introductory PDE texts, e.g. [1].

Theorem 1

(Cauchy–Kovalevskaya). Suppose \(V \subset \mathbb {R}^n\) is an open subset and \(f :V \rightarrow \mathbb {R}^n\) is an analytic vector field. Then the initial value problem

$$\begin{aligned} \dot{x} = f(x) \quad x(0) = x_0 \in V \end{aligned}$$
(1)

has a unique solution which is analytic on some open interval, \(J(x_0)\), containing zero.

There are several proofs of this theorem in the literature. The classical proof provides a prototypical example of the method of majorants. To illustrate the constructive aspect of our approach, we sketch a version of the classical proof for the case \(n = 1\).

The main idea in the classical proof is to use the Taylor coefficients of f to dominate the Taylor coefficients of x. Roughly speaking, f is analytic if its Taylor coefficients decay rapidly enough. The classical proof follows from showing that this condition forces the Taylor coefficients of any solution to decay rapidly as well. Note that the existence and uniqueness of a solution on some open interval, \(J(x_0)\), containing zero follows from the Picard–Lindelöf theorem. In fact, by the usual bootstrap argument, this theorem shows that this solution is as smooth as f. Hence, we may take for granted the existence of \(x \in C^\infty (J(x_0))\) satisfying Eq. (1).

The Cauchy–Kovalevskaya theorem asserts that, in fact, \(x \in C^\omega (J(x_0))\). Equivalently, there exists \(\tau > 0\), such that the series

$$\begin{aligned} x(t) = \sum _{j = 0}^\infty \frac{x^{(j)}(0)}{j!} t^j \quad \left| t \right| < \tau \end{aligned}$$
(2)

converges. The crux of the classical argument arises from applying the Faà di Bruno formula for the iterated chain rule with the assumption that x satisfies Eq. (1), to obtain the formula

$$\begin{aligned} x^{(j)}(0) = p_j \left( f(0), f'(0), \dotsc , f^{(j-1)}(0) \right) \quad j \in \mathbb {N}, \end{aligned}$$
(3)

where each \(p_j\) is a polynomial in j variables with non-negative coefficients. Then one defines the non-negative sequence, \(\left\{ u_j\right\} = \left\{ \left| f^{(j)}(0) \right| : j \in \mathbb {N}\right\} \), so that we have the bound

$$\begin{aligned} \left| x^{(j)}(0) \right| \le p_j(u_0,\dotsc ,u_{j-1}) \quad \text {for all} \ j \in \mathbb {N}. \end{aligned}$$
(4)

This bound implies that the function

$$\begin{aligned} \tilde{x}(t) \,{:=}\, \sum _{j = 0}^\infty \frac{p_j \left( u_0, \dotsc , u_{j-1} \right) }{j!} t^j \end{aligned}$$
(5)

is a majorant for x. The classical proof is concluded by showing that \(\tilde{x}\) is analytic which ultimately follows as a consequence of the fact that f is analytic.

The classical proof is quite beautiful, however, we note that it is not constructive. In contrast with this approach, our proof of the Cauchy–Kovalevskaya theorem is based on analyzing the coefficients of the solution and proving directly that they decay sufficiently fast. Our proof is inspired by the so-called “radii polynomial approach”, which provides a constructive framework for proving theorems in nonlinear analysis with assistance of a digital computer. While our proof does not have a numerical aspect, it is carried out in the same style so we briefly review the method.

1.1 The radii polynomial approach

The radii polynomial approach is a modern methodology combining functional analytic tools with rigorous numerical computations to study nonlinear problems. The method first appeared in [2] as a modification of the technique presented in [3] for rigorously proving the existence of solutions of zero-finding problems using Newton’s method. Since then, the radii polynomial approach has played an important role in a number of results in dynamical systems such as existence of spontaneous periodic solutions in the Navier–Stokes equations [4], chaos in the circular restricted four body problem [5], coexistence of hexagonal patterns and rolls in the Swift–Hohenberg equations [6], and the proof of Wright’s conjecture [7], to name just a few. This is a small subset of the growing collection of results which utilize the radii polynomial approach as the basis for rigorous numerical algorithms for computation and continuation of equilibria, periodic orbits, connecting orbits, solutions of initial/boundary value problems, and invariant manifolds (see, e.g. [8,9,10,11,12,13,14,15,16]). A more detailed exposition on rigorous numerical techniques and various applications of radii polynomial approach can be found in [17, 18].

The main idea is to first recast problems as a zero-finding problem on a Banach space. Then a Newton-like operator is introduced which has fixed points in one-to-one correspondence with solutions of the zero-finding problem. By combining careful “pencil and paper” estimates with rigorous computations, one tries to prove that this Newton-like operator has a fixed point by an application of the Banach fixed point theorem. If successful, the existence of a zero for the original problem is concluded.

In [19], the radii polynomial approach was generalized to a rigorous numerical IVP solver for polynomial vector fields in which the fixed-point problem does not arise from a Newton-like operator. This approach was based on modifying the methodology in [20] in which one looks for a fixed point of a “Picard-like” operator. In this work, we follow a similar approach. The main idea is to associate any instance of Eq. (1), with a mapping, \(T : X \rightarrow X\), where X is an appropriate space of rapidly decaying real sequences. We provide an explicit construction for X and T depending only on f and \(x_0\), and we prove that if T has a fixed point, then the solution of Eq. (1) is analytic. The Cauchy–Kovalevskaya theorem follows after proving that if f is analytic, then our construction always produces a map with a fixed point.

We begin by describing the main theorem utilized in our approach which is a constructive version of the Banach fixed point theorem.

Theorem 2

Suppose that X is a Banach space with norm \(\Vert \cdot \Vert _{X}\), \(U\subset X\) is an open subset, and \(T:U \rightarrow X\) is a Fréchet differentiable map. Fix \(\bar{x} \in U\) and let \(r^*>0\) be given such that \(\overline{B_{r^*}(\bar{x})} \subset U\). Let Y be a positive constant satisfying

$$\begin{aligned} \Vert T(\bar{x})-\bar{x}\Vert _{X} \le Y, \end{aligned}$$
(6)

and \(Z:(0,r^*) \rightarrow [0,\infty )\) is a non-negative function satisfying

$$\begin{aligned} \sup _{x\in \overline{B_r(\bar{x})}} \Vert DT(x)\Vert _{X} \le Z(r) \quad \text {for all} \quad r\in (0,r^*), \end{aligned}$$
(7)

where DT(x) denotes the Fréchet derivative of T at \(x \in U\) and \(\Vert DT(x)\Vert _{X}\) denotes the operator norm induced by \(\Vert \cdot \Vert _{X}\). We define the radii polynomial, \(p: (0, r^*) \rightarrow \mathbb {R}\), by the formula

$$\begin{aligned} p(r)\,{:=}\,Z(r)r-r+Y. \end{aligned}$$
(8)

If there exists \(r_0\in (0,r^*)\) such that \(p(r_0)<0\), then there exists a unique \(x \in \overline{B_{r_0}(\bar{x})}\) so that \(T(x)=x\).

The version presented in Theorem 2 first appeared in [19] where its proof can also be found. Our proof of the Cauchy–Kovalevskaya theorem will follow from applying Theorem 2 in two steps. First, we construct a fixed point problem which amounts to defining \(X,U, r^*\), and T appropriately, and proving that if our construction has a solution then Eq. (1) has an analytic solution. Then we construct \(\overline{x}\), and bounds, YZ, and prove that we can always find a positive value which makes the corresponding radii polynomial negative.

The remainder of the paper is organized as follows. In Sect. 2, we introduce notation and describe the construction of the fixed point problem in case f is a scalar, i.e. \(n=1\). Then we prove that our construction has a fixed point if f is analytic by applying Theorem 2. In Sect. 3, we generalize the construction to the vector field case. As in the scalar case, we prove that our fixed point problem always has a solution when f is analytic. Finally, we prove that any fixed point of our construction implies the existence of an analytic solution for Eq. (1) which proves the Cauchy–Kovalevskaya theorem for ODEs.

2 The fixed point problem for scalar equations

In this section, we consider Eq. (1) for the case that f is an arbitrary analytic scalar function. Specifically, we assume that \(n = 1\) and for some \(b > 0\), \(f : (x_0 - b,x_0 + b) \rightarrow \mathbb {R}\) is analytic. Therefore, f(x) may be written as a convergent Taylor series of the form

$$\begin{aligned} f(x) = \sum _{k=0}^{\infty } c_k (x-x_0)^k \quad \text {where} \quad c_k = \frac{f^{(k)}(x_0)}{k!} \quad \text {for} \quad k \in \mathbb {N}. \end{aligned}$$

We begin by defining some notation and reviewing necessary prerequisites from complex and functional analysis.

2.1 Preliminaries

We will work with the collection of real-valued sequences denoted by

$$\begin{aligned} S \,{:=}\, \left\{ \left\{ u_j\right\} _{j = 0}^\infty : u_j \in \mathbb {R}, \ 0 \le j < \infty \right\} . \end{aligned}$$
(9)

Let \(S^\omega _\nu \subset S\) denote the collection of sequences which define analytic functions on \(C^\omega (\mathbb {D}_\nu )\), where \(\mathbb {D}_\nu = \left\{ z \in \mathbb {C}: \left| z \right| < \nu \right\} \) is the complex disc of radius \(\nu > 0\). Though we are interested specifically in real analytic functions, we are only concerned with the property that a function converges to a power series. Thus, we do not make a distinction between a real analytic function converging say on an interval of radius \(r> 0\), and its continuation to a complex analytic function converging on a complex disc of radius r.

To apply Theorem 2, we require a Banach space in which to work. With this goal in mind, we start by equipping S with an appropriate norm.

Definition 1

Fix a weight, \(\nu > 0\) and define the space of weighted, absolutely summable sequences

$$\begin{aligned} \ell ^1_\nu \,{:=}\, \left\{ u \in S : \sum _{j = 0}^{\infty } \nu ^j \left| u_j \right| < \infty \right\} . \end{aligned}$$

This is a normed vector space and we denote the norm of \(u \in \ell ^1_\nu \) by

$$\begin{aligned} \Vert u\Vert _{1, \nu } \,{:=}\, \sum _{j = 0}^{\infty } \nu ^j \left| u_j \right| . \end{aligned}$$

We note the obvious inclusions \(\ell ^1_\nu \subset S^\omega _\nu \subset S\) and each is strict. The following theorem provides a connection between \(S^\omega _\nu \) and \(\ell ^1_\nu \).

Proposition 3

Fix \(\nu > 1\) and suppose \(g \in C^\omega (\mathbb {D}_{\nu })\) with Taylor coefficients given by \(u \in S^\omega _\nu \). Then \(u \in \ell ^1_{\nu '}\) for any \(\nu ' < \nu \). In fact, since \(g^{(k)} \in C^\omega (\mathbb {D}_{\nu })\) for any \(k \in \mathbb {N}\), it follows that the Taylor coefficients of \(g^{(k)} \in \ell ^1_{\nu '}\) as well.

The proof can be found in [21]. Roughly speaking, Proposition 3 says we can pass from analytic functions to \(\ell ^1_\nu \) sequences provided we “give up some domain”. This trick is commonly used in rigorous numerical algorithms to obtain bounds on rounding and truncation errors for Taylor series. In our setting, the theorem gives us license to work with sequences in \(\ell ^1_\nu \) as opposed to \(S^\omega _\nu \). The next proposition shows that it suffices to consider the case \(\nu = 1\).

Proposition 4

Suppose \(V \subset \mathbb {R}\) is an open subset and \(f: V \rightarrow \mathbb {R}\). For any \(\tau , \nu > 0\), the initial value problem

$$\begin{aligned} \frac{dx}{dt} = f(x) \quad x(0) = x_0 \end{aligned}$$
(10)

has a solution with Taylor coefficients in \(\ell ^1_\tau \) if and only if the initial value problem

$$\begin{aligned} \frac{d y}{d s}= \frac{\tau }{\nu } f(y) \quad y(0) = x_0 \end{aligned}$$
(11)

has a solution with Taylor coefficients in \(\ell ^1_\nu \).

Proposition 4 says that choosing \(\nu \) is equivalent to rescaling time in Eq. (1). We exploit this equivalence by making an a priori choice for our function space. Specifically, we will work exclusively in the space \(\ell _1^1\) and thus, we will omit \(\nu \) from the notation for the remainder of the paper and simply write \(\ell ^1\) in place of \(\ell _1^1\). Similarly, we let \(\mathbb {D}\,{:=}\, \mathbb {D}_1\) denote the complex unit disc and our discussion of analytic functions of a scalar variable will always refer to the set \(C^\omega (\mathbb {D})\). The trade-off for fixing \(\nu = 1\) is that we must work with a modified form of Eq. (1) given by

$$\begin{aligned} \dot{x} = \tau f(x) \quad x(0) = x_0, \end{aligned}$$
(12)

where \(\tau \) is a time rescaling parameter.

Finally, we note that \(C^\omega (\mathbb {D})\) is closed under point-wise multiplication. This gives rise to a multiplication operation on \(\ell ^1\) called the Cauchy product. Specifically, the Cauchy product of \(u,v \in \ell ^1\) is denoted as \(u*v\) and given explicitly by the formula

$$\begin{aligned} (u *v )_n\,{:=}\,\sum _{k=0}^{n}u_{n-k}v_k. \end{aligned}$$
(13)

In fact, Merten’s theorem implies that the Cauchy product makes \(\ell ^1\) into a Banach algebra. In particular, suppose \(f,g \in C^\omega (\mathbb {D})\) are analytic functions with Taylor coefficients given by \(u,v \in \ell ^1\), and let \(w = u*v\). Then \(w \in \ell ^1\) also and the function

$$\begin{aligned} h(t) = \sum _{j = 0}^{\infty }w_j t^j \quad t \in \mathbb {D}\end{aligned}$$

is well defined and satisfies \(h(t) = f(t) g(t)\) as expected. Since \(\ell ^1\) is closed under products we define finite powers for Cauchy products in the obvious way by

$$\begin{aligned} u^k \,{:=}\, \underbrace{u *u \dots *u}_{k \ \text {copies}}. \end{aligned}$$

Evidently, it follows that if \(u \in \ell ^1\) then \(u^k \in \ell ^1\) for any \(k\in \mathbb {N}\). To simplify some formulas involving powers of Cauchy products, we define

$$\begin{aligned} u^0 = \left( 1, 0, 0, \dotsc \right) \end{aligned}$$

for any \(u \in \ell ^1\).

2.2 Taylor expansion of IVP solutions

To motivate the construction of a fixed point problem, we consider the method of solving Eq. (12) by power series expansion. We begin by considering an ansatz for the solution to Eq. (12) of the form

$$\begin{aligned} x(t)=\sum _{j=0}^{\infty } a_jt^j \quad a_j \in \mathbb {R}. \end{aligned}$$
(14)

We want to prove that Eq. (14) defines an analytic function on some open interval containing zero by analyzing the coefficient sequence, \(a(\tau ) \,{:=}\, \left\{ a_j\right\} _{j\in \mathbb {N}} \in S\). Combining Proposition 3 and Proposition 4, this is equivalent to proving that for some choice of \(\tau \), \(a(\tau ) \in \ell ^1\).

For the moment, we suppose \(\tau > 0\) is fixed and we suppress the dependence of a on \(\tau \). We formally plug Eq. (14) into Eq. (12) to obtain

$$\begin{aligned} \sum _{j=1}^{\infty } ja_j t^{j-1} = \tau f(x(t))=\tau \sum _{k=0}^{\infty } c_k \left( \sum _{j=0}^{\infty } a_j t^j - x_0 \right) ^k. \end{aligned}$$
(15)

Now, we impose \(a_0 = x_0\) to satisfy the initial condition, and define the sequence

$$\begin{aligned} \tilde{a} \,{:=}\, \left( 0, a_1,a_2,\dotsc \right) \end{aligned}$$

so the right-hand side of Eq. (15) has the form

$$\begin{aligned} \tau f(x(t)) = \tau \sum _{k=0}^{\infty } c_k \left( \sum _{j=1}^{\infty } a_j t^j \right) ^k = \tau \sum _{k=0}^{\infty } c_k \sum _{j=0}^{\infty } \tilde{a}_j^k t^j, \end{aligned}$$
(16)

where the expressions of the form \(\tilde{a}_j^k\) appearing in Eq. (16), and throughout this work, represent the \(j^\mathrm{th}\) term of the k-fold convolution. Specifically,

$$\begin{aligned} \tilde{a}_j^k \,{:=}\, (\underbrace{\tilde{a} *\tilde{a} \dots *\tilde{a}}_{k \text { copies}})_j \end{aligned}$$

as opposed to the \(k^\mathrm{th}\) power of the real number, \(\tilde{a}_j\). This should not lead to confusion as the latter will not appear in this paper.

Now, after matching like powers of Eq. (16) with the left-hand side of Eq. (15), we obtain a recursive formula for the terms in a given by

$$\begin{aligned} a_j \,{:=}\, {\left\{ \begin{array}{ll} x_0 &{}\quad j = 0 \\ \frac{\tau }{j}\sum \nolimits _{k=0}^{j-1} c_k \tilde{a}^k_{j-1} &{}\quad j \ge 1. \end{array}\right. } \end{aligned}$$
(17)

2.3 Constructing the fixed point problem

Now, we want to construct appropriate choices for XU, and T as in Theorem 2. We start with a definition.

Definition 2

For any \(N \in \mathbb {N}\) we define the tail subspace of S to be

$$\begin{aligned} S_{\text {tail}}= \{u \in S : u_j=0 \ \text { for } 0 \le j \le N\}. \end{aligned}$$
(18)

Similarly, we define the tail subspace of \(\ell ^1\) by \(X = S_{\text {tail}}\cap \ell ^1\) and we note that X is a closed subspace of \(\ell ^1\). Hence, X is a Banach space under the norm inherited from \( \ell ^1\). We will denote this norm by \(\Vert \cdot \Vert _{X}\) to emphasize when we are working in this subspace.

Now, we define a Banach space to work in by supposing that \(N \in \mathbb {N}\) is fixed and \(S_{\text {tail}}, X\) denote the tail subspaces as defined in Definition 2. Let \(a(\tau )\) denote the sequence satisfying Eq. (17) where now we emphasize the dependence of this sequence on the choice of \(\tau \) explicitly. Let \(\hat{a}(\tau )\) denote the truncation of \(a(\tau )\) embedded into \(\ell ^1\) defined explicitly by

$$\begin{aligned} \hat{a}(\tau )_j \,{:=}\, {\left\{ \begin{array}{ll} 0 &{}\quad j = 0, \ \text {or} \ j > N \\ a_j(\tau ) &{}\quad 1 \le j \le N. \end{array}\right. } \end{aligned}$$
(19)

Equation (17) leads us to define the \(\tau \)-parameterized family of maps, \(T_\tau : X \rightarrow S_{\text {tail}}\), by the formula

$$\begin{aligned} T_\tau (u)_j \,{:=}\, {\left\{ \begin{array}{ll} 0 &{}\quad 0\le j \le N\\ \frac{\tau }{j}\sum \nolimits _{k=1}^{j-1} c_k \left( \hat{a}(\tau ) + u\right) ^k_{j-1} &{}\quad j > N. \end{array}\right. } \end{aligned}$$
(20)

We will show in the next section that \(a(\tau )\) is the unique fixed point of \(T_\tau \). However, we ultimately want to show that \(\hat{a}(\tau ) \in X\) and we note that the map defined in Eq. (20) does not necessarily map back into X as required for Theorem 2. As a consequence, we must first define an appropriate open subset, \(U \subset X\), on which to restrict T.

With this in mind, we note that since f is analytic on the interval \((x_0-b, x_0 + b)\), for any constant \(b_* \in (0, b)\), there exists positive real constants C, \(C^*\) and \(C^{**}\), which satisfy the bounds

$$\begin{aligned} \sum _{k=0}^{\infty } \left| c_k \right| b_*^k&< C \end{aligned}$$
(21)
$$\begin{aligned} \sum _{k=1}^{\infty } k\left| c_k \right| b_*^{k-1}&< C^*\end{aligned}$$
(22)
$$\begin{aligned} \sum _{k=2}^{\infty } k(k-1)\left| c_k \right| b_*^{k-2}&< C^{**}. \end{aligned}$$
(23)

This is a simple consequence of Cauchy’s integral formula combined with Proposition 3. A proof can be found in [21]. Next, we note that \(\Vert \hat{a}(\tau )\Vert _{1}\) is monotonically increasing as a function of \(\tau \) and by a simple computation we have the limits

$$\begin{aligned} \lim \limits _{\tau \rightarrow 0} \Vert \hat{a}(\tau )\Vert _{1} = \hat{a}_0 = 0 \quad \lim \limits _{\tau \rightarrow \infty } \Vert \hat{a}(\tau )\Vert _{1} = \infty . \end{aligned}$$

Hence, there exists a unique \(\tau _0\) such that

$$\begin{aligned} \Vert \hat{a}(2\tau _0)\Vert _{1} = b_*, \end{aligned}$$

and therefore, \(\Vert \hat{a}(\tau )\Vert _{1} < b_*\) for all \(0<\tau \le \tau _0\). Define positive constants

$$\begin{aligned}&r^* \,{:=}\, b_*-\Vert \hat{a}(\tau _0)\Vert _{1} > 0 \end{aligned}$$
(24)
$$\begin{aligned}&\tau ^* \,{:=}\, \min \left( \tau _0,\frac{Nr^*}{C+r^* C^*} \right) \end{aligned}$$
(25)

and define the open subset

$$\begin{aligned} U\,{:=}\,\left\{ u\in X : \Vert u\Vert _{X} < \frac{1}{2}r^* \right\} . \end{aligned}$$
(26)

Note that the choice of \(b_*\) is not unique. However, for any \(b_* \in (0,b)\), this construction produces an appropriate subset \(U \subset X\).

Next, we will prove that the restriction of \(T_\tau \) to U satisfies the requirements of Theorem 2. We start by defining some notation.

Definition 3

Let \(u \in S\) be any real sequence. The pointwise positive sequence associated to u, denoted by \(\left| u \right| \in S\), is the sequence with terms defined by

$$\begin{aligned} \left| u \right| _j = \left| u_j \right| . \end{aligned}$$

With this notation defined, we have the following lemma.

Lemma 5

Fix \(N \in \mathbb {N}, b_* \in (0, b)\) with corresponding constant \(\tau ^*\) as defined by Eq. (25), and \(U \subset X\) as defined by Eq. (26). Suppose \(\tau \in (0, \tau ^*]\) is fixed, and let \(\hat{a}\) denote the corresponding sequence defined in Eq. (19) where the dependence on \(\tau \) is suppressed. Let T denote the corresponding map defined by Eq. (20). Then

  1. (i)

    \(T(U) \subset X\)

  2. (ii)

    \(T: U \rightarrow X\) is Fréchet differentiable.

Proof

To prove (i), note that T maps into \(S_{\text {tail}}\) by definition, so it suffices to show that for any \(u \in U\), \(T(u) \in \ell ^1\). By a direct computation, we have

$$\begin{aligned} \begin{aligned} \sum _{j=0}^{\infty } \left| T(u)_j \right|&= \sum _{j=N+1}^{\infty } \left| \frac{\tau }{j}\sum _{k=1}^{j-1} c_k \left( \hat{a} + u\right) ^k_{j-1} \right| \\&\le \frac{\tau }{N+1} \sum _{j=N+1}^{\infty } \sum _{k=1}^{j-1} \left| c_k \right| \left| \left( \hat{a} + u\right) ^k_{j-1} \right| \\&\le \frac{\tau }{N+1} \sum _{k=1}^{\infty } \left| c_k \right| \Vert \hat{a} + u\Vert _1^k\\&< \frac{\tau }{N+1} \sum _{k=1}^{\infty } \left| c_k \right| b_*^k\\&\le \frac{\tau C}{N+1}, \end{aligned} \end{aligned}$$

where the second to last line follows from Eq. (24) combined with the bound \(\Vert u\Vert _X < \frac{1}{2}r^*\), and the last line from Eq. (21). Hence, \(T(u) \in \ell ^1\) as required.

Now, we show that T is Fréchet differentiable. Fix \(u \in U\) and define a linear operator, \(A(u): U \rightarrow X\), by its action on \(h \in U\) given by the formula

$$\begin{aligned} \left( A(u) h\right) _j = \left\{ \begin{array}{ll} 0 &{}0\le j \le N\\ \frac{\tau }{j} \sum \limits _{k=1}^{j-1}k c_k \left( h * \left( \hat{a} + u \right) ^{k-1} \right) _{j-1} &{} j > N. \end{array} \right. \end{aligned}$$
(27)

The claim that A(u) maps U into X follows from a computation similar to the proof of (i) by applying Eq. (22). We want to show that A(u) is the Fréchet derivative of T at \(u \in U\). Let \(h \in U\) be arbitrary such that \(u+h\in U\) as well. By directly applying the formulas for T(u) and A(u), we have

$$\begin{aligned}&\left| T(u+h)-T(u)-A(u)h \right| _j\\&\quad = \left| \frac{\tau }{j}\sum _{k=0}^{j-1} c_k\left( \left( \left( \hat{a} + u +h \right) ^k\right) _{j-1}-\left( \left( \hat{a} + u \right) ^k\right) _{j-1}-k \left( h * \left( \hat{a} + u \right) ^{k-1} \right) _{j-1}\right) \right| \\&\quad = \left| \frac{\tau }{j}\sum _{k=2}^{j-1} c_k\sum _{i=2}^{k}\frac{k(k-1)}{i(i-1)}\left( {\begin{array}{c}i-2\\ k-2\end{array}}\right) \left( h^i * (\hat{a} +u)^{k-i}\right) _{j-1} \right| . \end{aligned}$$

Now, passing to the pointwise positive sequences for \(\hat{a} + u\) and h and summing over \(j \in \mathbb {N}\) we obtain the estimate

$$\begin{aligned}&\Vert T(u+h)-T(u)-A(u)h\Vert _X \\&\quad \le \sum _{j=N+1}^\infty \frac{\tau }{j}\sum _{k=2}^{j-1} k(k-1)\left| c_k \right| \sum _{i=0}^{k-2}\left( {\begin{array}{c}i\\ k-2\end{array}}\right) \left| \left( \left| h \right| ^{i+2} * \left( \left| \hat{a} + u \right| \right) ^{k-2 - i}\right) _{j-1} \right| \\&= \sum _{j=N+1}^\infty \frac{\tau }{j}\sum _{k=2}^{j-1} k(k-1)\left| c_k \right| \left( (\left| h \right| ^2 * \left( \left| \hat{a} + u \right| + \left| h \right| \right) ^{k-2} \right) _{j-1}\\&\le \frac{\tau \Vert h\Vert ^2_X}{N+1}\sum _{k=2}^{\infty } k(k-1)\left| c_k \right| \left( \Vert \hat{a}\Vert _1 + \Vert u\Vert _X + \Vert h\Vert _X\right) ^{k-2}\\&< \frac{\tau \Vert h\Vert ^2_X}{N+1}\sum _{k=2}^{\infty } k(k-1)\left| c_k \right| b_*^{k-2}\\&\le \frac{\tau C^{**}}{N+1} \Vert h\Vert ^2_{X}. \end{aligned}$$

where the second to last line follows from Eq. (24) combined with the bounds \(\Vert h\Vert _X < \frac{1}{2}r^*\) and \(\Vert u\Vert _X < \frac{1}{2}r^*\), and the last line follows from Eq. (23). It follows that

$$\begin{aligned} \lim _{\Vert h\Vert _{X}\rightarrow 0}\frac{\Vert T(u+h)-T(u)-A(u)h\Vert _{X}}{\Vert h\Vert _{X}} = 0 \end{aligned}$$
(28)

which proves that T is Fréchet differentiable. Moreover, since \(0 < \tau \le \tau ^*\) was arbitrary, we have shown that \(T_\tau \) is Fréchet differentiable for the entire family of \(\tau \)-parameterized maps defined by Eq. (20).

Lemma 5 proves that \(DT_\tau \) is Fréchet differentiable, and moreover, its derivative is given by the formula in Eq. (27). For the remainder of this work, we let \(DT_\tau (u)\) denote the Fréchet derivative of \(T_\tau \) at \(u \in U\).

2.4 Constructing the bounds

To construct the bounds required for Theorem 2, we begin by defining \(\bar{x} \,{:=}\, 0_{\ell ^1} \in X\) which is the sequence of infinitely many zeroes. This choice is made independent of N or \(\tau \). We are left with constructing \(r_0\), \(Y_\tau \) and \(Z_\tau : (0, r^*) \rightarrow [0, \infty )\), such that the corresponding radii polynomial, \(p_\tau (r_0) < 0\). Here the \(\tau \) subscript emphasizes that these bounds depend on \(\tau \). The next lemma establishes the required bounds for \(Y_\tau \) and \(Z_\tau \).

Lemma 6

Fix \(N \in \mathbb {N}\) and let \(S_{\text {tail}}\) be the tail subspace of order N. Fix \(b_* \in (0, b)\) with corresponding constants \(C, C^*, r^*\) and \(\tau ^*\) as defined in Eqs. (21), (22), (24), (25), and \(U \subset X = S_{\text {tail}}\cap \ell ^1\) as defined in Eq. (26). For \(\tau \in (0, \tau ^*]\), let \(\hat{a}(\tau )\) denote the truncation defined in Eq. (19), and \(T_\tau : U \rightarrow X\) denotes the parameterized family of maps defined in Eq. (20). Define the constant

$$\begin{aligned} Y_\tau \,{:=}\,\frac{\tau C}{N+1} \end{aligned}$$
(29)

and the constant function, \(Z_\tau : (0, r^*) \rightarrow [0, \infty )\), by the formula

$$\begin{aligned} Z_\tau (r)\,{:=}\,\frac{\tau C^* }{N+1} \quad \text {for all} \quad r \in (0, r*). \end{aligned}$$
(30)

Then the following bounds hold

$$\begin{aligned}&\qquad \qquad \qquad \Vert T_\tau (0)\Vert _{X}\le Y_\tau \end{aligned}$$
(31)
$$\begin{aligned}&\sup \limits _{u \in \overline{B_r(0)}} \Vert DT_\tau (u)\Vert _{X} \le Z_\tau (r) \quad \text {for all} \quad r \in (0, r^*). \end{aligned}$$
(32)

Proof

To establish the bound for \(Y_\tau \), we compute

$$\begin{aligned} \begin{aligned} \Vert T(0)\Vert _{X}&=\sum _{j=N+1}^{\infty } \left| \frac{\tau }{j}\sum _{k=0}^{j-1} c_k \hat{a}^k_{j-1} \right| \\&\le \frac{\tau }{N+1} \sum _{k=1}^{\infty }\sum _{j=N+1}^{\infty }\left| c_k \right| \left| \hat{a}^k_{j-1} \right| \\&\le \frac{\tau }{N+1} \sum _{k=1}^{\infty } \left| c_k \right| \Vert \hat{a}^k \Vert _1\\&\le \frac{\tau C}{N+1} \end{aligned} \end{aligned}$$

which proves the bound in Eq. (31).

Next, we fix \(0< r < r^*\) and \(u \in \overline{B_r(0)}\), and suppose \(h \in U\) is arbitrary. Then we have the bound

$$\begin{aligned} \begin{aligned} \Vert DT_\tau (u)h\Vert _{X}&=\sum _{j=N+1}^{\infty }\frac{\tau }{j}\left| \sum _{k=1}^{\infty }k c_k\left( h*\left( \hat{a} + u \right) ^{k-1}\right) _{j-1} \right| \\&\le \frac{\tau }{N+1}\sum _{k=1}^{\infty } k \left| c_k \right| \Vert h*\left( \hat{a} + u \right) ^{k-1}\Vert _1\\&\le \frac{\tau \Vert h\Vert _{X}}{N+1} \sum _{k=1}^{\infty } k \left| c_k \right| \left( \Vert \hat{a}\Vert _1 + \Vert u\Vert _X \right) ^{k-1}. \end{aligned} \end{aligned}$$

Dividing through by \(\Vert h\Vert _X\), we obtain the operator norm bound

$$\begin{aligned} \Vert DT_\tau (u)\Vert _X \le \frac{\tau }{N+1} \sum _{k=1}^{\infty } k \left| c_k \right| \left( \Vert \hat{a}\Vert _1 + \Vert u\Vert _X \right) ^{k-1}. \end{aligned}$$

Upon taking the supremum over all \(u \in \overline{B_r(0)}\), we obtain the bound

$$\begin{aligned} \sup _{u \in \overline{B_r(0)}} \Vert DT_\tau (u)\Vert _{X} \le \frac{\tau }{N+1} \sum _{k=1}^{\infty } k \left| c_k \right| \left( \Vert \hat{a}\Vert _1 + r \right) ^{k-1}, \end{aligned}$$

and finally, we obtain a bound which holds for any \(r \in (0,r^*)\) given by

$$\begin{aligned} \sup _{u \in \overline{B_r(0)}} \Vert DT_\tau (u)\Vert _{X}&\le \frac{\tau }{N+1} \sum _{k=1}^{\infty } k \left| c_k \right| \left( \Vert \hat{a}\Vert _1 + r^* \right) ^{k-1} \end{aligned}$$
(33)
$$\begin{aligned}&\le \frac{\tau C^*}{N+1}, \end{aligned}$$
(34)

where the last line follows from Eqs. (22) and (24).

We note that our definition of \(Z_\tau \) is Lemma 6 is, in fact, a constant function with no dependence on r. However, the statement of Theorem 7 allows for Z to depend on r. In practical applications of the radii polynomial approach, bounding higher order derivatives of \(DT_\tau \) yields more accurate approximations and in this case, Z does indeed depend on r. In order to highlight the similarity between these practical applications and our proof in the present work, we will continue to consider \(Z_\tau \) as a function defined on the interval \((0, r^*)\), and write \(Z_\tau (r)\) despite the fact that it is constant.

We have now constructed all of the necessary ingredients for applying Theorem 2 which we apply to prove a precursor to the Cauchy–Kovalevskaya theorem for the scalar case.

Theorem 7

(Cauchy–Kovalevskaya precursor). Suppose \(V \subset \mathbb {R}\) is an open subset and \(f : V \rightarrow \mathbb {R}\) is analytic with a Taylor expansion centered at \(x_0 \in V\) given by the formula

$$\begin{aligned} f(x) = \sum _{k=0}^\infty c_k (x-x_0)^k \end{aligned}$$

which converges for \(x \in (x_0 - b, x_0 + b)\subseteq V\). For any \(N \in \mathbb {N}\), there exists \(\tau > 0\) such that the map defined by Eq. (20) has a fixed point.

Proof

Let \(S_{\text {tail}}\) be the tail subspace of order N and let \(X = S_{\text {tail}}\cap \ell ^1\). Fix \(b_* \in (0, b)\) with corresponding constants \(r^*\) and \(\tau ^*\) as defined by Eqs. (24), (25), and \(U \subset X\) as defined by Eq. (26). Let \(\hat{a}(\tau ^*)\) denote the truncation defined in Eq. (19), and \(T_{\tau ^*} : U \rightarrow X\) denotes the map defined in Eq. (20). Define the radii polynomial

$$\begin{aligned} \begin{aligned} p(r) \,{:=}\, Z_{\tau ^*}(r)r - r + Y_{\tau ^*} \quad \text {for} \quad r \in (0, r^*), \end{aligned} \end{aligned}$$

where \(Y_{\tau ^*}\) and \(Z_{\tau ^*}\) are the norm bounds for \(T_{\tau ^*}\) and \(DT_{\tau ^*}\) proved in Lemma 6. Applying the formulas for \(Y_{\tau ^*}, Z_{\tau ^*}\), we obtain the bound

$$\begin{aligned} p(r)&= \frac{\tau ^* C^*}{N+1}r - r + \frac{\tau ^* C}{N+1} \\&\le \frac{N r^*}{(N+1)(C + r^* C^*)} \left( rC^* + C \right) - r \end{aligned}$$

for all \(r \in (0,r^*)\).

Define \(r_0 \,{:=}\, \frac{N}{N+1}r^* \in (0, r^*)\), and we obtain the bound

$$\begin{aligned} p(r_0)&< \frac{N r^*}{(N+1)(C + r^* C^*)} \left( r^*C^* + C \right) - \frac{N}{N+1}r^* \\&= 0. \end{aligned}$$

By Theorem 2, we conclude that \(T_{\tau ^*}\) has a fixed point in U.

Note that Theorem 7 implies the Cauchy–Kovalevskaya theorem under the additional assumption that fixed points of our construction correspond to analytic solutions of Eq. (1) which we prove in the next section.

3 The Cauchy–Kovalevskaya theorem for analytic vector fields

We begin by extending the construction in Sect. 2 to the case for which f is a vector field. The main technical results are already handled in the scalar case and much of the work here amounts to setting up appropriate notation so that the previous fixed point problem is meaningful. Once this is accomplished, our proof of the Cauchy–Kovalevskaya theorem follows by first proving that fixed points of our construction imply analytic solutions of (1), and then proving a general version of Theorem 7 for analytic vector fields. We begin by recalling the definition of analyticity for vector fields.

Definition 4

Let \(V \subset \mathbb {R}^n\) be an open subset and suppose \(g: V \rightarrow \mathbb {R}\) is a scalar function of the n variables, \(\left\{ x_1,\dotsc ,x_n\right\} \), which we write as components of a vector, \(x \in \mathbb {R}^n\). To avoid confusion over the meaning of indices we will index the components of a vector with superscripts by writing \(x = \left( x^{(1)}, \dotsc , x^{(n)}\right) \). Then g is analytic if for every \(x = \left( x^{(1)}, \dotsc , x^{(n)}\right) \in V\), and for each \(1 \le i \le n\), there exists an open neighborhood, \(V_{x,j} \subset \mathbb {R}\), containing \(x^{(j)}\) such that the formula

$$\begin{aligned} g_{x,j}(t) \,{:=}\, g\left( x^{(1)}, \dotsc , x^{(j-1)}, t , x^{(j+1)}, \dotsc , x^{(n)}\right) \quad t \in V_{x,j}, \end{aligned}$$

defines an analytic function.

This definition generalizes to vector fields as follows. Suppose \(g: V \rightarrow \mathbb {R}^n\) is a vector field which we write as a vector of component functions, \(g(x) = \left( g^{(1)}(x), \dotsc , g^{(n)}(x)\right) \in \mathbb {R}^n\). Then we define g to be analytic if for each \(1 \le i \le n\), the component function, \(g^{(i)} : V \rightarrow \mathbb {R}\), is analytic.

In this setting, the analog of Eq. (12) is the initial value problem

$$\begin{aligned} \dot{x} = \tau f(x) \quad x(0) = x_0 \in V, \end{aligned}$$
(35)

where \(V \subset \mathbb {R}^n\) is an open subset, \(f: V \rightarrow \mathbb {R}^n\) is an analytic vector field, and \(\tau > 0\) is a time rescaling parameter. The solution of Eq. (35) is a function, \(x : \mathbb {R}\rightarrow \mathbb {R}^n\), which parameterizes a trajectory of the ODE initially passing through the point \(x_0\) at time \(t = 0\). Our goal is to prove that if f is analytic, then for each \(x_0 \in V\), there exists an open interval, \(J(x_0) \subset \mathbb {R}\) containing 0, such that \(x: J(x_0) \rightarrow \mathbb {R}^n\) defines an analytic curve.

We will construct a fixed point problem similar to the scalar case. In this version, we describe this operator at a higher level for which the construction in Sect. 2 is a special case. Next, we introduce a Banach space to work in and define some additional notation.

3.1 Products of sequence spaces

We start by generalizing the sequence spaces introduced for scalar functions in Sect. 2.1 to the vector field setting. We consider coefficient sequences in the product

$$\begin{aligned} S^n \,{:=}\, \underbrace{S \times S \times \dots \times S}_{n \ \text {copies}}. \end{aligned}$$
(36)

For arbitrary \(u \in S^n\), we write \(u = \left( u^{(1)}, \dotsc , u^{(n)}\right) \) with \(u^{(i)} \in S\) for \(1 \le i \le n\). If \(g : \mathbb {D}\rightarrow \mathbb {R}^n\) is an analytic curve, then g is defined by a convergent Taylor series of the form

$$\begin{aligned} g(z) = \begin{pmatrix} g^{(1)}(z) \\ \vdots \\ g^{(n)}(z) \end{pmatrix} = \begin{pmatrix} \sum _{j = 0}^{\infty }u^{(1)}_j z^j\\ \vdots \\ \sum _{j = 0}^{\infty }u^{(n)}_j z^j \end{pmatrix} \quad u^{(i)}_j \in \mathbb {R}\quad \text {for all} \quad j\in \mathbb {N}, \ 1 \le i \le n. \end{aligned}$$
(37)

Hence, g is naturally identified with an element, \(u \in S^n\), where \(u^{(i)} \in S\) is the sequence of Taylor coefficients for the analytic scalar function, \(g^{(i)} : \mathbb {D}\rightarrow \mathbb {R}\).

Often, it is advantageous to consider an alternative description of \(S^n\) in which we define elements of \(S^n\) as sequences of vectors in \(\mathbb {R}^n\). Specifically, we have the following equivalent characterization:

$$\begin{aligned} S^n = \left\{ \left\{ u_j\right\} _{j = 0}^\infty : u_j \in \mathbb {R}^n, \ j \in \mathbb {N}\right\} . \end{aligned}$$
(38)

In this case, the equivalent expression for Eq. (37) can be written as

$$\begin{aligned} g(z) = \sum _{j = 0}^{\infty }u_j z^j \quad u_j \in \mathbb {R}^n \quad \text {for all} \quad j \in \mathbb {N}. \end{aligned}$$
(39)

For arbitrary \(u \in S^n\) we write \(u^{(i)} \in S\) to express the \(i^\mathrm{th}\) component sequence, and we write \(u_j \in \mathbb {R}^n\) to denote the \(j^\mathrm{th}\) term when we consider u to be an infinite sequence of real vectors.

Following the radii polynomial approach and the constructions in Sect. 2, we want to work in a Banach space of absolutely summable sequences. The appropriate space for representing analytic curves would be a product of the form \(\ell ^1_{\nu _1} \times \ell ^1_{\nu _2} \times \dots \ell ^1_{\nu _n}\). By an easy generalization of Proposition 4, we can take \(\nu _i = 1\) for \(1 \le i \le n\). With this in mind, we define the product

$$\begin{aligned} (\ell ^1)^n \,{:=}\, \underbrace{\ell ^1 \times \ell ^1 \times \dots \times \ell ^1}_{n \ \text {copies}}, \end{aligned}$$

where we note the inclusion, \((\ell ^1)^n \subset S^n\). We equip \((\ell ^1)^n\) with the norm defined by

$$\begin{aligned} \Vert u\Vert _{\infty } \,{:=}\, \max \left\{ \Vert u^{(1)}\Vert _1, \Vert u^{(2)}\Vert _1 , \dotsc , \Vert u^{(n)}\Vert _1\right\} \end{aligned}$$

which makes \((\ell ^1)^n\) into a Banach space. Before continuing to the construction of the fixed point operator, we introduce notation to connect analytic functions and their Taylor coefficient sequences.

Definition 5

Let \(C^\omega (\mathbb {D}, \mathbb {R}^n)\) denote the space of parameterized curves which are analytic on \(\mathbb {D}\). The Taylor coefficient map, \(\mathcal {T}: C^\omega (\mathbb {D}, \mathbb {R}^n) \rightarrow S^n\), is the linear operator which maps an analytic function to its sequence of Taylor coefficients. Specifically, \(u = \mathcal {T}g \in S^n\) is the sequence defined by the formula

$$\begin{aligned} u_j = {\left\{ \begin{array}{ll} g(0) &{}\quad j = 0 \\ \frac{g^{(j)}(0)}{j!} &{}\quad j \ge 1. \end{array}\right. } \end{aligned}$$

We define the “inverse” Taylor coefficient map by the formula

$$\begin{aligned} \mathcal {T}^{-1} u = \sum _{j = 0}^{\infty }u_j z^j, \end{aligned}$$

where we note that strictly speaking, \(\mathcal {T}^{-1}\) is not a true inverse since \(\mathcal {T}^{-1} u\) does not generally define an analytic function. Nevertheless, \(\mathcal {T}^{-1} u\) is well defined as a formal power series and as we make no assumption about its convergence this notation should not present any ambiguity.

Now, we have all of the necessary ingredients to describe the construction of the fixed point operator.

3.2 Constructing the fixed point problem

Our first goal is to construct a fixed point problem to which we will apply Theorem 2. We start by noting that Eq. (35) has a unique smooth solution, \(x: J(x_0) \rightarrow \mathbb {R}^n\), which follows from the same bootstrap argument as in the scalar case. Therefore, the sequence \(\mathcal {T}(x) \in S^n\) is well defined.

Following the radii polynomial approach, we want to identify a fixed point problem which has a solution if and only if there exists some \(\tau \) such that \(a(\tau ) \in (\ell ^1)^n\). Next, we extend Definition 2 to \(S^n\).

Definition 6

For a fixed \(N \in \mathbb {N}\), we define the tail subspace of order N to be

$$\begin{aligned} S_{\text {tail}}^n \,{:=}\, \left\{ u \in S^n : u^{(i)}_j = 0 \quad \text {for} \quad 0 \le j \le N, \ 1 \le i \le n\right\} . \end{aligned}$$
(40)

We let \(X \,{:=}\, S_{\text {tail}}^n \cap (\ell ^1)^n\) denote the space of absolutely summable tails. Note that X is a closed subspace of \((\ell ^1)^n\) which makes X into a Banach space under the norm inherited from \((\ell ^1)^n\) and we denote this norm by \(\Vert \cdot \Vert _X\).

Our fixed point problem will be formulated on the Banach space, X, given in Definition 6. Specifically, we describe a parameterized family of maps, \(T_\tau : X \rightarrow S^n_{\text {tail}}\), whose fixed points characterize the solutions of Eq. (35). Our construction for \(T_\tau \) in the general case is decomposed as a composition of maps defined on \(S^n\) which simplifies its analysis. We begin by defining a functional analytic extension of a smooth function defined on \(\mathbb {R}^n\), to a corresponding induced map on \(S^n\).

Definition 7

Let g be a formal power series in the variables \(\left\{ x^{(1)}, \dotsc , x^{(n)}\right\} \) defined with multi-indices by the formula

$$\begin{aligned} g(x) = \sum _{\alpha \in \mathbb {N}^n} u_\alpha x^\alpha \quad \text {where} \quad u_\alpha \in \mathbb {R}, \ x^\alpha = \prod _{i=1}^{n} \left( x^{(i)}\right) ^{\alpha ^{(i)}}. \end{aligned}$$

Formally, \(g : \mathbb {R}^n \rightarrow \mathbb {R}\), defines a scalar valued function on \(\mathbb {R}^n\) and we note that evaluation of g only requires evaluating sums and products. Hence, g induces a map, \(\phi _g : S^n \rightarrow S\), defined by the formula

$$\begin{aligned} \phi _g(u) \,{:=}\, \mathcal {T}\circ g(\mathcal {T}^{-1} u). \end{aligned}$$

We refer to this induced map as the S-extension of g. This generalizes to vector fields in the obvious way. If \(g(x) = \left( g^{(1)}(x), \dotsc , g^{(n)}(x) \right) \) is a vector field where for \(1 \le i \le n\), \(g^{(i)}(x)\) is given by a power series, then the \(S^n\)-extension of g denoted by \(\phi _g : S^n \rightarrow S^n\), is defined by the formula

$$\begin{aligned} \phi ^{(i)}_g(u) = \mathcal {T}\circ g^{(i)}(\mathcal {T}^{-1} u) \quad \text {for} \quad 1 \le i \le n. \end{aligned}$$

Next, we define two operators on \(S^n\) which are important for our fixed point construction.

Definition 8

The integration map, denoted by \(I: S^n \rightarrow S^n\), is the function whose action on \(u \in S^n\) is defined by

$$\begin{aligned} I(u)_j = \left\{ \begin{array}{ll} 0 &{} j = 0 \\ \frac{1}{j}u_{j-1} &{} j \ge 1. \end{array}\right. \end{aligned}$$
(41)

Definition 9

For any \(N \in \mathbb {N}\), let \(S_{\text {tail}}^n\) denote corresponding tail subspace of order N. We define the tail projection map, \(\pi _N : S^n \rightarrow S_{\text {tail}}^n\), by its action on \(u \in S^n\) given by the formula

$$\begin{aligned} \pi _N(u)_j = {\left\{ \begin{array}{ll} 0_{\mathbb {R}^n} &{}\quad 0 \le j \le N \\ u_j &{}\quad j > N. \end{array}\right. } \end{aligned}$$

Note that the restriction of \(\pi _N\) on \((\ell ^1)^n\) is the induced map, \(\pi _N : (\ell ^1)^n \rightarrow X\).

Now, we describe the fixed point problem construction for vector fields. Let \(\tilde{x}_0\) denote the embedding of \(x_0\) into \((\ell ^1)^n\) defined by

$$\begin{aligned} \tilde{x}_0 \,{:=}\, \left( x_0, 0_{\mathbb {R}^n}, 0_{\mathbb {R}^n}, \dotsc \right) . \end{aligned}$$

Suppose \(\tau > 0\) and define the parameterized sequence \(a(\tau ) \in S^n\) by the formula

$$\begin{aligned} a(\tau )^{(i)}_j \,{:=}\, {\left\{ \begin{array}{ll} x_0 &{}\quad j = 0 \\ \frac{\tau }{j} \left( \phi _{f^{(i)}}(a(\tau ) - \tilde{x}_0)\right) _{j-1} &{}\quad j \ge 1 \end{array}\right. } \quad \text {for} \quad 1 \le i \le n. \end{aligned}$$
(42)

Fix \(N \in \mathbb {N}\), and define the truncation

$$\begin{aligned} \hat{a} \,{:=}\, a - \tilde{x}_0 - \pi _N(a) \in (\ell ^1)^n, \end{aligned}$$
(43)

and the parameterized family of maps, \(T_\tau : X \rightarrow S_{\text {tail}}^n\), by the formula

$$\begin{aligned} T_\tau (u) = \tau \pi _N \circ I \circ \phi _f \left( \hat{a}(\tau ) + u \right) . \end{aligned}$$
(44)

Note that the construction in Sect. 2 is a special case of this map when \(n = 1\). Expressing \(T_\tau \) as a composition of operators makes it easy to provide an explicit formula for \(T_\tau \). However, it is no longer obvious that the Taylor coefficients of our IVP solution must be a fixed point of \(T_\tau \). The next lemma proves this is the case.

Lemma 8

Fix \(N \in \mathbb {N}\), let \(T_\tau : X \rightarrow S^n\) be the map defined by Eq. (44) and suppose that for some \(\tau > 0\), \(T_\tau \) has a fixed point. Then Eq. (35) has a unique solution which is analytic on the open interval \((-1,1)\).

Proof

Let \(a(\tau )\) denote the sequence defined by Eq. (42). By construction, if u is any fixed point of \(T_\tau \), then \(u + \hat{a}(\tau ) + \tilde{x}_0\) satisfies the recursive formula in Eq. (42). It follows that \(u = a(\tau )_{\text {tail}}\) since Eq. (42) is completely determined by a choice of \(\tau , x_0\). Therefore, \(a(\tau )_{\text {tail}}\in X\) is the unique fixed point of \(T_\tau \). Observe that \(\mathcal {T}^{-1} (a(\tau ))\) defines an analytic function on \((-1,1)\) given by the formula

$$\begin{aligned} x(t) \,{:=}\, \mathcal {T}^{-1} (a(\tau )) =\sum _{j=0}^{\infty } a(\tau )_j t^j. \end{aligned}$$

Since f is analytic, it has a convergent power series expansion centered at \(x_0\) of the form

$$\begin{aligned} f(x) = \sum _{\alpha \in \mathbb {N}^n} c_\alpha (x-x_0)^\alpha . \end{aligned}$$

By composing x with \(\tau f\), we obtain the formula

$$\begin{aligned} \tau f(x(t)) = \sum _{\alpha \in \mathbb {N}^n} c_\alpha \left( \sum _{j=1}^{\infty } a(\tau )_j t^j \right) ^\alpha , \end{aligned}$$
(45)

where we have used the fact that \(a(\tau )_0 = x_0\) by definition. By applying \(\mathcal {T}\) to the right-hand side of (45) and expressing it in terms of the \(\phi \) operator, we obtain the formula

$$\begin{aligned} \mathcal {T}\left( \tau f(x(t))\right) _j = \tau (\phi _f (a(\tau )-\tilde{x}_0))_{j-1} \quad \text {for all} \quad j \ge 1. \end{aligned}$$

On the other hand, we can differentiate Eq. (45) term by term to obtain the formula

$$\begin{aligned} \left( \mathcal {T}\dot{x}\right) _j = j a(\tau )_j \quad \text {for all} \quad j \ge 1. \end{aligned}$$

It follows from Eq. (42) that

$$\begin{aligned} \mathcal {T}\left( \tau f(x(t))\right) = \left( \mathcal {T}\dot{x}\right) \end{aligned}$$

proving that x satisfies Eq. (35).

The last ingredient in our fixed point problem is to define an appropriate open subset, \(U \subset X\), on which we will apply Theorem 2. If \(f : V \rightarrow \mathbb {R}^n\) is analytic and \(x_0 \in V\), then each component of f can be defined by power series converging (at least) for all

$$\begin{aligned} x \in \left( x_0^{(1)} - b_1, x_0^{(1)} + b_1\right) \times \dots \times \left( x_0^{(n)} - b_n, x_0^{(n)} + b_n\right) , \end{aligned}$$

where \(b_i > 0\) for \(1 \le i \le n\). We define \(b_0 \,{:=}\, \min \left\{ b_i : 1 \le i \le n\right\} \), and note that for \(1 \le i \le n\), the component, \(f^{(i)} : V \rightarrow \mathbb {R}^n\), defines an analytic function. Hence, \(f^{(i)}\) has a power series centered at \(x_0\) of the form

$$\begin{aligned} f^{(i)}(x) = \sum _{\alpha \in \mathbb {N}} c_\alpha ^{(i)} (x - x_0)^\alpha , \end{aligned}$$

converging at least for \(x \in (-b_0, b_0)^n\). We also note the following multi-variable analog of Eqs. (21), (22), and (23). For any \(b_* < b_0\), there exist positive constants \(C_i, C_i^*\) and \(C_i^{**}\), possibly depending on \(b_{*}\), satisfying the bounds

$$\begin{aligned} \sum _{\alpha \in \mathbb {N}^n} \left| c_\alpha ^{(i)} \right| b_*^{\left| \alpha \right| }&< C_i \\ \sum _{\alpha \in \mathbb {N}^n} \sum _{m=1}^n \alpha _{m} \left| c_\alpha ^{(i)} \right| b_*^{\left| \alpha \right| -1}&< C_i^*\\ \sum _{\alpha \in \mathbb {N}^n} \sum _{m_1=1}^n \sum _{m_2=1}^n \alpha _{m_1} \alpha _{m_2} \left| c_\alpha ^{(i)} \right| b_*^{\left| \alpha \right| -2}&< C_i^{**}. \end{aligned}$$

The proof follows immediately from Proposition 3 and the multivariate integral Cauchy integral formula which can be found in [21]. We let \(C, C^*\), and \(C^{**}\) denote the maximum values for these constants taken over \(1 \le i \le n\). Then we have the bounds

$$\begin{aligned} \sum _{\alpha \in \mathbb {N}^n} \left| c_\alpha ^{(i)} \right| b_*^{\left| \alpha \right| } < C^* \end{aligned}$$
(46)
$$\begin{aligned} \sum _{\alpha \in \mathbb {N}^n} \sum _{m=1}^n \alpha _{m} \left| c_\alpha ^{(i)} \right| b_*^{\left| \alpha \right| -1} < C^* \end{aligned}$$
(47)
$$\begin{aligned} \sum _{\alpha \in \mathbb {N}^n} \sum _{m_1=1}^n \sum _{m_2=1}^n \alpha _{m_1} \alpha _{m_2} \left| c_\alpha ^{(i)} \right| b_*^{\left| \alpha \right| -2} < C^{**} \end{aligned}$$
(48)

which hold for all \(1 \le i \le n\). We apply these bounds to define an appropriate subset, \(U \subset X\), on which to restrict \(T_\tau \) which is similar to the scalar case. Note that \(\Vert \hat{a}(\tau )\Vert _\infty \) is monotonically increasing as a function of \(\tau \) since each component has this property. Moreover, we have the limits

$$\begin{aligned} \lim \limits _{\tau \rightarrow 0} \Vert \hat{a}(\tau )\Vert _\infty = 0 \quad \lim \limits _{\tau \rightarrow \infty } \Vert \hat{a}(\tau )\Vert _\infty = \infty \end{aligned}$$

and we define \(\tau _0 > 0\) to be the unique real number satisfying \(\Vert \hat{a}(2\tau _0)\Vert _\infty = b_*\). As in the scalar case, we define the following:

$$\begin{aligned} r^*&\,{:=}\,&b_*-\Vert \hat{a}(\tau _0)\Vert _{\infty } \end{aligned}$$
(49)
$$\begin{aligned} \tau ^*= & {} \min \left( \tau _0,\frac{Nr^* }{C+r^* C^*}\right) \end{aligned}$$
(50)

and the open subset

$$\begin{aligned} U\,{:=}\,\left\{ u\in X : \Vert u\Vert _\infty < \frac{1}{2}r^* \right\} . \end{aligned}$$
(51)

This completes the construction of the fixed point problem for the vector field case. Next, we have a generalization of Lemma 5 to vector fields.

Lemma 9

Fix \(N \in \mathbb {N}\) and \(b_* \in (0, b)\), with corresponding constants \(r^*\) and \(\tau ^*\) as defined by Eqs. (49), (50), and \(U \subset X\) as defined by Eq. (51). Let \(\hat{a}(\tau )\) denote the sequence defined in Eq. (43), and \(T_\tau \) denotes the map defined by Eq. (44). Then for all \(\tau \in (0, \tau ^*]\), the following statements hold

  1. (i)

    \(T_\tau (U) \subset X\).

  2. (ii)

    \(T_\tau : U \rightarrow X\) is Fréchet differentiable. In particular, the action of \(DT_\tau (u)\) on \(h = \left( h^{(1)}, \dotsc , h^{(n)}\right) \in U\) is given by the formula

    $$\begin{aligned} \left( DT_\tau (u) h\right) ^{(i)} = \sum _{m = 1}^{n} \left( \tau \pi _N \circ I \circ \phi _{\nabla f^{(i)}}(\hat{a}(\tau ) + u)\right) ^{(m)}*h^{(m)}, \end{aligned}$$

    where \(\nabla f^{(i)}(x) = \left( \frac{\partial f^{(i)}}{\partial x_1},\frac{\partial f^{(i)}}{\partial x_2},\cdots , \frac{\partial f^{(i)}}{\partial x_n}\right) \) denotes the gradient vector of \(f^{(i)}\).

The proof is an easy generalization of the proof in Lemma 5 where the bound in Eq. (48) is now applied to control all of the \(2^\mathrm{nd}\) order (and higher) partial derivatives of f. We note that the formula for \(DT_\tau (u)\) is nothing more than the operator obtained by applying the \(S^n\)-extension map to each component of the Jacobian matrix for f.

3.3 Constructing the bounds

Now, we construct the bounds required for applying Theorem 2. Similar to the scalar case, we choose \(\bar{x} = \left( 0_{\mathbb {R}^n},0_{\mathbb {R}^n},0_{\mathbb {R}^n},\dotsc \right) \in (\ell ^1)^n\). The necessary bounds are provided by the following generalization of Lemma 6.

Lemma 10

Fix \(N \in \mathbb {N}\) and \(b_* \in (0, b)\) with corresponding constants \(r^*, \tau ^*\) as defined by Eqs. (49) and (50), and \(U \subset X\) as defined by Eq. (51). Let \(\hat{a}(\tau )\) denote the truncation defined in Eq. (43), and \(T_\tau : U \rightarrow X\) denotes the parameterized family of maps defined in Eq. (44). For \(\tau \in (0, \tau ^*]\), define the constant

$$\begin{aligned} Y_\tau \,{:=}\,\frac{\tau C}{N+1} \end{aligned}$$
(52)

and the constant function, \(Z_\tau : (0, r^*) \rightarrow [0, \infty )\), by the formula

$$\begin{aligned} Z_\tau (r)\,{:=}\,\frac{\tau C^* }{N+1}. \end{aligned}$$
(53)

Then the following bounds hold:

$$\begin{aligned}&\Vert T_\tau (0)\Vert _\infty \le Y_\tau . \end{aligned}$$
(54)
$$\begin{aligned}&\sup \limits _{u \in \overline{B_r(0)}} \Vert DT_\tau (u)\Vert _\infty \le Z_\tau (r) \quad \text {for all} \quad r \in (0, r^*). \end{aligned}$$
(55)

The proof is similar to the proof of Lemma 6 with Eqs. (46), (47) providing the necessary bounds in this case.

3.4 The constructive proof of the Cauchy–Kovalevskaya theorem

At last, we have all ingredients necessary to give a constructive proof of the Cauchy–Kovalevskaya theorem.

Theorem 11

(Cauchy–Kovalevskaya Theorem). Suppose \(V \subset \mathbb {R}^n\) is an open subset, \(f : V \rightarrow \mathbb {R}^n\) is analytic, and \(x_0 \in V\). Then the initial value problem

$$\begin{aligned} \dot{x} = f(x), \quad x(0) = x_0 \end{aligned}$$
(56)

has a unique analytic solution.

Proof

Suppose \(N \in \mathbb {N}\), let \(S_{\text {tail}}^n\) be the tail subspace of order N, and \(X = S_{\text {tail}}^n \cap (\ell ^1)^n\). Fix \(b_* \in (0, b)\) with corresponding constants \(r^*, \tau ^*\) as defined by Eqs. (49) and (50), and \(U \subset X\) as defined by Eq. (51).

We will consider the radii polynomial obtained from the bounds in Lemma 10 for the parameter value \(\tau = \tau ^*\). In particular, let \(\hat{a} \,{:=}\, \hat{a}(\tau ^*)\) denote the truncation defined in Eq. (43), \(T_{\tau ^*} : U \rightarrow X\) denotes the map defined in Eq. (44), and define the radii polynomial

$$\begin{aligned} \begin{aligned} p(r) \,{:=}\, Z_{\tau ^*}(r)r - r + Y_{\tau ^*} \quad \text {for} \quad r \in (0, r^*), \end{aligned} \end{aligned}$$

where \(Y_{\tau ^*}\) and \(Z_{\tau ^*}(r)\) are the norm bounds for \(T_{\tau ^*}\) and \(DT_{\tau ^*}\) proved in Lemma 10. We define \(r_0 \,{:=}\, \frac{Nr^*}{N+1} \in (0, r^*)\) and by a direct computation similar to the proof of Theorem 7, we have \(p(r_0)<0\). It follows from Theorem 2 that \(T_{\tau ^*}\) has a unique fixed point. By Proposition 8, this fixed point is the tail of an analytic solution to Eq. (35). By Proposition 4, this sequence is, in fact, a rescaled coefficient sequence for an analytic solution of Eq. (56) which completes the proof.

3.5 An example

The goal of this work is not to present a practical algorithm for verifying that any particular initial value problem has an analytic solution. Nevertheless, it may be instructive to demonstrate a constructive proof for an example, especially considering that the approach is inspired by rigorous numerical algorithms which do have this exact goal in mind.

Therefore, we conclude this paper by presenting an example of the constructive proof for a toy problem. We have intentionally chosen a rather simple example in an effort to focus on the proof itself. Additionally, the bounds chosen to demonstrate the proof in this example are intended to make the computations easy to follow rather than minimizing the approximation error as one would probably do in practice.

Example 1

Define the function \(f : \mathbb {R}\rightarrow \mathbb {R}\) by the formula \(f(x) = x(1-x)\) and consider the scalar initial value problem

$$\begin{aligned} \dot{x} = \tau f(x) = \tau x(1-x), \quad x_0 = \frac{1}{2}. \end{aligned}$$
(57)

In this example, f is polynomial and therefore analytic. Hence, the Cauchy–Kovalevskaya theorem implies that Eq. (57) has a unique analytic solution (in fact, the exact solution is well known to be \(x(t) = (1 + \exp (-\tau t))^{-1}\)). We will prove this following the constructive approach described in this paper.

We begin by rewriting f centered at \(x_0\) as \(f(x) = \frac{1}{4} - (x - \frac{1}{2})^2\). So the coefficients for f are \(c_0 = \frac{1}{4}\), \(c_2 = -1\), and \(c_j = 0\) for all \(j \ne 0,2\). Since f is polynomial we have \(b = \infty \) and, therefore, we can choose \(b_*\) arbitrarily.

For this example, we let \(b_* = \frac{1}{2}\) and we take \(N = 5\). Applying the formula in Eq. (17), we obtain the first N coefficients which are

$$\begin{aligned} a_0(\tau ) = \frac{1}{2}, \quad a_1(\tau ) = \frac{\tau }{4}, \quad a_2(\tau ) = 0, \quad a_3(\tau ) = \frac{-\tau ^3}{48}, \quad a_4(\tau ) = 0. \end{aligned}$$

Therefore, \(\hat{a}(\tau )\) is the sequence

$$\begin{aligned} \hat{a}(\tau ) = \left( 0, \frac{\tau }{4}, 0, \frac{-\tau ^3}{48}, 0, 0, \dots \right) , \end{aligned}$$

which is in \(\ell ^1\) for all finite \(\tau \). Next, we define \(\tau _0\) as the solution to the equation \(\Vert \hat{a}(2 \tau _0)\Vert _1 = b_*\). For this example, this amounts to solving \(\tau _0 + \frac{1}{3} \tau _0^3 - 1 = 0\). As expected, this equation has a unique real solution which has the exact value

$$\begin{aligned} \tau _0 = \left( \frac{3 + \sqrt{13}}{2} \right) ^\frac{1}{3} - \left( \frac{2}{3 + \sqrt{13}} \right) ^\frac{1}{3} \approx 0.8177. \end{aligned}$$

Following the definition in Eq. (24), we find, after a bit of algebra, that \(r^* = b_* - \Vert \hat{a}(\tau _0)\Vert _1\) is the unique real root of the cubic polynomial \(4096z^3 - 6912z^2 + 5088z - 1029\). The exact value is given by

$$\begin{aligned} r^* = \frac{9}{16} + \frac{5}{16} \left( \frac{\sqrt{13} - 3}{2} \right) ^\frac{1}{3} - \frac{5}{16}\left( \frac{2}{\sqrt{13} - 3} \right) ^\frac{1}{3} \approx 0.3070. \end{aligned}$$

Next, we define \(C = 1, C^* = 2\) and observe that

$$\begin{aligned} C&> \frac{1}{2} = \left| c_0 \right| + \left| c_2 \right| b_*^2 \\ C^*&> 1 = 2 \left| c_2 \right| b_*, \end{aligned}$$

implying C and \(C^*\) satisfy the bounds required by Eqs. (21) and (22) respectively. Consequently, for this choice of \(b_*, N, C\) and \(C^*\), we have that

$$\begin{aligned} \tau _0 < \frac{Nr^*}{C + r^* C^*} \approx 0.9510, \end{aligned}$$

and, therefore, we set \(\tau ^* = \tau _0\) as defined in Eq. (25).

Continuing with the construction, we compute \(Y_{\tau ^*}\) and \(Z_{\tau ^*}\) according to the formulas defined in Lemma 6. For this example, we obtain the bounds

$$\begin{aligned} Y_{\tau ^*}&= \frac{\tau ^*}{6} = \frac{1}{6} \left( \left( \frac{3 + \sqrt{13}}{2} \right) ^\frac{1}{3} - \left( \frac{2}{3 + \sqrt{13}} \right) ^\frac{1}{3} \right) \approx 0.1353. \\ Z_{\tau ^*}(r)&= \frac{\tau ^*}{3} = \frac{1}{3} \left( \left( \frac{3 + \sqrt{13}}{2} \right) ^\frac{1}{3} - \left( \frac{2}{3 + \sqrt{13}} \right) ^\frac{1}{3} \right) \\&\approx 0.2726 \quad \text {for all} \ r \in (0, r^*). \end{aligned}$$

As expected, the radii polynomial, \(p : (0, r^*) \rightarrow \mathbb {R}\) is given by the formula

$$\begin{aligned} p(r) = Z_{\tau ^*}(r)r - r + Y_{\tau ^*}, \end{aligned}$$

which is linear in r. The conclusion of Theorem 7 is that if p(r) is negative for some \(r \in (0, r^*)\), then \(T_{\tau ^*}\) must have a fixed point and consequently, Eq. (57) has an analytic solution. As in the proof of Theorem 7, we choose

$$\begin{aligned} r_0 = \frac{N}{N+1}r^* = \frac{45}{96} + \frac{25}{96} \left( \frac{\sqrt{13} - 3}{2} \right) ^\frac{1}{3} - \frac{25}{96}\left( \frac{2}{\sqrt{13} - 3} \right) ^\frac{1}{3} \approx 0.2558, \end{aligned}$$

and indeed we find that \(p(r_0) \approx -0.0499\) which completes the proof for this example.