Abstract
We give a constructive proof of the classical Cauchy–Kovalevskaya theorem for ordinary differential equations which provides a sufficient condition for an initial value problem to have a unique, analytic solution. Our proof is inspired by a modern numerical technique for rigorously solving nonlinear problems known as the radii polynomial approach. The main idea is to recast the existence and uniqueness of analytic solutions as a fixed point problem on an appropriately chosen Banach space, and then prove a fixed point exists via a constructive version of the Banach fixed point theorem. A key aspect of this method is the use of an approximate solution which plays a crucial role in the theoretical proof. Our proof is constructive in the sense that we provide an explicit recipe for constructing the fixed point problem, an approximate solution, and the bounds necessary to prove the existence of the fixed point.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
In this paper, we present a novel proof of the Cauchy–Kovalevskaya theorem in the ordinary differential equation (ODE) setting. The general theorem, first proved by Sonya Kovalevskaya in 1874, gives sufficient conditions for a Cauchy problem to have a unique analytic solution. Unfortunately, spaces of analytic functions are typically not the right regularity for studying solutions of partial differential equations (PDE) so the Cauchy–Kovalevskaya theorem is rarely practically applicable in this setting. On the other hand, the Cauchy–Kovalevskaya theorem is often applicable to initial value problems (IVP) arising from ODEs which is the focus of the present work. We begin by stating the theorem in this setting. A statement of the general theorem and its classical proof can be found in most introductory PDE texts, e.g. [1].
Theorem 1
(Cauchy–Kovalevskaya). Suppose \(V \subset \mathbb {R}^n\) is an open subset and \(f :V \rightarrow \mathbb {R}^n\) is an analytic vector field. Then the initial value problem
has a unique solution which is analytic on some open interval, \(J(x_0)\), containing zero.
There are several proofs of this theorem in the literature. The classical proof provides a prototypical example of the method of majorants. To illustrate the constructive aspect of our approach, we sketch a version of the classical proof for the case \(n = 1\).
The main idea in the classical proof is to use the Taylor coefficients of f to dominate the Taylor coefficients of x. Roughly speaking, f is analytic if its Taylor coefficients decay rapidly enough. The classical proof follows from showing that this condition forces the Taylor coefficients of any solution to decay rapidly as well. Note that the existence and uniqueness of a solution on some open interval, \(J(x_0)\), containing zero follows from the Picard–Lindelöf theorem. In fact, by the usual bootstrap argument, this theorem shows that this solution is as smooth as f. Hence, we may take for granted the existence of \(x \in C^\infty (J(x_0))\) satisfying Eq. (1).
The Cauchy–Kovalevskaya theorem asserts that, in fact, \(x \in C^\omega (J(x_0))\). Equivalently, there exists \(\tau > 0\), such that the series
converges. The crux of the classical argument arises from applying the Faà di Bruno formula for the iterated chain rule with the assumption that x satisfies Eq. (1), to obtain the formula
where each \(p_j\) is a polynomial in j variables with non-negative coefficients. Then one defines the non-negative sequence, \(\left\{ u_j\right\} = \left\{ \left| f^{(j)}(0) \right| : j \in \mathbb {N}\right\} \), so that we have the bound
This bound implies that the function
is a majorant for x. The classical proof is concluded by showing that \(\tilde{x}\) is analytic which ultimately follows as a consequence of the fact that f is analytic.
The classical proof is quite beautiful, however, we note that it is not constructive. In contrast with this approach, our proof of the Cauchy–Kovalevskaya theorem is based on analyzing the coefficients of the solution and proving directly that they decay sufficiently fast. Our proof is inspired by the so-called “radii polynomial approach”, which provides a constructive framework for proving theorems in nonlinear analysis with assistance of a digital computer. While our proof does not have a numerical aspect, it is carried out in the same style so we briefly review the method.
1.1 The radii polynomial approach
The radii polynomial approach is a modern methodology combining functional analytic tools with rigorous numerical computations to study nonlinear problems. The method first appeared in [2] as a modification of the technique presented in [3] for rigorously proving the existence of solutions of zero-finding problems using Newton’s method. Since then, the radii polynomial approach has played an important role in a number of results in dynamical systems such as existence of spontaneous periodic solutions in the Navier–Stokes equations [4], chaos in the circular restricted four body problem [5], coexistence of hexagonal patterns and rolls in the Swift–Hohenberg equations [6], and the proof of Wright’s conjecture [7], to name just a few. This is a small subset of the growing collection of results which utilize the radii polynomial approach as the basis for rigorous numerical algorithms for computation and continuation of equilibria, periodic orbits, connecting orbits, solutions of initial/boundary value problems, and invariant manifolds (see, e.g. [8,9,10,11,12,13,14,15,16]). A more detailed exposition on rigorous numerical techniques and various applications of radii polynomial approach can be found in [17, 18].
The main idea is to first recast problems as a zero-finding problem on a Banach space. Then a Newton-like operator is introduced which has fixed points in one-to-one correspondence with solutions of the zero-finding problem. By combining careful “pencil and paper” estimates with rigorous computations, one tries to prove that this Newton-like operator has a fixed point by an application of the Banach fixed point theorem. If successful, the existence of a zero for the original problem is concluded.
In [19], the radii polynomial approach was generalized to a rigorous numerical IVP solver for polynomial vector fields in which the fixed-point problem does not arise from a Newton-like operator. This approach was based on modifying the methodology in [20] in which one looks for a fixed point of a “Picard-like” operator. In this work, we follow a similar approach. The main idea is to associate any instance of Eq. (1), with a mapping, \(T : X \rightarrow X\), where X is an appropriate space of rapidly decaying real sequences. We provide an explicit construction for X and T depending only on f and \(x_0\), and we prove that if T has a fixed point, then the solution of Eq. (1) is analytic. The Cauchy–Kovalevskaya theorem follows after proving that if f is analytic, then our construction always produces a map with a fixed point.
We begin by describing the main theorem utilized in our approach which is a constructive version of the Banach fixed point theorem.
Theorem 2
Suppose that X is a Banach space with norm \(\Vert \cdot \Vert _{X}\), \(U\subset X\) is an open subset, and \(T:U \rightarrow X\) is a Fréchet differentiable map. Fix \(\bar{x} \in U\) and let \(r^*>0\) be given such that \(\overline{B_{r^*}(\bar{x})} \subset U\). Let Y be a positive constant satisfying
and \(Z:(0,r^*) \rightarrow [0,\infty )\) is a non-negative function satisfying
where DT(x) denotes the Fréchet derivative of T at \(x \in U\) and \(\Vert DT(x)\Vert _{X}\) denotes the operator norm induced by \(\Vert \cdot \Vert _{X}\). We define the radii polynomial, \(p: (0, r^*) \rightarrow \mathbb {R}\), by the formula
If there exists \(r_0\in (0,r^*)\) such that \(p(r_0)<0\), then there exists a unique \(x \in \overline{B_{r_0}(\bar{x})}\) so that \(T(x)=x\).
The version presented in Theorem 2 first appeared in [19] where its proof can also be found. Our proof of the Cauchy–Kovalevskaya theorem will follow from applying Theorem 2 in two steps. First, we construct a fixed point problem which amounts to defining \(X,U, r^*\), and T appropriately, and proving that if our construction has a solution then Eq. (1) has an analytic solution. Then we construct \(\overline{x}\), and bounds, Y, Z, and prove that we can always find a positive value which makes the corresponding radii polynomial negative.
The remainder of the paper is organized as follows. In Sect. 2, we introduce notation and describe the construction of the fixed point problem in case f is a scalar, i.e. \(n=1\). Then we prove that our construction has a fixed point if f is analytic by applying Theorem 2. In Sect. 3, we generalize the construction to the vector field case. As in the scalar case, we prove that our fixed point problem always has a solution when f is analytic. Finally, we prove that any fixed point of our construction implies the existence of an analytic solution for Eq. (1) which proves the Cauchy–Kovalevskaya theorem for ODEs.
2 The fixed point problem for scalar equations
In this section, we consider Eq. (1) for the case that f is an arbitrary analytic scalar function. Specifically, we assume that \(n = 1\) and for some \(b > 0\), \(f : (x_0 - b,x_0 + b) \rightarrow \mathbb {R}\) is analytic. Therefore, f(x) may be written as a convergent Taylor series of the form
We begin by defining some notation and reviewing necessary prerequisites from complex and functional analysis.
2.1 Preliminaries
We will work with the collection of real-valued sequences denoted by
Let \(S^\omega _\nu \subset S\) denote the collection of sequences which define analytic functions on \(C^\omega (\mathbb {D}_\nu )\), where \(\mathbb {D}_\nu = \left\{ z \in \mathbb {C}: \left| z \right| < \nu \right\} \) is the complex disc of radius \(\nu > 0\). Though we are interested specifically in real analytic functions, we are only concerned with the property that a function converges to a power series. Thus, we do not make a distinction between a real analytic function converging say on an interval of radius \(r> 0\), and its continuation to a complex analytic function converging on a complex disc of radius r.
To apply Theorem 2, we require a Banach space in which to work. With this goal in mind, we start by equipping S with an appropriate norm.
Definition 1
Fix a weight, \(\nu > 0\) and define the space of weighted, absolutely summable sequences
This is a normed vector space and we denote the norm of \(u \in \ell ^1_\nu \) by
We note the obvious inclusions \(\ell ^1_\nu \subset S^\omega _\nu \subset S\) and each is strict. The following theorem provides a connection between \(S^\omega _\nu \) and \(\ell ^1_\nu \).
Proposition 3
Fix \(\nu > 1\) and suppose \(g \in C^\omega (\mathbb {D}_{\nu })\) with Taylor coefficients given by \(u \in S^\omega _\nu \). Then \(u \in \ell ^1_{\nu '}\) for any \(\nu ' < \nu \). In fact, since \(g^{(k)} \in C^\omega (\mathbb {D}_{\nu })\) for any \(k \in \mathbb {N}\), it follows that the Taylor coefficients of \(g^{(k)} \in \ell ^1_{\nu '}\) as well.
The proof can be found in [21]. Roughly speaking, Proposition 3 says we can pass from analytic functions to \(\ell ^1_\nu \) sequences provided we “give up some domain”. This trick is commonly used in rigorous numerical algorithms to obtain bounds on rounding and truncation errors for Taylor series. In our setting, the theorem gives us license to work with sequences in \(\ell ^1_\nu \) as opposed to \(S^\omega _\nu \). The next proposition shows that it suffices to consider the case \(\nu = 1\).
Proposition 4
Suppose \(V \subset \mathbb {R}\) is an open subset and \(f: V \rightarrow \mathbb {R}\). For any \(\tau , \nu > 0\), the initial value problem
has a solution with Taylor coefficients in \(\ell ^1_\tau \) if and only if the initial value problem
has a solution with Taylor coefficients in \(\ell ^1_\nu \).
Proposition 4 says that choosing \(\nu \) is equivalent to rescaling time in Eq. (1). We exploit this equivalence by making an a priori choice for our function space. Specifically, we will work exclusively in the space \(\ell _1^1\) and thus, we will omit \(\nu \) from the notation for the remainder of the paper and simply write \(\ell ^1\) in place of \(\ell _1^1\). Similarly, we let \(\mathbb {D}\,{:=}\, \mathbb {D}_1\) denote the complex unit disc and our discussion of analytic functions of a scalar variable will always refer to the set \(C^\omega (\mathbb {D})\). The trade-off for fixing \(\nu = 1\) is that we must work with a modified form of Eq. (1) given by
where \(\tau \) is a time rescaling parameter.
Finally, we note that \(C^\omega (\mathbb {D})\) is closed under point-wise multiplication. This gives rise to a multiplication operation on \(\ell ^1\) called the Cauchy product. Specifically, the Cauchy product of \(u,v \in \ell ^1\) is denoted as \(u*v\) and given explicitly by the formula
In fact, Merten’s theorem implies that the Cauchy product makes \(\ell ^1\) into a Banach algebra. In particular, suppose \(f,g \in C^\omega (\mathbb {D})\) are analytic functions with Taylor coefficients given by \(u,v \in \ell ^1\), and let \(w = u*v\). Then \(w \in \ell ^1\) also and the function
is well defined and satisfies \(h(t) = f(t) g(t)\) as expected. Since \(\ell ^1\) is closed under products we define finite powers for Cauchy products in the obvious way by
Evidently, it follows that if \(u \in \ell ^1\) then \(u^k \in \ell ^1\) for any \(k\in \mathbb {N}\). To simplify some formulas involving powers of Cauchy products, we define
for any \(u \in \ell ^1\).
2.2 Taylor expansion of IVP solutions
To motivate the construction of a fixed point problem, we consider the method of solving Eq. (12) by power series expansion. We begin by considering an ansatz for the solution to Eq. (12) of the form
We want to prove that Eq. (14) defines an analytic function on some open interval containing zero by analyzing the coefficient sequence, \(a(\tau ) \,{:=}\, \left\{ a_j\right\} _{j\in \mathbb {N}} \in S\). Combining Proposition 3 and Proposition 4, this is equivalent to proving that for some choice of \(\tau \), \(a(\tau ) \in \ell ^1\).
For the moment, we suppose \(\tau > 0\) is fixed and we suppress the dependence of a on \(\tau \). We formally plug Eq. (14) into Eq. (12) to obtain
Now, we impose \(a_0 = x_0\) to satisfy the initial condition, and define the sequence
so the right-hand side of Eq. (15) has the form
where the expressions of the form \(\tilde{a}_j^k\) appearing in Eq. (16), and throughout this work, represent the \(j^\mathrm{th}\) term of the k-fold convolution. Specifically,
as opposed to the \(k^\mathrm{th}\) power of the real number, \(\tilde{a}_j\). This should not lead to confusion as the latter will not appear in this paper.
Now, after matching like powers of Eq. (16) with the left-hand side of Eq. (15), we obtain a recursive formula for the terms in a given by
2.3 Constructing the fixed point problem
Now, we want to construct appropriate choices for X, U, and T as in Theorem 2. We start with a definition.
Definition 2
For any \(N \in \mathbb {N}\) we define the tail subspace of S to be
Similarly, we define the tail subspace of \(\ell ^1\) by \(X = S_{\text {tail}}\cap \ell ^1\) and we note that X is a closed subspace of \(\ell ^1\). Hence, X is a Banach space under the norm inherited from \( \ell ^1\). We will denote this norm by \(\Vert \cdot \Vert _{X}\) to emphasize when we are working in this subspace.
Now, we define a Banach space to work in by supposing that \(N \in \mathbb {N}\) is fixed and \(S_{\text {tail}}, X\) denote the tail subspaces as defined in Definition 2. Let \(a(\tau )\) denote the sequence satisfying Eq. (17) where now we emphasize the dependence of this sequence on the choice of \(\tau \) explicitly. Let \(\hat{a}(\tau )\) denote the truncation of \(a(\tau )\) embedded into \(\ell ^1\) defined explicitly by
Equation (17) leads us to define the \(\tau \)-parameterized family of maps, \(T_\tau : X \rightarrow S_{\text {tail}}\), by the formula
We will show in the next section that \(a(\tau )\) is the unique fixed point of \(T_\tau \). However, we ultimately want to show that \(\hat{a}(\tau ) \in X\) and we note that the map defined in Eq. (20) does not necessarily map back into X as required for Theorem 2. As a consequence, we must first define an appropriate open subset, \(U \subset X\), on which to restrict T.
With this in mind, we note that since f is analytic on the interval \((x_0-b, x_0 + b)\), for any constant \(b_* \in (0, b)\), there exists positive real constants C, \(C^*\) and \(C^{**}\), which satisfy the bounds
This is a simple consequence of Cauchy’s integral formula combined with Proposition 3. A proof can be found in [21]. Next, we note that \(\Vert \hat{a}(\tau )\Vert _{1}\) is monotonically increasing as a function of \(\tau \) and by a simple computation we have the limits
Hence, there exists a unique \(\tau _0\) such that
and therefore, \(\Vert \hat{a}(\tau )\Vert _{1} < b_*\) for all \(0<\tau \le \tau _0\). Define positive constants
and define the open subset
Note that the choice of \(b_*\) is not unique. However, for any \(b_* \in (0,b)\), this construction produces an appropriate subset \(U \subset X\).
Next, we will prove that the restriction of \(T_\tau \) to U satisfies the requirements of Theorem 2. We start by defining some notation.
Definition 3
Let \(u \in S\) be any real sequence. The pointwise positive sequence associated to u, denoted by \(\left| u \right| \in S\), is the sequence with terms defined by
With this notation defined, we have the following lemma.
Lemma 5
Fix \(N \in \mathbb {N}, b_* \in (0, b)\) with corresponding constant \(\tau ^*\) as defined by Eq. (25), and \(U \subset X\) as defined by Eq. (26). Suppose \(\tau \in (0, \tau ^*]\) is fixed, and let \(\hat{a}\) denote the corresponding sequence defined in Eq. (19) where the dependence on \(\tau \) is suppressed. Let T denote the corresponding map defined by Eq. (20). Then
-
(i)
\(T(U) \subset X\)
-
(ii)
\(T: U \rightarrow X\) is Fréchet differentiable.
Proof
To prove (i), note that T maps into \(S_{\text {tail}}\) by definition, so it suffices to show that for any \(u \in U\), \(T(u) \in \ell ^1\). By a direct computation, we have
where the second to last line follows from Eq. (24) combined with the bound \(\Vert u\Vert _X < \frac{1}{2}r^*\), and the last line from Eq. (21). Hence, \(T(u) \in \ell ^1\) as required.
Now, we show that T is Fréchet differentiable. Fix \(u \in U\) and define a linear operator, \(A(u): U \rightarrow X\), by its action on \(h \in U\) given by the formula
The claim that A(u) maps U into X follows from a computation similar to the proof of (i) by applying Eq. (22). We want to show that A(u) is the Fréchet derivative of T at \(u \in U\). Let \(h \in U\) be arbitrary such that \(u+h\in U\) as well. By directly applying the formulas for T(u) and A(u), we have
Now, passing to the pointwise positive sequences for \(\hat{a} + u\) and h and summing over \(j \in \mathbb {N}\) we obtain the estimate
where the second to last line follows from Eq. (24) combined with the bounds \(\Vert h\Vert _X < \frac{1}{2}r^*\) and \(\Vert u\Vert _X < \frac{1}{2}r^*\), and the last line follows from Eq. (23). It follows that
which proves that T is Fréchet differentiable. Moreover, since \(0 < \tau \le \tau ^*\) was arbitrary, we have shown that \(T_\tau \) is Fréchet differentiable for the entire family of \(\tau \)-parameterized maps defined by Eq. (20).
Lemma 5 proves that \(DT_\tau \) is Fréchet differentiable, and moreover, its derivative is given by the formula in Eq. (27). For the remainder of this work, we let \(DT_\tau (u)\) denote the Fréchet derivative of \(T_\tau \) at \(u \in U\).
2.4 Constructing the bounds
To construct the bounds required for Theorem 2, we begin by defining \(\bar{x} \,{:=}\, 0_{\ell ^1} \in X\) which is the sequence of infinitely many zeroes. This choice is made independent of N or \(\tau \). We are left with constructing \(r_0\), \(Y_\tau \) and \(Z_\tau : (0, r^*) \rightarrow [0, \infty )\), such that the corresponding radii polynomial, \(p_\tau (r_0) < 0\). Here the \(\tau \) subscript emphasizes that these bounds depend on \(\tau \). The next lemma establishes the required bounds for \(Y_\tau \) and \(Z_\tau \).
Lemma 6
Fix \(N \in \mathbb {N}\) and let \(S_{\text {tail}}\) be the tail subspace of order N. Fix \(b_* \in (0, b)\) with corresponding constants \(C, C^*, r^*\) and \(\tau ^*\) as defined in Eqs. (21), (22), (24), (25), and \(U \subset X = S_{\text {tail}}\cap \ell ^1\) as defined in Eq. (26). For \(\tau \in (0, \tau ^*]\), let \(\hat{a}(\tau )\) denote the truncation defined in Eq. (19), and \(T_\tau : U \rightarrow X\) denotes the parameterized family of maps defined in Eq. (20). Define the constant
and the constant function, \(Z_\tau : (0, r^*) \rightarrow [0, \infty )\), by the formula
Then the following bounds hold
Proof
To establish the bound for \(Y_\tau \), we compute
which proves the bound in Eq. (31).
Next, we fix \(0< r < r^*\) and \(u \in \overline{B_r(0)}\), and suppose \(h \in U\) is arbitrary. Then we have the bound
Dividing through by \(\Vert h\Vert _X\), we obtain the operator norm bound
Upon taking the supremum over all \(u \in \overline{B_r(0)}\), we obtain the bound
and finally, we obtain a bound which holds for any \(r \in (0,r^*)\) given by
where the last line follows from Eqs. (22) and (24).
We note that our definition of \(Z_\tau \) is Lemma 6 is, in fact, a constant function with no dependence on r. However, the statement of Theorem 7 allows for Z to depend on r. In practical applications of the radii polynomial approach, bounding higher order derivatives of \(DT_\tau \) yields more accurate approximations and in this case, Z does indeed depend on r. In order to highlight the similarity between these practical applications and our proof in the present work, we will continue to consider \(Z_\tau \) as a function defined on the interval \((0, r^*)\), and write \(Z_\tau (r)\) despite the fact that it is constant.
We have now constructed all of the necessary ingredients for applying Theorem 2 which we apply to prove a precursor to the Cauchy–Kovalevskaya theorem for the scalar case.
Theorem 7
(Cauchy–Kovalevskaya precursor). Suppose \(V \subset \mathbb {R}\) is an open subset and \(f : V \rightarrow \mathbb {R}\) is analytic with a Taylor expansion centered at \(x_0 \in V\) given by the formula
which converges for \(x \in (x_0 - b, x_0 + b)\subseteq V\). For any \(N \in \mathbb {N}\), there exists \(\tau > 0\) such that the map defined by Eq. (20) has a fixed point.
Proof
Let \(S_{\text {tail}}\) be the tail subspace of order N and let \(X = S_{\text {tail}}\cap \ell ^1\). Fix \(b_* \in (0, b)\) with corresponding constants \(r^*\) and \(\tau ^*\) as defined by Eqs. (24), (25), and \(U \subset X\) as defined by Eq. (26). Let \(\hat{a}(\tau ^*)\) denote the truncation defined in Eq. (19), and \(T_{\tau ^*} : U \rightarrow X\) denotes the map defined in Eq. (20). Define the radii polynomial
where \(Y_{\tau ^*}\) and \(Z_{\tau ^*}\) are the norm bounds for \(T_{\tau ^*}\) and \(DT_{\tau ^*}\) proved in Lemma 6. Applying the formulas for \(Y_{\tau ^*}, Z_{\tau ^*}\), we obtain the bound
for all \(r \in (0,r^*)\).
Define \(r_0 \,{:=}\, \frac{N}{N+1}r^* \in (0, r^*)\), and we obtain the bound
By Theorem 2, we conclude that \(T_{\tau ^*}\) has a fixed point in U.
Note that Theorem 7 implies the Cauchy–Kovalevskaya theorem under the additional assumption that fixed points of our construction correspond to analytic solutions of Eq. (1) which we prove in the next section.
3 The Cauchy–Kovalevskaya theorem for analytic vector fields
We begin by extending the construction in Sect. 2 to the case for which f is a vector field. The main technical results are already handled in the scalar case and much of the work here amounts to setting up appropriate notation so that the previous fixed point problem is meaningful. Once this is accomplished, our proof of the Cauchy–Kovalevskaya theorem follows by first proving that fixed points of our construction imply analytic solutions of (1), and then proving a general version of Theorem 7 for analytic vector fields. We begin by recalling the definition of analyticity for vector fields.
Definition 4
Let \(V \subset \mathbb {R}^n\) be an open subset and suppose \(g: V \rightarrow \mathbb {R}\) is a scalar function of the n variables, \(\left\{ x_1,\dotsc ,x_n\right\} \), which we write as components of a vector, \(x \in \mathbb {R}^n\). To avoid confusion over the meaning of indices we will index the components of a vector with superscripts by writing \(x = \left( x^{(1)}, \dotsc , x^{(n)}\right) \). Then g is analytic if for every \(x = \left( x^{(1)}, \dotsc , x^{(n)}\right) \in V\), and for each \(1 \le i \le n\), there exists an open neighborhood, \(V_{x,j} \subset \mathbb {R}\), containing \(x^{(j)}\) such that the formula
defines an analytic function.
This definition generalizes to vector fields as follows. Suppose \(g: V \rightarrow \mathbb {R}^n\) is a vector field which we write as a vector of component functions, \(g(x) = \left( g^{(1)}(x), \dotsc , g^{(n)}(x)\right) \in \mathbb {R}^n\). Then we define g to be analytic if for each \(1 \le i \le n\), the component function, \(g^{(i)} : V \rightarrow \mathbb {R}\), is analytic.
In this setting, the analog of Eq. (12) is the initial value problem
where \(V \subset \mathbb {R}^n\) is an open subset, \(f: V \rightarrow \mathbb {R}^n\) is an analytic vector field, and \(\tau > 0\) is a time rescaling parameter. The solution of Eq. (35) is a function, \(x : \mathbb {R}\rightarrow \mathbb {R}^n\), which parameterizes a trajectory of the ODE initially passing through the point \(x_0\) at time \(t = 0\). Our goal is to prove that if f is analytic, then for each \(x_0 \in V\), there exists an open interval, \(J(x_0) \subset \mathbb {R}\) containing 0, such that \(x: J(x_0) \rightarrow \mathbb {R}^n\) defines an analytic curve.
We will construct a fixed point problem similar to the scalar case. In this version, we describe this operator at a higher level for which the construction in Sect. 2 is a special case. Next, we introduce a Banach space to work in and define some additional notation.
3.1 Products of sequence spaces
We start by generalizing the sequence spaces introduced for scalar functions in Sect. 2.1 to the vector field setting. We consider coefficient sequences in the product
For arbitrary \(u \in S^n\), we write \(u = \left( u^{(1)}, \dotsc , u^{(n)}\right) \) with \(u^{(i)} \in S\) for \(1 \le i \le n\). If \(g : \mathbb {D}\rightarrow \mathbb {R}^n\) is an analytic curve, then g is defined by a convergent Taylor series of the form
Hence, g is naturally identified with an element, \(u \in S^n\), where \(u^{(i)} \in S\) is the sequence of Taylor coefficients for the analytic scalar function, \(g^{(i)} : \mathbb {D}\rightarrow \mathbb {R}\).
Often, it is advantageous to consider an alternative description of \(S^n\) in which we define elements of \(S^n\) as sequences of vectors in \(\mathbb {R}^n\). Specifically, we have the following equivalent characterization:
In this case, the equivalent expression for Eq. (37) can be written as
For arbitrary \(u \in S^n\) we write \(u^{(i)} \in S\) to express the \(i^\mathrm{th}\) component sequence, and we write \(u_j \in \mathbb {R}^n\) to denote the \(j^\mathrm{th}\) term when we consider u to be an infinite sequence of real vectors.
Following the radii polynomial approach and the constructions in Sect. 2, we want to work in a Banach space of absolutely summable sequences. The appropriate space for representing analytic curves would be a product of the form \(\ell ^1_{\nu _1} \times \ell ^1_{\nu _2} \times \dots \ell ^1_{\nu _n}\). By an easy generalization of Proposition 4, we can take \(\nu _i = 1\) for \(1 \le i \le n\). With this in mind, we define the product
where we note the inclusion, \((\ell ^1)^n \subset S^n\). We equip \((\ell ^1)^n\) with the norm defined by
which makes \((\ell ^1)^n\) into a Banach space. Before continuing to the construction of the fixed point operator, we introduce notation to connect analytic functions and their Taylor coefficient sequences.
Definition 5
Let \(C^\omega (\mathbb {D}, \mathbb {R}^n)\) denote the space of parameterized curves which are analytic on \(\mathbb {D}\). The Taylor coefficient map, \(\mathcal {T}: C^\omega (\mathbb {D}, \mathbb {R}^n) \rightarrow S^n\), is the linear operator which maps an analytic function to its sequence of Taylor coefficients. Specifically, \(u = \mathcal {T}g \in S^n\) is the sequence defined by the formula
We define the “inverse” Taylor coefficient map by the formula
where we note that strictly speaking, \(\mathcal {T}^{-1}\) is not a true inverse since \(\mathcal {T}^{-1} u\) does not generally define an analytic function. Nevertheless, \(\mathcal {T}^{-1} u\) is well defined as a formal power series and as we make no assumption about its convergence this notation should not present any ambiguity.
Now, we have all of the necessary ingredients to describe the construction of the fixed point operator.
3.2 Constructing the fixed point problem
Our first goal is to construct a fixed point problem to which we will apply Theorem 2. We start by noting that Eq. (35) has a unique smooth solution, \(x: J(x_0) \rightarrow \mathbb {R}^n\), which follows from the same bootstrap argument as in the scalar case. Therefore, the sequence \(\mathcal {T}(x) \in S^n\) is well defined.
Following the radii polynomial approach, we want to identify a fixed point problem which has a solution if and only if there exists some \(\tau \) such that \(a(\tau ) \in (\ell ^1)^n\). Next, we extend Definition 2 to \(S^n\).
Definition 6
For a fixed \(N \in \mathbb {N}\), we define the tail subspace of order N to be
We let \(X \,{:=}\, S_{\text {tail}}^n \cap (\ell ^1)^n\) denote the space of absolutely summable tails. Note that X is a closed subspace of \((\ell ^1)^n\) which makes X into a Banach space under the norm inherited from \((\ell ^1)^n\) and we denote this norm by \(\Vert \cdot \Vert _X\).
Our fixed point problem will be formulated on the Banach space, X, given in Definition 6. Specifically, we describe a parameterized family of maps, \(T_\tau : X \rightarrow S^n_{\text {tail}}\), whose fixed points characterize the solutions of Eq. (35). Our construction for \(T_\tau \) in the general case is decomposed as a composition of maps defined on \(S^n\) which simplifies its analysis. We begin by defining a functional analytic extension of a smooth function defined on \(\mathbb {R}^n\), to a corresponding induced map on \(S^n\).
Definition 7
Let g be a formal power series in the variables \(\left\{ x^{(1)}, \dotsc , x^{(n)}\right\} \) defined with multi-indices by the formula
Formally, \(g : \mathbb {R}^n \rightarrow \mathbb {R}\), defines a scalar valued function on \(\mathbb {R}^n\) and we note that evaluation of g only requires evaluating sums and products. Hence, g induces a map, \(\phi _g : S^n \rightarrow S\), defined by the formula
We refer to this induced map as the S-extension of g. This generalizes to vector fields in the obvious way. If \(g(x) = \left( g^{(1)}(x), \dotsc , g^{(n)}(x) \right) \) is a vector field where for \(1 \le i \le n\), \(g^{(i)}(x)\) is given by a power series, then the \(S^n\)-extension of g denoted by \(\phi _g : S^n \rightarrow S^n\), is defined by the formula
Next, we define two operators on \(S^n\) which are important for our fixed point construction.
Definition 8
The integration map, denoted by \(I: S^n \rightarrow S^n\), is the function whose action on \(u \in S^n\) is defined by
Definition 9
For any \(N \in \mathbb {N}\), let \(S_{\text {tail}}^n\) denote corresponding tail subspace of order N. We define the tail projection map, \(\pi _N : S^n \rightarrow S_{\text {tail}}^n\), by its action on \(u \in S^n\) given by the formula
Note that the restriction of \(\pi _N\) on \((\ell ^1)^n\) is the induced map, \(\pi _N : (\ell ^1)^n \rightarrow X\).
Now, we describe the fixed point problem construction for vector fields. Let \(\tilde{x}_0\) denote the embedding of \(x_0\) into \((\ell ^1)^n\) defined by
Suppose \(\tau > 0\) and define the parameterized sequence \(a(\tau ) \in S^n\) by the formula
Fix \(N \in \mathbb {N}\), and define the truncation
and the parameterized family of maps, \(T_\tau : X \rightarrow S_{\text {tail}}^n\), by the formula
Note that the construction in Sect. 2 is a special case of this map when \(n = 1\). Expressing \(T_\tau \) as a composition of operators makes it easy to provide an explicit formula for \(T_\tau \). However, it is no longer obvious that the Taylor coefficients of our IVP solution must be a fixed point of \(T_\tau \). The next lemma proves this is the case.
Lemma 8
Fix \(N \in \mathbb {N}\), let \(T_\tau : X \rightarrow S^n\) be the map defined by Eq. (44) and suppose that for some \(\tau > 0\), \(T_\tau \) has a fixed point. Then Eq. (35) has a unique solution which is analytic on the open interval \((-1,1)\).
Proof
Let \(a(\tau )\) denote the sequence defined by Eq. (42). By construction, if u is any fixed point of \(T_\tau \), then \(u + \hat{a}(\tau ) + \tilde{x}_0\) satisfies the recursive formula in Eq. (42). It follows that \(u = a(\tau )_{\text {tail}}\) since Eq. (42) is completely determined by a choice of \(\tau , x_0\). Therefore, \(a(\tau )_{\text {tail}}\in X\) is the unique fixed point of \(T_\tau \). Observe that \(\mathcal {T}^{-1} (a(\tau ))\) defines an analytic function on \((-1,1)\) given by the formula
Since f is analytic, it has a convergent power series expansion centered at \(x_0\) of the form
By composing x with \(\tau f\), we obtain the formula
where we have used the fact that \(a(\tau )_0 = x_0\) by definition. By applying \(\mathcal {T}\) to the right-hand side of (45) and expressing it in terms of the \(\phi \) operator, we obtain the formula
On the other hand, we can differentiate Eq. (45) term by term to obtain the formula
It follows from Eq. (42) that
proving that x satisfies Eq. (35).
The last ingredient in our fixed point problem is to define an appropriate open subset, \(U \subset X\), on which we will apply Theorem 2. If \(f : V \rightarrow \mathbb {R}^n\) is analytic and \(x_0 \in V\), then each component of f can be defined by power series converging (at least) for all
where \(b_i > 0\) for \(1 \le i \le n\). We define \(b_0 \,{:=}\, \min \left\{ b_i : 1 \le i \le n\right\} \), and note that for \(1 \le i \le n\), the component, \(f^{(i)} : V \rightarrow \mathbb {R}^n\), defines an analytic function. Hence, \(f^{(i)}\) has a power series centered at \(x_0\) of the form
converging at least for \(x \in (-b_0, b_0)^n\). We also note the following multi-variable analog of Eqs. (21), (22), and (23). For any \(b_* < b_0\), there exist positive constants \(C_i, C_i^*\) and \(C_i^{**}\), possibly depending on \(b_{*}\), satisfying the bounds
The proof follows immediately from Proposition 3 and the multivariate integral Cauchy integral formula which can be found in [21]. We let \(C, C^*\), and \(C^{**}\) denote the maximum values for these constants taken over \(1 \le i \le n\). Then we have the bounds
which hold for all \(1 \le i \le n\). We apply these bounds to define an appropriate subset, \(U \subset X\), on which to restrict \(T_\tau \) which is similar to the scalar case. Note that \(\Vert \hat{a}(\tau )\Vert _\infty \) is monotonically increasing as a function of \(\tau \) since each component has this property. Moreover, we have the limits
and we define \(\tau _0 > 0\) to be the unique real number satisfying \(\Vert \hat{a}(2\tau _0)\Vert _\infty = b_*\). As in the scalar case, we define the following:
and the open subset
This completes the construction of the fixed point problem for the vector field case. Next, we have a generalization of Lemma 5 to vector fields.
Lemma 9
Fix \(N \in \mathbb {N}\) and \(b_* \in (0, b)\), with corresponding constants \(r^*\) and \(\tau ^*\) as defined by Eqs. (49), (50), and \(U \subset X\) as defined by Eq. (51). Let \(\hat{a}(\tau )\) denote the sequence defined in Eq. (43), and \(T_\tau \) denotes the map defined by Eq. (44). Then for all \(\tau \in (0, \tau ^*]\), the following statements hold
-
(i)
\(T_\tau (U) \subset X\).
-
(ii)
\(T_\tau : U \rightarrow X\) is Fréchet differentiable. In particular, the action of \(DT_\tau (u)\) on \(h = \left( h^{(1)}, \dotsc , h^{(n)}\right) \in U\) is given by the formula
$$\begin{aligned} \left( DT_\tau (u) h\right) ^{(i)} = \sum _{m = 1}^{n} \left( \tau \pi _N \circ I \circ \phi _{\nabla f^{(i)}}(\hat{a}(\tau ) + u)\right) ^{(m)}*h^{(m)}, \end{aligned}$$where \(\nabla f^{(i)}(x) = \left( \frac{\partial f^{(i)}}{\partial x_1},\frac{\partial f^{(i)}}{\partial x_2},\cdots , \frac{\partial f^{(i)}}{\partial x_n}\right) \) denotes the gradient vector of \(f^{(i)}\).
The proof is an easy generalization of the proof in Lemma 5 where the bound in Eq. (48) is now applied to control all of the \(2^\mathrm{nd}\) order (and higher) partial derivatives of f. We note that the formula for \(DT_\tau (u)\) is nothing more than the operator obtained by applying the \(S^n\)-extension map to each component of the Jacobian matrix for f.
3.3 Constructing the bounds
Now, we construct the bounds required for applying Theorem 2. Similar to the scalar case, we choose \(\bar{x} = \left( 0_{\mathbb {R}^n},0_{\mathbb {R}^n},0_{\mathbb {R}^n},\dotsc \right) \in (\ell ^1)^n\). The necessary bounds are provided by the following generalization of Lemma 6.
Lemma 10
Fix \(N \in \mathbb {N}\) and \(b_* \in (0, b)\) with corresponding constants \(r^*, \tau ^*\) as defined by Eqs. (49) and (50), and \(U \subset X\) as defined by Eq. (51). Let \(\hat{a}(\tau )\) denote the truncation defined in Eq. (43), and \(T_\tau : U \rightarrow X\) denotes the parameterized family of maps defined in Eq. (44). For \(\tau \in (0, \tau ^*]\), define the constant
and the constant function, \(Z_\tau : (0, r^*) \rightarrow [0, \infty )\), by the formula
Then the following bounds hold:
The proof is similar to the proof of Lemma 6 with Eqs. (46), (47) providing the necessary bounds in this case.
3.4 The constructive proof of the Cauchy–Kovalevskaya theorem
At last, we have all ingredients necessary to give a constructive proof of the Cauchy–Kovalevskaya theorem.
Theorem 11
(Cauchy–Kovalevskaya Theorem). Suppose \(V \subset \mathbb {R}^n\) is an open subset, \(f : V \rightarrow \mathbb {R}^n\) is analytic, and \(x_0 \in V\). Then the initial value problem
has a unique analytic solution.
Proof
Suppose \(N \in \mathbb {N}\), let \(S_{\text {tail}}^n\) be the tail subspace of order N, and \(X = S_{\text {tail}}^n \cap (\ell ^1)^n\). Fix \(b_* \in (0, b)\) with corresponding constants \(r^*, \tau ^*\) as defined by Eqs. (49) and (50), and \(U \subset X\) as defined by Eq. (51).
We will consider the radii polynomial obtained from the bounds in Lemma 10 for the parameter value \(\tau = \tau ^*\). In particular, let \(\hat{a} \,{:=}\, \hat{a}(\tau ^*)\) denote the truncation defined in Eq. (43), \(T_{\tau ^*} : U \rightarrow X\) denotes the map defined in Eq. (44), and define the radii polynomial
where \(Y_{\tau ^*}\) and \(Z_{\tau ^*}(r)\) are the norm bounds for \(T_{\tau ^*}\) and \(DT_{\tau ^*}\) proved in Lemma 10. We define \(r_0 \,{:=}\, \frac{Nr^*}{N+1} \in (0, r^*)\) and by a direct computation similar to the proof of Theorem 7, we have \(p(r_0)<0\). It follows from Theorem 2 that \(T_{\tau ^*}\) has a unique fixed point. By Proposition 8, this fixed point is the tail of an analytic solution to Eq. (35). By Proposition 4, this sequence is, in fact, a rescaled coefficient sequence for an analytic solution of Eq. (56) which completes the proof.
3.5 An example
The goal of this work is not to present a practical algorithm for verifying that any particular initial value problem has an analytic solution. Nevertheless, it may be instructive to demonstrate a constructive proof for an example, especially considering that the approach is inspired by rigorous numerical algorithms which do have this exact goal in mind.
Therefore, we conclude this paper by presenting an example of the constructive proof for a toy problem. We have intentionally chosen a rather simple example in an effort to focus on the proof itself. Additionally, the bounds chosen to demonstrate the proof in this example are intended to make the computations easy to follow rather than minimizing the approximation error as one would probably do in practice.
Example 1
Define the function \(f : \mathbb {R}\rightarrow \mathbb {R}\) by the formula \(f(x) = x(1-x)\) and consider the scalar initial value problem
In this example, f is polynomial and therefore analytic. Hence, the Cauchy–Kovalevskaya theorem implies that Eq. (57) has a unique analytic solution (in fact, the exact solution is well known to be \(x(t) = (1 + \exp (-\tau t))^{-1}\)). We will prove this following the constructive approach described in this paper.
We begin by rewriting f centered at \(x_0\) as \(f(x) = \frac{1}{4} - (x - \frac{1}{2})^2\). So the coefficients for f are \(c_0 = \frac{1}{4}\), \(c_2 = -1\), and \(c_j = 0\) for all \(j \ne 0,2\). Since f is polynomial we have \(b = \infty \) and, therefore, we can choose \(b_*\) arbitrarily.
For this example, we let \(b_* = \frac{1}{2}\) and we take \(N = 5\). Applying the formula in Eq. (17), we obtain the first N coefficients which are
Therefore, \(\hat{a}(\tau )\) is the sequence
which is in \(\ell ^1\) for all finite \(\tau \). Next, we define \(\tau _0\) as the solution to the equation \(\Vert \hat{a}(2 \tau _0)\Vert _1 = b_*\). For this example, this amounts to solving \(\tau _0 + \frac{1}{3} \tau _0^3 - 1 = 0\). As expected, this equation has a unique real solution which has the exact value
Following the definition in Eq. (24), we find, after a bit of algebra, that \(r^* = b_* - \Vert \hat{a}(\tau _0)\Vert _1\) is the unique real root of the cubic polynomial \(4096z^3 - 6912z^2 + 5088z - 1029\). The exact value is given by
Next, we define \(C = 1, C^* = 2\) and observe that
implying C and \(C^*\) satisfy the bounds required by Eqs. (21) and (22) respectively. Consequently, for this choice of \(b_*, N, C\) and \(C^*\), we have that
and, therefore, we set \(\tau ^* = \tau _0\) as defined in Eq. (25).
Continuing with the construction, we compute \(Y_{\tau ^*}\) and \(Z_{\tau ^*}\) according to the formulas defined in Lemma 6. For this example, we obtain the bounds
As expected, the radii polynomial, \(p : (0, r^*) \rightarrow \mathbb {R}\) is given by the formula
which is linear in r. The conclusion of Theorem 7 is that if p(r) is negative for some \(r \in (0, r^*)\), then \(T_{\tau ^*}\) must have a fixed point and consequently, Eq. (57) has an analytic solution. As in the proof of Theorem 7, we choose
and indeed we find that \(p(r_0) \approx -0.0499\) which completes the proof for this example.
References
Ebert, M.R., Reissig, M.: Basics for Partial Differential Equations. Springer, New York (2018)
Day, S., Lessard, J.-P., Mischaikow, K.: Validated continuation for equilibria of PDEs. SIAM J. Numer. Anal. 45(4), 1398–1424 (2007) (electronic)
Yamamoto, N.: A numerical verification method for solutions of boundary value problems with local uniqueness by Banach’s fixed-point theorem. SIAM J. Numer. Anal. 35(5), 2004–2013 (1998) (electronic)
van den Berg, J.B., Breden, M., Lessard, J.-P., van Veen, L.: Spontaneous periodic orbits in the Navier–Stokes flow (2019). https://arxiv.org/abs/1902.00384
Kepley, S., Mireles James, J.D.: Chaotic motions in the restricted four body problem via Devaney’s saddle-focus homoclinic tangle theorem. J. Differ. Equ. 266(4), 1709–1755 (2015)
van den Berg, J.B., Deschênes, A., Lessard, J.-P., Mireles James, J.D.: Stationary coexistence of hexagons and rolls via rigorous computations. SIAM J. Appl. Dyn. Syst. 14(2), 942–979 (2015)
van den Berg, J.B., Jaquette, J.: A proof of Wright’s conjecture. J. Differ. Equ. 264(12), 7412–7462 (2018)
Gameiro, M., Lessard, J.-P.: Analytic estimates and rigorous continuation for equilibria of higher-dimensional PDEs. J. Differ. Equ. 249(9), 2237–2268 (2010)
van den Berg, J.B., Queirolo, E.: A general framework for validated continuation of periodic orbits in systems of polynomial ODEs. J. Comput. Dynamics. 8(1), 59–97 (2021) https://doi.org/10.3934/jcd.2021004
Murray, M., Mireles James, J.D.: Chebyshev–Taylor parameterization of stable/unstable manifolds for periodic orbits: implementation and applications. Int. J. Bifurc. Chaos 27(14), 1–32 (2017) (submitted)
Lessard, J.-P., Reinhardt, C.: Rigorous numerics for nonlinear differential equations using Chebyshev series. SIAM J. Numer. Anal. 52(1), 1–22 (2014)
Mireles James, J.D., Mischaikow, K.: Rigorous a-posteriori computation of (un)stable manifolds and connecting orbits for analytic maps. SIAM J. Appl. Dyn. Syst. 12(2), 957–1006 (2013)
Kalies, W.D., Kepley, S., Mireles James, J.D.: Analytic continuation of local (un)stable manifolds with rigorous computer assisted error bounds. SIAM J. Appl. Dyn. Syst. 17(1), 157–202 (2018)
van den Berg, J.B., Mireles James, J.D., Reinhard, C.: Computing (un)stable manifolds with validated error bounds: non-resonant and resonant spectra. J. Nonlinear Sci. 26, 1055–1095 (2016)
van den Berg, J.B., Mireles James, J.D.: Parameterization of slow-stable manifolds and their invariant vector bundles: theory and numerical implementation. Discrete Contin. Dyn. Syst. Ser. A 36(9), 4637–4664 (2016)
Gonzalez, J.L., MirelesJames, J.D.: High-order parameterization of stable/unstable manifolds for long periodic orbits of maps. SIAM J. Appl. Dyn. Syst. 16(3), 1748–1795 (2017)
van den Berg, J.B., Lessard, J.-P.: Rigorous numerics in dynamics. Not. Am. Math. Soc. 62(9), 1057–1061 (2015)
van den Berg, J.B., Mireles James, J.D., Lessard, J.-P., Wanner, T., Day, S., Mischaikow, K.: Rigorous Numerics in Dynamics. American Mathematical Society, Providence (2018)
Mireles James, J.D.: Validated numerics for equilibria of analytic vector fields: invariant manifolds and connecting orbits. Rigorous Numerics in Dynamics. Proceedings of Symposia in Applied Mathematics, vol 74. pp. 80 (2018). https://doi.org/10.1090/psapm/074/02
Arioli, G., Koch, H.: Existence and stability of traveling pulse solutions of the FitzHugh–Nagumo equation. Nonlinear Anal. 113, 51–70 (2015)
Scheidemann, V.: Introduction to complex analysis in several variables (2005). https://www.springer.com/us/book/9783764374907
Acknowledgements
The authors wish to thank Konstantin Mischaikow for helpful discussions. S.K. was partially supported by NSF grant 1839294, by NIH-1R01GM126555-01 as part of the Joint DMS/NIGMS Initiative to Support Research at the Interface of the Biological and Mathematical Science and DARPA contract HR001117S0003-SD2-FP-011. T.Z. was partially supported by the Training Program for Top Students in Mathematics from Zhejiang University.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Kepley, S., Zhang, T. A constructive proof of the Cauchy–Kovalevskaya theorem for ordinary differential equations. J. Fixed Point Theory Appl. 23, 7 (2021). https://doi.org/10.1007/s11784-020-00841-1
Accepted:
Published:
DOI: https://doi.org/10.1007/s11784-020-00841-1