Abstract
We study stochastic model reduction for evolution equations in infinite-dimensional Hilbert spaces and show the convergence to the reduced equations via abstract results of Wong–Zakai type for stochastic equations driven by a scaled Ornstein–Uhlenbeck process. Both weak and strong convergence are investigated, depending on the presence of quadratic interactions between reduced variables and driving noise. Finally, we are able to apply our results to a class of equations used in climate modeling.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
In this paper we study stochastic model reduction for a system of nonlinear evolution equations in infinite-dimensional Hilbert spaces which is general enough to cover well-established systems of equations used in climate modeling. The big advantage of such a procedure is the lower complexity of the reduced equations, since complexity is still one of the major issues when predicting the evolution of systems over time spans which are typical for climate rather than meteorology.
Following [9, 17], we assume that the climate variables of the system, i.e., those more relevant to climate prediction, evolve on longer time scales than the unresolved variables, which can be modeled stochastically and have a typical time scale much shorter than the climate variables. To be able to close the equation for the climate variables, the task is to understand the effects of unresolved variables when stretching time to climate time. In what follows, we also refer to climate variables as resolved variables.
Climate modeling typically starts with equations containing quadratic nonlinearities which can describe many features of oceanic and atmospheric dynamics at meteorological time—see [18, 25]. In abstract mathematical terms, such equations would look like
where \(A:H \rightarrow H\) is a linear operator, \(B:H\times H \rightarrow H\) is a bilinear operator, and f is an external forcing term. Here, the variable Z taking values in H is supposed to be a complex mix of climate and unresolved variables, and hence, the space H has to be ‘big enough’ to ‘host’ variables of that type. We therefore choose H to be a separable infinite-dimensional Hilbert space.
Now, there is a variety of procedures to identify climate variables in practice which we will not discuss in this paper. We rather assume that climate variables have been identified spanning a Hilbert-subspace \(H_d\subset H\), and we further assume that the orthogonal complement \(H_\infty ,\,H = H_d \oplus H_\infty \), gives the space of unresolved variables. When projecting Z onto \(H_d\), \(H_\infty \) via the projection maps \(\pi _d\), \(\pi _\infty \), Eq. (1) gives raise to two equations
and
for the collection of climate variables \(X=\pi _d(Z)\) and unresolved variables \(Y=\pi _\infty (Z)\), respectively.
The next step, called stochastic climate modeling, consists in replacing the complicated nonlinear self-interaction term in (3) by a linear random term. Such a replacement could be justified by the assumption that quickly varying fluctuations of small-scale unresolved variables are more or less indistinguishable from the combined effect of a large number of weakly coupled factors, usually leading to Gaussian driving forces via central limit theorem. But such effects would only become visible at climate time and not at meteorological time used in (2) and (3), so that we are looking to replace \(B^2_{22}(Y_{\varepsilon ^{-1}t},Y_{\varepsilon ^{-1}t})\) by a linear random term, stretching meteorological time to \(\varepsilon ^{-1}t\), using a small parameter \(\varepsilon \ll 1\).
In this work, following [17, 22], we suppose that
where \(\mu ,\sigma \) are positive constants, and \({\dot{W}}\) is Gaussian noise, white in time, and colored in space. This way, the parameter \(\varepsilon \) is used to scale time, but also to adjust for the size of the involved variables when scaling time.
Another assumption made in [17] is that climate variables at climate time have small forcing and self-interaction, and hence, we also suppose that
avoiding so-called fast forcing and fast waves.
All in all, when introducing the notation \(X^\varepsilon _t = X_{\varepsilon ^{-1}t}\) for climate variables at climate time, and \(Y^\varepsilon _t = \varepsilon ^{-1} Y_{\varepsilon ^{-1}t}\) for the effect of unresolved variables at climate time, Eqs. (2) and (3) translate into
where we have set \(\mu =\sigma =1\) for the sake of simplicity.
The hope is now that, when \(\varepsilon \) tends to zero, climate variables at climate time can be approximated by a random variable \({\bar{X}}\) which solves a closed stochastic equation with new coefficients not depending on unresolved variables any more. Of course, these new coefficients will be functions of the coefficients of Eqs. (4) and (5), and the process of finding these new coefficients is called stochastic model reduction.
Stochastic model reduction of finite-dimensional systems similar to (4), (5) was extensively discussed in [17]. However, one of the key steps, i.e., proving the convergence \(X^\varepsilon \rightarrow {\bar{X}},\,\varepsilon \downarrow 0\), was kept rather short. Indeed, the authors first sketch a perturbation method based on a theorem by T.G. Kurtz, [16], which is their general method, and they then briefly describe a so-called direct averaging method for special cases based on limits of solutions to stochastic differential equations. In particular, the latter method lacks a certain amount of rigor because the convergence of the involved stochastic processes is not shown, and this gap has not been closed in follow-up papers—see [6, 7, 13] for example.
In this paper we are not only closing this gap, but also develop a new method of proof.
We at first identify the limit process \({\bar{X}}\), and then study the convergence \(X^\varepsilon \rightarrow {\bar{X}}\) as \(\varepsilon \downarrow 0\), when \(X^\varepsilon \) solves a general evolution equation of type
where \(Y^\varepsilon \) is a decoupled infinite-dimensional Ornstein–Uhlenbeck process satisfying
Since Eq. (6) is more general than (4), once stochastic model reduction is established for the system (6), (7) with decoupled unresolved variables, it also follows for an interesting subclass of systems of type (4), (5) with coupled unresolved variables—basically those systems for which \(B^2_{12}=0\), see Theorem 5.3. Part (ii) of this theorem deals with the case of linear scattering, that is \(B^1_{22} =0\), and in this case we achieve showing ‘strong’ convergence in probability:
on a given climate time interval [0, T]. When the quadratic interaction term \(B^1_{22}\) is non-trivial, we can only show convergence in law, as stated in Theorem 5.3(i). We refer to Remark 4.3(ii) for an argument which suggests that one cannot expect much more than a weak-type convergence in the general case. This insight of course sheds new light on the results given in [17] and follow-up papers.
At this point it should be mentioned that throughout this paper we assume that \(H_d\) is finite-dimensional which seems to be a natural choice when it comes to climate modeling. However, our arguments are general and can be adapted to infinite-dimensional subspaces, see [5].
In the case of the more abstract system (6), (7), the process \(Y^\varepsilon \) will eventually behave like white noise, as \(\varepsilon \downarrow 0\). This limiting behavior is fundamental for finding the limit of Eq. (6) because it opens the door for using arguments similar to those of Wong and Zakai in [26]. Of course, Wong and Zakai formulated their results in a finite-dimensional setting. There have been earlier attempts of proving similar results in infinite dimensions; we refer to [2, 23, 24], for example. However, we would like to emphasize that these earlier attempts dealt with piecewise linear approximations of noise rather than an infinite-dimensional Ornstein–Uhlenbeck process. Note that it is typical for Wong–Zakai results that stochastic integral terms of limiting equations are interpreted in the sense of Stratonovich.
Finally, it is worth comparing our results with those in the literature concerning averaging principles, see, for instance, [8, Sect. 7.9], [20, 21] and references therein. Roughly speaking, in those results the unresolved variables satisfy the equation \(\mathrm{d}Y^\varepsilon _t = - \varepsilon ^{2} Y^\varepsilon _t \mathrm{d}t + \varepsilon ^{-1} \mathrm{d}W_t\), with a weaker noise intensity compared to our, and therefore, the resolved variables only undergo a change of drift in the limit \(\varepsilon \downarrow 0\). On the contrary, in our setting a diffusion term also appears in the limit, see (13) below.
The paper is structured as follows.
In Sect. 2, we formulate our main results on the convergence of solutions to (6), (7). First, the limiting equation for \({\bar{X}}\) is identified, and then conditions for weak convergence \(X^\varepsilon \rightarrow {\bar{X}}\) are stated in Theorem 2.2(i). However, when (6) is a simpler equation, i.e., \(\beta =0\), even the stronger convergence (8) can be shown under the same conditions—see Theorem 2.2(ii).
In Sect. 3, we give the proof of Theorem 2.2(ii). The proof relies on preliminary localization and discretization arguments which allow to consider, instead of (8), its discrete version
for only finitely many \(t_k \in [0,T]\).
In Sect. 4, we give the proof of Theorem 2.2(i) which, at the beginning, requires a careful analysis of the quadratic term \(\beta (Y^\varepsilon _t,Y^\varepsilon _t)\), but otherwise is an adaptation of the proof given in the previous section.
In Sect. 5, we eventually use the results of Sect. 2 to prove Theorem 5.3 under quite natural conditions, thus making the connection to our main applications in climate modeling.
2 Notation and main result
Let \(H_d\), \(H_\infty \) be real separable Hilbert spaces. Assume that \(H_d\) is finite-dimensional, \(\dim H_d = d\), with given orthonormal basis \({\mathbf {e}}_1,\dots ,{\mathbf {e}}_d\), and that \(H_\infty \) is infinite-dimensional with given orthonormal basis \({\mathbf {f}}_1,{\mathbf {f}}_2,\dots \)
Given two Banach spaces U, V, let \({\mathcal {L}}(U,V)\) denote the Banach space of continuous linear operators mapping U to V, endowed with the operator norm.
For each \(\varepsilon >0\), consider the pair of stochastic processes \((X^\varepsilon ,Y^\varepsilon )\), taking values in \(H_d \times H_\infty \), where \(X^\varepsilon \) satisfies (6) over a fixed finite time interval [0, T], and \(Y^\varepsilon \) is given by
where W is a Wiener process in \(H_\infty \), with real-valued time parameter and self-adjoint trace class covariance operator \(Q \in {\mathcal {L}}(H_\infty ,H_\infty )\).
Remark 2.1
-
(i)
A Wiener process with real-valued time parameter can be obtained in the following way: given two independent Wiener processes \((W^+_t)_{t\ge 0}\) and \((W^-_t)_{t\ge 0}\) defined on filtered probability spaces \((\Omega ^+,({\mathcal {F}}^+_t),\mathbb {P}^+)\) and \((\Omega ^-,({\mathcal {F}}^-_t),\mathbb {P}^-)\), respectively, set \(W_t = W^+_t\), for \(t \ge 0\), and \(W_t = W^-_{-t}\), for \(t<0\).
-
(ii)
Using such a representation of W, we can also write
$$\begin{aligned} Y^\varepsilon _t = - \int _0^\infty \varepsilon ^{-2}e^{-\varepsilon ^{-2}(t+s)} \mathrm{d}W^-_s + \int _0^t \varepsilon ^{-2}e^{-\varepsilon ^{-2}(t-s)} \mathrm{d}W^+_s, \quad t\ge 0, \end{aligned}$$which clearly is a stationary Ornstein–Uhlenbeck process on \((\Omega ,{\mathcal {F}}^-_\infty \otimes {\mathcal {F}}^+_\infty ,\mathbb {P})\), where \(\Omega =\Omega ^- \times \Omega ^+\) and \(\mathbb {P}=\mathbb {P}^- \otimes \mathbb {P}^+\), see [3]. Furthermore, setting up the stochastic basis for our processes \((X^\varepsilon ,Y^\varepsilon )\), let \((\Omega ,{\mathcal {F}},\mathbb {P})\) be the completion of \((\Omega ,{\mathcal {F}}^-_\infty \otimes {\mathcal {F}}^+_\infty ,\mathbb {P})\), and \(({\mathcal {F}}_t)_{t \ge 0}\) be the augmentation of the filtration \(({\mathcal {F}}^-_\infty \otimes {\mathcal {F}}^+_t)_{t \ge 0}\). Note that this filtration would satisfy the usual conditions.
-
(iii)
Since Q is trace class, both W and \(Y^\varepsilon \) take values in \(H_\infty \). Without loss of generality, we can assume that Q is diagonal with respect to the chosen basis \(\{{\mathbf {f}}_m\}_{m \in \mathbb {N}}\) of \(H_\infty \), that the eigenvalues of Q form a sequence \(\{q_m\}_{m \in \mathbb {N}}\) satisfying \(\sum _{m} q_m < \infty \), and that \(\mathbb {E}\left[ \langle W_t, {\mathbf {f}}_m \rangle _{H_\infty } ^2\right] = |t| q_m\), for every \(t \ge 0\) and \(m \in \mathbb {N}\). Moreover, since
$$\begin{aligned} \langle Y^\varepsilon _t, {\mathbf {f}}_m \rangle _{H_\infty } = \int _{-\infty }^t \varepsilon ^{-2}e^{-\varepsilon ^{-2}(t-s)} d\langle W_s, {\mathbf {f}}_m \rangle _{H_\infty } \end{aligned}$$we also have \(\mathbb {E}\left[ \langle Y^\varepsilon _t, {\mathbf {f}}_m \rangle _{H_\infty } ^2\right] = \frac{\varepsilon ^{-2}}{2} q_m\) for every \(t \ge 0\) and \(m \in \mathbb {N}\).
-
(iv)
Let Z be an \(\varepsilon \)-independent stationary Ornstein–Uhlenbeck process solving \(\mathrm{d}Z_t = -Z_t \mathrm{d}t + \mathrm{d}W_t\), which is explicitly given by the formula
$$\begin{aligned} Z_t = \int _{-\infty }^t e^{-(t-s)} \mathrm{d}W_s,\quad t\ge 0. \end{aligned}$$(9)Due to the self-similarity of W, it is easy to check that the process \((Y^\varepsilon _t)_{t \ge 0}\) equals in law the process \((\varepsilon ^{-1} Z_{t \varepsilon ^{-2}})_{t\ge 0}\), thus making more transparent why we expect the process \(Y^\varepsilon \) to behave like a white noise as \(\varepsilon \downarrow 0\), see, for instance, [1].
Adopting the useful notation \(W^\varepsilon _t = \int _0^t Y^\varepsilon _s \mathrm{d}s\), we can write (6) in integral form as
where \(x_0 \in H_d\) is a deterministic initial condition, as well as \(F:[0,T] \times H_d \rightarrow H_d\), \(\sigma :[0,T] \times H_d \rightarrow {\mathcal {L}}(H_\infty ,H_d)\), \(\beta :H_{\infty } \times H_{\infty } \rightarrow H_d\). We make the following assumptions on these coefficients:
- (A1):
-
\(F \in C([0,T] \times H_d , H_d)\), and \(F(t,\cdot ) \in {Lip}_{loc}(H_d,H_d)\), uniformly in \(t \in [0,T]\);
- (A2):
-
\(\sigma \in C^{1,\gamma }([0,T] \times H_d,{\mathcal {L}}(H_\infty ,H_d))\), the space of \(C^1\) functions with \(\gamma \)-Hölder derivative, for some \(\gamma \in (0,1)\) and its space-differential \(D\sigma (t,\cdot ) \in {Lip}_{loc}(H_d,{\mathcal {L}}(H_d,{\mathcal {L}}(H_\infty ,H_d)))\), uniformly in \(t \in [0,T]\);
- (A3):
-
\(\beta :H_{\infty } \times H_{\infty } \rightarrow H_d\) is a continuous bilinear map.
Of course, by standard theory (see [3], for example), Eq. (10) admits a unique local strong solution, for each \(\varepsilon >0\).
Next, we introduce the limiting equation for the wanted limit \({\bar{X}}\) of the processes \(X^\varepsilon \), when \(\varepsilon \downarrow 0\). First, define the so-called Stratonovich correction term \(C:[0,T] \times H_d \rightarrow H_d\) by
where
is matrix notation for the linear map \(\sigma (s,x)\in {\mathcal {L}}(H_\infty ,H_d)\) with respect to our chosen basis vectors; second, let
Then, our limiting equation would read
where W is the same Wiener process used to define \(Y^\varepsilon \) in Remark 2.1, while \(\{{\bar{W}}^{\ell ,m}\}_{\ell ,m \in \mathbb {N}}\) is a family of independent one-dimensional standard Wiener processes, which are also independent of W.
As for (10), also (13) admits a unique local strong solution. However, in view of the interpretation of our results with respect to climate modeling, it is natural to further assume that
Another assumption specific to climate modeling, which has been advocated in [17], is the zero-mean property of \(\beta (Y^\varepsilon _s,Y^\varepsilon _s)\), \(s \ge 0\). Since all \(Y^\varepsilon \) are stationary under \(\mathbb {P}\), see Remark 2.1(ii), this assumption would translate into
where \(Y_s^{\varepsilon ,\ell }\) is short notation for the coordinates \(\langle Y_s^{\varepsilon } , {\mathbf {f}}_\ell \rangle _{H_\infty }\), \(\ell =1,2,\dots ,\,s\in [0,T]\). As a consequence, we also impose the zero-mean condition
- (A5):
-
\(\sum _{\ell \in \mathbb {N}} \langle \beta ({\mathbf {f}}_\ell ,{\mathbf {f}}_\ell ) , {\mathbf {e}}_i \rangle _{H_d}\, q_\ell \,=0\), for all \(i=1,\dots ,d\),
which is usually true for equations from fluid dynamics and can in general be understood as a renormalization procedure for the quadratic term.
The following theorem is the main result of this paper.
Theorem 2.2
-
(i)
Assume (A1)–(A5). Then, \(X^\varepsilon \) converges to \({\bar{X}}\), in law, \(\varepsilon \downarrow 0\).
-
(ii)
However, if (A1)–(A4) and (A5) comes via \(\beta =0\), then the stronger convergence (8) holds true.
In what follows, to keep notation light in proofs, when no confusion may occur, the norms in both spaces \(H_d\) and \(H_\infty \) will be denoted by \(|\cdot |\), and their scalar products by \(\langle \cdot ,\cdot \rangle \). The symbol \(\lesssim \) means inequality up to a multiplicative constant, possibly depending on the parameters of our equations, but not on \(\varepsilon \).
3 Strong convergence
In this section we give the proof of Theorem 2.2(ii), which is divided into several steps.
First, by localization, we argue that we can restrict ourselves to \(|X^\varepsilon _t|\), \(|{\bar{X}}_t| \le R\), for some large R, which is effectively leading to Lipschitz continuity of the coefficients of (10).
Second, we discretize the problem, which allows us to reduce the proof of Theorem 2.2(ii) to its discrete version:
for only finitely many \(t_k \in [0,T]\). Here, we choose \(t_k = k\Delta \), where \(\Delta =\Delta _\varepsilon \) is a positive parameter whose \(\varepsilon \)-dependence has to be carefully chosen in the proof—see Remark 3.9.
Third, we prove the above discretized version.
3.1 Localization
Fix \(\varepsilon >0,\,\delta \in (0,1)\), and define
so that
Therefore, since (A4) implies
to prove (8), it is sufficient to show the convergence of the second summand on the right-hand side of (14), when \(\varepsilon \downarrow 0\), for fixed \(\delta \in (0,1),\,R>0\). Furthermore, by Markov inequality,
for every \(p>0\), \(\delta \in (0,1)\), and hence showing convergence of the above right-hand side, only, is enough. To keep notation light, we are going to use \(\tau ^\varepsilon \) instead of \(\tau ^\varepsilon _R\), as \(R>0\) will be fixed, in what follows.
3.2 Discretization
Fix \(\varepsilon >0\). We show that the expectation on the right-hand side of (15) can be replaced by an expectation of the same quantity, but with the supremum taken over a finite number (diverging to \(\infty \), as \(\varepsilon \downarrow 0\)) of times \(t_k\), see Corollary 3.7 below.
To start with, we have the following useful a priori estimate.
Lemma 3.1
For any \(p>1\), the Ornstein–Uhlenbeck process \(Y^\varepsilon \) satisfies
Proof
First, using the decomposition \(Y^\varepsilon _t = Y^\varepsilon _0 + \left( Y^\varepsilon _t - Y^\varepsilon _0 \right) \), Gaussian estimates on \(Y^\varepsilon _0\) and [15, Theorem 2.2], the result is true in one dimension.
In the infinite-dimensional case, by Hölder’s inequality, we can suppose \(p>2\). Therefore, since Q is trace class with eigenvalues satisfying \(\sum _{m\in \mathbb {N}}q_m<\infty \), when \(\alpha = (p-2)/p\), we obtain that
having used the one-dimensional result for the coordinates \(Y^{\varepsilon ,m}_t = \langle Y^\varepsilon _t , {\mathbf {f}}_m \rangle ,\,m=1,2,\dots \) \(\square \)
Remark 3.2
In view of Remark 2.1(iv), the previous result could also be obtained from the analogous result for (9) and parabolic scaling. Indeed, it would be sufficient to prove \( \mathbb {E}\left[ \sup _{t \le T} |Z_t|^p\right] \lesssim \log ^{p/2}(1+T) \) for every \(p>1\).
Now, we introduce the discretization of the time interval [0, T]. Let \(\Delta >0\), and let \([T/\Delta ]\) be the largest integer less or equal than \(T/\Delta \). In what follows, \(\Delta \) will also depend on \(\varepsilon \), in a way to be determined later. Also, to make it easier to bound terms by powers of \(\varepsilon \) or \(\Delta \), without loss of generality, we will always assume that both \(\varepsilon \) and \(\Delta \) are less than one.
The next two lemmas control the excursion of \(X^\varepsilon \) between adjacent nodes in terms of the ratio \(\Delta /\varepsilon \).
Lemma 3.3
For any \(p>1\), and any deterministic time \(\tau >0\),
Proof
Since \(\beta =0\), by (10), the increment \({X}^\varepsilon _{t+k\Delta } - {X}^\varepsilon _{k\Delta }\) can be written as
Therefore, using (A1), (A2), boundedness of \(X^\varepsilon \) on \([0,\tau ^\varepsilon ]\), and Lemma 3.1, we obtain that
where \(W^\varepsilon _t = \int _0^t Y^\varepsilon _s \mathrm{d}s\) was defined in Sect. 2. \(\square \)
Lemma 3.4
For any \(p>1\), and any fixed \(k \in \{0,1,\dots ,[T/\Delta ]\}\) such that \(k\Delta \le T\),
Proof
It suffices to bound every single term on the right-hand side of the equation
First, by (A1) and boundedness of \(X^\varepsilon \) on \([0,\tau ^\varepsilon ]\), we have that
Second, using Hölder’s inequality with \(q'>1/p\) and Lemma 3.1,
Since \(pq'>1\) by assumption, we can estimate the integral above using Hölder’s inequality with exponents \(pq'\) and \(pq' /(pq'-1)\), (A2) and Lemma 3.3 to obtain
Finally,
because, for every \(t_2 > t_1 \ge 0\),
\(\square \)
The next lemma controls the excursion of the limiting process \({\bar{X}}\) between adjacent nodes.
Lemma 3.5
For any \(p>1\), any deterministic time \(\tau \in (0,1)\), and any fixed \(k \in \{0,1,\dots ,[T/\Delta ]\}\),
Proof
Since \(\beta =0\), by (13), the increment \({\bar{X}}_{t+k\Delta } - {\bar{X}}_{k\Delta }\) can be written as
Therefore, using (A1), (A2), boundedness of \(X^\varepsilon \) on \([0,\tau ^\varepsilon ]\), and Burkholder–Davis–Gundy’s inequality, we obtain that
which proves the lemma since \(\tau <1\). \(\square \)
Corollary 3.6
For any \(p>1\),
Proof
The claim easily follows from Lemma 3.5 with \(\tau =\Delta \), and the inequality
\(\square \)
Corollary 3.7
Let \(\Delta =\Delta _\varepsilon >0\) depend on \(\varepsilon \) such that \(\Delta /\varepsilon \rightarrow 0\), as \(\varepsilon \downarrow 0\). Then,
Proof
First, by Hölder’s inequality with \(q>1\) and Lemma 3.6, we have that
since we have taken \(q>1\). Thus, the proof can easily be completed by combining the above and Lemma 3.3, while taking into account
where \([t/\Delta ]\) is again our notation for the floor of \(t/\Delta \). \(\square \)
3.3 Proof of the discretized version
We now discuss our strategy to prove part (ii) of Theorem 2.2. Recall that we want
for every fixed \(\delta >0\), as \(\varepsilon \downarrow 0\). As we have seen, by (14), (15) and Corollary 3.7, it suffices to prove
for some \(\Delta = \Delta _\varepsilon = o(\varepsilon )\). The proof is inspired by [11, Sect. VI.7].
Hereafter, \(\partial \sigma \) denotes the derivative of \(\sigma \) with respect its first variable, and \(D \sigma \) denotes the derivative of \(\sigma \) with respect its second variable. To start with, by (10) without \(\beta \)-term, (A2), and (16), we have that
for any \(k=0,\dots ,[T/\Delta ]\) such that \((k+1)\Delta \le T\).
Similarly, using (13) instead of (10), the process \({\bar{X}}\) satisfies
Having in mind to apply Gronwall’s lemma, it turns out to be useful to summarize the contributions of the right-hand sides of (18), (19) as follows:
for any \(h=1,\dots ,[T/\Delta ]\), which splits the difference \(X^\varepsilon _{h\Delta } - {\bar{X}}_{h\Delta }\) into 5 sums.
We at first prove that the 2nd and the 5th sum can be neglected when proving (17). The summands of the 5th sum are discussed in Lemma 3.8 below. The contribution of the 2nd sum though is more delicate and requires a martingale argument similar to that of [11, Theorem VI.7.1].
The remaining sums will be controlled in terms of the difference \(X^\varepsilon - {\bar{X}}\) itself, which allows them to be estimated via Gronwall’s lemma.
Of course, under assumption (A1), the function F is uniformly continuous when restricted to \([0,T] \times B_R(0)\), where \(B_R(0)\) is the closed ball of radius R in \(H_d\). In what follows, we will denote by \(\omega _F:[0,T] \rightarrow [0,\infty )\) the (local) modulus of continuity of \(F(\cdot ,x)\):
Obviously, the function \(\omega _F\) vanishes at zero, and without loss of generality, it can be chosen to be both non-decreasing and continuous.
Denote by \(\omega _\sigma \) the corresponding modulus of continuity of the derivative \(D\sigma (\cdot ,x)\), and let \(\omega _{F,\sigma } = \omega _F + \omega _\sigma \). Recall that, under assumption (A2), one can take \(\omega _\sigma (t) = C t^{\gamma }\) for some positive constant C and \(\gamma \in (0,1)\).
Lemma 3.8
For any \(p>1\):
Proof
Throughout this proof, we will frequently make use of (A1), (A2) without explicit mentioning.
For \(\sum I^k_1\), by Hölder’s inequality and Lemma 3.3,
For \(\sum I^k_3\), by Hölder’s inequality and Lemma 3.1,
For \(\sum I^k_4\), by Hölder’s inequality, Lemmas 3.1 and 3.3,
We now consider \(\sum I^k_8\). Here, the idea is to convert \(Y^\varepsilon \)-increments into \(X^\varepsilon \)-increments via integration by parts since \(X^\varepsilon \)-increments are easier to control. This way, applying Lemmas 3.1 and 3.4,
In a similar way, for \(\sum J^k_1\) and \(\sum J^k_3\), now applying Lemma 3.5,
For the last sum \(\sum J^k_5\), by Burkholder–Davis–Gundy’s inequality and Lemma 3.5,
\(\square \)
Remark 3.9
The estimates given in Lemma 3.8 motivate the following choice of how \(\Delta =\Delta _\varepsilon \) should behave when \(\varepsilon \) goes to zero:
Such a choice is always possible. Indeed, under assumption (A2), one can take \(\omega _\sigma (t) = C t^{\gamma }\) for some positive constant C and \(\gamma \in (0,2/3)\), and therefore the choice \(\Delta _\varepsilon = \varepsilon ^{\frac{2}{1+\gamma /2}}\) satisfies all the requirements above. We will maintain this choice of \(\Delta \) in the remainder of the paper.
We now discuss the 2nd sum on the right-hand side of (20), that is
the i-th component of which, when plugging in (11), reads
where \(c_{\ell ,m}^k(\Delta ,\varepsilon )\) is given by
Taking the conditional expectation of \(c_{\ell ,m}^k(\Delta ,\varepsilon )\) with respect to \({\mathcal {F}}_{k\Delta }\) yields
where the following representation of \(Y^\varepsilon \),
has been used, and this conditional expectation can easily be calculated as
Now, since \(\sum _{j=1,\dots ,d} D_j \sigma ^{i,m}(k\Delta ,{\bar{X}}_{\tau ^\varepsilon \wedge (k\Delta )}) \sigma ^{j,\ell }(k\Delta ,{\bar{X}}_{\tau ^\varepsilon \wedge (k\Delta )})\) is \({\mathcal {F}}_{k\Delta }\) measurable, for every \(\ell ,m \in \mathbb {N}\), \(i=1,\dots ,d\), each process \(M^i_h,\,h=1,\dots ,[T/\Delta ]\), given by
is a discrete martingale with respect to the filtration \(({\mathcal {F}}_{h\Delta })_{h=1}^{[T/\Delta ]}\).
Lemma 3.10
For each \(i=1,\dots ,d\),
Proof
Combining Doob’s maximal inequality and martingale property gives
where
for each \(k=0,\dots ,[T/\Delta ]-1\), because the conditional expectation is an \(L^2\)-projection. Thus, by independence of \(Y^{\varepsilon ,\ell }\) and \(Y^{\varepsilon ,m}\), for every \(\ell \ne m\), we can estimate
\(\square \)
To eventually cover the remainder of the 2nd sum on the right-hand side of (20), after subtracting the martingale term \(M_h\), we introduce
Lemma 3.11
For each \(i=1,\dots ,d\),
Proof
The proof is an easy consequence of (21). Indeed,
\(\square \)
All in all, Lemmas 3.10 and 3.11 together imply
showing that the 2nd sum on the right-hand side of (20) can be neglected, like the 5th one, when \(\varepsilon \downarrow 0\), and \(\Delta =\Delta _\varepsilon \) behaves as described in Remark 3.9.
Recall that we wanted to control the remaining sums in terms of the difference \(X^\varepsilon - {\bar{X}}\) itself, which is obvious for the first and third sum on the right-hand side of (20). However, in case of the fourth sum, applying almost the same martingale argument used in case of the 2nd sum, each term \(I^k_5\) can be formally replaced by \(\int _{k\Delta }^{(k+1)\Delta } \left( C(k\Delta ,X^\varepsilon _{k\Delta }) - C(k\Delta ,{\bar{X}}_{k\Delta }) \right) \mathrm{d}s \), subject to a sufficiently small \(\varepsilon \)-correction, eventually leading to the wanted contraction argument in this case, too.
On the whole, we have justified that, if \(\Delta =\Delta _\varepsilon \) behaves as described in Remark 3.9, then
where \(r(\Delta ,\varepsilon ) \rightarrow 0,\,\varepsilon \downarrow 0\), finally proving (17), by Gronwall’s lemma.
The proof of Theorem 2.2(ii) is thus complete.
4 Weak convergence
In this section we prove part (i) of Theorem 2.2. The idea of proof is similar to the one of part (ii), except that now \(\beta \not =0\) is possible. It is the existence of this bilinear term which prevents us from proving convergence in probability—we only succeed in showing convergence in law (see Remark 4.3(ii)).
First, we prove weak convergence of the bilinear term.
Second, we prove convergence in law of \(X^\varepsilon ,\,\varepsilon \downarrow 0\), using bounds similar to those obtained in Sect. 3.
4.1 Weak convergence of the bilinear term
For any \(\varepsilon >0\), define the process \(U^\varepsilon \) by
where \(Y^\varepsilon \) is the stationary Ornstein–Uhlenbeck process introduced in Remark 2.1. By (A5), the process \(U^\varepsilon \) has zero mean, and, using (A3), its second moments,
can be calculated to be
for \(i,j=1,\dots ,d\), and \(\ell ,m \in \mathbb {N}\).
Recalling (12), using the above short notation, we also have that
Next, since \(\mathrm{d}Y^{\varepsilon ,\ell }_t = -\varepsilon ^{-2}Y^{\varepsilon ,\ell }_t \mathrm{d}t + \varepsilon ^{-2} d\langle W_t , {\mathbf {f}}_\ell \rangle \), Itô’s formula implies
for any \(\ell ,m\in \mathbb {N}\), and hence
where \(M^\varepsilon \) is a d-dimensional continuous local martingale, while the process \(V^\varepsilon \) satisfies
by combining (A3) and Lemma3.1.
Remark 4.1
Using \(\sum _{\ell ,m \in \mathbb {N}} \beta ^i_{\ell ,m} q_\ell q_m < \infty \) for every \(i=1,\dots ,d\), it is possible to prove that \(M^\varepsilon \) is a square integrable martingale for every \(\varepsilon >0\). However, we will not need this in the following.
The above representation of \(U^\varepsilon \), though very simple, has been used in a variety of cases in a fruitful way, see for instance [19] or [10]. Observe that, by (A5), the Itô-correction actually cancels out, being otherwise a contribution of order \(\varepsilon ^{-1}\). The process \(U^\varepsilon \), nevertheless, has got an interesting limit in law:
Proposition 4.2
The couple of processes \((U^\varepsilon ,W)\) converges in law, \(\varepsilon \downarrow 0\), to a pair of processes \((\eta ,\omega )\), where \(\eta \) is a d-dimensional Wiener process with covariance \((\sum _{\ell ,m \in \mathbb {N}} b^i_{\ell ,m}b^j_{\ell ,m})_{i,j=1}^d\), and \(\omega \) is a Q-Wiener process, like W. Furthermore, \(\eta \) and \(\omega \) are independent.
Proof
First, by (23), it is sufficient to prove the proposition for \((M^\varepsilon ,W)\) instead of \((U^\varepsilon ,W)\).
Since all components of the processes \(M^\varepsilon ,\,\varepsilon >0\), and of W are continuous local martingales, the distributional properties of the limit \((\eta ,\omega )\) would follow from [4, Chapter VII, Theorem 1.4], if
for each \(t \in [0,T]\), and \(i,j =1,\dots , d\), as well as
for each \(t\in [0,T],\,i=1,\dots ,d\), and \(m\in \mathbb {N}\).
First, fix \(t \in [0,T]\), as well as \(i,j =1,\dots , d\). Then, the quadratic covariation \(\left[ M^{\varepsilon ,i} , M^{\varepsilon ,j} \right] _{t}\) is given by
so that
Now, using that one can easily calculate \(\mathbb {E}\left[ Y^{\varepsilon ,\ell }_s Y^{\varepsilon ,\ell '}_s \right] = \frac{\varepsilon ^{-2}}{2} q_\ell \delta _{\ell ,\ell '}\), it follows from Isserlis–Wick’s theorem, see [14, Theorem 1.28], that
which yields
proving (24).
Second, fix \(t\in [0,T]\), as well as \(i=1,\dots ,d,\,m\in \mathbb {N}\). Then,
where, using Lemma 3.1,
finishing the proof of the proposition. \(\square \)
Remark 4.3
-
(i)
Of course, a d-dimensional Wiener process with covariance \((\sum _{\ell ,m \in \mathbb {N}} b^i_{\ell ,m}b^j_{\ell ,m})_{i,j=1}^d\) can always be represented by \(\sum _{\ell ,m \in \mathbb {N}} b_{\ell ,m} {\bar{W}}^{\ell ,m}\), where \(\{{\bar{W}}^{\ell ,m}\}_{\ell ,m \in \mathbb {N}}\) is a family of independent one-dimensional standard Wiener processes.
-
(ii)
We would like to stress that we do not expect a much stronger convergence of \(U^\varepsilon \), when \(\varepsilon \downarrow 0\), as the one stated in the above proposition. Indeed, it turns out to be that the sequence \(\{M^{\varepsilon }\}_{\varepsilon >0}\) is not even a Cauchy sequence in \(L^2(\Omega ;\mathbb {R}^d)\). To see this, for fixed \(0<\varepsilon <\underline{\varepsilon }\), and some \(1\le i\le d\), consider
$$\begin{aligned} {\mathbb {E}} \left[ \sup _{t \le T} \bigg | M^{\varepsilon ,i}_t - M^{\underline{\varepsilon },i}_t \bigg |^2 \right]&= {\mathbb {E}} \left[ \sup _{t \le T} \bigg | \int _0^t \sum _{\ell ,m \in \mathbb {N}} \beta ^i_{\ell ,m} \left( \varepsilon Y^{\varepsilon ,\ell }_s - \underline{\varepsilon } Y^{\underline{\varepsilon },\ell }_s \right) d\langle W_s , {\mathbf {f}}_m \rangle \bigg |^2 \right] . \end{aligned}$$But, by Burkholder–Davis–Gundy’s inequality, the above expectation can be bound from below by
$$\begin{aligned}&{\mathbb {E}} \left[ \int _0^T \sum _{m \in \mathbb {N}} \left( \sum _{\ell \in \mathbb {N}} \beta ^i_{\ell ,m} \left( \varepsilon Y^{\varepsilon ,\ell }_s - \underline{\varepsilon } Y^{\underline{\varepsilon },\ell }_s \right) \right) ^2 q_m \mathrm{d}s \right] \\&\quad = T \sum _{\ell ,m \in \mathbb {N}} (\beta ^i_{\ell ,m})^2 q_\ell q_m \left( 1 - \frac{2\varepsilon ^{-1}\underline{\varepsilon }^{-1}}{\varepsilon ^{-2}+\underline{\varepsilon }^{-2}} \right) , \end{aligned}$$where
$$\begin{aligned} \lim _{\varepsilon \rightarrow 0} \left( 1 - \frac{2\varepsilon ^{-1}\underline{\varepsilon }^{-1}}{\varepsilon ^{-2}+\underline{\varepsilon }^{-2}} \right) = 1, \quad \hbox {for every fixed} \underline{\varepsilon }>0, \end{aligned}$$so that \(\{M^{\varepsilon ,i}\}_{\varepsilon >0}\) cannot be Cauchy in \(L^2(\Omega )\).
4.2 Weak convergence of solutions
We now prove \(X^\varepsilon \rightarrow {\bar{X}}\), in law, when \(\varepsilon \downarrow 0\).
First, for each \(\varepsilon >0\), let \({\hat{X}}^\varepsilon \) be the solution of
where \(U^\varepsilon \) is given by (22), and let \(\tau ^\varepsilon _R = \inf \{t\ge 0 : |X^\varepsilon _t|\ge R \} \wedge \inf \{t\ge 0 : |{\hat{X}}^\varepsilon _t|\ge R \} \).
Note that, if (A4), then the coefficients \(F,C,\sigma ,\beta \) must have properties such that each of the above equations admits global solutions on [0, T], too.
Next, taking into account \(\mathbb {E}\left[ \bigg | \sup _{s \in [0,T]} \varepsilon \beta (Y^\varepsilon _s,Y^\varepsilon _s) \bigg |^p \right] \lesssim \varepsilon ^{-p} \log ^p(1+\varepsilon ^{-2})\) we can estimate increments of \(U^\varepsilon \) with
As a consequence, it can easily be verified that the analogous of Lemmas 3.3 and 3.4 would still be valid for the process \(X^\varepsilon \), despite \(\beta \not =0\), on the one hand, and that the following versions
and
of Lemmas 3.5 and 3.6, respectively, would hold true when replacing \({\bar{X}}\) by \({\hat{X}}^\varepsilon \), on the other. We point out that the proof of this claim differs from those in Sect. 3 only for the term \(U^\varepsilon \), which however can be controlled by (26).
Therefore, when expanding \(X^\varepsilon \) and \({\hat{X}}^\varepsilon \) as in (18) and (19), but including the \(\beta \)-term, and then arguing as in the proof of Theorem 2.2(ii) in Sect. 3, it would immediately follow that \(X^\varepsilon _{\cdot \wedge \tau ^\varepsilon _R} - {\hat{X}}^\varepsilon _{\cdot \wedge \tau ^\varepsilon _R} \rightarrow 0\), in probability, \(\varepsilon \downarrow 0\), for any \(R>0\), once the following lemma is also available.
Lemma 4.4
Assume that \(\Delta =\Delta _\varepsilon \) behaves as described in Remark 3.9. Then,
Proof
To start with, write
which creates two summands, for any fixed \(0\le k\le [T/\Delta ]-1\).
We estimate the impact of each summand separately.
First, using \(|D\sigma (r,X^\varepsilon _r)-D\sigma ({k\Delta },X^\varepsilon _{k\Delta })| \lesssim |X^\varepsilon _r-X^\varepsilon _{k\Delta }| + \omega _\sigma (\Delta )\), we obtain that
Second, we approach
following the method used when discussing the 2nd sum on the right-hand side of (20) in the proof of Theorem 2.2(ii), but now for triple moments of \(Y^\varepsilon \).
Indeed, define
and take the conditional expectation with respect to \({\mathcal {F}}_{k\Delta }\), that is
Since
we have that
Next, for each \(i=1,\dots ,d\), the process \(M^i_h,\,h=1,\dots ,[T/\Delta ]\), given by
is a martingale with respect to the filtration \(({\mathcal {F}}_{h\Delta })_{h=1}^{[T/\Delta ]}\), and arguing as in the proof of Lemma 3.10 yields
So, it remains to prove that the remainder, after subtracting the martingale term \(M_h\) from (27), also vanishes, when \(\varepsilon \downarrow 0\). For \(i=1,\dots ,d\), the ith coordinate of this remainder reads
and we can easily calculate the below bound,
finishing the proof of the lemma. \(\square \)
Corollary 4.5
For any \(R>0\), if \(\Delta =\Delta _\varepsilon \) behaves as described in Remark 3.9,
and hence \(X^\varepsilon _{\cdot \wedge \tau ^\varepsilon _R} - {\hat{X}}^\varepsilon _{\cdot \wedge \tau ^\varepsilon _R} \rightarrow 0\), in probability, \(\varepsilon \downarrow 0\), in particular.
The above corollary suggests that it would be sufficient to show that \({\hat{X}}_{\cdot \wedge \tau ^\varepsilon _R}^\varepsilon \rightarrow {\bar{X}}_{\cdot \wedge \tau ^\varepsilon _R}\), in law, when \(\varepsilon \downarrow 0\), subject to some procedure allowing to let R go to infinity, afterwards. So, we at first prove the weak convergence for fixed R and then discuss the limit-procedure for \(R\rightarrow \infty \).
Modify the coefficients \(F,\sigma \) outside the set \(\{(t,x): |x|<R\}\) in such a way that the new coefficients \(F_R,\,\sigma _R\), but also \(D\sigma _R\), are globally bounded, and that both functions \(F_R(t,\cdot )\) and \(D\sigma _R(t,\cdot )\) are globally Lipschitz, uniformly in \(t \in [0,T]\).
Of course, \({\hat{X}}^\varepsilon _{\cdot \wedge \tau ^\varepsilon _R}\) coincides with \({\hat{X}}^{\varepsilon ,R}_{\cdot \wedge \tau ^\varepsilon _R}\), where \({\hat{X}}^{\varepsilon ,R}\) denotes the solution to the equation obtained when replacing the coefficients of (25) by \(F_R,\,\sigma _R\), and the Stratonovich correction \(C_R\) associated with \(\sigma _R\). Also, let \({\bar{X}}^R\) denote the solution to the equation obtained when replacing the coefficients of (13) by \(F_R,\,\sigma _R,\,C_R\).
Proposition 4.6
Fix \(R>0\). Then, \({\hat{X}}^{\varepsilon ,R}\) converges to \({\bar{X}}^R\), in law, when \(\varepsilon \downarrow 0\).
Proof
Since
by boundedness of the coefficients on the above right-hand side, we obtain that
where Burkholder–Davis–Gundy’s inequality gives \(\mathbb {E}\left[ \sup _{t\le T}|\int _0^t \sigma _R(s,{\hat{X}}^{\varepsilon ,R}_s ) \mathrm{d}W_s|\right] \,\lesssim \,T^{1/2}\).
Similarly, \(\mathbb {E}\left[ |({\hat{X}}^{\varepsilon ,R}_{t_2} - U^\varepsilon _{t_2}) - ({\hat{X}}^{\varepsilon ,R}_{t_1} - U^\varepsilon _{t_1})|^p\right] \lesssim |t_2-t_1|^{p/2}\), for any \(|t_2-t_1|<1\), and any \(p>1\). Thus, by Kolmogorov–Chentsov’s theorem, for every \(\alpha \in (0,1)\), one can find \(\Delta \in (0,1)\) such that
where const depends on \(\gamma \), but not on \(\varepsilon \), and \(\gamma \in (0,1/2)\) can be freely chosen.
We therefore have equi-boundedness and equi-continuity of \(\{{\hat{X}}^{\varepsilon ,R} - U^\varepsilon \}_{\varepsilon >0}\) with arbitrarily high probability, and hence the family \(\{{\hat{X}}^{\varepsilon ,R} - U^\varepsilon \}_{\varepsilon >0}\) is tight with respect to the uniform topology in \(C([0,T],\mathbb {R}^d)\), first applying Arzelà–Ascoli, followed by Prokhorov’s theorem. Moreover, \(\{U^\varepsilon \}_{\varepsilon > 0}\) is trivially tight by Proposition 4.2, so that adding \({\hat{X}}^{\varepsilon ,R} - U^\varepsilon \) and \(U^\varepsilon \) would make \(\{{\hat{X}}^{\varepsilon ,R}\}_{\varepsilon > 0}\) tight, too.
All in all, the family of triples \(\{\left( \right. {\hat{X}}^{\varepsilon ,R},U^\varepsilon ,{W} \left. \right) \}_{\varepsilon > 0}\) is tight.
Next, for \(\varepsilon >0\), let \(\mathbb {P}^{R,\varepsilon }\) be the pushforward measure \(\mathbb {P}\circ ({\hat{X}}^{\varepsilon ,R},U^\epsilon , {W} )^{-1}\) on the space
equipped with the Borel-\(\sigma \)-algebra \({\mathcal {B}}\), and let \((\xi ,\eta ,\omega )\) denote the coordinate process on \({\tilde{\Omega }}\).
By tightness of \(\{({\hat{X}}^{\varepsilon ,R},U^\epsilon ,{W})\}_{\varepsilon > 0}\), there exists a subsequence \((\varepsilon _n)_{n\in \mathbb {N}}\) such that \(\mathbb {P}^{R,\varepsilon _n}\) weakly converges to a probability measure \(\mathbb {P}^R\) on \(({\tilde{\Omega }},{\mathcal {B}})\), when \(n\uparrow \infty \).
Let \(\tilde{{\mathcal {F}}}\) be the \(\mathbb {P}^R\)- completion of \({\mathcal {B}}\), and let \((\tilde{{\mathcal {F}}}_t)_{t\in [0,T]}\) be the smallest filtration the process \((\xi ,\eta ,\omega )\) is adapted to, on the one hand, and which satisfies the usual conditions with respect to \(\mathbb {P}^R\), on the other. Also, introduce \(\tilde{{\mathcal {F}}}^n,\,(\tilde{{\mathcal {F}}}_t^n)_{t\in [0,T]}\) in a similar way with respect to \(\mathbb {P}^{R,\varepsilon _n},\,n\in \mathbb {N}\).
Now, it easily follows from Proposition 4.2 that, on \(({\tilde{\Omega }},\tilde{{\mathcal {F}}},\mathbb {P}^R)\), the following distributional properties must hold for the pair of processes \((\eta ,\omega )\): \(\eta \) is a d-dimensional Wiener process with covariance \((\sum _{\ell ,m \in \mathbb {N}} b^i_{\ell ,m}b^j_{\ell ,m})_{i,j=1}^d\), \(\omega \) is a Q-Wiener process, \(\eta \) and \(\omega \) are independent.
Introduce
and observe that each component of both processes \(M^R\) and \(\omega \), but also
are continuous local martingales with respect to \((\tilde{{\mathcal {F}}}_t^n)_{t\in [0,T]}\) on \(({\tilde{\Omega }},\tilde{{\mathcal {F}}}^n,\mathbb {P}^{R,\varepsilon _n})\), for any \(n\in \mathbb {N}\), and hence they are continuous local martingales with respect to \((\tilde{{\mathcal {F}}}_t)_{t\in [0,T]}\) on \(({\tilde{\Omega }},\tilde{{\mathcal {F}}},\mathbb {P}^{R})\), too, by [12, IX. Cor.1.19].
Therefore, applying [3, Theorem 8.2] to the pair of process \((M^R,\omega )\) yields
on \(({\tilde{\Omega }},\tilde{{\mathcal {F}}},\mathbb {P}^R)\), or an enlargement of this space we still denote by \(({\tilde{\Omega }},\tilde{{\mathcal {F}}},\mathbb {P}^R)\), where \(W^R\) is another Q-Wiener process, which, by the above representation, even \(\mathbb {P}^R\)- almost surely coincides with \(\omega \), so that
Thus, Eq. (28) can be written as
where \(\omega \) is a Q-Wiener process, while \(\eta \) is a d-dimensional Wiener process, independent of \(\omega \), and with covariance \((\sum _{\ell ,m \in \mathbb {N}} b^i_{\ell ,m}b^j_{\ell ,m})_{i,j=1}^d\). Observe that the process \({\bar{X}}^R\) satisfies the same type of equation, as \(\sum _{\ell , m \in \mathbb {N}} b_{\ell ,m} {\bar{W}}^{\ell ,m}\) from (13) is a d-dimensional Wiener process with covariance \((\sum _{\ell ,m \in \mathbb {N}} b^i_{\ell ,m}b^j_{\ell ,m})_{i,j=1}^d\), too. But, since this type of equation admits a unique strong solution, the laws of \(\xi \) and \({\bar{X}}^R\) must be the same, proving \({\hat{X}}^{\varepsilon _n,R}\rightarrow {\bar{X}}^R\), in law, when \(n\uparrow \infty \). However, the same argument applies to any converging subsequence, and the limit will always be the same, finally proving \({\hat{X}}^{\varepsilon ,R}\rightarrow {\bar{X}}^R\), in law, when \(\varepsilon \downarrow 0\). \(\square \)
It remains to discuss how R can be taken to infinity.
Recall that \({\bar{X}}\) is the solution of (13), and it is not difficult to see that \({\bar{X}}^R\) converges to \({\bar{X}}\), in law, as \(R \rightarrow \infty \).
Now take a function \(\varphi _R \in C(C([0,T],\mathbb {R}^d),[0,1])\), such that \(\varphi _R(u)=0\), if \(\sup _{t \in [0,T]} |u_t| \le R-1\), and \(\varphi _R(u)=1\), if \(\sup _{t \in [0,T]} |u_t| > R\).
Then,
and because \({\hat{X}}^{\varepsilon ,R} \rightarrow {\bar{X}}^R\), in law, when \(\varepsilon \downarrow 0\), we deduce that
where the last probability converges to zero, when \(R\rightarrow \infty \), because \({\bar{X}}\) is a global solution.
As a consequence, for any \(\psi \in C_b(C([0,T],\mathbb {R}^d),\mathbb {R})\),
Here, when taking R large enough, we can make all the summands on the right-hand side, except for the second and fourth, arbitrarily small, uniformly in \(\varepsilon \), and, for fixed R, the remaining terms go to zero, when \(\varepsilon \downarrow 0\).
Thus, by a diagonal argument, the convergence in law of \(X^\varepsilon \rightarrow {\bar{X}},\,\varepsilon \downarrow 0\), follows, completing the proof of the theorem.
5 Application to climate models
We now apply Theorem 2.2 to perform stochastic model reduction for a subclass of the stochastic climate models given by (4), (5) in the introduction: we restrict ourselves to a simpler version of (5), omitting fast forcing \(\varepsilon ^{-2}f^2_{\varepsilon ^{-1}t}\) and \(\varepsilon ^{-1} A^2_2 Y^\varepsilon _t\), on the one hand, but also neglecting the interaction \(B^2_{12}(X^\varepsilon _t,Y^\varepsilon _t)\), on the other. While the first two terms we omit are technically demanding but look doable from a wider prospective, which is beyond this paper, the term \(\varepsilon ^{-1}B^2_{12}(X^\varepsilon _t,Y^\varepsilon _t)\) involving the neglected interaction is notoriously hard and beyond our understanding, right now.
For each \(\varepsilon >0\), let \((X^\varepsilon ,Y^\varepsilon )\) be a pair of processes satisfying
where \(A^1_1:H_d \rightarrow H_d\), \(A^1_2:H_\infty \rightarrow H_d\), \(A^2_1:H_d \rightarrow H_\infty \) are bounded linear operators, \(B^1_{11}:H_d \times H_d \rightarrow H_d\), \(B^1_{12}:H_d \times H_\infty \rightarrow H_d\), \(B^1_{22}:H_\infty \times H_\infty \rightarrow H_d\), \(B^2_{11}:H_d \times H_d \rightarrow H_\infty \) are continuous bilinear maps, and \(F^1:[0,T] \rightarrow H_d\) is a deterministic continuous external force. Stochastic basis and Wiener process W are taken to be the same as in Remark 2.1.
In what follows, the above equations will always have initial conditions \((x_0,y_0)\), where \(x_0 \in H_d\) can be chosen arbitrarily, while \(y_0 = \int _{-\infty }^0 \varepsilon ^{-2} e^{\varepsilon ^{-2}s} \mathrm{d}W_s\) will be fixed to ensure pseudo-stationarity of the scaled unresolved variables. Note that fixing \(y_0\in H_\infty \) this way would not restrict the initial data of the reduced equations.
In fluid dynamics settings like (1), it is customary to assume that A is self-adjoint, and that the full nonlinearity is skew-symmetric: \(\langle B(z',z),z \rangle _H = 0\), \(z,z' \in H\), see [18]. We therefore make the following assumptions on the projected coefficients:
- (C1):
-
\(A^2_1 = (A^1_2)^*\);
- (C2):
-
\(\langle B^1_{11}(x',x),x \rangle _{H_d} = 0\), for all \(x,x' \in H_d\);
- (C3):
-
\(\langle B^1_{12}(x',y),x \rangle _{H_d} = - \langle B^2_{11}(x',x),y \rangle _{H_\infty } \), for all \(x,x' \in H_d\), \(y \in H_\infty \).
Also, without loss of generality, we can assume that \(B^1_{22}\) is symmetric in the sense of \(\langle B^1_{22}({\mathbf {f}}_\ell ,{\mathbf {f}}_m) , {\mathbf {e}}_i \rangle _{H_d} = \langle B^1_{22}({\mathbf {f}}_m,{\mathbf {f}}_\ell ) , {\mathbf {e}}_i \rangle _{H_d}\), for all \(i,\ell ,m\); and finally, we will need the analogue of (A5), that is
- (C4):
-
\(\sum _{\ell \in \mathbb {N}} \langle B^1_{22}({\mathbf {f}}_\ell ,{\mathbf {f}}_\ell ) , {\mathbf {e}}_i \rangle _{H_d}\, q_\ell \,=0\), for all \(i=1,\dots ,d\).
Note that the latter condition is indeed satisfied for many fluid dynamics models—it usually holds independently of the structure of the noise because \(\langle B^1_{22}({\mathbf {f}}_\ell ,{\mathbf {f}}_m) , {\mathbf {e}}_i \rangle _{H_d}\) would be zero on the diagonal, when \(\ell =m\), for all i.
Next, we bring Eqs. (29), (30) into a form which makes them comparable to (6), (7).
Using the definition of \(y_0\), we have the following mild formulation of (30),
where
is a stationary Ornstein–Uhlenbeck process. Plugging (31) into (29), \(X^\varepsilon \) alternatively satisfies
when using the abbreviation
Since \(Z^\varepsilon _s\) is close to \(A^2_1 X^\varepsilon _s + B^2_{11}(X^\varepsilon _s,X^\varepsilon _s)\), for small \(\varepsilon \), and since both terms \(B^1_{22}({\tilde{Y}}^\varepsilon _s,Z^\varepsilon _s ),\, B^1_{22}\left( Z^\varepsilon _s,Z^\varepsilon _s\right) \) will be shown to vanish with \(\varepsilon \), too, the process \(X^\varepsilon \) should be close to \({\tilde{X}}^\varepsilon \) satisfying
which is an equation of type (6) with
Thus, in this setting, the analogue of (13) would read
where the Stratonovich correction term \(C: H_d \rightarrow H_d\) simplifies to
and
Proposition 5.1
When assuming (C1)–(C3), Eq. (34) admits a unique global strong solution on [0, T].
Proof
First, regularity of coefficients guarantees the existence of a unique local strong solution. Second, by Itô’s formula,
for any fixed \(t\in [0,T]\), and any stopping time \(\tau \) smaller than a possible explosion time.
Applying (C1)–(C3), we have the identities
leading to
again using the regularity of the coefficients combined with Burkholder–Davis–Gundy’s inequality. Thus, by Gronwall, the local solution \({\bar{X}}\) has to be global on [0, T]. \(\square \)
Remark 5.2
In a very similar way, it can be shown that both Eqs. (32) and (33) admit unique global strong solutions on [0, T], too, and hence those proofs are omitted. As a consequence, simply substituting the solution of (32) into (31), for each \(\varepsilon >0\), there is a unique pair of processes \((X^\varepsilon ,Y^\varepsilon )\) satisfying (29), (30) on [0, T].
Theorem 5.3
Assume (C1)–(C3), fix \(\varepsilon >0\), and let \((X^\varepsilon ,Y^\varepsilon )\) be the unique pair of processes satisfying (29), (30) on a given climate time interval [0, T].
-
(i)
If (C4), then \(X^\varepsilon \) converges in law, \(\varepsilon \downarrow 0\), to the unique process \({\bar{X}}\) satisfying (34).
-
(ii)
However, if (C4) comes via \(B^1_{22}=0\), then the stronger convergence (8) holds true.
Proof
Recall the process \({\tilde{X}}^\varepsilon \) satisfying (33), which is an equation of type (6) with coefficients \(F,\sigma ,\beta \) satisfying (A1)–(A3). Furthermore, by Proposition 5.1 and Remark 5.2, condition (A4) is satisfied, too, while (A5) and (C4) actually are the same condition.
All in all, Theorem 2.2 implies that both parts (i) and (ii) of Theorem 5.3 hold true when replacing \({X}^\varepsilon \) by \({\tilde{X}}^\varepsilon \).
Thus, it is sufficient to prove convergence in probability of \(X^\varepsilon - {\tilde{X}}^\varepsilon \) to zero, \(\varepsilon \downarrow 0\), uniformly on compact subsets of a localizing stochastic interval, which can easily be shown following the lines of proof of Theorem 2.2.
Indeed, by localization and discretization arguments, one would first derive
where \(\tau ^\varepsilon _R = \inf \{t \ge 0: |X^\varepsilon _t| \ge R \} \wedge \inf \{t \ge 0: |{\tilde{X}}^\varepsilon _t| \ge R \}\), and \(r(\Delta ,\varepsilon ) \rightarrow 0,\,\varepsilon \downarrow 0\), for a suitable choice of \(\Delta =\Delta _\varepsilon \). Then, combining Gronwall’s lemma and Markov’s inequality, one would obtain
which yields the convergences stated in parts (i) and (ii) of Theorem 5.3 up to time \(\tau ^\varepsilon _R\). Since \({\bar{X}}\) is globally defined, both types of convergence can be extended to the whole interval [0, T], using similar arguments given in the proof of the corresponding parts of Theorem 2.2. \(\square \)
References
R.N. Bhattacharya. On the functional central limit theorem and the law of the iterated logarithm for Markov processes. Z. Wahrscheinlichkeitstheorie verw Gebiete, 60, 185–201, 1982.
Z. Brze\(\acute{z}\)niak, M. Capi\(\acute{n}\)ski, and F. Flandoli. A convergence result for stochastic partial differential equations. Stochastics, 24(4):423–445, 1988.
G. Da Prato and J. Zabczyk. Stochastic equations in infinite dimensions, volume 152 of Encyclopedia of Mathematics and its Applications 2nd edn. Cambridge University Press, Cambridge, 2014.
S. N. Ethier and T. G. Kurtz. Markov processes—characterization and convergence. Wiley Series in Probability and Mathematical Statistics: Probability and Mathematical Statistics. John Wiley & Sons Inc., New York, 1986.
F. Flandoli and U. Pappalettera. 2D Euler equations with Stratonovich transport noise as a large scale stochastic model reduction. J. Nonlinear Sci., 31:24, 2021.
C. Franzke and A. J. Majda. Low-order stochastic mode reduction for a prototype atmospheric GCM. J. Atmos. Sci., 63(2):457–479, 2006.
C. Franzke, A. J. Majda, and E. Vanden-Eijnden. Low-order stochastic mode reduction for a realistic barotropic model climate. J. Atmos. Sci., 62(6):1722–1745, 2005.
M.I. Freidlin and A.D. Wentzell. Random Perturbations of Dynamical Systems. Grundlehren der mathematischen Wissenschaften. Springer New York, 2012.
K. Hasselmann. Stochastic climate models Part I. Theory. Tellus, 28:473–485, 1976.
B. Iftimie, È. Pardoux, and A. Piatnitski. Homogenization of a singular random one-dimensional pde. Ann. Inst. H. Poincarè Probab. Statist., 44(3):519–543, 06 2008.
N. Ikeda and S. Watanabe. Stochastic Differential Equations and Diffusion Processes. North-Holland Mathematical Library. North-Holland Publishing Co., second edition, 1989.
J. Jacod and A. N. Shiryaev. Limit Theorems for Stochastic Processes Grundlehren der mathematischen Wissenschaften 288. Springer, Berlin, second edition, 2002.
A. Jain, I. Timofeyev, and E. Vanden-Eijnden. Stochastic mode-reduction in models with conservative fast sub-systems. Commun. Math. Sci., 13(2):297–314, 2015.
S. Janson. Gaussian Hilbert spaces, volume 129 of Cambridge Tracts in Mathematics. Cambridge University Press, Cambridge, 1997.
C. Jia and G. Zhao. Moderate maximal inequalities for the Ornstein-Uhlenbeck process. Proc. Amer. Math. Soc.,https://doi.org/10.1090/proc/14804, 2020.
T. G. Kurtz. A limit theorem for perturbed operator semigroups with applications to random evolutions. J. Functional Analysis, 12:55–67, 1973.
A. J. Majda, I. Timofeyev, and E. Vanden Eijnden. A mathematical framework for stochastic climate models. Comm. Pure Appl. Math., 54(8):891–974, 2001.
A. J. Majda and X. Wang. Non-linear dynamics and statistical theories for basic geophysical flows. Cambridge University Press, Cambridge, 2006.
S. Olla. Homogenization of diffusion processes in random fields, 1994.
E. Pardoux and Yu. Veretennikov. On the Poisson Equation and Diffusion Approximation. I. Ann. Probab., 29(3):1061–1085, 2001.
E. Pardoux and Yu. Veretennikov. On the Poisson Equation and Diffusion Approximation. II. Ann. Probab., 31(3):1166–1192, 2003.
C. Penland and L. Matrosova. A Balance Condition for Stochastic Numerical Models with Application to the El Niño-Southern Oscillation. Journal of Climate, 7(9):1352–1372, 09 1994.
G. Tessitore and J. Zabczyk. Wong-Zakai approximations of stochastic evolution equations. Journal of Evolution Equations, 6(4):621–655, 2006.
K. Twardowska. Approximation theorems of Wong-Zakai type for stochastic differential equations in infinite dimensions. Dissertationes Math. (Rozprawy Mat.), 325, 1993.
G. K. Vallis. Atmospheric and oceanic fluid dynamics : fundamentals and large-scale circulation. Cambridge University Press, first edition, November 2006.
E. Wong and M. Zakai. On the convergence of ordinary integrals to stochastic integrals. Ann. Math. Statist., 36(5):1560–1564, 10 1965.
Funding
Open access funding provided by Scuola Normale Superiore within the CRUI-CARE Agreement.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Assing, S., Flandoli, F. & Pappalettera, U. Stochastic model reduction: convergence and applications to climate equations. J. Evol. Equ. 21, 3813–3848 (2021). https://doi.org/10.1007/s00028-021-00708-z
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00028-021-00708-z