Abstract
In this work we are concerned with the study of the strong order of convergence in the averaging principle for slow-fast systems of stochastic evolution equations in Hilbert spaces with additive noise. In particular the stochastic perturbations are general Wiener processes, i.e their covariance operators are allowed to be not trace class. We prove that the slow component converges strongly to the averaged one with order of convergence 1/2 which is known to be optimal. Moreover we apply this result to a slow-fast stochastic reaction diffusion system where the stochastic perturbation is given by a white noise both in time and space.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Consider the following slow-fast system of abstract stochastic evolution equations with additive noise
where \(\epsilon >0\) is a small parameter representing the ratio of time scales beween the slow component of the system \(U^\epsilon \) and the fast one \(V^\epsilon \). Here H, K are Hilbert spaces, \(A_1,A_2\) are unbounded linear operators on H, K respectively and \(W^{Q_1}, W^{Q_2}\) are Wiener processes on H, K respectively.
Slow-fast systems are very used in applications since it is very natural for real-world systems to present very different time-scales. We refer the reader for example to [23] for applications to physics, [32] to chemistry, [34] to neurophysiology, [1, 13, 21, 22] to mathematical finance (see also [12] for a slightly different financial model) and the references therein.
A natural idea is then to study the behaviour of the system when \(\epsilon \rightarrow 0\). In particular under certain hypotheses it is known that the slow component \(U^\epsilon \) converges to the solution U of the so called averaged equation
where
and \(\mu \) is the invariant measure related to the fast motion, i.e.
Note that the equation for U is uncoupled from \(V^\epsilon \). This fact is known as averaging principle and it is fundamental in applications since U captures the effective dynamic of \(U^\epsilon \) (which is usually the most interesting variable in applications) and it is then a rigorous dimensionality reduction of the original system.
The first general result for the averaging principle for finite-dimensional stochastic differential equations can be found in [27]. For generalizations and improvements see [13, 17, 18, 23, 31, 40, 41] and the references therein. It is important to mention that the drift of the fast equation is allowed to depend also on the slow component (i.e. fully coupled system) and the stochastic perturbations of the slow and fast equations are allowed to be multiplicative, i.e. the diffusion coefficient of the slow equation can depend on both the slow and fast variables. Moreover when the diffusion coefficient of the slow equation is independent of the fast variable then a strong convergence in probability is obtained. Otherwise only a weak convergence can be proved.
The averaging principle for infinite dimensional systems follows more delicated arguments: for this we refer to [5, 6, 8, 9, 14, 15, 24, 39] and the references therein. Also for infinite dimensional systems the previous comment about the dependence of the diffusion coefficients holds. See also [19, 20, 38] for optimal control problems of slow-fast systems in infinite dimension.
However for numerical applications it is very important to know the speed of convergence for which \(U^\epsilon \rightarrow U\), e.g. see [3, 29]. For the study of the order of convergence for finite dimensional systems we refer to [16, 25, 28, 30, 35] and the references therein. It is important to mention that the order of convergence can be studied in two ways: in the strong sense and in the weak sense. Moreover the optimal order for the strong and weak convergence are known to be 1/2 and 1 respectively.
Recently the problem of estimating the order of convergence in the averaging principle for infinite dimensional systems systems is being addressed by researchers:
in [4] the author (generalizing his previous work [2]) considers a slow-fast stochastic reaction diffusion system with additive noise. Both the weak and strong orders of convergence are obtained: in particular under strong regularity of the noise (it is for example assumed that the covariance operator is trace class but for the precise statement see [4]) it is proved that the strong order of convergence is 1/2 and the weak order is 1 with both orders being optimal. Instead under more general assumptions on the noise only weaker orders of convergence are obtained for both the strong and weak convergence.
In [34] a 1-dimensional fully coupled reaction-diffusion system is considered and the strong order of convergence is proved to be 1/2 under very strong assumptions on the covariance operators of the noises, i.e. \(Tr (\Delta ^{1/2} Q_i)< \infty \), where \(\Delta \) is the Laplacian.
In [36] the strong order of convergence for a fully coupled slow-fast stochastic system is studied. Here it is assumed that the the covariance operators of the noises are trace-class and moreover that \(Tr (-A_1 Q_1) < \infty \).
See also [26] where the weak order of convergence for a stochastic wave equation with fast oscillation given by a fast reaction-diffusion stochastic system is proved to be 1. Also here it assumed \(Tr ( Q_i)< \infty \).
Indeed in all these papers the case \(Tr ( Q_i)= \infty \), which is very important for applications as it happens very naturally for example when the stochastic perturbation is a white noise i.e. \(Q_i=I\), can’t be treated.
In this manuscript we are then interested in studying the strong order of convergence for the slow-fast infinite-dimensional system of stochastic evolution equations (1) where \(W^{Q_1}, W^{Q_2}\) are general Wiener processes on H, K respectively with covariance operators \(Q_1,Q_2\) with \(Tr(Q_1)=+ \infty \), \(Tr(Q_2) = +\infty \) possibly. Under some hypotheses, see Hypotheses 1, 2, 3, 4, 5 below, we prove that the strong order of convergence is 1/2 which is known to be optimal. In particular we show in Theorem 1 that
where \(U_t\) is the solution of the averaged equation. Notice that this result is much stronger than [4, 36] where \(\sup _{t \in [0,T]}\) is outside the expectation.
The key tool in the proof of Theorem 1 is Proposition 3. The proof of this proposition is based on a technical result, i.e. Lemma 6.3, which is inspired by [7], and it is a consequence of the mixing properties of the fast motion, i.e. Lemmas 4.5, 4.6, 4.7. We recall that [7] studies the normal deviations, i.e. the weak convergence of \(Z^\epsilon :=(U^\epsilon -U)/ \sqrt{\epsilon }\), when the equation for the slow component has no stochastic perturbation (\(Q_1=0\)).
Finally we discuss an application of our theory to a 1-dimensional slow-fast stochastic reaction diffusion system where the stochastic perturbation is given by a white noise both in time and space which to the best of our knowledge, as said before, can not be treated by the existing literature.
The paper is organized as follows: in Sect. 2 we introduce the problem in a formal way and we state the assumptions that we will use. In Sect. 3 we prove some a-priori estimates. In Sect. 4 we prove some results related to the fast motion. In Sect. 5 we study the well posedness of the averaged equation. In Sect. 6 we prove some preliminary results. In Sect. 7 we prove that the order of convergence is 1/2 and we give an application of our theory.
2 Setup and Assumptions
In this section we define the notation and the assumptions for the rest of the paper.
H, K will be Hilbert spaces with scalar products \(\langle \cdot ,\cdot \rangle _H, \langle \cdot ,\cdot \rangle _K\) and \(|\cdot |_H, |\cdot |_K\) the induced norms.
\(B_B(H)\) will denote the space of bounded functions \(\phi :H \rightarrow \mathbb {R}\) with the sup norm \(|\cdot |_{H,\infty }\).
Lip(H) will denote the set of Lipschitz functions \(\phi :H \rightarrow \mathbb {R}\) and set
\(\mathcal {L}(H)\) will denote the space of linear bounded operators from H to H, endowed with the operator norm
Next denote by \(\mathcal {L}_2(H)\) the space of Hilbert-Schmidt operators endowed with the norm
The analogous spaces \(B_B(K),Lip(K),\mathcal {L}(K),\mathcal {L}_2(K)\) are defined for the Hilbert space K with the corresponding norms \(|\cdot |_{K,\infty },[\cdot ]_{\text{ K,Lip } },\left\| \cdot \right\| _K,\left\| \cdot \right\| _{\mathcal {L}_2(K)}\).
In order to simplify the notation we will omit the subscripts K and H in the various norms when no confusion is possible.
\(\mathcal {B}(H)\) and \(\mathcal {B}(K)\) will denote the Borel sigma-algebra in H and K respectively.
Consider now the following infinite dimensional system for \(0 \le t \le T < \infty \)
where
-
\(\epsilon >0\) is a parameter,
-
\(A_1 :D(A_1) \subset H \rightarrow H\), \(A_2 :D(A_2) \subset K \rightarrow K\) are linear operators,
-
\(F :H\times K\rightarrow H\), \(G :K\rightarrow K\),
-
\(W^{Q_1}, W^{Q_2}\) are independent cylindrical Wiener processes on H, K respectively with covariance operator \(Q_1,Q_2\) respectively and they are defined on some probability space \((\Omega ,\mathcal {F},\mathbb {P})\) with a normal filtration \(\mathcal {F}_t\), \(t \ge 0 \).
We now state the assumptions that we will use throughout the work:
Hypothesis 1
\(A_1 :D(A_1) \subset H \rightarrow H\) is a linear operator generator of an analytical semigroup \(e^{A_1 t}\) on H, \(t \ge 0 \).
Moreover there exist an orthonormal basis \(\{e_k\}_{k \in \mathbb {N}}\) of H and \(\{\alpha _k\}_{k \in \mathbb {N}} \subset (0, +\infty )\) such that
Moreover we assume that there exist \(\zeta >0,n \ge 2\) integer and \( 1/(2n)< \beta < 1/3\) such that
and
Remark 2.1
Hypothesis 1 is necessary in the proof of Proposition 3. It holds for example when \(A_1= \Delta \) is the Laplacian on [0, L]. Indeed in this case it is well known that \(\alpha _k = C k^2\) and then the two series converge for example by choosing \(\zeta =\frac{3}{5}\), \(n=3\), \(\beta =\frac{1}{5}\).
From Hypothesis 1 we have the following spectral representation:
Moreover we can define the fractional powers of \(-A_1\) denoted by \((-A_1)^{\theta }\) for \(\theta \ge 0\) with domain \(D((-A_1)^\theta )\). We will denote by \(|\cdot |_{\theta }\) the norm \(|\cdot |_{D((-A_1)^\theta )}=|(-A_1)^\theta \cdot |\).
The following standard results holds, e.g. see [33, Chapter 2, Theorem 6.13],
for some \(\nu >0\) and
Hypothesis 2
\(A_2 :D(A_2) \subset K \rightarrow K\) is a linear operator generator of a \(\mathcal {C}_0\)-semigroup \(e^{A_2 t}\) on K, \(t \ge 0 \).
Moreover there exists \(\lambda >0\) such that
for every \(t\ge 0\).
Hypothesis 3
There exist \(L_F, L_G>0\) such that
for every \(x_1,x_2 \in H, y_1,y_2 \in K\).
Moreover we assume that
We remark that this implies that \(A_2+G(\cdot )\) is strongly dissipative, i.e. set
then it holds:
for every \(y_1,y_2 \in D(A_2)\).
Hypothesis 4
There exist \(C>0\), \(\gamma \in (0,1/2)\) such that
for every \(t>0\).
Hypothesis 5
Assume that \(Q_2\) is invertible (with inverse \(Q_2^{-1} \in {\mathcal {L}}(K)\)).
Remark 2.2
Hypotheses 4, 5 hold for example choosing \(A_1= A_2=A\) to be the Laplacian on [0, L] and \(Q_1=Q_2=Q=I\).
Indeed Hypothesis 5 is immediately satisfied. Moreover by setting \(H=K=L^2([0,L])\), we have that \(A e_k =-C k^2 e_k\) for some orthonormal basis \(\{ e_k\}\) of eigenvectors of A. Thus, by the spectral representation (3) with \(\alpha _k=C k^2\), we have:
where the inequality follows since \(\forall t > 0\) the function \(h \mapsto e^{-Ch^2 t}\) is non-increasing. This shows that Hypothesis 4 holds with \(\gamma =1/4.\)
Proposition 1
Let \(u \in H, v \in K\), under Hypotheses 1, 2, 3, 4, 5 there exists a unique mild solution of (2) given by
for every \(t \in [0,T]\).
Proof
See e.g. [10]. \(\square \)
In the sequel we will always assume that Hypotheses 1, 2, 3, 4, 5 hold. Moreover \(C>0\) will denote a generic constant independent of \(\epsilon \) which may change from line to line.
3 A Priori Estimates
In this section we prove some classical a-priori estimates for the slow and fast components.
In the following for every \(t \ge 0\) denote by
and
the stochastic convolutions.
First we prove some estimates related to \(\Gamma ^{1}_t\) and \(\Gamma ^{2 \epsilon }_t\).
Lemma 3.1
for every \(0 \le \theta < 1/2-\gamma \), \(p \ge 1\).
Proof
Fix \(0<\eta < 1/2\), by the factorization method, e.g. see [10, Chapter 5, Section 3], we have:
where
Now fix \(0 \le \theta < 1/2-\gamma \), then by Holder’s inequality:
for every \(p> 1/\eta >2\).
Now we estimate \(\mathbb {E} |Y_\rho |^p_\theta \); since \(Y_\rho \) is a Gaussian random variable, by Ito’s isometry, (4) and Hypothesis 4 we have:
for every \(\rho \le T\) and \(\theta , \eta \) such that \(0 \le \theta + \eta < 1/2 -\gamma \).
Next inserting the last inequality in (10) and recalling that \(p > 1/\eta \), which yields \( p > (1/2 - \gamma - \theta )^{-1}\) we obtain the thesis of the Lemma for \(0 \le \theta < 1/2-\gamma \), \(p> (1/2 - \gamma - \theta )^{-1}>2\).
Finally by Holder’s inequality we have the thesis of the Lemma. \(\square \)
Lemma 3.2
Let \(p\ge 2\), then there exists \(C=C(p)>0\) such that
for every \(\epsilon >0\).
Proof
For \(t>0\), by [10, Theorem 4.36] and by our hypotheses we have
so that the thesis is proved. \(\square \)
Lemma 3.3
Let \(p \ge 2\) then there exists \(C=C(T,p)>0\) such that
and
for every \(\epsilon >0\).
Proof
Define
so that
By Young’s inequality and Hypotheses 1, 3 we have:
for every \(t \le T\).
Then by the comparison Theorem we have
for every \(t \le T\).
Then by the definition of \(\Lambda ^{1\epsilon }\) and this last inequality it follows
for every \(t \le T\).
Now by Lemma 3.1 we have:
for every \(\tau \le T\).
Now we proceed in a similar way to [19, proof of Lemma 3.10], i.e. set
so that
Now by (8) it follows
Now similarly to [19, proof of Lemma 3.10] fix \(\theta >0\), differentiate \(f(x)= \sqrt{x+ \theta }\) and use the previous inequality, then:
Now by dominated convergence for \(\theta \rightarrow 0\) we have:
so that by recalling the definition of \(\Lambda ^{2\epsilon }_t\) we have:
Then, by Holder’s inequality and Lemma 3.2, we have:
This proves (12).
Finally inserting (12) into (15) we have (11). \(\square \)
Lemma 3.4
Let \(0< \alpha < 1/2 - \gamma \), \(u \in D((-A_1)^\alpha )\), \(v \in K\), then there exists \(C=C(T,\alpha )>0\) such that
for every \(\epsilon >0\).
Proof
Consider for \(t \le T\)
First as \(u \in D((-A_1)^\alpha )\) we have:
Moreover by (4) and by Lemma 3.3 we have:
Finally by Lemma 3.1 we have:
By considering (16), calculating \( \mathbb {E}\left[ \sup _{t\in [0,T]} \left| U^\epsilon _t\right| _{\alpha }^2 \right] \) and using the last three inequalities we have the thesis. \(\square \)
Lemma 3.5
Let \(0< \alpha < 1/2 - \gamma \), \(u \in D((-A_1)^\alpha )\), \(v \in K\), then there exists \(C=C(T,\alpha )>0\) such that
for every \(\epsilon >0\), \(0 \le t \le T\), \(h\ge 0\) such that \(t+h \le T\).
Proof
For \(0 \le t \le T\), \(h\ge 0\) such that \(t+h \le T\) we have
Consider the first term on the right-hand-side, as \(u \in D((-A_1)^\alpha )\) by (5) and by Lemma 3.4 we have:
Consider now the second term on the right-hand-side, then by Lemma 3.3 we have:
Finally for the third term on the right-hand-side by Ito’s isometry and Hypothesis 4 we have:
As by assumption \(2 \alpha \wedge 2 \wedge (1-2 \gamma )= 2\alpha \) then we have the thesis. \(\square \)
4 Fast Motion
In this section we study some classical properties of the fast motion. Consider
for every \(s \ge 0\) and for some \(Q_2\)-Wiener process \(w_s^{Q_2}\).
First define the semigroup related to the fast motion by
for every \(\phi \in B_B(K)\), \(s \ge 0\).
Next recall that \(\delta \) was defined by (7), then we have:
Lemma 4.1
for every \(s \ge 0\), \(v_1,v_2 \in K\).
Proof
Define \(\rho _s=v_s^{v_1}-v_s^{v_2}\), then by (8) we have:
for every \(s \ge 0\), \(v_1,v_2 \in K\).
Then by taking the expectation and applying the comparison Theorem we have the thesis. \(\square \)
Next we can show:
Lemma 4.2
Let \(p\ge 1\), then there exists \(C=C(p)>0\) such that:
for every \(s \ge 0\), \(v \in K\).
Proof
Define
First by Burkholder’s inequality and Hypotheses 2 and 4 similarly to what is done for Lemma 3.2 we have:
for every \(s \ge 0\).
Now set \(\rho _s=v^{v}_s-{\tilde{\Gamma }}_s^{Q_2}\). Then for \(p\ge 2\) we have:
Then by the comparison Theorem and (20) we have:
\(\square \)
Now by [11, Theorem 6.3.3] there exists a unique invariant measure \(\mu \) for the semigroup \(P_t\). Moreover we have:
Lemma 4.3
We have:
Proof
Fix \(N>0\), then by definition of invariant measure and Lemma 4.2 we have for every \(s>0\)
where we have used the fact that \((a+b)\wedge c \le a \wedge c + b \wedge c\) for every \(a,b,c \ge 0.\) By choosing \(s>0\) large enough we have
for some \(C>0\) independent of N. Letting \(N \rightarrow \infty \) by the monotone convergence Theorem we have the result. \(\square \)
Next we study the convergence to equlibrium of the semigroup of the fast motion, i.e. we prove:
Lemma 4.4
There exists \(C>0\) such that
for every \(s \ge 0\), \(v \in K\), \(\phi \in Lip(K)\).
Moreover there exists \(C>0\) such that
for every \(s > 0\), \(\phi \in B_B(K)\), \(v \in K\).
Proof
First for every \(\phi \in Lip(K)\) by Lemma 4.1 we have:
for every \(s\ge 0\), \(v_1,v_2 \in K\).
Now let \(s>0\), by definition of invariant measure, (22) and Lemma 4.3 we have:
for every \(v \in K\), \(\phi \in Lip(K)\) so that we have the first inequality.
Now thanks to Hypothesis 5 we can apply [10, Theorem 9.32] to have the Bismut-Elworthy formula:
for every \(s>0\), \(\phi \in B_B(K)\).
Now by the semigroup property, the regularizing property of the semigroup (24) and by (22) we have:
for every \(s>0\), \(v_1,v_2 \in K\).
Finally similarly to before by (25) for \(s>0\) we have:
for every \(v \in K\), \(\phi \in B_B(K)\). \(\square \)
Now we study the mixing properties of the semigroup of the fast motion. To this purpose define for \(0 \le s \le t \le \infty \), \(v \in K\)
Then a classical consequence of Lemmas 4.2, 4.4 is the following mixing result whose proof is the same of [7, Lemma 3.2] and is reported in the appendix for completeness.
Lemma 4.5
There exists \(C>0\) such that
for every \(0 \le s \le t\), \(v \in K\).
Now Lemma 4.5 implies the following classical result, e.g. see [37] (see also [7, Proposition 3.3]). The proof can be found in the appendix for completeness.
Lemma 4.6
There exists \(C>0\) such that for every \(0\le s_1 \le t_1 < s_2 \le t_2\) and \(\xi _i\) \(\mathcal {H}_{s_i}^{t_i}(v)\)-measurable \(i=1,2\) and \(|\xi _i|\le 1\) a.s \(i=1,2\)
Since in our case \(|\xi _i|\) will not be bounded by 1 we need the following result which is similar to [7, Proposition 3.3]. Also in this case we postpone the proof in the appendix.
Lemma 4.7
Let \(\rho \in (0,1)\). Then there exists \(C=C(\rho )>0\) such that for every \(0\le s_1 \le t_1 < s_2 \le t_2\) and \(\xi _i\) \(\mathcal {H}_{s_i}^{t_i}(v)\)-measurable, \(i=1,2\) satisfying for some \(K_i=K_i(\rho )>0\)
then:
5 Averaged Equation
In this section we introduce the averaged equation and we prove its well-posedness.
for every \(u\in H\) and consider the so called averaged equation:
for every \(t\le T\).
Now we prove the well-posedness of the averaged equation:
Proposition 2
Equation (28) admits a unique mild solution given by:
for every \(t \in [0,T]\).
Moreover for every \(p>0\) there exists \(C=C(T,p)>0\) such that
Proof
In order to prove the first part of the Proposition it is sufficient to prove that \({\overline{F}}\) is Lipschitz (e.g. see [10]). But this follows from the Lipschitz continuity of F, indeed:
Hence we obtain the Lipschitzianity of \({\overline{F}}\) and the first claim of the Proposition.
The proof of the second claim of the Proposition follows by a standard application of Gronwall’s Lemma. \(\square \)
6 Preliminary Results
In this section we prove a technical result, i.e. Lemma 6.3, which is inspired by [7, Lemma 4.2] and follows by the mixing properties of the fast motion studied in Sect. 4, in particular Lemma 4.7. In order to prove it we proceed with similar techniques to the ones of the proof of [7, Lemma 4.2].
Fix \({\overline{\xi }} \in {\mathcal {C}}([0,T];H)\), \(v \in K, h\in H\) with \(|h|=1\), and define
and
Moreover let \(n \in \mathbb {N}\) and set:
for every \(1\le j\le n\), \(0 \le r_1 \le \ldots \le r_{2n} \le T/\epsilon \).
First we show the following result:
Lemma 6.1
Let \(0< \rho < 1\) then there exists \(C=C(T,\rho ,n)>0\) and \(\eta =\eta (\rho ,j_2-j_1)>0\) such that
for every \(u \in H,v \in K\), \(1\le j_1 \le j_2 \le n\), \(0 \le r_1 \le \ldots \le r_{2n} \le T/\epsilon \).
Moreover there exists \(C=C(T,\rho ,n)>0\) and \(\eta = \eta (\rho ,n) >0\) such that
where
for every \(1\le j\le n\), \(0 \le r_1 \le \ldots \le r_{2n} \le T/\epsilon \).
Remark 6.2
Note that the dependence of the exponents with respect to \(j_2-j_1\) and n is not restrictive: once n has been fixed (together with \(\rho \) and T) we are allowed to take any \(0 \le j_1 \le j_2 \le n\). In this sense \(\eta =\eta (\rho , j_2-j_1)\). Of course since \(j_2-j_1 \le n\) we could choose \(\eta '=\eta '(\rho ,n) \ge \eta \) and replace \(\eta \) with \(\eta '\) in the estimates of the Lemma. However the estimate with \(\eta \) is more precise.
Proof
By the sublinearity of \(\Psi _h\) and Lemma 4.2 for every \(p\ge 1\) we have:
for every \(1 \le j_1 \le j_2 \le 2n\).
Notice that \(V^{\epsilon }_{\epsilon r}\) and \(v^v_r\), defined by (18) for \(w_r^{Q_2}= W^{Q_2}_{\epsilon r}/ \sqrt{\epsilon }\), are indistinguishable for \(r \in [0,T/\epsilon ]\). Then by setting \(p=2/(1-\rho )\) and applying Lemma 4.7 to \(\xi _1=\prod _{i=1}^{j} \Psi _h \left( r_{i}\right) \), \(\xi _2=\prod _{i=j+1}^{2 n} \Psi _h \left( r_{i}\right) \) with \(K_1=C(1+\sup _{r \le T}|{\overline{\xi }}_r|^j+|v|^j)\), \(K_2=C(1+\sup _{r \le T}|{\overline{\xi }}_r|^{2n-j}+|v|^{2n-j})\) for every \(0 \le j_1< j < j_2 \le 2n\) we have:
where \(\eta =j_2-j_1 + \frac{\rho ({j_2}-j_1-1)}{(\rho +2)}\).
By definitions of \(\Psi _h\), the indistinguishability of \(V^{\epsilon }_{\epsilon r}\), \(v^v_r\) and by Lemma 4.4 we have:
Now by the last three inequalities we have:
where \( \eta =(j_2-j_1 +1)+ \frac{\rho (j_2-j_1)}{(\rho +2)}\). This implies (30).
Now by (30) we have:
for every \(j_1 <2n.\)
Now fix any \(j < 2n\). Then by the last inequality and (32) we have:
Finally consider (33) with \(j_1=1,j_2=2n\) and (36). Since the function \(f(s)=e^{-\delta s} s^{-1/2}\) is decreasing we have
This implies (31). \(\square \)
Let \(0<\alpha \), \(0 \le \beta < 1/3\) and set
Now we can state and prove the main result of this section.
Lemma 6.3
Let \(n \in \mathbb {N}\) and \(0 \le \beta < 1/3\). Then there exists a constant \(C=C(T,n,\beta )>0\) and \(\eta =\eta (n)>0\) such that for every \(\epsilon >0\), \(\alpha >0\), \(u \in H,v \in K\), \(h\in H\) with \(|h|=1\) we have:
Proof
Recall the definition of \(\Psi _h(r)\), then by a change of variable we have:
where we have defined:
By simmetry we have:
We proceed by induction on n and to this purpose fix some \(\rho \in (0,1)\). Consider \(n=1\) then by the definition of \(\theta _{\alpha ,\beta }=e^{-r \alpha }r^{-\beta }\), (30) and some changes of variables we have
so that by (37) we have the thesis for \(n=1\).
Now assume that
for every even \(j<2n\) where \(\eta _j=j + \frac{\rho (j-1)}{(\rho +2)}\).
We prove that then it holds for \(j=2n\).
Set for \(r=(r_1,\ldots r_{2n}) \in (s,t)^{2n}\) with \(s \le r_1 \le \ldots \le r_{2n} \le t\) the integer j(r) such that
and consider \(H_\epsilon (s,t)\) given by (38). Recalling the definition of \(J_j(r_1,\ldots r_{2n})\) given by (29) we have:
Note that by definition of j(r), for every \(s \le r_1 \le \ldots \le r_{2n} \le t\), we have
It follows:
Now by (31) and the definition of j(r) we have:
where \(\bar{\rho }=\frac{\rho }{2(2+\rho )}\).
Recall that \(r_{i+1} \ge r_{i}\) for every i and note that \(\max \left( r_{2 n}-r_{2 n-1}, r_{2n-1}-r_{2n-2}\right) \ge 1/2 (r_{2 n}-r_{2 n-2})\) and \(\max \left( r_{2 n}-r_{2 n-1}, r_{2n-1}-r_{2n-2},\cdots ,r_{3}-r_{2}\right) \ge \frac{1}{n} \left( r_{2 n}-r_{2 n-1}+\sum _{i=1}^{n-1} r_{2 i+1}-r_{2 i}\right) .\) Hence, since \(g(t)=1/t^{{\bar{\rho }}}\), \(f(t)=e^{-\alpha t}\) for \(\alpha >0\) are decreasing, we have (recall that C may change from line to line):
where in the last equality we have defined \(\delta _{n}=\frac{\delta \rho }{n(2+\rho )}\) and we have used the following identity: \(\sum _{i=1}^{n-1} r_{2 i+1}-r_{2 i}=r_{2n-1}-r_{2n-2}+ \sum _{i=1}^{n-2} r_{2 i+1}-r_{2 i}\).
We can apply this last inequality in order to estimate \(I_{1,\epsilon }(s,t)\), i.e.
Now for \(k=1,2,3\), \(i=1,3, \ldots 2n-1\) we obtain:
Now consider \(i=2,4,\ldots 2n\), we have:
Now consider the remaining term in the inequality for \(I_{1,\epsilon }(s,t)\): by (40) and (41) we have
Applying now (42) and (43) to the inequality for \(I_{1,\epsilon }(s,t)\) we have:
In addition by the inductive hypothesis we have:
Finally applying the last two inequalities to (39) and then going back to (37) we have the thesis. \(\square \)
7 Order of Convergence
In this section we finally investigate the order of convergence. We first prove the following proposition which will be crucial in the derivation of the order of convergence.
Proposition 3
There exist \(C=C(T)>0\), \(\iota >0\) such that
for every \(0 \le \tau \le T\), \(\epsilon >0\), \(u \in H,v \in K\).
Proof
We proceed with similar techniques to the ones in the proof of [7, Theorem 4.1].
Define
We proceed using the factorization method, e.g. see [10, Chapter 5, Section 3]: fix \(\zeta >0,n \ge 2\) integer and \( 1/(2n)< \beta < 1/3\) as in Hypothesis 1, then:
where
By Holder’s inequality we have:
We now claim that there exist \(C=C(T)>0\) and \(\eta >0\) such that
for every \(0 \le t_0 \le s \le T\), \(\epsilon >0\), \(u \in H,v \in K\).
Indeed first recall the spectral representation (3). Then by Parseval’s identity, Holder’s inequality, Hypothesis 1 and the properties of conditional expectations we have:
Now consider the conditional expectation on the right-hand-side. Fixing \(s \ge 0\), due to the independence of the averaged component \(\{U_r, \forall r \le s \}\) (which is \(\sigma \{ W_r^{Q_1}, \forall r \le s \} -\)measurable) and the fast component \(\{V_r^\epsilon , \forall r \le s \}\) (which is \(\sigma \{ W_r^{Q_2}, \forall r \le s \}- \)measurable being independent of \(\sigma \{ W_r^{Q_1}, \forall r \le s \} \)), for every \({\overline{\xi }} \in C([0,s]; H)\) we have
where the last inequality follows by Lemma 6.3.
Hence we have:
Inserting this inequality in the one for \(\mathbb {E} \left| Y^{\epsilon }_s\right| ^{2 n}\) we have:
The series on the right-hand-side is convergent by Hypothesis 1 and we have (47), so that the claim is proved.
Inserting (47) into (46) we have:
Finally by Holder’s inequality we have the thesis:
for \(\iota =\eta / n\). \(\square \)
We can now state and prove the main Theorem of this work:
Theorem 1
Let \(u \in H\), \(v \in K\) and assume Hypotheses 1, 2, 3, 4, 5. Then there exists \(C=C(T,|u|,|v|)>0\) such that
for every \(\epsilon >0\).
Proof
For \(t \in [0,T]\) we have:
so that:
Now let \(0 \le \tau \le T\) and compute
For the first term on the right-hand-side by the Lipschitz continuity of F we have:
For the second term on the right-hand-side by Proposition 3 we have:
Putting everything together we have:
for every \(\tau \le T\).
Then by Gronwall’s Lemma we have the thesis of the Theorem:
\(\square \)
Finally we can provide an application to which our theory can be applied and which is not covered by the existing literature.
Example 1
Consider the following fully coupled slow-fast stochastic reaction-diffusion system:
where
-
\(t \in [0,T], \xi \in [0,L]\),
-
\(\epsilon \in (0,1]\) is a small parameter representing the ratio of time-scales between the two variables of the system \(u_\epsilon \) and \(v_\epsilon \),
-
\(u_\epsilon \) and \(v_\epsilon \) are the slow and fast components respectively,
-
\(u,v \in H= L^2[0,T]\) are the initial conditions,
-
\(\lambda >0\),
-
\(f,g :[0,L] \times \mathbb {R} \rightarrow \mathbb {R}\) are Lipschitz functions uniformly wrt \(\xi \) with Lipschitz constants \(L_f,L_G\) respectively and \(L_G < \lambda \),
-
\({\dot{w}}_1, {\dot{w}}_2\) are independent white noises both in time and space.
Then it is well known [6] that (49) can be rewritten in the abstract form (2) where \(H=K=L^2[0,T]\), \(F:H\times H\rightarrow H\), \(G:H \rightarrow H\) are the Nemytskii operators of f, g respectively, i.e.
In this setting the hypotheses of Theorem 1 are satisfied (recall Remarks 2.1, 2.2) so that the result can be applied.
References
Bardi, M., Cesaroni, A., Manca, L.: Convergence by viscosity methods in multiscale financial models with stochastic volatility. SIAM J. Financ. Math. 1(1), 230–265 (2010). https://doi.org/10.1137/090748147
Bréhier, C.E.: Strong and weak orders in averaging for SPDEs. Stoch. Process. Appl. 122(7), 2553–2593 (2012)
Bréhier, C.E.: Analysis of an HMM time-discretization scheme for a system of stochastic PDEs. SIAM J. Numer. Anal. 51, 1185–1210 (2013)
Bréhier, C.E.: Orders of convergence in the averaging principle for SPDEs: the case of a stochastically forced slow component. Stoch. Process. Appl. 130(6), 3325–3368 (2020)
Cerrai, S., Freidlin, M.: Averaging principle for stochastic reaction-diffusion equations. Prob. Theory Relat. Fields 144, 137–177 (2009)
Cerrai, S.: A Khasminskii type averaging principle for stochastic reaction-diffusion equations. Ann. Appl. Probab. 19(3), 899–948 (2009). https://doi.org/10.1214/08-AAP560
Cerrai, S.: Normal deviations from the averaged motion for some reaction-diffusion equations with fast oscillating perturbation. J. Math. Pures Appl. 91(6), 614–647 (2009)
Cerrai, S.: Averaging principle for systems of reaction-diffusion equations with polynomial non- linearities perturbed by multiplicative noise. SIAM J. Math. Anal. 43, 2482–2518 (2011)
Cerrai, S., Lunardi, A.: Averaging principle for nonautonomous slow-fast systems of stochastic reaction- diffusion equations: the almost periodic case. SIAM J. Math. Anal. 49, 2843–2884 (2017). https://doi.org/10.1137/16M1063307
Da Prato, G., Zabczyk, J.: Stochastic Equations in Infinite Dimensions. Cambridge University Press, Cambridge (2014)
Da Prato, G., Zabczyk, J.: Ergodicity for Infinite Dimensional Systems, vol. 229. Cambridge University Press, Cambridge (1996)
de Feo, F.: The averaging principle for stochastic differential equations and a financial application. Master’s Thesis, Politecnico di Milano (2020). https://www.politesi.polimi.it/handle/10589/165293
de Feo, F.: The averaging principle for non-autonomous slow-fast stochastic differential equations and an application to a local stochastic volatility model. J. Differ. Equ. 302, 406–443 (2021)
Dong, Z., Sun, X., Xiao, H., Zhai, J.: Averaging principle for one dimensional stochastic Burgers equation. J. Differ. Equ. 265, 4749–4797 (2018)
Fu, H., Wan, L., Liu, J.: Strong convergence in averaging principle for stochastic hyperbolic- parabolic equations with two time-scales. Stoch. Process. Appl. 125(8), 3255–3279 (2015)
Givon, D.: Strong convergence rate for two-time-scale jump-diffusion stochastic differential systems. Multiscale Modeling Simul. 6(2), 577–594 (2007)
Golec, J.: Stochastic averaging principle for systems with pathwise uniqueness. Stoch. Anal. Appl. 13, 307–322 (1995)
Golec, J., Ladde, G.: Averaging principle and systems of singularly perturbed stochastic differential equations. J. Math. Phys. 31, 1116–1123 (1990)
Guatteri, G., Tessitore, G.: Singular limit of BSDEs and optimal control of two scale stochastic systems in infinite dimensional spaces. Appl. Math. Optim. 83(2), 1025–1051 (2021)
Guatteri, G., Tessitore, G.: Singular limit of two-scale stochastic optimal control problems in infinite dimensions by vanishing noise regularization. SIAM J. Control Optim. 60(1), 575–596 (2022)
Fouque, J.P., Papanicolaou, G., Sircar, R.: Derivatives in Financial Markets with Stochastic Volatility. Cambridge University Press, Cambridge (2000)
Fouque, J.P., Papanicolaou, G., Sircar, R., Solna, K.: Multiscale Stochastic Volatility for Equity, Interest Rate, and Credit Derivatives. Cambridge University Press, Cambridge (2011)
Freidlin, M.I., Wentzell, A.D.: Random Perturbations of Dynamical Systems. Springer, New York (1998)
Fu, H., Wan, L., Liu, J.: Strong convergence in averaging principle for stochastic hyperbolic-parabolic equations with two time-scales, Stochastic Processes and their Applications, 125(8), 3255–3279. ISSN 0304-4149 (2015) https://doi.org/10.1016/j.spa.2015.03.004
Fu, H., Wan, L., Liu, J., Zhang, B.: Weak order in averaging principle for stochastic differential equations with jumps. Adv. Differ. Equ. Paper No. 197, 20 pp (2018)
Fu, H., Wan, L., Liu, J., Liu, X.: Weak order in averaging principle for stochastic wave equation with a fast oscillation. Stoch. Process. Appl. 128(8), 2557–2580 (2018)
Khasminskii, R.Z.: On an averaging principle for Itô stochastic differential equations. Kibernetika 4(3), 260–279 (1968). ((Russian))
Khasminskii, R.Z., Yin, G.: On averaging principles: an asymptotic expansion approach. SIAM J. Math. Anal. 35(6), 1534–1560 (2004)
Liu, D., Weinan, E., Vanden-Eijnden, E.: Analysis of multiscale methods for stochastic differential equations. Commun. Pure Appl. Math. 58(11), 1544–1585 (2005)
Liu, D.: Strong convergence of principle of averaging for multiscale stochastic dynamical systems. Commun. Math. Sci. 8, 999–1020 (2010)
Liu, W., Rockner, M., Sun, X., Xie, Y.: Averaging principle for slow-fast stochastic differential equations with time dependent locally Lipschitz coefficients, Journal of Differential Equations, 268(6), 2910–2948, ISSN 0022-0396 (2020). https://doi.org/10.1016/j.jde.2019.09.047
Fuke, W., Tian, T., Rawlings, J.B., Yin, G.: Approximate method for stochastic chemical kinetics with two-time scales by chemical Langevin equations. J. Chem. Phys. 144(17), 174112 (2016). https://doi.org/10.1063/1.4948407
Pazy, A.: Semigroups of Linear Operators and Applications to Partial Differential Equations. Springer, New York (1983)
Roberts, A.J., Wang, W.: Average and deviation for slow-fast stochastic partial differential equations. J. Differ. Equ. 253(5), 1265–1286 (2012)
Röckner, M., Sun, X., Xie, L.: Strong and weak convergence in the averaging principle for SDEs with Holder coefficients. arXiv preprint arXiv:1907.09256 (2019)
Röckner, M., Xie, L., Yang, L.: Asymptotic behavior of multiscale stochastic partial differential equations. arXiv preprint arXiv:2010.14897 (2020)
Rozanov, I.A.: Stationary random processes. Holden-Day. (1967)
Święch, A.: Singular perturbations and optimal control of stochastic systems in infinite dimension: HJB equations and viscosity solutions. ESAIM Control Optim. Calculus Variations 27, 6 (2021)
Uda, K.: Averaging principle for stochastic differential equations in the random periodic regime. Stoch. Process. Appl. 139, 1–36. ISSN 0304-4149, (2021) https://doi.org/10.1016/j.spa.2021.04.017
Veretennikov, AYu.: On the averaging principle for systems of stochastic differential equations. Math. USSR-Sbornik 69(1), 271 (1991)
Xu, J., Liu, J., Miao, Y.: Strong averaging principle for two-time-scale SDEs with non-Lipschitz coefficients. J. Math. Anal. Appl. 468(1), 116–140 (2018)
Acknowledgements
The author thanks warmly his PhD supervisor Giuseppina Guatteri for her precious and constant support during the preparation of this manuscript. He also thanks Sandra Cerrai and Gianmario Tessitore for queryA funding declaration is mandatory for publication in this journal. Please confirm that this declaration is accurate, or provide an alternative.some relevant conversations related to the content of the present paper. Finally he thanks the two anonimous referees for their very careful reading of the manuscript and their precious remarks and suggestions.
Funding
This research was partially financed by the INdAM - GNAMPA Project “Riduzione del modello in sistemi dinamici stocastici in dimensione infinita a due scale temporali”. Open access funding provided by Politecnico di Milano within the CRUI-CARE Agreement.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
Author declared that there is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Proof of Lemma 4.5
Proof
First observe that
where
is the family of cylindrical sets.
Consider \(B_{1} \in \mathcal {C}_{0}^{t}(v)\) and \(B_{2} \in \mathcal {C}_{s+t}^{\infty }(v)\), i.e.
for \(0 \le r_{1,1}< \cdots <r_{1, k_{1}} \le t\) and \(s+t \le r_{2,1}<\cdots<r_{2, k_{2}}<\infty \) and \(A_{j, i} \in \mathcal {B}(K)\), for \(j=1,2\) and \(i=1, \ldots , k_{j}\).
First by the tower property of conditional expectations we have:
where \(1_{A_{i,j}}(\cdot )\) is the indicator function.
Now, as \(t+s \leqslant r_{2,1}<\cdots <r_{2, k_{2}}\), we have:
By iteration we have:
so that by (51) we have:
In a similar way we have:
so that
It follows
for \(\phi {:}{=}1_{A_{2,1}} P_{r_{2,2}-r_{2,1}}\left( 1_{A_{2,2}} P_{r_{2,3}-r_{2,2}}\left( \cdots I_{A_{2,k_2-1}} P_{r_{2,k_2}-r_{2,k_2-1}}\left( I_{A_{2,k_2}} \right) \cdots \right) \right) \).
Then by Lemmas 4.2 and 4.4 and as \(f(s)=e^{-\delta s} s^{-1/2}\) is decreasing we have:
so that the inequality holds when \(B_{1} \in \mathcal {C}_{0}^{t}(v)\) and \(B_{2} \in \mathcal {C}_{t+s}^{\infty }(v)\).
Finally recalling (50) the validity of the inequality can be extended to every \(B_{1} \in \mathcal {H}_{0}^{t}(v)\) and \(B_{2} \in \mathcal {H}_{s+t}^{\infty }(v)\). \(\square \)
Proof of Lemma 4.6
Proof
As \(|\xi _1| \le 1\) a.s. we have:
where we have defined
Similarly, as \(|\xi _2| \le 1\) a.s., we have:
where we have defined
Then it follows:
Notice that
with
It follows:
Now by Lemma 4.5 as \(A \in \mathcal {H}_{s_{1}}^{t_{1}}(v), B \in \mathcal {H}_{s_{2}}^{t_{2}}(v)\) we have:
so that we have the thesis of the Lemma. \(\square \)
Proof of Lemma 4.7
Proof
We proceed in a similar way to the proof of [7, Proposition 3.3].
Indeed fix \(R>0\) and set \(A_{1, R}=\left\{ \omega \in \Omega : \left| \xi _{1}\right| \le R\right\} \), \(A_{2, R}=\left\{ \omega \in \Omega :\left| \xi _{2}\right| \le R\right\} \).
Then we have:
Consider \(T_{1, R}\), then we have:
Now as \(\left| \frac{\xi _{i}}{R} 1_{A_{i, R}}\right| \le 1\) a.s then by Lemma 4.6 we have:
For \(T_{2, R}\) by Holder’s and Markov’s inequalities we have:
and then by (26) we have:
For the first term of \(T_{3,R}\) we have:
Also the other terms can be treated in an analogous way and then similarly to before we have:
Now by inserting the inequalities for \(T_{i,R}\) into the first equation we have:
By minimizing over \(R>0\) the right-hand-side of the previous inequality we have:
\(\square \)
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
de Feo, F. The Order of Convergence in the Averaging Principle for Slow-Fast Systems of Stochastic Evolution Equations in Hilbert Spaces. Appl Math Optim 88, 39 (2023). https://doi.org/10.1007/s00245-023-10018-0
Accepted:
Published:
DOI: https://doi.org/10.1007/s00245-023-10018-0