1 Introduction

For \(i=1,2\), let \((H_i, \Vert \cdot \Vert _{H_i})\) be a separable Hilbert spaces with inner product \(\langle \cdot ,\cdot \rangle _{H_i}\) and \(H^{*}_i\) its dual. Let \((V_i, \Vert \cdot \Vert _{V_i})\) be a reflexive Banach space, such that \(V_i\subseteq H_i\) continuously and densely. Then for its dual space \(V^{*}_i\) it follows that \(H^{*}_i\subseteq V^{*}_i\) continuously and densely. Identifying \(H_i\) and \(H^{*}_i\) via the Riesz isomorphism we have that

$$\begin{aligned} V_i\subseteq H_i\equiv H^{*}_i\subseteq V^{*}_i \end{aligned}$$

is a Gelfand triple. Let \(_{V^{*}_i}\langle ~, ~\rangle _{V_i}\) be the dualization between \(V^{*}_i\) and \(V_i\). Then it follows that

$$\begin{aligned} _{V^{*}_i}\langle z_i, v_i\rangle _{V_i} =\langle z_i,v_i\rangle _{H_i},\quad \text {for all}~z_i\in H_i, v_i\in V_i. \end{aligned}$$

For \(i=1,2\), let \(\{W^{i}_t\}_{t\geqslant 0}\) be a cylindrical \(\mathscr {F}_t\)-Wiener process in a separable Hilbert space \((U_i, \Vert \cdot \Vert _{U_i})\) on a probability space \((\Omega ,\mathscr {F},\mathbb {P})\) with natural filtration \(\mathscr {F}_{t}\). Let \(L_{2}(U_i,H_i)\) be the space of Hilbert-Schmidt operator from \(U_i\rightarrow H_i\). The norm on \(L_{2}(U_i,H_i)\) is defined by

$$\begin{aligned} \Vert S\Vert ^2_{L_{2}(U_i,H_i)}:=\sum _{k\in \mathbb {N}}\Vert S e_{i,k}\Vert ^2_{H_i},\quad S\in L_{2}(U_i,H_i), \end{aligned}$$

where \(\{e_{i,k}\}_{k\in \mathbb {N}}\) is an orthonormal basis of \(U_i\). We also assume the processes \(\{W^{1}_t\}_{t\geqslant 0}\) and \(\{W^{2}_t\}_{t\geqslant 0}\) are independent.

In this paper, we consider the following abstract stochastic partial differential equations (SPDEs)

$$\begin{aligned} \left\{ \begin{array}{l} \displaystyle dX^{\varepsilon }_t=\left[ A(X^{\varepsilon }_t)+F(X^{\varepsilon }_t, Y^{\varepsilon }_t)\right] dt +G_1(X^{\varepsilon }_t)d W^{1}_{t},\\ \displaystyle dY^{\varepsilon }_t=\frac{1}{\varepsilon }B(X^{\varepsilon }_t, Y^{\varepsilon }_t)dt +\frac{1}{\sqrt{\varepsilon }}G_2(X^{\varepsilon }_t, Y^{\varepsilon }_t)d W^{2}_{t},\\ X^{\varepsilon }_0=x\in H_1, Y^{\varepsilon }_0=y\in H_2,\end{array}\right. \end{aligned}$$
(1.1)

where \(\varepsilon >0\) is a small parameter describing the ratio of the time scale between the slow component \(X^{\varepsilon }_t\) and the fast component \(Y^{\varepsilon }_t\), and the coefficients

$$\begin{aligned} A: V_1\rightarrow V^{*}_1;\quad F:H_1\times H_2\rightarrow H_1; \quad G_1: V_1\rightarrow L_{2}(U_1; H_1); \end{aligned}$$

and

$$\begin{aligned} B: H_1\times V_2\rightarrow V^{*}_2; \quad G_2:H_1\times V_2\rightarrow L_{2}(U_2; H_2) \end{aligned}$$

are measurable.

The averaging principle has a long and rich history in multiscale models, which has wide applications in material sciences, chemistry, fluid dynamics, biology, ecology and climate dynamics, see, e.g., [1, 10, 17, 22] and the references therein. Usually, a multiscale model can be described through coupled equations, which correspond to the "slow" and "fast" component, respectively. The averaging principle is essential to describe the asymptotic behavior of the slow component, i.e., the slow component will convergence to the so-called averaged equation. Bogoliubov and Mitropolsky [2] first studied the averaging principle for deterministic systems, which then was extended to stochastic differential equations by Khasminskii [18].

Since the averaging principle for a general class of stochastic reaction-diffusion systems with two time-scales were investigated by Cerrai and Freidlin in [6], the averaging principle for slow–fast SPDEs has initiated further studies in the past decade, including other types of SPDEs, various ways of convergence and rates of convergence. For instance, Bréhier obtained the strong and weak orders in averaging for stochastic evolution equation of parabolic type with slow and fast time scales in [3]. Fu, Wan and Liu proved the strong averaging principle for stochastic hyperbolic-parabolic equations with slow and fast time-scales in [13]. Cerrai and Lunardi studied the averaging principle for nonautonomous slow–fast systems of stochastic reaction-diffusion equations in [7]. For some further results on this topic, we refer to [4, 11, 12, 21, 24, 25] and the references therein.

However, the references we mentioned above always assume that the coefficients satisfy Lipschitz conditions, and there are few results on the average principle for SPDEs with nonlinear terms. For example, stochastic reaction-diffusion equations with polynomial coefficients [5], stochastic Burgers equation [9], stochastic two dimensional Navier–Stokes equations [19], stochastic Kuramoto-Sivashinsky equation [14], stochastic Schrödinger equation [15] and stochastic Klein-Gordon equation [16]. But all these papers consider semilinear SPDEs (i.e., for operators \(A=A_1+A_2\) with \(A_1\) a linear operator and \(A_2\) a nonlinear perturbation), and use the mild solution approach to SPDEs exploiting the smoothing properties of the \(C_0\)- semigroup \(e^{A_1 t}\) generated by the linear operator \(A_1\) in an essential way. To the best of our knowledge, the case of the operator A has no linear part hasn’t been studied yet, such as the porous medium operator and the p-Laplace operator.

Hence, the main purpose of this paper is to prove the strong averaging principle for slow–fast SPDEs within the (generalized) variational framework, i.e., locally monotone and strongly monotone coefficients for the slow and fast equations respectively. Our result covers a large class of examples (see [20, Sects. 4 and 5]), especially for the case that the slow equation is a quasilinear SPDEs, such as the stochastic porous medium equation or the stochastic p-Laplace equation. Our result is also applicable to the stochastic Burgers type equation and stochastic two dimensional Navier–Stokes equation, whose coefficients only satisfy the local monotonicity conditions.

The main difficulty here is how to avoid applying the techniques which only work in the case of the mild solution approach, and use the techniques from the variational approach. More precisely, we will use the variational approach to estimate the integral of the time increment of \(X_{t}^{\varepsilon }\) instead of studying the Hölder continuity of time, which is strong enough for our purpose. We will also use the variational approach to obtain some apriori estimates of the solution, which are crucial to construct a proper stopping time to deal with the nonlinear terms.

The rest of the paper is organized as follows. In Sect. 2, under some suitable assumptions, we formulate our main result. Section 3 is devoted to proving the main result. In Sect. 4, we will give some examples to illustrate the wide applicability of our result. In the Appendix, we give the detailed proof of the existence and uniqueness of solutions to system (1.1).

Throughout the paper, C, \(C_T\) and \(C_{p,T}\) denote positive constants which may change from line to line, where \(C_T\) and \(C_{p,T}\) are used to emphasize that the constants depend on T and pT respectively.

2 Main Result

For the coefficients of the slow equation, we suppose that there exist constants \(\alpha \in (1,\infty )\), \(\beta \in [0, \infty )\), \(\theta \in (0,\infty )\) and \(C>0\) such that the following conditions hold for all \(u,v,w\in V_1\), \(u_1,v_1\in H_1\) and \(u_2,v_2\in H_2\):

A1:

(Hemicontinuity) The map  \(\lambda \rightarrow {_{V^{*}_1}}\langle A(u+\lambda v), w\rangle _{V_1}\) is continuous on \(\mathbb {R}\).

A2:

(Local monotonicity)

$$\begin{aligned} 2_{V^{*}_1}\langle A(u)-A(v), u-v\rangle _{V_1}+\Vert G_1(u)-G_1(v)\Vert ^2_{L_2(U_1,H_1)}\leqslant \rho (v)\Vert u-v\Vert ^2_{H_1}. \end{aligned}$$

where \(\rho : V_1\rightarrow [0,\infty )\) is a measurable hemicontinuous function satisfying

$$\begin{aligned} |\rho (v)|\leqslant C\left[ (1+\Vert v\Vert ^{\alpha }_{V_1})(1+\Vert v\Vert ^{\beta }_{H_1})\right] . \end{aligned}$$

Furthermore,

$$\begin{aligned} \Vert F(u_1,u_2)-F(v_1,v_2)\Vert _{H_1}\leqslant C(\Vert u_1-v_1\Vert _{H_1}+\Vert u_2-v_2\Vert _{H_2}). \end{aligned}$$
(2.1)
A3:

(Coercivity)

$$\begin{aligned} _{V^{*}_1}\langle A(v), v\rangle _{V_1}\leqslant C\Vert v\Vert ^2_{H_1}-\theta \Vert v\Vert ^{\alpha }_{V_1}+C. \end{aligned}$$
A4:

(Growth)

$$\begin{aligned} \Vert A(v)\Vert ^{\frac{\alpha }{\alpha -1}}_{V^{*}_1}\leqslant C(1+\Vert v\Vert ^{\alpha }_{V_1})(1+\Vert v\Vert ^{\beta }_{H_1}) \end{aligned}$$

and

$$\begin{aligned} \Vert G_1(v)\Vert _{L_2(U_1,H_1)}\leqslant C(1+\Vert v\Vert _{H_1}). \end{aligned}$$

For the coefficients of the fast equation, we suppose that there exist constants \(\kappa \in (1,\infty )\), \(\gamma ,\eta \in (0, \infty )\), \(\zeta \in (0,1)\) and \(C>0\) such that the following conditions hold for all \(u,v,w\in V_2\), \(u_1,v_1\in H_1\):

B1:

(Hemicontinuity) The map  \(\lambda \rightarrow _{V^{*}_2}\langle B(u_1+\lambda v_1,u+\lambda v), w\rangle _{V_2}\) is continuous on  \(\mathbb {R}\).

B2:

(Strong monotonicity)

$$\begin{aligned}{} & {} 2_{V^{*}_2}\langle B(u_1,u)-B(v_1,v), u-v\rangle _{V_2}+\Vert G_2(u_1,u)-G_2(v_1,v)\Vert ^2_{L_2(U_2,H_2)}\nonumber \\{} & {} \quad \leqslant -\gamma \Vert u-v\Vert ^2_{H_2}+C\Vert u_1-v_1\Vert ^2_{H_1}. \end{aligned}$$
(2.2)
B3:

(Coercivity)

$$\begin{aligned} _{V^{*}_2}\langle B(u_1,v), v\rangle _{V_2}\leqslant C\Vert v\Vert ^2_{H_2}-\eta \Vert v\Vert ^{\kappa }_{V_2}+C(1+\Vert u_1\Vert ^2_{H_1}). \end{aligned}$$
B4:

(Growth)

$$\begin{aligned} \Vert B(u_1,v)\Vert _{V^{*}_2}\leqslant C\left( 1+\Vert v\Vert ^{\kappa -1}_{V_2}+\Vert u_1\Vert ^{\frac{2(\kappa -1)}{\kappa }}_{H_1}\right) \end{aligned}$$

and

$$\begin{aligned} \Vert G_2(u_1,v)\Vert _{L_2(U_2,H_2)}\leqslant C(1+\Vert u_1\Vert _{H_1}+\Vert v\Vert ^{\zeta }_{H_2}). \end{aligned}$$
(2.3)

Remark 2.1

We here give some comments for the assumptions above.

  • Condition (2.2) is also called the dissipativity condition, which guarantees that there exists a unique invariant measure for the frozen equation and the exponential ergodicity holds.

  • The \(\zeta \in (0,1)\) in condition (2.3) is used to prove the p-th moments of the solution \((X^{{\varepsilon }}_t, Y^{{\varepsilon }}_t)\) are finite, when p is large enough, which could be removed if we assume the Lispchitz constant of \(G_2\) is sufficiently small.

Now, we recall the definition of a variational solution in [20].

Definition 2.2

For any given \({\varepsilon }>0\), a continuous \(H_1\times H_2\)-valued \(\mathscr {F}_{t}\)-adapted process \((X^{{\varepsilon }}_t, Y^{{\varepsilon }}_t)_{t\in [0, T]}\) is called a solution of system (1.1), if for its \(dt\otimes \mathbb {P}\)-equivalence class \((\check{X}^{{\varepsilon }}, \check{Y}^{{\varepsilon }})\) we have \(\check{X}^{{\varepsilon }}\in L^{\alpha }([0, T]\times \Omega , dt\otimes \mathbb {P}; V_1)\cap L^2([0, T]\times \Omega , dt\otimes \mathbb {P}; H_1)\) with \(\alpha \) as in A3, \(\check{Y}^{{\varepsilon }}\in L^{\kappa }([0, T]\times \Omega , dt\otimes \mathbb {P}; V_2)\cap L^2([0, T]\times \Omega , dt\otimes \mathbb {P}; H_2)\) with \(\kappa \) as in B3 and \(\mathbb {P}\)-a.s.

$$\begin{aligned} \left\{ \begin{array}{l} \displaystyle X^{\varepsilon }_t=X^{\varepsilon }_0+\int ^t_0 A(\tilde{X}^{\varepsilon }_s)ds+\int ^t_0 F(X^{\varepsilon }_s, Y^{\varepsilon }_s)ds+\int ^t_0 G(\tilde{X}^{\varepsilon }_s)dW^{1}_s,\\ Y^{\varepsilon }_t=Y^{\varepsilon }_0+\frac{1}{\varepsilon }\int ^t_0 B(X^{\varepsilon }_s, \tilde{Y}^{\varepsilon }_s)ds +\frac{1}{\sqrt{\varepsilon }}\int ^t_0 G_2(X^{\varepsilon }_s, \tilde{Y}^{\varepsilon }_s)dW^{2}_s, \end{array}\right. \end{aligned}$$
(2.4)

where \((\tilde{X}^{\varepsilon }, \tilde{Y}^{\varepsilon })\) is any \(V_1\times V_2\)-valued progressively measurable \(dt\otimes \mathbb {P}\)-version of \((\check{X}^{{\varepsilon }},\check{Y}^{{\varepsilon }})\).

Using the variational approach in infinite dimensional space, we have the following well-posedness result, whose proof will be presented in the Appendix.

Theorem 2.3

Assume the conditions A1A4, B1B4 hold. Then for any \({\varepsilon }>0\) and initial values \((x, y)\in H_1\times H_2\), the system (1.1) has a unique solution \((X^{\varepsilon },Y^{\varepsilon })\).

The following is the main result of this work.

Theorem 2.4

Assume the conditions A1A4, B1B4 hold. Then for any initial values \((x, y)\in H_1\times H_2\), \(p\geqslant 1\) and \(T>0\), we have

$$\begin{aligned} \lim _{{\varepsilon }\rightarrow 0}\mathbb {E} \left( \sup _{t\in [0,T]}\Vert X_{t}^{{\varepsilon }}-\bar{X}_{t}\Vert ^{2p}_{H_1} \right) =0, \end{aligned}$$
(2.5)

where \(\bar{X}_t\) is the solution of the corresponding averaged equation:

$$\begin{aligned} \left\{ \begin{array}{l} \displaystyle d\bar{X}_{t}=A(\bar{X}_{t})dt+\bar{F}(\bar{X}_{t})dt+G_1(\bar{X}_t)d W^{1}_{t},\\ \bar{X}_{0}=x,\end{array}\right. \end{aligned}$$
(2.6)

with the average \(\bar{F}(x)=\int _{H_2}F(x,y)\mu ^{x}(dy)\). \(\mu ^{x}\) is the unique invariant measure of the transition semigroup of the frozen equation

$$\begin{aligned} \left\{ \begin{aligned}&dY_{t}=B(x,Y_{t})dt+G_2(x,Y_t)d\bar{W}_{t}^{2},\\&Y_{0}=y, \end{aligned} \right. \end{aligned}$$

where \(\{\bar{W}^{2}_t\}_{t\geqslant 0}\) is a \(\tilde{\mathscr {F}}_t\)-cylindrical Wiener process in a separable Hilbert space \(U_2\) on another probability space, with natural filtration \(\tilde{\mathscr {F}}_t\).

Remark 2.5

The advantage of using the variational approach is that it can cover some nonlinear SPDEs for slow component, such as stochastic power law fluids, and some quasilinear SPDEs for slow component, such as the stochastic porous medium equation and the stochastic p-Laplace equation, which can not be handled by the mild solution approach and thus have not been studied yet. Furthermore, our result also generalizes some known results of the cases that the slow component is a semilinear stochastic partial differential equation, such as the stochastic Burgers equation (see [9]) and stochastic two dimensional Navier–Stokes equation (see [19]). Besides some known results, our result can also be applied to many other unstudied hydrodynamical models in [8], such as the stochastic magneto-hydrodynamic equations, the stochastic Boussinesq model for the Bénard convection, the stochastic 2D magnetic Bénard problem, the stochastic 3D Leray-\(\alpha \) model and some stochastic shell models of turbulence.

3 Proof of the Main Result

This section is devoted to proving Theorem 2.4. The proof consists of the following four subsections: In Sect. 3.1, we give some apriori estimates for the solution \((X^{\varepsilon }_t, Y^{\varepsilon }_t)\). Using the apriori estimates, we get an estimate for the time increments for \(X_{t}^{\varepsilon }\), which plays an important role in the proof of the main result. In Sect. 3.2, we will use the technique of time discretization to construct an auxiliary process \(\hat{Y}_{t}^{\varepsilon }\) and give an estimate of the difference process \(Y^{\varepsilon }_t-\hat{Y}_{t}^{\varepsilon }\). In Sect. 3.3, by constructing a stopping time \(\tau _R\), we prove that \(X^{\varepsilon }_t\) strongly converges to \(\bar{X}_{t}\) for \(t<\tau _R\). Finally, the apriori estimates for the solution control the difference \(X^{{\varepsilon }}_t-\bar{X}_t\) after the stopping time. Note that we always assume conditions A1A4 and B1B4 hold and from now on we fix an initial value \((x,y)\in H_1\times H_2\) in this section.

3.1 Some Apriori Estimates of \((X^{\varepsilon }_t, Y^{\varepsilon }_t)\)

At first, we prove uniform bounds with respect to \({\varepsilon }\in (0,1)\) for the moments of the solution \((X_{t}^{\varepsilon }, Y_{t}^{{\varepsilon }})\) to the system (1.1).

Lemma 3.1

For any \(T>0\) and \(p\geqslant 1\), there exists a constant \(C_{p,T}>0\) such that

$$\begin{aligned}&\sup _{{\varepsilon }\in (0,1)}\mathbb {E}\left( \sup _{t\in [0,T]}\Vert X_{t}^{{\varepsilon }}\Vert ^{2p}_{H_1}\right) +\sup _{{\varepsilon }\in (0,1)}\mathbb {E}\left( \int ^T_0\Vert X_{t}^{{\varepsilon }}\Vert ^{2p-2}_{H_1}\Vert \tilde{X}_{t}^{{\varepsilon }}\Vert ^{\alpha }_{V_1}dt\right) \nonumber \\&\quad \leqslant C_{p,T}\left( 1+\Vert x\Vert ^{2p}_{H_1}+\Vert y\Vert ^{2p}_{H_2}\right) \end{aligned}$$
(3.1)

and

$$\begin{aligned} \sup _{{\varepsilon }\in (0,1)}\sup _{t\in [0, T]}\mathbb {E}\Vert Y_{t}^{\varepsilon }\Vert ^{2p}_{H_2}\leqslant C_{p,T}\left( 1+\Vert x\Vert ^{2p}_{H_1}+\Vert y\Vert ^{2p}_{H_2}\right) . \end{aligned}$$
(3.2)

Proof

Applying Itô’s formula (see e.g. [20, Theorem 6.1.1]), we have

$$\begin{aligned} \Vert Y_{t}^{{\varepsilon }}\Vert ^{2}_{H_2}= & {} \Vert y\Vert ^{2}_{H_2}+\frac{2}{\varepsilon }\int _{0}^{t} {_{V^{*}_2}}\langle B(X_{s}^{\varepsilon },\tilde{Y}_{s}^{\varepsilon }),\tilde{Y}_{s}^{\varepsilon }\rangle _{V_2} ds \\{} & {} + \frac{1}{\varepsilon }\int _{0} ^{t}\Vert G_2(X_{s}^{\varepsilon },\tilde{Y}_{s}^{\varepsilon })\Vert _{L_{2}(U_2,H_2)}^2ds +\frac{2}{\sqrt{\varepsilon }}\int _{0} ^{t}\langle G_2(X_{s}^{\varepsilon },\tilde{Y}_{s}^{\varepsilon })dW^{2}_s,Y_{s}^{\varepsilon }\rangle _{H_2}. \end{aligned}$$

Then applying Itô’s formula for \(f(z)=(z)^{p}\) with \(z_t=\Vert Y_{t}^{{\varepsilon }}\Vert ^{2}_{H_2}\), and taking expectation on both sides, we obtain

$$\begin{aligned} \mathbb {E}\Vert Y_{t}^{{\varepsilon }}\Vert ^{2p}_{H_2}= & {} \Vert y\Vert ^{2p}_{H_2}+\frac{2p}{\varepsilon }\mathbb {E}\left[ \int _{0} ^{t}\Vert Y_{s}^{\varepsilon }\Vert ^{2p-2}_{H_2}{_{V^{*}_2}}\langle B(X_{s}^{\varepsilon },\tilde{Y}_{s}^{\varepsilon }),\tilde{Y}_{s}^{\varepsilon }\rangle _{V_2} ds \right] \\{} & {} + \frac{p}{\varepsilon }\mathbb {E}\left[ \int _{0} ^{t}\Vert Y_{s}^{\varepsilon }\Vert ^{2p-2}_{H_2}\Vert G_2(X_{s}^{\varepsilon },\tilde{Y}_{s}^{\varepsilon })\Vert _{L_{2}(U_2,H_2)}^2ds\right] \\{} & {} +\frac{2p(p-1)}{\varepsilon }\mathbb {E}\left[ \int _{0} ^{t}\Vert Y_{s}^{\varepsilon }\Vert ^{2p-4}_{H_2}\Vert G_2(X_{s}^{\varepsilon }, \tilde{Y}_{s}^{\varepsilon })^*Y^\varepsilon _s\Vert _{U_2}^2ds\right] \end{aligned}$$

By conditions B2B4 and a similar argument in the proof of [20, Lemma 4.3.8], there exists a constant \({\hat{\gamma }}\in (0,\gamma )\) such that for any \(u\in H_1, v\in V_2\),

$$\begin{aligned} 2{_{V^{*}_2}}\langle B(u,v),v\rangle _{V_2}\leqslant -{\hat{\gamma }}\Vert v\Vert ^2_{H_2}+C(1+\Vert u\Vert ^2_{H_1}). \end{aligned}$$
(3.3)

Then applying Young’s inequality and estimate (3.3), we get

$$\begin{aligned} \frac{d}{dt}\mathbb {E}\Vert Y_{t}^{{\varepsilon }}\Vert ^{2p}_{H_2}{} & {} =\frac{2p}{\varepsilon }\mathbb {E}\left[ \Vert Y_{t}^{\varepsilon }\Vert ^{2p-2}_{H_2}{_{V^{*}_2}}\langle B(X^{{\varepsilon }}_t, \tilde{Y}_{t}^{\varepsilon }),\tilde{Y}_{t}^{\varepsilon }\rangle _{V_2}\right] \\{} & {} \quad + \frac{p}{\varepsilon }\mathbb {E}\left[ \Vert Y_{t}^{\varepsilon }\Vert ^{2p-2}_{H_2} \Vert G_2(X_{t}^{\varepsilon },\tilde{Y}_{t}^{\varepsilon })\Vert _{L_{2}(U_2,H_2)}^2\right] \\{} & {} \quad +\frac{2p(p-1)}{\varepsilon }\mathbb {E}\left[ \Vert Y_{t}^{\varepsilon }\Vert ^{2p-4}_{H_2} \Vert G_2(X_{t}^{\varepsilon },\tilde{Y}_{t}^{\varepsilon })^*Y^\varepsilon _t\Vert _{U_2}^2\right] \\{} & {} \leqslant \frac{2p}{\varepsilon }\mathbb {E}\left[ \Vert Y_{t}^{\varepsilon }\Vert ^{2p-2}_{H_2}(-{\hat{\gamma }}\Vert Y_{t}^{\varepsilon }\Vert ^{2}_{H_2} +C\Vert X_{t}^{\varepsilon }\Vert _{H_2}^2+C)\right] \\{} & {} \quad + \frac{C_p}{\varepsilon }\mathbb {E}\left[ \Vert Y_{t}^{\varepsilon }\Vert ^{2p-2}_{H_2} (1+\Vert X_{t}^{\varepsilon }\Vert ^2_{H_1}+\Vert Y_{t}^{\varepsilon }\Vert ^{2\zeta }_{H_2})\right] \\{} & {} \leqslant -\frac{C_{p}}{\varepsilon }\mathbb {E}\Vert Y_{t}^{\varepsilon } \Vert ^{2p}_{H_2}+\frac{C_{p}}{\varepsilon }\mathbb {E}\Vert X_{t}^{\varepsilon }\Vert ^{2p}_{H_1} +\frac{C_{p}}{\varepsilon }. \end{aligned}$$

Multiplication by the integrating factor \(e^{\frac{C_p t}{\varepsilon }}\) yields that

$$\begin{aligned} \frac{d}{dt}\left( e^{\frac{C_p t}{\varepsilon }}\mathbb {E}\Vert Y_{t}^{{\varepsilon }}\Vert ^{2p}_{H_2}\right) \leqslant \frac{C_{p}}{\varepsilon }e^{\frac{C_p t}{\varepsilon }}\left( 1+\mathbb {E}\Vert X_{t}^{\varepsilon }\Vert ^{2p}_{H_1}\right) . \end{aligned}$$

Integrate this from 0 to t, we get

$$\begin{aligned} \mathbb {E}\Vert Y_{t}^{\varepsilon }\Vert ^{2p}_{H_2}\leqslant \Vert y\Vert ^{2p}_{H_2}e^{-\frac{C_{p}}{\varepsilon }t}+\frac{C_{p}}{\varepsilon }\int ^t_0 e^{-\frac{C_{p}}{\varepsilon }(t-s)}\left( 1+\mathbb {E}\Vert X_{s}^{\varepsilon }\Vert ^{2p}_{H_1}\right) ds. \end{aligned}$$
(3.4)

On the other hand, applying Itô’s formula, we also have

$$\begin{aligned} \Vert X_{t}^{\varepsilon }\Vert ^{2p}_{H_1}= & {} \Vert x\Vert ^{2p}_{H_1}+2p\int _{0} ^{t}\Vert X_{s}^{\varepsilon }\Vert ^{2p-2}_{H_1}{_{V^{*}_1}}\langle A(\tilde{X}_{s}^{\varepsilon }),\tilde{X}_{s}^{\varepsilon }\rangle _{V_1} ds\\{} & {} +2p\int _{0} ^{t}\Vert X_{s}^{\varepsilon }\Vert ^{2p-2}_{H_1} \langle F(X_{s}^{\varepsilon },Y_{s}^{\varepsilon }),X_{s}^{\varepsilon }\rangle _{H_1} ds\\{} & {} +2p\int _{0} ^{t}\Vert X_{s}^{\varepsilon }\Vert ^{2p-2}_{H_1}\langle X_{s}^{\varepsilon }, G_1(\tilde{X}_{s}^{\varepsilon })dW^{1}_s\rangle _{H_1}\\{} & {} +p\int _{0} ^{t}\Vert X_{s}^{\varepsilon }\Vert ^{2p-2}\Vert G_1( \tilde{X}_{s}^{\varepsilon })\Vert _{L_{2}(U_1,H_1)}^2ds\\{} & {} +2p(p-1)\int _{0}^{t}\Vert X_{s}^{\varepsilon }\Vert ^{2p-4}_{H_1}\Vert G_1(\tilde{X}_{s}^{\varepsilon })^{*}X^\varepsilon _s\Vert _{U_1}^2ds. \end{aligned}$$

Note that

$$\begin{aligned} \left\{ \int _{0} ^{t}\Vert X_{s}^{\varepsilon }\Vert ^{2p-2}_{H_1} \langle F(X_{s}^{\varepsilon },Y_{s}^{\varepsilon }),X_{s}^{\varepsilon }\rangle _{H_1} ds\right\} _{0\leqslant t\leqslant T} \end{aligned}$$

is a local martingale, then applying Burkholder-Davis-Gundy inequality (see e.g. [20, Theorem 6.1.2]), conditions A2A4 and (3.4), we get

$$\begin{aligned}{} & {} \mathbb {E}\left( \sup _{t\in [0, T]}\Vert X_{t}^{{\varepsilon }}\Vert ^{2p}_{H_1}\right) +2p\theta \mathbb {E}\left( \int ^T_0\Vert X_{t}^{{\varepsilon }}\Vert ^{2p-2}_{H_1}\Vert \tilde{X}_{t}^{{\varepsilon }}\Vert ^{\alpha }_{V_1}dt\right) \\{} & {} \quad \leqslant C_P(\Vert x\Vert ^{2p}_{H_1}+1)+C_p\int ^T_0\mathbb {E}\Vert X_{t}^{\varepsilon }\Vert ^{2p}_{H_1}dt+C_p\int ^T_0\mathbb {E}\Vert Y_{t}^{\varepsilon }\Vert ^{2p}_{H_2}dt\\{} & {} \quad \leqslant C_p(\Vert x\Vert ^{2p}_{H_1}+\Vert y\Vert ^{2p}_{H_2}+1)\\{} & {} \qquad +C_p\int ^T_0\mathbb {E}\Vert X_{t}^{\varepsilon }\Vert ^{2p}_{H_1}dt+\frac{C_p}{{\varepsilon }}\int ^T_0\int ^t_0 e^{-\frac{C_{p}}{\varepsilon }(t-s)}\left( 1+\mathbb {E}\Vert X_{s}^{\varepsilon }\Vert ^{2p}_{H_1}\right) dsdt\\{} & {} \quad \leqslant C_p(\Vert x\Vert ^{2p}_{H_1}+\Vert y\Vert ^{2p}_{H_2}+1)+C_p\int ^T_0\mathbb {E}\Vert X_{t}^{\varepsilon }\Vert ^{2p}_{H_1}dt. \end{aligned}$$

Hence, applying Gronwall’s inequality (see e.g. [23, Exercise 5.17]), we obtain

$$\begin{aligned}{} & {} \mathbb {E}\left( \sup _{t\in [0, T]}\Vert X_{t}^{{\varepsilon }}\Vert ^{2p}_{H_1}\right) +2p\mathbb {E}\left( \int ^T_0\Vert X_{t}^{{\varepsilon }}\Vert ^{2p-2}_{H_1}\Vert \tilde{X}_{t}^{{\varepsilon }}\Vert ^{\alpha }_{V_1}dt\right) \\{} & {} \quad \leqslant C_{p,T}(\Vert x\Vert ^{2p}_{H_1}+\Vert y\Vert ^{2p}_{H_2}+1), \end{aligned}$$

which also gives that

$$\begin{aligned} \sup _{t\in [0, T]}\mathbb {E}\Vert Y_{t}^{\varepsilon }\Vert ^{2p}_{H_2}\leqslant C_{p,T}\left( 1+\Vert x\Vert ^{2p}_{H_1}+\Vert y\Vert ^{2p}_{H_2}\right) . \end{aligned}$$

The proof is complete. \(\square \)

Because the method of time discretization is used in this paper, the following estimate about the integral of the time increment plays an important role in the proof of our main result, which has been proved in the case of the stochastic 2D Navier–Stokes equation in [19].

Lemma 3.2

For any \(T>0\), \({\varepsilon }\in (0,1)\) and \(\delta >0\) small enough, there exist constants \(C_{T}, m>0\) such that

$$\begin{aligned} \mathbb {E}\left[ \int ^{T}_0\Vert X_{t}^{\varepsilon }-X_{t(\delta )}^{\varepsilon }\Vert ^2_{H_1} dt\right] \leqslant C_{T}(1+\Vert x\Vert ^m_{H_1}+\Vert y\Vert ^m_{H_2})\delta ^{1/2}, \end{aligned}$$
(3.5)

where \(t(\delta ):=[\frac{t}{\delta }]\delta \) and [s] denotes the largest integer which is smaller than s.

Proof

Note that

$$\begin{aligned}{} & {} \mathbb {E}\left[ \int ^{T}_0\Vert X_{t}^{\varepsilon }-X_{t(\delta )}^{\varepsilon }\Vert ^2_{H_1}dt\right] \nonumber \\{} & {} \quad = \mathbb {E}\left( \int ^{\delta }_0\Vert X_{t}^{\varepsilon }-x\Vert ^2_{H_1}dt\right) +\mathbb {E}\left[ \int ^{T}_{\delta }\Vert X_{t}^{\varepsilon }-X_{t(\delta )}^{\varepsilon }\Vert ^2_{H_1}dt\right] \nonumber \\{} & {} \quad \leqslant C(1+\Vert x\Vert ^2_{H_1}+\Vert y\Vert ^2_{H_2})\delta +2\mathbb {E}\left( \int ^{T}_{\delta }\Vert X_{t}^{\varepsilon }-X_{t-\delta }^{\varepsilon }\Vert ^2_{H_1}dt\right) \nonumber \\{} & {} \qquad +2\mathbb {E}\left( \int ^{T}_{\delta }\Vert X_{t(\delta )}^{\varepsilon }-X_{t-\delta }^{\varepsilon }\Vert ^2_{H_1}dt\right) . \end{aligned}$$
(3.6)

Then applying Itô’s formula we have

$$\begin{aligned} \Vert X_{t}^{\varepsilon }-X_{t-\delta }^{\varepsilon }\Vert ^{2}_{H_1}{} & {} =2\int _{t-\delta } ^{t}{_{V^{*}_1}}\langle A(\tilde{X}_{s}^{\varepsilon }), \tilde{X}_{s}^{\varepsilon }-\tilde{X}_{t-\delta }^{\varepsilon }\rangle _{V_1} ds\nonumber \\{} & {} \quad + 2\int _{t-\delta } ^{t}\langle F(X_{s}^{\varepsilon }, Y_{s}^{\varepsilon }), X_{s}^{\varepsilon }-X_{t-\delta }^{\varepsilon }\rangle _{H_1} ds\nonumber \\{} & {} \quad +\int _{t-\delta } ^{t}\Vert G_1(\tilde{X}_{s}^{\varepsilon })\Vert _{L_{2}(U_1,H_1)}^2ds\nonumber \\{} & {} \quad +2\int _{t-\delta } ^{t}\langle X_{s}^{\varepsilon }-X_{t-\delta }, G_1(\tilde{X}_{s}^{\varepsilon })dW^{1}_s\rangle _{H_1} \nonumber \\{} & {} := I_{1}(t)+I_{2}(t)+I_{3}(t)+I_{4}(t). \end{aligned}$$
(3.7)

For the term \(I_1(t)\), by condition A4 and applying Hölder’s inequality, there exist constants \(m, C_T>0\) such that

$$\begin{aligned} \mathbb {E}\left( \int ^{T}_{\delta }|I_{1}(t)|dt\right)\leqslant & {} C\mathbb {E}\left( \int ^{T}_{\delta }\int _{t-\delta } ^{t}\Vert A(\tilde{X}_{s}^{\varepsilon })\Vert _{V^{*}_1} \Vert \tilde{X}_{s}^{\varepsilon }-\tilde{X}_{t-\delta }^{\varepsilon }\Vert _{V_1} ds dt\right) \nonumber \\\leqslant & {} C\left[ \mathbb {E}\int ^{T}_{\delta }\int _{t-\delta } ^{t}\Vert A(\tilde{X}_{s}^{\varepsilon })\Vert ^{\frac{\alpha }{\alpha -1}}_{V^{*}_1}dsdt\right] ^{\frac{\alpha -1}{\alpha }}\nonumber \\{} & {} \quad \left[ \mathbb {E}\int ^{T}_{\delta }\int _{t-\delta } ^{t}\Vert \tilde{X}_{s}^{\varepsilon }-\tilde{X}_{t-\delta }^{\varepsilon }\Vert ^{\alpha }_{V_1} dsdt\right] ^{\frac{1}{\alpha }}\nonumber \\\leqslant & {} C\left[ \delta \mathbb {E}\int ^{T}_0(1+\Vert \tilde{X}_{s}^{\varepsilon }\Vert ^{\alpha }_{V_1})(1+\Vert X_{s}^{\varepsilon }\Vert ^{\beta }_{H_1})ds\right] ^{\frac{\alpha -1}{\alpha }}\nonumber \\{} & {} \quad \cdot \left[ \delta \mathbb {E}\int ^{T}_0\Vert \tilde{X}_{s}^{\varepsilon }\Vert ^{\alpha }_{V_1}ds\right] ^{\frac{1}{\alpha }}\nonumber \\\leqslant & {} C_{T}(1+\Vert x\Vert ^m_{H_1}+\Vert y\Vert ^m_{H_2})\delta , \end{aligned}$$
(3.8)

where we applied Fubini’s theorem and (3.1) in the third and fourth inequality respectively.

For terms \(I_{2}(t)\) and \(I_3(t)\), by condition (2.1), estimates (3.1) and (3.2), we get

$$\begin{aligned}{} & {} \mathbb {E}\left( \int ^{T}_{\delta }|I_{2}(t)|dt\right) \nonumber \\{} & {} \quad \leqslant C\mathbb {E}\left[ \int ^{T}_{\delta }\int _{t-\delta } ^{t}(1+\Vert X_{s}^{\varepsilon }\Vert _{H_1}+\Vert Y_{s}^{\varepsilon }\Vert _{H_2})(\Vert X_{s}^{\varepsilon }\Vert _{H_1}+\Vert X_{t-\delta }^{\varepsilon }\Vert _{H_1})ds dt\right] \nonumber \\{} & {} \quad \leqslant C\delta \mathbb {E}\left[ \sup _{s\in [0,T]}(1+\Vert X_{s}^{\varepsilon }\Vert ^2_{H_1})\right] \nonumber \\{} & {} \qquad +C\mathbb {E}\left[ \sup _{s\in [0,T]}\Vert X_{s}^{\varepsilon }\Vert _{H_1}\int ^T_{\delta }\int ^t_{t-\delta }\Vert Y^{{\varepsilon }}_s\Vert _{H_2}dsdt\right] \nonumber \\{} & {} \quad \leqslant C\delta \mathbb {E}\left[ \sup _{s\in [0,T]}(1+\Vert X_{s}^{\varepsilon }\Vert ^2_{H_1})\right] \nonumber \\{} & {} \qquad +C_T\delta ^{1/2}\mathbb {E}\left[ \sup _{s\in [0,T]}\Vert X_{s}^{\varepsilon }\Vert ^2_{H_1}\right] ^{1/2}\mathbb {E}\left( \int ^T_{\delta }\int ^t_{t-\delta }\Vert Y_{s}^{\varepsilon }\Vert ^2_{H_2}dsdt\right) ^{1/2}\nonumber \\{} & {} \quad \leqslant C_{T}\delta (1+\Vert x\Vert ^2_{H_1}+\Vert y\Vert ^2_{H_2}) \end{aligned}$$
(3.9)

and

$$\begin{aligned} \mathbb {E}\left( \int ^{T}_{\delta }|I_{3}(t)|dt\right)\leqslant & {} C\mathbb {E}\left( \int ^{T}_{\delta }\int _{t-\delta } ^{t}(1+\Vert X_{s}^{\varepsilon }\Vert ^2_{H_1})ds dt\right) \nonumber \\\leqslant & {} C_T\delta \mathbb {E}\left[ \sup _{s\in [0,T]}(1+\Vert X_{s}^{\varepsilon }\Vert ^2_{H_1})\right] \nonumber \\\leqslant & {} C_{T}\delta \left( 1+\Vert x\Vert ^2_{H_1}+\Vert y\Vert ^2_{H_2}\right) . \end{aligned}$$
(3.10)

For the term \(I_{4}(t)\), note that

$$\begin{aligned} \left\{ \int _{t-\delta } ^{u}\langle X_{s}^{\varepsilon }-X_{t-\delta }, G_1(\tilde{X}_{s}^{\varepsilon })dW^{1}_s\rangle _{H_1}\right\} _{t-\delta \leqslant u\leqslant T} \end{aligned}$$

is a local martingale, then applying Burkholder-Davies-Gundy inequality, it follows

$$\begin{aligned} \mathbb {E}\left( \int ^{T}_{\delta }|I_{4}(t)|dt\right)\leqslant & {} C\mathbb {E}\int ^{T}_{\delta }\left[ \int _{t-\delta } ^{t}\Vert G_1(\tilde{X}_{s}^{\varepsilon })\Vert ^2_{L_{2}(U_1,H_1)}\Vert X_{s}^{\varepsilon }-X_{t-\delta }^{\varepsilon }\Vert ^2_{H_1} ds\right] ^{1/2}dt\nonumber \\\leqslant & {} C_T\left[ \mathbb {E}\int ^{T}_{\delta }\int _{t-\delta } ^{t}(1+\Vert X_{s}^{\varepsilon }\Vert ^2_{H_1})\Vert X_{s}^{\varepsilon }-X_{t-\delta }^{\varepsilon }\Vert ^2_{H_1}dsdt\right] ^{1/2}\nonumber \\\leqslant & {} C_{T}\delta ^{1/2}\left[ \mathbb {E}\sup _{s\in [0,T]}\left( 1+\Vert X_{s}^{\varepsilon }\Vert ^4_{H_1}\right) \right] ^{1/2}\nonumber \\\leqslant & {} C_{T}\delta ^{1/2}(1+\Vert x\Vert ^2_{H_1}+\Vert y\Vert ^2_{H_2}). \end{aligned}$$
(3.11)

Combining estimates (3.8)–(3.11), we obtain

$$\begin{aligned} \mathbb {E}\left( \int ^{T}_{\delta }\Vert X_{t}^{\varepsilon }-X_{t-\delta }^{\varepsilon }\Vert ^2_{H_1}dt\right) \leqslant C_{T}(1+\Vert x\Vert ^m_{H_1}+\Vert y\Vert ^m_{H_2})\delta ^{1/2}. \end{aligned}$$
(3.12)

By the same argument above, we also have

$$\begin{aligned} \mathbb {E}\left( \int ^{T}_{\delta }\Vert X_{t(\delta )}^{\varepsilon }-X_{t-\delta }^{\varepsilon }\Vert ^2_{H_1}dt\right) \leqslant C_{T}(1+\Vert x\Vert ^m_{H_1}+\Vert y\Vert ^m_{H_2})\delta ^{1/2}. \end{aligned}$$
(3.13)

Hence, the result (3.5) holds by estimates (3.6), (3.12) and (3.13). The proof is complete. \(\square \)

3.2 Construction of the Auxiliary Process

Based on the method of time discretization, which is inspired by [18], we first construct an auxiliary process \(\hat{Y}_{t}^{\varepsilon }\in H_2\) satisfying the following equation

$$\begin{aligned} d\hat{Y}_{t}^{{\varepsilon }}=\frac{1}{{\varepsilon }}B\left( X^{{\varepsilon }}_{t(\delta )},\hat{Y}_{t}^{{\varepsilon }}\right) dt+\frac{1}{\sqrt{{\varepsilon }}}G_2\left( X^{{\varepsilon }}_{t(\delta )},\hat{Y}_{t}^{{\varepsilon }}\right) dW^{2}_t,\quad \hat{Y}_{0}^{{\varepsilon }}=y\in H_2, \end{aligned}$$

where \(\delta \) is a fixed positive number depending on \({\varepsilon }\) and will be chosen later. Then for its \(dt\otimes \mathbb {P}\)-equivalence class \(\check{\hat{Y}}^{{\varepsilon }}\) we have \(\check{\hat{Y}}^{{\varepsilon }}\in L^{\kappa }([0, T]\times \Omega , dt\otimes \mathbb {P}; V_2)\cap L^2([0, T]\times \Omega , dt\otimes \mathbb {P}; H_2)\) with \(\kappa \) as in B3, and for any \(k\in \mathbb {N}\) and \(t\in [k\delta ,\min ((k+1)\delta ,T)]\), \(\mathbb {P}\)-a.s.

$$\begin{aligned} \hat{Y}_{t}^{\varepsilon }=\hat{Y}_{k\delta }^{\varepsilon }+\frac{1}{\varepsilon }\int _{k\delta }^{t} B(X_{k\delta }^{\varepsilon },\tilde{\hat{Y}}_{s}^{\varepsilon })ds+\frac{1}{\sqrt{\varepsilon }}\int _{k\delta }^{t}G_2(X_{k\delta }^{\varepsilon },\tilde{\hat{Y}}_{s}^{\varepsilon })dW^{2}_s, \end{aligned}$$
(3.14)

where \(\tilde{\hat{Y}}^{\varepsilon }\) is any \(V_2\)-valued progressively measurable \(dt\otimes \mathbb {P}\)-version of \(\check{\hat{Y}}^{{\varepsilon }}\).

By the construction of \(\hat{Y}_{t}^{\varepsilon }\), we obtain the following estimates, which will be used later.

Lemma 3.3

For any \(T>0\) and \({\varepsilon }\in (0,1)\), there exist a constant \(C_{T}>0\) and \(m\in \mathbb {N}\) such that

$$\begin{aligned} \sup _{t\in [0,T]}\mathbb {E}\Vert \hat{Y}_{t}^{{\varepsilon }}\Vert ^2_{H_2}\leqslant C_{T}(1+\Vert x\Vert ^2_{H_1}+\Vert y\Vert ^2_{H_2}) \end{aligned}$$
(3.15)

and

$$\begin{aligned} \mathbb {E}\left( \int _0^{T}\Vert Y_{t}^{\varepsilon }-\hat{Y}_{t}^{\varepsilon }\Vert ^2_{H_2}dt\right) \leqslant C_{T}\left( 1+\Vert x\Vert ^m_{H_1}+\Vert y\Vert ^m_{H_2}\right) \delta ^{1/2}. \end{aligned}$$
(3.16)

Proof

Because the proof of estimate (3.15) follows almost the same steps as in the proof of Lemma 3.1, we omit its proof and only prove (3.16) here.

Note that

$$\begin{aligned} Y_{t}^{\varepsilon }-\hat{Y}_{t}^{\varepsilon }={} & {} \frac{1}{\varepsilon }\int ^t_0 \left[ B(X^{\varepsilon }_s, \tilde{Y}^{\varepsilon }_s)-B(X^{\varepsilon }_{s(\delta )}, \tilde{\hat{Y}}^{\varepsilon }_s)\right] ds\\{} & {} +\frac{1}{\sqrt{\varepsilon }}\int ^t_0 \left[ G_2(X^{\varepsilon }_s, \tilde{Y}^{\varepsilon }_s)-G_2(X^{\varepsilon }_{s(\delta )}, \tilde{\hat{Y}}^{\varepsilon }_s)\right] dW^{2}_s. \end{aligned}$$

Applying Itô’s formula, we obtain

$$\begin{aligned} \mathbb {E}\Vert Y_{t}^{\varepsilon }-\hat{Y}_{t}^{\varepsilon }\Vert ^{2}_{H_2}={} & {} \frac{2}{\varepsilon }\mathbb {E}\int _{0} ^{t}{_{V^{*}_2}}\langle B(X^{\varepsilon }_s, \tilde{Y}^{\varepsilon }_s)-B(X^{\varepsilon }_{s(\delta )}, \tilde{\hat{Y}}^{\varepsilon }_s),\tilde{Y}_{s}^{\varepsilon }-\tilde{\hat{Y}}_{s}^{\varepsilon }\rangle _{V_2} ds \\{} & {} + \frac{1}{\varepsilon }\mathbb {E}\int _{0} ^{t}\Vert G_2(X^{\varepsilon }_s, \tilde{Y}^{\varepsilon }_s)-G_2(X^{\varepsilon }_{s(\delta )}, \tilde{\hat{Y}}^{\varepsilon }_s)\Vert _{L_{2}(U_2,H_2)}^2ds. \end{aligned}$$

Then by condition B2, there exists \(\gamma >0\) such that

$$\begin{aligned} \frac{d}{dt}\mathbb {E}\Vert Y_{t}^{\varepsilon }-\hat{Y}_{t}^{\varepsilon }\Vert ^{2}_{H_2} \leqslant -\frac{\gamma }{\varepsilon }\mathbb {E}\Vert Y_{t}^{\varepsilon }-\hat{Y}_{t}^{\varepsilon }\Vert ^{2}_{H_2}+\frac{C}{\varepsilon }\mathbb {E}\Vert X_t^\varepsilon -X_{t(\delta )}^\varepsilon \Vert ^2_{H_1}. \end{aligned}$$

The the same argument used in (3.4), we have

$$\begin{aligned} \mathbb {E}\Vert Y_{t}^{\varepsilon }-\hat{Y}_{t}^{\varepsilon }\Vert ^{2}_{H_2}\leqslant \frac{C}{\varepsilon }\int _0^te^{-\frac{\gamma (t-s)}{{\varepsilon }}}\mathbb {E}\Vert X_s^\varepsilon -X_{s(\delta )}^\varepsilon \Vert ^2_{H_1}ds. \end{aligned}$$

Then applying Fubini’s theorem, for any \(T>0\),

$$\begin{aligned} \mathbb {E}\left( \int _0^T\Vert Y_{t}^{\varepsilon }-\hat{Y}_{t}^{\varepsilon }\Vert ^{2}_{H_2}dt\right){} & {} \leqslant \frac{C}{\varepsilon }\int _0^T\int ^t_0e^{-\frac{\beta (t-s)}{{\varepsilon }}}\mathbb {E}\Vert X_s^\varepsilon -X_{s(\delta )}^\varepsilon \Vert ^2_{H_1}dsdt\\{} & {} = \frac{C}{\varepsilon }\mathbb {E}\left[ \int _0^T\Vert X_s^\varepsilon -X_{s(\delta )}^\varepsilon \Vert ^2_{H_1}\left( \int ^T_s e^{-\frac{\beta (t-s)}{{\varepsilon }}}dt\right) ds\right] \\{} & {} \leqslant C\mathbb {E}\left( \int _0^T\Vert X_s^\varepsilon -X_{s(\delta )}^\varepsilon \Vert ^2_{H_1} ds\right) . \end{aligned}$$

Therefore, by Lemma 3.2, we obtain

$$\begin{aligned} \mathbb {E}\left( \int _0^T\Vert Y_{t}^{\varepsilon }-\hat{Y}_{t}^{\varepsilon }\Vert ^{2}_{H_2}dt\right) \leqslant C_{T}(1+\Vert x\Vert ^m_{H_1}+\Vert y\Vert ^m_{H_2})\delta ^{1/2}. \end{aligned}$$

The proof is complete. \(\square \)

3.3 The Ergodicity of the Frozen Equation

The frozen equation associated to the fast motion for fixed slow component \(x\in H_1\) is as follows,

$$\begin{aligned} \left\{ \begin{aligned}&dY_{t}=B(x,Y_{t})dt+G_2(x,Y_t)d\bar{W}_{t}^{2},\\&Y_{0}=y\in H_2, \end{aligned} \right. \end{aligned}$$
(3.17)

where \(\{\bar{W}^{2}_t\}_{t\geqslant 0}\) is a cylindrical \(\tilde{\mathscr {F}}_{t}\)-Wiener process in a separable Hilbert space \(U_2\) on another probability space \(({\tilde{\Omega }},\tilde{\mathscr {F}},\tilde{\mathbb {P})}\) with natural filtration \(\tilde{\mathscr {F}}_{t}\).

Under the assumptions B1B4, for any fixed \(x\in H_1\) and initial data \(y\in H_2\), equation (3.17) has a unique variational solution \(Y_{t}^{x,y}\) in the sense of Definition 2.2, i.e., for its \(dt\otimes {\tilde{\mathbb {P}}}\)-equivalence class \(\hat{Y}\) we have \(\hat{Y}^{x,y}\in L^{\kappa }([0, T]\times {\tilde{\Omega }}, dt\otimes {\tilde{\mathbb {P}}}; V_2)\cap L^2([0, T]\times {\tilde{\Omega }}, dt\otimes {\tilde{\mathbb {P}}}; H_2)\) with \(\kappa \) as in B3, we have \({\tilde{\mathbb {P}}}\)-a.s.

$$\begin{aligned} Y^{x,y}_{t}=y+\int _{0}^{t} B(x,\tilde{Y}^{x,y}_{s})ds+\int _{0}^{t}G_2(x,\tilde{Y}^{x,y}_{s})d\bar{W}^{2}_s, \end{aligned}$$
(3.18)

where \(\tilde{Y}^{x,y}\) is any \(V_2\)-valued progressively measurable \(dt\otimes {\tilde{\mathbb {P}}}\)-version of \(\hat{Y}^{x,y}\). By the same arguments as in the proof of Lemma 3.1, it is easy to prove that

$$\begin{aligned} \sup _{t\geqslant 0}{\tilde{\mathbb {E}}}\Vert Y_{t}^{x,y}\Vert ^2_{H_2}\leqslant C(1+\Vert x\Vert ^2_{H_1}+\Vert y\Vert ^2_{H_2}). \end{aligned}$$

Let \(\{P^{x}_t\}_{t\geqslant 0}\) be the transition semigroup of the Markov process \(\{Y_{t}^{x,y}\}_{t\geqslant 0}\), that is, for any bounded measurable function \(\varphi \) on \(H_2\),

$$\begin{aligned} P^x_t \varphi (y)= \tilde{\mathbb {E}} \left[ \varphi \left( Y_{t}^{x,y}\right) \right] , \quad y \in H_2,\ \ t>0, \end{aligned}$$

where \({{\tilde{\mathbb {E}}}}\) is the expectation on \((\tilde{\Omega },\tilde{\mathscr {F}},\tilde{\mathbb {P})}\). Then we have the following asymptotic behavior of \(P^x_t\), whose proof can be founded in [20, Theorem 4.3.9].

Proposition 3.4

The transition semigroup \(\{P^{x}_t\}_{t\geqslant 0}\) has a unique invariant measure \(\mu ^x\). Moreover, there exists a constant \(C>0\) such that for any Lipschitz function \(\varphi :H_2\rightarrow R\),

$$\begin{aligned} \Big |P^x_t\varphi (y)-\int _{H_2}\varphi (z)\mu ^x(dz)\Big |\leqslant C(1+\Vert x\Vert _{H_1}+\Vert y\Vert _{H_2})e^{-\frac{\gamma t}{2}}\Vert \varphi \Vert _{Lip}, \end{aligned}$$
(3.19)

where \(\Vert \varphi \Vert _{Lip}=\sup _{y_1\ne y_2\in H_2}\frac{|\varphi (y_1)-\varphi (y_2)|}{\Vert y_1-y_2\Vert _{H_2}}\).

3.4 The Averaged Equation

We consider the corresponding averaged equation, i.e.

$$\begin{aligned} \left\{ \begin{array}{l} \displaystyle d\bar{X}_{t}=A(\bar{X}_{t})dt+\bar{F}(\bar{X}_{t})dt+G_1(\bar{X}_{t})dW^{1}_t,\\ \bar{X}_{0}=x\in H_1,\end{array}\right. \end{aligned}$$
(3.20)

with the averaged coefficient

$$\begin{aligned} \bar{F}(x):=\int _{H_2}F(x,y)\mu ^{x}(dy),\quad x\in {H_1}, \end{aligned}$$

where \(\mu ^{x}\) is the unique invariant measure for the transition semigroup \(\{P^{x}_t\}_{t\geqslant 0}\).

Since F is Lipschitz continuous, it is easy to check \(\bar{F}\) is also Lipschitz continuous, i.e.

$$\begin{aligned} \Vert \bar{F}(u)-\bar{F}(v)\Vert _{H_1}\leqslant C\Vert u-v\Vert _{H_1},\quad u,v\in H_1. \end{aligned}$$

Then equation (3.20) has a unique variational solution \({\bar{X}}\) in the sense of Definition 2.2, i.e.,for its \(dt\otimes \mathbb {P}\)-equivalence class \(\check{\bar{X}}\) we have \(\check{\bar{X}}\in L^{\alpha }([0, T]\times \Omega , dt\otimes \mathbb {P}; V_1)\cap L^2([0, T]\times \Omega , dt\otimes \mathbb {P}; H_1)\) with \(\alpha \) as in A3, we have \(\mathbb {P}\)-a.s.

$$\begin{aligned} \bar{X}_{t}=x+\int _{0}^{t}A(\tilde{\bar{X}}_{s})ds+\int _{0}^{t} \bar{F}(\bar{X}_{s})ds+\int _{0}^{t}G_1(\tilde{\bar{X}}_{s})dW^{1}_s, \end{aligned}$$
(3.21)

where \(\tilde{\bar{X}}\) is any \(V_1\)-valued progressively measurable \(dt\otimes \mathbb {P}\)-version of \(\check{\bar{X}}\). Moreover, we also have the following estimates. Because their proofs follows almost the same steps in the proof of Lemmas 3.1 and 3.2, we omit them here.

Lemma 3.5

For any \(T>0\), \(p\geqslant 1\), there exist constants \(C_{p,T},C_T>0\) and \(m>0\) such that for any \(x\in H_1\),

$$\begin{aligned} \mathbb {E}\left( \sup _{t\in [0,T]}\Vert \bar{X}_{t}\Vert ^{2p}_{H_1}\right) +\mathbb {E}\left( \int _0^T\Vert \bar{X}_{t}\Vert ^{2p-2}_{H_1}\Vert \tilde{\bar{X}}_{t}\Vert _{V_1}^{\alpha }dt\right) \leqslant C_{p,T}(1+\Vert x\Vert ^{2p}_{H_1}) \end{aligned}$$

and

$$\begin{aligned} \mathbb {E}\left[ \int ^{T}_0\Vert \bar{X}_{t}-\bar{X}_{t(\delta )}\Vert ^2_{H_1} dt\right] \leqslant C_{T}\delta ^{1/2}(1+\Vert x\Vert ^m_{H_1}). \end{aligned}$$
(3.22)

Next, we intend to prove that \(X_{t}^{{\varepsilon }}\) strongly converges to \(\bar{X}_t\) for \(t<\tau _{R}\) firstly, then the proof of the main result will follow from the fact that the difference process \(X^{{\varepsilon }}_t-\bar{X}_t\) after the stopping time is sufficient small when R is large enough, whose proof is given in Sect. 3.5.

Proposition 3.6

For any \((x,y)\in H_1\times H_2\), \(T,R>0\) and \({\varepsilon }\in (0,1)\), then there exist constants \(C_{R,T}, m>0\) such that

$$\begin{aligned}&\mathbb {E}\left( \sup _{t\in [0, T\wedge \tau _R]}\Vert X_{t}^{{\varepsilon }}-\bar{X}_{t}\Vert ^2_{H_1}\right) \nonumber \\&\quad \leqslant C_{R,T}(1+\Vert x\Vert _{H_1}^m+\Vert y\Vert ^{m}_{H_1})\left( \frac{{\varepsilon }}{\delta }+\frac{{\varepsilon }^{1/2}}{\delta ^{1/2}}+\delta ^{1/2}+\delta ^{1/4}\right) , \end{aligned}$$
(3.23)

where

$$\begin{aligned} \tau _R:=\inf \left\{ t\geqslant 0:\int _0^t(1+\Vert \tilde{\bar{X}}_s\Vert _{V_1}^{\alpha })(1+\Vert \bar{X}_s\Vert _{H_1}^{\beta })ds\geqslant R\right\} . \end{aligned}$$

Proof

We will divide the proof into three steps.

Step 1 We note that

$$\begin{aligned} X_{t}^{{\varepsilon }}-\bar{X}_{t}= & {} \int ^t_0 \left[ A(\tilde{X}_s^\varepsilon )-A(\tilde{\bar{X}}_s)\right] ds +\left[ F(X_{s}^\varepsilon ,Y_s^\varepsilon )-\bar{F}(\bar{X}_s)\right] ds\\{} & {} +\left[ G_1(\tilde{X}_s^\varepsilon )-G_1(\tilde{\bar{X}}_s)\right] dW_s^{1}. \end{aligned}$$

Applying Itô’s formula, we have

$$\begin{aligned} \Vert X_{t}^{{\varepsilon }}-\bar{X}_{t}\Vert ^2_{H_1}{} & {} =2\int _0^t{_{V^{*}_1}}\langle A(\tilde{X}_s^\varepsilon )-A(\tilde{\bar{X}}_s), \tilde{X}_{s}^{{\varepsilon }}-\tilde{\bar{X}}_{s}\rangle _{V_1} ds\\{} & {} \quad +\int _0^t\Vert G_1(\tilde{X}_s^\varepsilon )-G_1(\tilde{\bar{X}}_s) \Vert _{L_{2}(U_1, H_1)}^2ds\\{} & {} \quad +2\int _0^t\left\langle \left[ F(X_{s}^\varepsilon ,Y_s^\varepsilon )-\bar{F}(\bar{X}_s)\right] , X_{s}^{{\varepsilon }}-\bar{X}_{s}\right\rangle _{H_1} ds\\{} & {} \quad +2\int _0^t\langle X_{s}^{{\varepsilon }}-\bar{X}_{s}, [G_1(\tilde{X}_s^\varepsilon )-G_1(\tilde{\bar{X}}_s)]dW_s^{1}\rangle _{H_1}\\{} & {} = 2\int _0^t{_{V^{*}_1}}\langle A(\tilde{X}_s^\varepsilon )-A(\tilde{\bar{X}}_s), \tilde{X}_{s}^{{\varepsilon }}-\tilde{\bar{X}}_{s}\rangle _{V_1} ds\\{} & {} \quad +\int _0^t\Vert G_1(\tilde{X}_s^\varepsilon )-G_1(\tilde{\bar{X}}_s) \Vert _{L_{2}(U_1, H_1)}^2ds\\{} & {} \quad +2\int _0^t\left\langle \left[ \bar{F}(X_{s}^\varepsilon )-\bar{F}(\bar{X}_s)\right] , X_{s}^{{\varepsilon }}-\bar{X}_{s}\right\rangle _{H_1} ds\\{} & {} \quad +2\int _0^t\left\langle \left[ F(X_{s}^\varepsilon ,Y_s^\varepsilon )-\bar{F}(X^{{\varepsilon }}_s)-F(X_{s(\delta )}^\varepsilon ,\hat{Y}_s^\varepsilon )+\bar{F}(X^{{\varepsilon }}_{s(\delta )})\right] , X_{s}^{{\varepsilon }}-\bar{X}_{s}\right\rangle _{H_1} ds\\{} & {} \quad +2\int _0^t\left\langle \left[ F(X_{s(\delta )}^\varepsilon ,\hat{Y}_s^\varepsilon )-\bar{F}(X^{{\varepsilon }}_{s(\delta )})\right] , X_{s}^{{\varepsilon }}-X_{s(\delta )}^{{\varepsilon }}-\bar{X}_{s}+\bar{X}_{s(\delta )}\right\rangle _{H_1} ds\\{} & {} \quad +2\int _0^t\left\langle \left[ F(X_{s(\delta )}^\varepsilon ,\hat{Y}_s^\varepsilon )-\bar{F}(X^{{\varepsilon }}_{s(\delta )})\right] , X_{s(\delta )}^{{\varepsilon }}-\bar{X}_{s(\delta )}\right\rangle _{H_1} ds\\{} & {} \quad +2\int _0^t\langle X_{s}^{{\varepsilon }}-\bar{X}_{s}, [G_1(\tilde{X}_s^\varepsilon )-G_1(\tilde{\bar{X}}_s)]dW_s^{1}\rangle _{H_1}. \end{aligned}$$

Then conditions A2 and A3 imply \(\mathbb {P}\)-a.s.,

$$\begin{aligned} \Vert X_{t}^{{\varepsilon }}-\bar{X}_{t}\Vert ^2_{H_1}\leqslant & {} C\int _0^t\Vert X_{s}^{{\varepsilon }}-\bar{X}_{s}\Vert ^2_{H_1}(1+\Vert \tilde{\bar{X}}_s\Vert _{V_1}^{\alpha })(1+\Vert \bar{X}_s\Vert _{H_1}^{\beta })ds\\{} & {} +C\int _0^t\left( \Vert X_{s}^{{\varepsilon }}-X_{s(\delta )}^{{\varepsilon }}\Vert ^2_{H_1}+\Vert Y^{{\varepsilon }}_s-\hat{Y}^{{\varepsilon }}_{s}\Vert ^2_{H_2}+\Vert \bar{X}_s-{\bar{X}}_{s(\delta )}\Vert ^2_{H_1}\right) ds\\{} & {} +C\left[ \int _0^t\Vert F(X_{s(\delta )}^\varepsilon ,\hat{Y}_s^\varepsilon )-\bar{F}(X^{\varepsilon }_{s(\delta )})\Vert ^2_{H_1}ds\right] ^{1/2}\\{} & {} \times \left[ \int ^t_0 \left( \Vert X^{\varepsilon }_{s}-X^{\varepsilon }_{s(\delta )}\Vert ^2_{H_1}+\Vert {\bar{X}}_{s}-{\bar{X}}_{s(\delta )}\Vert ^2_{H_1} \right) ds\right] ^{1/2}\\{} & {} +2\left| \int _0^t\left\langle \left[ F(X_{s(\delta )}^\varepsilon ,\hat{Y}_s^\varepsilon )-\bar{F}(X^{{\varepsilon }}_{s(\delta )})\right] , X_{s(\delta )}^{{\varepsilon }}-\bar{X}_{s(\delta )}\right\rangle _{H_1} ds\right| \\{} & {} +2\left| \int _0^t\langle X_{s}^{{\varepsilon }}-\bar{X}_{s}, [G_1(X_s^\varepsilon )-G_1(\tilde{\bar{X}}_s)]dW_s^{1}\rangle _{H_1}\right| . \end{aligned}$$

Applying Gronwall’s inequality and the definition of the stopping time \(\tau _{R}\) , we deduce that

$$\begin{aligned}{} & {} \sup _{t\in [0, T\wedge \tau _R]}\Vert X_{t}^{{\varepsilon }}-\bar{X}_{t}\Vert ^2_{H_1}\\{} & {} \quad \leqslant C_{R,T}\Bigg \{\int _0^{T}\left( \Vert X_{s}^{{\varepsilon }}-X_{s(\delta )}^{{\varepsilon }}\Vert ^2_{H_1}+\Vert Y^{{\varepsilon }}_s-\hat{Y}_{s}\Vert ^2_{H_2}+\Vert \bar{X}_s-{\bar{X}}_{s(\delta )}\Vert ^2_{H_1}\right) ds\\{} & {} \qquad +\left[ \int _0^{T}\left( 1+\Vert X_{s(\delta )}^\varepsilon \Vert ^2_{H_1}+\Vert \hat{Y}_s^\varepsilon \Vert ^2_{H_2}\right) ds\right] ^{1/2}\\{} & {} \quad \times \left[ \int ^t_0 \Vert X^{\varepsilon }_{s}-X^{\varepsilon }_{s(\delta )}\Vert ^2_{H_1}+\Vert {\bar{X}}_{s}-{\bar{X}}_{s(\delta )}\Vert ^2_{H_1} ds\right] ^{1/2}\\{} & {} \qquad +\sup _{t\in [0, T]}\left| \int _0^t\left\langle \left[ F(X_{s(\delta )}^\varepsilon ,\hat{Y}_s^\varepsilon )-\bar{F}(X^{{\varepsilon }}_{s(\delta )})\right] , X_{s(\delta )}^{{\varepsilon }}-\bar{X}_{s(\delta )}\right\rangle _{H_1} ds\right| \\{} & {} \qquad +\sup _{t\in [0, T\wedge \tau _R]}\left| \int _0^t\langle X_{s}^{{\varepsilon }}-\bar{X}_{s}, [G_1(X_s^\varepsilon )-G_1(\tilde{\bar{X}}_s)]dW_s^{1}\rangle _{H_1}\right| \Bigg \}. \end{aligned}$$

Note that

$$\begin{aligned} \left\{ \int _0^t\langle X_{s}^{{\varepsilon }}-\bar{X}_{s}, [G_1(X_s^\varepsilon )-G_1(\tilde{\bar{X}}_s)]dW_s^{1}\rangle _{H_1}\right\} _{0\leqslant t\leqslant T} \end{aligned}$$

is a local martingale and \(T\wedge \tau _R\) is a stopping time, thus applying Burkholder-Davis inequality (see e.g. [20, Proposition D.0.1]), estimates (3.5), (3.16) and (3.22), there exists \(m>0\) such that

$$\begin{aligned}{} & {} \mathbb {E}\left( \sup _{t\in [0, T\wedge \tau _R]}\Vert X_{t}^{{\varepsilon }}-\bar{X}_{t}\Vert ^2_{H_1}\right) \\{} & {} \quad \leqslant C_{R,T}\left( 1+\Vert x\Vert ^m_{H_1}+\Vert y\Vert ^m_{H_2}\right) \delta ^{1/4}\\{} & {} \qquad +\frac{1}{2}\mathbb {E} \left( \sup _{t\in [0, T\wedge \tau _R]}\Vert X_{t}^{{\varepsilon }}-\bar{X}_{t}\Vert ^2_{H_1}\right) \\{} & {} \qquad +C_{R,T}\mathbb {E}\left[ \sup _{t\in [0, T]}\left| \int _0^t\left\langle F(X_{s(\delta )}^\varepsilon ,\hat{Y}_s^\varepsilon )-\bar{F}(X^{{\varepsilon }}_{s(\delta )}), X_{s(\delta )}^{{\varepsilon }}-\bar{X}_{s(\delta )}\right\rangle _{H_1} ds\right| \right] \\{} & {} \qquad +C_{R,T}\mathbb {E}\left( \int _0^{T\wedge \tau _R}\Vert X_{t}^{{\varepsilon }}-\bar{X}_{t}\Vert ^2_{H_1}dt\right) , \end{aligned}$$

which implies

$$\begin{aligned}{} & {} \mathbb {E}\left( \sup _{t\in [0, T\wedge \tau _R]}\Vert X_{t}^{{\varepsilon }}-\bar{X}_{t}\Vert ^2_{H_1}\right) \\{} & {} \quad \leqslant C_{R,T}\left( 1+\Vert x\Vert ^m_{H_1}+\Vert y\Vert ^m_{H_2}\right) \delta ^{1/4}\\{} & {} \qquad +C_{R,T}\mathbb {E}\left[ \sup _{t\in [0, T]}\left| \int _0^t\left\langle F(X_{s(\delta )}^\varepsilon ,\hat{Y}_s^\varepsilon )-\bar{F}(X^{{\varepsilon }}_{s(\delta )}), X_{s(\delta )}^{{\varepsilon }}-\bar{X}_{s(\delta )}\right\rangle _{H_1} ds\right| \right] \\{} & {} \qquad +C_{R,T}\int _0^T\mathbb {E}\left( \sup _{s\in [0, t\wedge \tau _R]}\Vert X_{s}^{{\varepsilon }}-\bar{X}_{s}\Vert ^2_{H_1}\right) dt. \end{aligned}$$

Applying Gronwall’s inequality, we finally get

$$\begin{aligned}{} & {} \mathbb {E}\left( \sup _{t\in [0, T\wedge \tau _R]}\Vert X_{t}^{{\varepsilon }}-\bar{X}_{t}\Vert ^2_{H_1}\right) \\{} & {} \quad \leqslant C_{R,T}\left( 1+\Vert x\Vert ^m_{H_1}+\Vert y\Vert ^m_{H_2}\right) \delta ^{1/4}\\{} & {} \qquad +C_{R,T}\mathbb {E}\left[ \sup _{t\in [0, T]}\left| \int _0^t\left\langle F(X_{s(\delta )}^\varepsilon ,\hat{Y}_s^\varepsilon )-\bar{F}(X^{{\varepsilon }}_{s(\delta )}), X_{s(\delta )}^{{\varepsilon }}-\bar{X}_{s(\delta )}\right\rangle _{H_1} ds\right| \right] . \end{aligned}$$

Hence, the proof will be completed by the following estimate:

$$\begin{aligned}{} & {} \mathbb {E}\left[ \sup _{t\in [0, T]}\left| \int _0^t\left\langle F(X_{s(\delta )}^\varepsilon ,\hat{Y}_s^\varepsilon )-\bar{F}(X^{{\varepsilon }}_{s(\delta )}), X_{s(\delta )}^{{\varepsilon }}-\bar{X}_{s(\delta )}\right\rangle _{H_1}ds\right| \right] \nonumber \\{} & {} \quad \leqslant C_T\left( 1+\Vert x\Vert ^2_{H_1}+\Vert y\Vert ^2_{H_2}\right) \left( \frac{{\varepsilon }}{\delta }+\frac{{\varepsilon }^{1/2}}{\delta ^{1/2}}+\delta ^{1/2}\right) , \end{aligned}$$
(3.24)

whose proof will be given in Step 2.

Step 2 We note that

$$\begin{aligned}{} & {} \left| \int _{0}^{t}\langle F(X_{s(\delta )}^{{\varepsilon }},\hat{Y}_{s}^{{\varepsilon }})-\bar{F}(X^{{\varepsilon }}_{s(\delta )}), X_{s(\delta )}^{{\varepsilon }}-\bar{X}_{s(\delta )}\rangle _{H_1} ds\right| \nonumber \\{} & {} \quad =\left| \sum _{k=0}^{[t/\delta ]-1} \int _{k\delta }^{(k+1)\delta }\langle F(X_{s(\delta )}^{{\varepsilon }},\hat{Y}_{s}^{{\varepsilon }})-\bar{F}(X^{{\varepsilon }}_{s(\delta )}), X_{s(\delta )}^{{\varepsilon }}-\bar{X}_{s(\delta )}\rangle _{H_1} ds\right. \nonumber \\{} & {} \qquad \left. +\int _{t(\delta )}^{t}\langle F(X_{s(\delta )}^{{\varepsilon }},\hat{Y}_{s}^{{\varepsilon }})-\bar{F}(X^{{\varepsilon }}_{s(\delta )}), X_{s(\delta )}^{{\varepsilon }}-\bar{X}_{s(\delta )}\rangle _{H_1} ds\right| \nonumber \\{} & {} \quad \leqslant \sum _{k=0}^{[t/\delta ]-1} \left| \int _{k\delta }^{(k+1)\delta }\langle F(X_{s(\delta )}^{{\varepsilon }},\hat{Y}_{s}^{{\varepsilon }})-\bar{F}(X^{{\varepsilon }}_{s(\delta )}), X_{s(\delta )}^{{\varepsilon }}-\bar{X}_{s(\delta )}\rangle _{H_1} ds\right| \nonumber \\{} & {} \qquad +\left| \int _{t(\delta )}^{t}\langle F(X_{s(\delta )}^{{\varepsilon }},\hat{Y}_{s}^{{\varepsilon }})-\bar{F}(X^{{\varepsilon }}_{s(\delta )}), X_{s(\delta )}^{{\varepsilon }}-\bar{X}_{s(\delta )}\rangle _{H_1} ds\right| \nonumber \\{} & {} \quad := J_1(t)+J_2(t). \end{aligned}$$
(3.25)

For the term \(J_2(t)\), it is easy to see

$$\begin{aligned} \mathbb {E}\left[ \sup _{t\in [0, T]}J_2(t)\right]\leqslant & {} C\left[ \mathbb {E}\sup _{t\in [0, T]}\Vert X^{{\varepsilon }}_t-\bar{X}_{t}\Vert ^2_{H_1}\right] ^{1/2}\nonumber \\{} & {} \quad \left[ \mathbb {E}\sup _{t\in [0,T]}\left| \int _{t(\delta )}^{t}(1+\Vert X^{{\varepsilon }}_{s(\delta )}\Vert _{H_1}+\Vert \hat{Y}_{s}^{{\varepsilon }}\Vert _{H_2})ds\right| ^2\right] ^{1/2}\nonumber \\\leqslant & {} C\left[ \mathbb {E}\sup _{t\in [0, T]}\Vert X^{{\varepsilon }}_t-\bar{X}_{t}\Vert ^2_{H_1}\right] ^{1/2}\nonumber \\{} & {} \quad \left[ \mathbb {E}\int _{0}^{T}(1+\Vert X^{{\varepsilon }}_{s(\delta )}\Vert ^2_{H_1}+\Vert \hat{Y}_{s}^{{\varepsilon }}\Vert ^2_{H_2})ds\right] ^{1/2}\delta ^{1/2}\nonumber \\\leqslant & {} C_{T}(\Vert x\Vert ^{2}_{H_1}+\Vert y\Vert ^{2}_{H_2}+1)\delta ^{1/2}. \end{aligned}$$
(3.26)

For the term \(J_1(t)\), we have

$$\begin{aligned}{} & {} \mathbb {E}\left[ \sup _{t\in [0, T]}J_1(t)\right] \leqslant \mathbb {E}\sum _{k=0}^{[T/\delta ]-1} \left| \int _{k\delta }^{(k+1)\delta }\langle F(X_{k\delta }^{{\varepsilon }},\hat{Y}_{s}^{{\varepsilon }})-\bar{F}(X_{k\delta }^{{\varepsilon }}), X_{k\delta }^{{\varepsilon }}-\bar{X}_{k\delta }\rangle _{H_1} ds\right| \\{} & {} \quad \leqslant \frac{C_{T}}{\delta }\max _{0\leqslant k\leqslant [T/\delta ]-1}\mathbb {E}\left| \int _{k\delta }^{(k+1)\delta } \langle F(X_{k\delta }^{{\varepsilon }},\hat{Y}_{s}^{{\varepsilon }})-\bar{F}(X_{k\delta }^{{\varepsilon }}), X_{k\delta }^{{\varepsilon }}-\bar{X}_{k\delta }\rangle _{H_1} ds\right| \\{} & {} \quad \leqslant \frac{C_{T}{\varepsilon }}{\delta }\max _{0\leqslant k\leqslant [T/\delta ]-1}\left[ \mathbb {E}\Vert X^{{\varepsilon }}_{k\delta }-{\bar{X}}_{k\delta }\Vert ^2_{H_1}\right] ^{1/2}\\{} & {} \qquad \left[ \mathbb {E}\left\| \int _{0}^{\frac{\delta }{{\varepsilon }}} F(X_{k\delta }^{{\varepsilon }},\hat{Y}_{s{\varepsilon }+k\delta }^{{\varepsilon }})-\bar{F}(X_{k\delta }^{{\varepsilon }})ds\right\| ^2_{H_1}\right] ^{1/2}\\{} & {} \quad \leqslant \frac{C_{T}(1+\Vert x\Vert _{H_1}+\Vert y\Vert _{H_2}){\varepsilon }}{\delta }\max _{0\leqslant k\leqslant [T/\delta ]-1}\left[ \int _{0}^{\frac{\delta }{{\varepsilon }}} \int _{r}^{\frac{\delta }{{\varepsilon }}}\Psi _{k}(s,r)dsdr\right] ^{1/2}, \end{aligned}$$

where for any \(0\leqslant r\leqslant s\leqslant \frac{\delta }{{\varepsilon }}\),

$$\begin{aligned} \Psi _{k}(s,r):={} & {} \mathbb {E}\left[ \langle F(X_{k\delta }^{{\varepsilon }},\hat{Y}_{s{\varepsilon }+k\delta }^{{\varepsilon }})-\bar{F}(X_{k\delta }^{{\varepsilon }}), F(X_{k\delta }^{{\varepsilon }},\hat{Y}_{r{\varepsilon }+k\delta }^{{\varepsilon }})-\bar{F}(X_{k\delta }^{{\varepsilon }})\rangle _{H_1}\right] , \end{aligned}$$

and

$$\begin{aligned} \Psi _{k}(s,r)\leqslant C_{T}(\Vert x\Vert ^{2}_{H_1}+\Vert y\Vert ^{2}_{H_2}+1)e^{-\frac{(s-r)\gamma }{2}}, \end{aligned}$$
(3.27)

whose proof will be presented in Step 3. Hence, we get

$$\begin{aligned}{} & {} \mathbb {E}\left[ \sup _{t\in [0, T]}\left| \int _0^t\left\langle F(X_{s(\delta )}^\varepsilon ,\hat{Y}_s^\varepsilon )-\bar{F}(X^{{\varepsilon }}_{s(\delta )}), X_{s(\delta )}^{{\varepsilon }}-\bar{X}_{s(\delta )}\right\rangle _{H_1} ds\right| \right] \\{} & {} \quad \leqslant C_{T}(\Vert x\Vert ^{2}_{H_1}+\Vert y\Vert ^{2}_{H_2}+1)\frac{{\varepsilon }}{\delta } \left[ \int _{0}^{\frac{\delta }{{\varepsilon }}}\int _{r}^{\frac{\delta }{{\varepsilon }}}e^{-\frac{(s-r)\gamma }{2}}dsdr\right] ^{1/2}\\{} & {} \qquad +C_{T}(\Vert x\Vert ^{2}_{H_1}+\Vert y\Vert ^{2}_{H_2}+1)\delta ^{1/2} \\{} & {} \quad = C_{T}(\Vert x\Vert ^{2}_{H_1}+\Vert y\Vert ^{2}_{H_2}+1)\frac{{\varepsilon }}{\delta }\Big (\frac{\delta }{\gamma {\varepsilon }}-\frac{1}{\gamma ^{2}}\\{} & {} \qquad +\frac{1}{\gamma ^2}e^{-\frac{\gamma \delta }{{\varepsilon }}}\Big )^{1/2} +C_{T}(\Vert x\Vert ^{2}_{H_1}+\Vert y\Vert ^{2}_{H_2}+1)\delta ^{1/2} \\{} & {} \quad \leqslant C_{T}(\Vert x\Vert ^{2}_{H_1}+\Vert y\Vert ^{2}_{H_2}+1)\left( \frac{{\varepsilon }}{\delta }+\frac{{\varepsilon }^{1/2}}{\delta ^{1/2}}+\delta ^{1/2}\right) , \end{aligned}$$

which completes the proof of estimate (3.24).

Step 3 For any \(s>0\), and any \(\mathscr {F}_s\)-measurable \(H_1\)-valued random variable X and \(H_2\)-valued random variable Y, we consider the following equation:

$$\begin{aligned} \left\{ \begin{aligned}&dY_{t}=\frac{1}{{\varepsilon }}B(X,Y_{t})dt+\frac{1}{\sqrt{{\varepsilon }}}G_2(X,Y_t)dW_{t}^{2},\quad t\geqslant s,\\&Y_{s}=Y, \end{aligned} \right. \end{aligned}$$

which has a unique solution \(\tilde{Y}^{{\varepsilon },s,X,Y}_t\). Then by the construction of \(\hat{Y}_{t}^{{\varepsilon }}\), for any \(k\in \mathbb {N}_{*}\) and \(t\in [k\delta ,(k+1)\delta ]\) we have \(\mathbb {P}\)-a.s.,

$$\begin{aligned} \hat{Y}_{t}^{{\varepsilon }}={\tilde{Y}}^{{\varepsilon },k\delta ,X_{k\delta }^{{\varepsilon }},\hat{Y}_{k\delta }^{{\varepsilon }}}_t, \end{aligned}$$

which implies

$$\begin{aligned} \Psi _{k}(s,r)={} & {} \mathbb {E}\left[ \langle F(X_{k\delta }^{{\varepsilon }},\tilde{Y}^{{\varepsilon }, k\delta ,X_{k\delta }^{{\varepsilon }}, {\hat{Y}}_{k\delta }^{{\varepsilon }}}_{s{\varepsilon }+k\delta })-\bar{F}(X_{k\delta }^{{\varepsilon }}), F(X_{k\delta }^{{\varepsilon }},\tilde{Y}^{{\varepsilon }, k\delta ,X_{k\delta }^{{\varepsilon }}, {\hat{Y}}_{k\delta }^{{\varepsilon }}}_{r{\varepsilon }+k\delta })-\bar{F}(X_{k\delta }^{{\varepsilon }})\rangle _{H_1}\right] \\ ={} & {} \int _{\Omega }\mathbb {E}\left[ \langle F(X_{k\delta }^{{\varepsilon }},\tilde{Y}^{{\varepsilon }, k\delta ,X_{k\delta }^{{\varepsilon }}, {\hat{Y}}_{k\delta }^{{\varepsilon }}}_{s{\varepsilon }+k\delta })-\bar{F}(X_{k\delta }^{{\varepsilon }}), \right. \\{} & {} \quad \quad \quad \left. F(X_{k\delta }^{{\varepsilon }},\tilde{Y}^{{\varepsilon }, k\delta ,X_{k\delta }^{{\varepsilon }}, {\hat{Y}}_{k\delta }^{{\varepsilon }}}_{r{\varepsilon }+k\delta })-\bar{F}(X_{k\delta }^{{\varepsilon }})\rangle _{H_1}| \mathscr {F}_{k\delta }\right] (\omega )\mathbb {P}(d \omega )\\ ={} & {} \int _{\Omega }\mathbb {E}\left[ \langle F(X_{k\delta }^{{\varepsilon }},\tilde{Y}^{{\varepsilon }, k\delta ,X_{k\delta }^{{\varepsilon }}(\omega ), {\hat{Y}}_{k\delta }^{{\varepsilon }}(\omega )}_{s{\varepsilon }+k\delta })-\bar{F}(X_{k\delta }^{{\varepsilon }}(\omega ))\right. , \\{} & {} \quad \quad \quad \left. F(X_{k\delta }^{{\varepsilon }}(\omega ),\tilde{Y}^{{\varepsilon }, k\delta ,X_{k\delta }^{{\varepsilon }}(\omega ), {\hat{Y}}_{k\delta }^{{\varepsilon }}(\omega )}_{r{\varepsilon }+k\delta })-\bar{F}(X_{k\delta }^{{\varepsilon }}(\omega ))\rangle _{H_1}\right] \mathbb {P}(d \omega ), \end{aligned}$$

where the last equality comes from the fact that \(X_{k\delta }^{{\varepsilon }}\) and \({\hat{Y}}_{k\delta }^{{\varepsilon }}\) are \(\mathscr {F}_{k\delta }\)-measurable, and for any fixed \((x,y)\in H_1\times H_2\), \(\{{\tilde{Y}}^{{\varepsilon }, k\delta ,x,y}_{s{\varepsilon }+k\delta }\}_{s\geqslant 0}\) is independent of \(\mathscr {F}_{k\delta }\).

By the definition of process \(\tilde{Y}^{{\varepsilon },k\delta ,x,y}_t\), for its \(dt\otimes \mathbb {P}\)-equivalence class \(\check{\tilde{Y}}^{{\varepsilon },k\delta ,x,y}\) we have \(\check{\tilde{Y}}^{{\varepsilon },k\delta ,x,y}\in L^{\kappa }([k\delta , T]\times \Omega , dt\otimes \mathbb {P}; V_2)\cap L^2([k\delta , T]\times \Omega , dt\otimes \mathbb {P}; H_2)\) with \(\kappa \) as in B3 and \(\mathbb {P}\)-a.s.

$$\begin{aligned} \tilde{Y}^{{\varepsilon },k\delta ,x,y}_{s{\varepsilon }+k\delta }= & {} y+\frac{1}{{\varepsilon }}\int ^{s{\varepsilon }+k\delta }_{k\delta } B(x,\tilde{\tilde{Y}}^{{\varepsilon },k\delta ,x,y}_r)dr+\frac{1}{\sqrt{{\varepsilon }}}\int ^{s{\varepsilon }+k\delta }_{k\delta } G_2(x,\tilde{\tilde{Y}}^{{\varepsilon },k\delta ,x,y}_r)dW^2_r\nonumber \\= & {} y+\frac{1}{{\varepsilon }}\int ^{s{\varepsilon }}_{0} B(x,\tilde{\tilde{Y}}^{{\varepsilon },k\delta ,x,y}_{r+k\delta })dr+\frac{1}{\sqrt{{\varepsilon }}}\int ^{s{\varepsilon }}_{0} G_2(x,\tilde{Y}^{{\varepsilon },k\delta ,x,y}_{r+k\delta })dW^{2,k\delta }_r\nonumber \\= & {} y+\int ^{s}_{0} B(x,\tilde{\tilde{Y}}^{{\varepsilon },k\delta ,x,y}_{r{\varepsilon }+k\delta })dr+\int ^{s}_{0} G_2(x,\tilde{Y}^{{\varepsilon },k\delta ,x,y}_{r{\varepsilon }+k\delta })d\hat{W}^{2,k\delta }_r, \end{aligned}$$
(3.28)

where \(\tilde{\tilde{Y}}^{{\varepsilon },k\delta ,x,y}\) is any \(V_2\)-valued progressively measurable \(dt\otimes \mathbb {P}\)-version of \(\check{\tilde{Y}}^{{\varepsilon },k\delta ,x,y}\), \(\{W^{2, k\delta }_r:=W^2_{r+k\delta }-W^2_{k\delta }\}_{r\geqslant 0}\) and \(\{{\hat{W}}^{2,k\delta }_t:=\frac{1}{\sqrt{{\varepsilon }}}W^{2,k\delta }_{t{\varepsilon }}\}_{t\geqslant 0}\).

The uniqueness of the solution of equations (3.28) and (3.18) implies that the distribution of \(({\tilde{Y}}^{{\varepsilon }, k\delta , x,y}_{s{\varepsilon }+k\delta })_{0\leqslant s\leqslant \delta /{\varepsilon }}\) coincides with the distribution of \((Y_{s}^{x, y})_{0\leqslant s\leqslant \delta /{\varepsilon }}\). Then by Proposition 3.4, estimates (3.1) and (3.15), we have

$$\begin{aligned} \Psi _{k}(s,r)= & {} \int _{\Omega }\Big [\tilde{\mathbb {E}} \big \langle F\left( X_{k\delta }^{{\varepsilon }}(\omega ),Y^{X_{k\delta }^{{\varepsilon }}(\omega ), {\hat{Y}}_{k\delta }^{{\varepsilon }}(\omega )}_{s}\right) -\bar{F}(X_{k\delta }^{{\varepsilon }}(\omega )),\\{} & {} \quad F\left( X_{k\delta }^{{\varepsilon }}(\omega ),Y^{X_{k\delta }^{{\varepsilon }}(\omega ), {\hat{Y}}_{k\delta }^{{\varepsilon }}(\omega )}_{r}\right) -\bar{F}(X_{k\delta }^{{\varepsilon }}(\omega ))\big \rangle _{H_1} \Big ]\mathbb {P}(d\omega )\\= & {} \int _{\Omega }\int _{\tilde{\Omega }}\big \langle \tilde{\mathbb {E}}\Big [ F\left( X_{k\delta }^{{\varepsilon }}(\omega ),Y^{X_{k\delta }^{{\varepsilon }}(\omega ),Y_{r}^{X_{k\delta }^{{\varepsilon }}(\omega ),{\hat{Y}}_{k\delta }^{{\varepsilon }}(\omega )}(\tilde{\omega })}_{s-r}\right) -\bar{F}( X_{k\delta }^{{\varepsilon }}(\omega ))\Big ],\\{} & {} \quad F\left( X_{k\delta }^{{\varepsilon }}(\omega ),Y^{X_{k\delta }^{{\varepsilon }}(\omega ), {\hat{Y}}_{k\delta }^{{\varepsilon }}(\omega )}_{r}(\tilde{\omega })\right) -\bar{F}( X_{k\delta }^{{\varepsilon }}(\omega ))\big \rangle _{H_1}\tilde{\mathbb {P}}(d\tilde{\omega })\mathbb {P}(d\omega )\\\leqslant & {} \int _{\Omega }\int _{\tilde{\Omega }}\left[ 1+\Vert X_{k\delta }^{{\varepsilon }}(\omega )\Vert _{H_1}+\Vert Y_{r}^{X_{k\delta }^{{\varepsilon }}(\omega ), {\hat{Y}}_{k\delta }^{{\varepsilon }}(\omega )}(\tilde{\omega })\Vert _{H_2}\right] e^{-\frac{(s-r)\gamma }{2}}\\{} & {} \quad \cdot \left[ (1+\Vert X_{k\delta }^{{\varepsilon }}(\omega )\Vert _{H_1}+\Vert Y_{r}^{X_{k\delta }^{{\varepsilon }}(\omega ), {\hat{Y}}_{k\delta }^{{\varepsilon }}(\omega )}(\tilde{\omega })\Vert _{H_2}\right] \tilde{\mathbb {P}}(d\tilde{\omega })\mathbb {P}(d\omega )\\\leqslant & {} C_T\int _{\Omega }\left[ (1+\Vert X^{{\varepsilon }}_{k\delta }(\omega )\Vert ^{2}_{H_1}+\Vert {\hat{Y}}_{k\delta }^{{\varepsilon }}(\omega )\Vert ^{2}_{H_2})\right] \mathbb {P}(d\omega )e^{-\frac{(s-r)\gamma }{2}}\\\leqslant & {} C_{T}(\Vert x\Vert ^{2}_{H_1}+\Vert y\Vert ^{2}_{H_2}+1)e^{-\frac{(s-r)\gamma }{2}}, \end{aligned}$$

which gives estimate (3.27). The proof is complete. \(\square \)

3.5 Proof of Theorem 2.4:

Applying Chebyshev’s inequality, Lemmas 3.1 and 3.5, we have

$$\begin{aligned}{} & {} \mathbb {E}\left( \sup _{t\in [0, T]}\Vert X_{t}^{{\varepsilon }}-\bar{X}_{t}\Vert ^2_{H_1} 1_{\{T>\tau _{R}\}}\right) \leqslant \left[ \mathbb {E}\left( \sup _{t\in [0, T]}\Vert X_{t}^{{\varepsilon }}-\bar{X}_{t}\Vert ^{4}_{H_1}\right) \right] ^{1/2}\nonumber \\{} & {} \quad \cdot \left[ \mathbb {P}\left( T>\tau _{R}\right) \right] ^{1/2} \nonumber \\{} & {} \quad \leqslant \frac{C_{T}(1+\Vert x\Vert ^2_{H_1}+\Vert y\Vert ^{2}_{H_2})}{\sqrt{R}}\nonumber \\{} & {} \qquad \times \left[ \mathbb {E}\int _0^T(1+\Vert \tilde{\bar{X}}_s\Vert _{V_1}^{\alpha })(1+\Vert \bar{X}_s\Vert _{H_1}^{\beta })ds\right] ^{1/2}\nonumber \\{} & {} \quad \leqslant \frac{C_{T}(1+\Vert x\Vert ^m_{H_1}+\Vert y\Vert ^{m}_{H_2})}{\sqrt{R}}, \end{aligned}$$
(3.29)

where m is a positive constant. Then taking \(\delta ={\varepsilon }^{\frac{2}{3}}\), estimates (3.23) and (3.29) give

$$\begin{aligned} \mathbb {E}\left( \sup _{t\in [0, T]}\Vert X_{t}^{{\varepsilon }}-\bar{X}_{t}\Vert ^2_{H_1}\right)\leqslant & {} \mathbb {E}\left( \sup _{t\in [0, T]}\Vert X^{{\varepsilon }}_{t}-\bar{X}_{t}\Vert ^2_{H_1} 1_{\{T\leqslant \tau _{R}\}}\right) \\{} & {} +\mathbb {E}\left( \sup _{t\in [0, T]}\Vert X_{t}^{{\varepsilon }}-\bar{X}_{t}\Vert ^2_{H_1} 1_{\{T>\tau _{R}\}}\right) \\\leqslant & {} C_{R,T}(1+\Vert x\Vert ^m_{H_1}+\Vert y\Vert ^{m}_{H_2}) {\varepsilon }^{\frac{1}{6}}\\{} & {} +\frac{C_{T}(1+\Vert x\Vert ^m_{H_1}+\Vert y\Vert ^{m}_{H_2})}{\sqrt{R}}. \end{aligned}$$

Now, letting \({\varepsilon }\rightarrow 0\) first, then \(R\rightarrow \infty \), we have

$$\begin{aligned} \lim _{{\varepsilon }\rightarrow 0}\mathbb {E}\left( \sup _{t\in [0, T]}\Vert X_{t}^{{\varepsilon }}-\bar{X}_{t}\Vert ^2_{H_1}\right) =0. \end{aligned}$$
(3.30)

Note that for any \(p>1\), by Lemmas 3.1 and 3.5, it is easy to see that

$$\begin{aligned} \mathbb {E}\left( \sup _{t\in [0, T]}\Vert X_{t}^{{\varepsilon }}-\bar{X}_{t}\Vert ^{2p}_{H_1}\right)= & {} \mathbb {E}\left[ \sup _{t\in [0, T]}\left( \Vert X_{t}^{{\varepsilon }}-\bar{X}_{t}\Vert _{H_1} \Vert X_{t}^{{\varepsilon }}-\bar{X}_{t}\Vert ^{2p-1}_{H_1}\right) \right] \\\leqslant & {} C_p\left[ \mathbb {E}\left( \sup _{t\in [0, T]}\Vert X_{t}^{{\varepsilon }}-\bar{X}_{t}\Vert ^{2}_{H_1}\right) \right] ^{1/2}\\{} & {} \left[ \mathbb {E}\left( \sup _{t\in [0, T]}\Vert X_{t}^{{\varepsilon }}-\bar{X}_{t}\Vert ^{4p-2}_{H_1}\right) \right] ^{1/2}\\\leqslant & {} C_{p, T}(1+\Vert x\Vert ^{2p-1}_{H_1}+\Vert y\Vert ^{2p-1}_{H_2})\\{} & {} \left[ \mathbb {E}\left( \sup _{t\in [0, T]}\Vert X_{t}^{{\varepsilon }}-\bar{X}_{t}\Vert ^{2}_{H_1}\right) \right] ^{1/2}. \end{aligned}$$

Hence, by (3.30), we finally get

$$\begin{aligned} \lim _{{\varepsilon }\rightarrow 0}\mathbb {E}\left( \sup _{t\in [0, T]}\Vert X_{t}^{{\varepsilon }}-\bar{X}_{t}\Vert ^{2p}_{H_1}\right) =0. \end{aligned}$$

The proof is complete. \(\square \)

4 Application to Examples

In this section we will apply our main result to establish the averaging principle for stochastic porous medium equations, p-Laplace equations, Burgers equations and 2D Navier–Stokes equations with slow and fast time-scales. Note that we here mainly focus on the nonlinear operator A, so we take the stochastic porous medium equation, p-Laplace equations, Burgers equations, or 2D Navier–Stokes equations for the slow component and stochastic heat equation with Lipschitz drift for the fast component for the simplicity.

Let \(\Lambda \subset \mathbb {R}^d\) be an open bounded domain and \(\Delta \) be the Laplace operator on \(\Lambda \) with Dirichlet boundary conditions, and for \(p\in [1, \infty )\) we use \(L^p(\Lambda )\) and \(H^{n,p}_0\) to denote the space of p-Lebesgue integrable functions on \(\Lambda \) and the Sobolev space of order n in \(L^p(\Lambda )\) with Dirichlet boundary conditions. Recall that \(X^{*}\) denotes the dual space of a Banach space X.

4.1 Stochastic Porous Medium Equations

Let \(\Psi :\mathbb {R}\rightarrow \mathbb {R}\) be a function having the following properties :

\((\Psi 1)\):

\(\Psi \) is continuous.

\((\Psi 2)\):

For all \(s, t\in \mathbb {R}\)

$$\begin{aligned} (t-s)(\Psi (t)-\Psi (s))\geqslant 0. \end{aligned}$$
\((\Psi 3)\):

There exist \(p\in [2,\infty ), c_1\in (0, \infty ), c_2\in [0, \infty )\) such that for all \(s\in \mathbb {R}\)

$$\begin{aligned} s\Psi (s)\geqslant c_1|s|^p-c_2. \end{aligned}$$
\((\Psi 4)\):

There exist \(c_3,c_4\in (0, \infty )\) such that for all \(s\in \mathbb {R}\)

$$\begin{aligned} |\Psi (s)|\leqslant c_3|s|^{p-1}+c_4, \end{aligned}$$

where p is as in \((\Psi 3)\).

Considering the Gelfand triple for the slow equation

$$\begin{aligned} V_1:=L^p(\Lambda )\subseteq H_1:=(H^{1,2}_0(\Lambda ))^{*}\subseteq V_1^{*}:=(L^p(\Lambda ))^{*} \end{aligned}$$

and the Gelfand triple for the fast equation

$$\begin{aligned} V_2:=H^{1,2}_0(\Lambda )\subseteq H_2:=L^2(\Lambda )\subseteq V_2^{*}:=(H^{1,2}_0(\Lambda ))^{*}. \end{aligned}$$

Then we introduce the porous medium operator \(A(u): V_1\rightarrow V_1^{*}\) (see [20, Sect. 4.1] for details) by

$$\begin{aligned} A(u)=\Delta \Psi (u),\quad u\in V_1. \end{aligned}$$

Now, we consider the slow–fast stochastic porous medium-heat equations

$$\begin{aligned} \left\{ \begin{array}{l} \displaystyle dX^{\varepsilon }_t=\left[ \Delta \Psi (X^{\varepsilon }_t)+F(X^{\varepsilon }_t, Y^{\varepsilon }_t)\right] dt +G_1(X^{\varepsilon }_t)d W^{1}_{t},\\ \displaystyle dY^{\varepsilon }_t=\frac{1}{\varepsilon }\left[ \Delta Y^{\varepsilon }_t+B_2(X^{\varepsilon }_t,Y^{\varepsilon }_t)\right] dt +\frac{1}{\sqrt{\varepsilon }}G_2(X^{\varepsilon }_t, Y^{\varepsilon }_t)d W^{2}_{t},\\ X^{\varepsilon }_0=x\in H_1, Y^{\varepsilon }_0=y\in H_2,\end{array}\right. \end{aligned}$$
(4.1)

where

$$\begin{aligned} F:H_1\times H_2\rightarrow H_1; \quad G_1: V_1\rightarrow L_{2}(U_1; H_1); \end{aligned}$$

are measurable mappings and

$$\begin{aligned} B_2: H_1\times V_2\rightarrow V^{*}_2; \quad G_2:H_1\times V_2\rightarrow L_{2}(U_2; H_2) \end{aligned}$$

are Lipschitz continuous. More precisely,

$$\begin{aligned}{} & {} \Vert F(u_1,u_2)-F(v_1,v_2)\Vert _{H_1}\leqslant C(\Vert u_1-v_1\Vert _{H_1}+\Vert u_2-v_2\Vert _{H_2}); \end{aligned}$$
(4.2)
$$\begin{aligned}{} & {} \Vert G_1(u)-G_1(v)\Vert ^2_{L_2(U_1,H_1)}\leqslant C\Vert u-v\Vert ^2_{H_1};\end{aligned}$$
(4.3)
$$\begin{aligned}{} & {} \Vert B_2(u_1,u_2)-B_2(v_1,v_2)\Vert _{H_2}\leqslant C\Vert u_1-v_1\Vert _{H_1}+L_{B_2}\Vert u_2-v_2\Vert _{H_2};\end{aligned}$$
(4.4)
$$\begin{aligned}{} & {} \Vert G_2(u_1,u)-G_2(v_1,v)\Vert _{L_2(U_2,H_2)}\leqslant C\Vert u_1-v_1\Vert _{H_1}+L_{G_2}\Vert u-v\Vert _{H_2}. \end{aligned}$$
(4.5)

Moreover, there exists \(\zeta \in (0,1)\) such that

$$\begin{aligned} \Vert G_2(u_1,v)\Vert _{L_2(U_2,H_2)}\leqslant C(1+\Vert u_1\Vert _{H_1}+\Vert v\Vert ^{\zeta }_{H_2}) \end{aligned}$$
(4.6)

and for the smallest eigenvalue \(\lambda _1\) of \(-\Delta \) in \(H_2\), the Lipschitz constants \(L_{B_2}\), \(L_{G_2}\) satisfy

$$\begin{aligned} 2\lambda _{1}-2L_{g}-L^2_{\sigma _2}>0. \end{aligned}$$
(4.7)

It is well known that the porous medium operator A satisfies the monotonicity and coercivity properties (see, e.g., [20, pp. 87–88]). So it is easy to check all the conditions A1A4 hold. Furthermore, the condition B2 holds by condition (4.7) and the conditions B1,B3 and B4 hold obviously. Hence, by Theorem 2.4, we have

$$\begin{aligned} \lim _{{\varepsilon }\rightarrow 0}\mathbb {E} \left( \sup _{t\in [0,T]}\Vert X_{t}^{{\varepsilon }}-\bar{X}_{t}\Vert ^{2p}_{H_1} \right) =0,\quad \forall p\geqslant 1, \end{aligned}$$

where \(\bar{X}_t\) is the solution of the corresponding averaged equation.

4.2 Stochastic p-Laplace Equation

Now we consider the stochastic p-Laplace equation \((p\geqslant 2)\). We choose the Gelfand triple for the slow equation

$$\begin{aligned} V_1:=H^{1,p}_0(\Lambda )\subseteq H_1:=L^2(\Lambda )\subseteq V_1^{*}:=(H^{1,p}_0(\Lambda ))^{*} \end{aligned}$$

and the Gelfand triple for the fast equation

$$\begin{aligned} V_2:=H^{1,2}_0(\Lambda )\subseteq H_2:=L^2(\Lambda )\subseteq V_2^{*}:=(H^{1,2}_0(\Lambda ))^{*} \end{aligned}$$

and let \(A: V_1\rightarrow V^{*}_1\) be

$$\begin{aligned} A(u):=\text {div} (|\nabla u|^{p-2}\nabla u),\quad u\in V_1. \end{aligned}$$

More precisely, given \(u\in V_1\), we define

$$\begin{aligned} {_{V_1^{*}}}\langle A(u), v\rangle _{V_1}:=-\int _{\Lambda }|\nabla u(\xi )|^{p-2}\langle \nabla u(\xi ),\nabla v(\xi )\rangle d\xi , \quad v\in V_1. \end{aligned}$$

Here A is called the p-Laplace operator, also denoted by \(\Delta _p\). Note that \(\Delta _2=\Delta \).

Now, we consider the slow–fast stochastic p-Laplace-heat equations

$$\begin{aligned} \left\{ \begin{array}{l} \displaystyle dX^{\varepsilon }_t=\left[ \text {div} (|\nabla X^{\varepsilon }_t|^{p-2}\nabla X^{\varepsilon }_t)+F(X^{\varepsilon }_t, Y^{\varepsilon }_t)\right] dt +G_1(X^{\varepsilon }_t)d W^{1}_{t},\\ \displaystyle dY^{\varepsilon }_t=\frac{1}{\varepsilon }\left[ \Delta Y^{\varepsilon }_t +B_2(X^{\varepsilon }_t,Y^{\varepsilon }_t)\right] dt +\frac{1}{\sqrt{\varepsilon }}G_2(X^{\varepsilon }_t, Y^{\varepsilon }_t)d W^{2}_{t},\\ X^{\varepsilon }_0=x\in H_1, Y^{\varepsilon }_0=y\in H_2,\end{array}\right. \end{aligned}$$
(4.8)

where the coefficients \(F, G_1, B_2\) and \(G_2\) satisfy conditions (4.2)–(4.7).

It is well known that the p-Laplace operator satisfies the monotonicity and coercivity properties (see, e.g., [20, Example 4.1.9]). So it is easy to check that all the conditions A1A4 and B1B4 hold. Hence, by Theorem 2.4, we have

$$\begin{aligned} \lim _{{\varepsilon }\rightarrow 0}\mathbb {E} \left( \sup _{t\in [0,T]}\Vert X_{t}^{{\varepsilon }}-\bar{X}_{t}\Vert ^{2p}_{H_1} \right) =0,\quad \forall p\geqslant 1, \end{aligned}$$

where \(\bar{X}_t\) is the solution of the corresponding averaged equation.

Note that in the above two examples both the porous medium and the p-Laplace operators are globally monotone. But our result also applies to many merely locally monotone operators. For illustration, we will consider the stochastic Burgers and stochastic 2D Naiver-Stokes equation below (see [20, Sects. 4.1 and 5.1] for a number of other examples).

4.3 Stochastic Burgers Equation

Now we consider the stochastic Burgers equation. Taking \(\Lambda =(0,1)\subset \mathbb {R}\), we choose the Gelfand triple for the slow equation

$$\begin{aligned} V_1:=H^{1,2}_0(\Lambda )\subseteq H_1:=L^2(\Lambda )\subseteq V_1^{*}:=(H^{1,2}_0(\Lambda ))^{*} \end{aligned}$$

and the Gelfand triple for the fast equation

$$\begin{aligned} V_2:=H^{1,2}_0(\Lambda )\subseteq H_2:=L^2(\Lambda )\subseteq V_2^{*}:=(H^{1,2}_0(\Lambda ))^{*} \end{aligned}$$

and let \(A: V_1\rightarrow V^{*}_1\) be

$$\begin{aligned} A(u):=\Delta u+u\nabla u,\quad u\in V_1. \end{aligned}$$

Now, we consider the slow–fast stochastic Burgers-heat equations

$$\begin{aligned} \left\{ \begin{array}{l} \displaystyle dX^{\varepsilon }_t=\left[ \Delta X^{\varepsilon }_t+X^{\varepsilon }_t\nabla X^{\varepsilon }_t+F(X^{\varepsilon }_t, Y^{\varepsilon }_t)\right] dt +G_1(X^{\varepsilon }_t)d W^{1}_{t},\\ \displaystyle dY^{\varepsilon }_t=\frac{1}{\varepsilon }\left[ \Delta Y^{\varepsilon }_t +B_2(X^{\varepsilon }_t,Y^{\varepsilon }_t)\right] dt +\frac{1}{\sqrt{\varepsilon }}G_2(X^{\varepsilon }_t, Y^{\varepsilon }_t)d W^{2}_{t},\\ X^{\varepsilon }_0=x\in H_1, Y^{\varepsilon }_0=y\in H_2,\end{array}\right. \end{aligned}$$
(4.9)

where the coefficients \(F, G_1, B_2\) and \(G_2\) satisfy conditions (4.2)–(4.7).

It is well known that the operator A satisfies the monotonicity and coercivity properties (see, e.g., [20, Lemma 5.1.6]). So it is easy to check that all the conditions A1A4 and B1B4 hold. Hence, by Theorem 2.4, we have

$$\begin{aligned} \lim _{{\varepsilon }\rightarrow 0}\mathbb {E} \left( \sup _{t\in [0,T]}\Vert X_{t}^{{\varepsilon }}-\bar{X}_{t}\Vert ^{2p}_{H_1} \right) =0,\quad \forall p\geqslant 1, \end{aligned}$$

where \(\bar{X}_t\) is the solution of the corresponding averaged equation.

4.4 Stochastic 2D Navier–Stokes Equation

Now we consider the stochastic 2D Navier–Stokes equation. Let \(\Lambda \) be a bounded domain in \(\mathbb {R}^2\) with smooth boundary, denote

$$\begin{aligned} V_1=\{v\in H^{1,2}_0(\Lambda ;\mathbb {R}^2): \nabla \cdot v=0~a.s. \text {in}~\Lambda \},\quad \Vert v\Vert ^2_{V_1}:=\int _{\Lambda } |\nabla v(\xi )|^2d\xi , \end{aligned}$$

and let \(H_1\) be the closure of \(V_1\) in \(L^2(\Lambda ;\mathbb {R}^2)\). We choose the Gelfand triple for the slow equation

$$\begin{aligned} V_1\subseteq H_1\subseteq V_1^{*} \end{aligned}$$

and the Gelfand triple for the fast equation

$$\begin{aligned} V_2:=H^{1,2}_0(\Lambda )\subseteq H_2:=L^2(\Lambda )\subseteq V_2^{*}:=(H^{1,2}_0(\Lambda ))^{*} \end{aligned}$$

and let \(A: V_1\rightarrow V^{*}_1\) be

$$\begin{aligned} A(u):=P_H\Delta u-P_H[(u\cdot \nabla )u],\quad u\in V_1, \end{aligned}$$

where \(P_H\) is the Helmholtz–Leray projection and \(u\cdot \nabla =\sum ^2_{i=1}u^i\partial _i\) with \(u=(u^1,u^2)\).

Now, we consider the slow–fast stochastic 2D Navier–Stokes-heat equation

$$\begin{aligned} \left\{ \begin{array}{l} \displaystyle dX^{\varepsilon }_t=\left[ A(X^{\varepsilon }_t)+F(X^{\varepsilon }_t, Y^{\varepsilon }_t)\right] dt +G_1(X^{\varepsilon }_t)d W^{1}_{t},\\ \displaystyle dY^{\varepsilon }_t=\frac{1}{\varepsilon }\left[ \Delta Y^{\varepsilon }_t +B_2(X^{\varepsilon }_t,Y^{\varepsilon }_t)\right] dt +\frac{1}{\sqrt{\varepsilon }}G_2(X^{\varepsilon }_t, Y^{\varepsilon }_t)d W^{2}_{t},\\ X^{\varepsilon }_0=x\in H_1, Y^{\varepsilon }_0=y\in H_2,\end{array}\right. \end{aligned}$$
(4.10)

where the coefficients \(F, G_1, B_2\) and \(G_2\) satisfy conditions (4.2)–(4.7).

It is well known that the operator A satisfies the monotonicity and coercivity properties (see, e.g., [20, Example 5.1.10]). So it is easy to check that all the conditions A1A4 and B1B4 hold. Hence, by Theorem 2.4, we have

$$\begin{aligned} \lim _{{\varepsilon }\rightarrow 0}\mathbb {E} \left( \sup _{t\in [0,T]}\Vert X_{t}^{{\varepsilon }}-\bar{X}_{t}\Vert ^{2p}_{H_1} \right) =0,\quad \forall p\geqslant 1, \end{aligned}$$

where \(\bar{X}_t\) is the solution of the corresponding averaged equation.