1 Introduction

In this paper we are interested in the solvability in \(L_p\) spaces of linear stochastic parabolic, possibly degenerate, PDEs and of systems of linear stochastic parabolic PDEs. The equations we consider are important in applications. They arise in nonlinear filtering of partially observable stochastic processes, in modelling of hydromagnetic dynamo evolving in fluids with random velocities, and in many other areas of physics and engineering.

Among several important results, an \(L_2\)-theory of degenerate linear elliptic and parabolic PDEs is presented in [2527] and [28]. The solvability in \(L_2\) spaces of linear degenerate stochastic PDEs of parabolic type was first studied in [20] (see also [29]).

Solving equations in \(W^m_p\) spaces for sufficiently high exponent \(p\) allows one to prove by Sobolev embedding better smoothness properties of the solutions than in the case of solving them in \(W^m_2\) spaces. As it is mentioned above, the class of stochastic PDEs considered in this paper includes the equations of nonlinear filtering of partially observed diffusion processes. By our results one obtains the existence of the conditional density of the unobserved process, and its regularity properties, under minimal smoothness conditions on the coefficients.

The first existence and uniqueness theorem on solvability of these equations in \(W^m_p\) spaces, when they may also degenerate, is presented in [22]. This result is improved in [8].

In the present paper we fill in a gap in the proof of the existence and uniqueness theorems in [22] and [8]. Moreover, we essentially improve these theorems. In [22] the existence and uniqueness theorem for \(W^m_p\)-valued solutions is not separated from an existence and uniqueness theorem for \(W^m_2\)-valued solutions. In particular, it contains also conditions ensuring the existence and uniqueness of a \(W^m_2\) solution. In [8] these conditions were removed, and for any \(q\in (0,p]\) an estimate for \(E\sup _{t\le T}|u|^q_{W^m_p}\) for the solution \(u\) is obtained. In the present paper we remove the extra conditions of the existence and uniqueness theorem in [22], remove the restriction \(q\le p\) on the exponent \(q\) in the corresponding theorem in [8], and prove the uniqueness of the solution under weaker assumptions than those in [22] and [8] (see Theorem 2.1 below). Note that to have \(q\)-th moment estimates for any high \(q\) is useful, for example, in proving almost sure rate of convergence of numerical approximations of stochastic PDEs, see, e.g., [5]. Moreover, we not only improve the existence an uniqueness theorems in [22] and [8], but our main result, Theorem 3.1, extends them to degenerate stochastic parabolic systems. We present also an existence and uniqueness theorem, Theorem 3.2, on solvability in \(W^m_2\) spaces for a larger class of stochastic parabolic systems, which, in particular, contains the first order symmetric hyperbolic systems. This result was indicated in [9].

We would like to emphasise that the equations we consider in this paper may degenerate and become first order equations. For non degenerate stochastic PDEs \(L_p\)- and \(L_q(L_p)\)-theories are developed, see e.g. [13, 14, 17, 18] and [15], which give essentially stronger results on smoothness of the solutions.

There are many publications on stochastic PDEs driven by martingale measures, pioneered by [30]. (See also [2] and the references therein.) In [3] two set-ups for stochastic PDEs, concerning the driving noise are compared: a set-up when the driving noise is a martingale measure, and an other set-up when the equations are driven by martingales with values in infinite dimensional spaces. It is shown, in particular, that stochastic integrals with respect to martingale measures can be rewritten as stochastic Itô integrals with respect to martingales taking values in Hilbert spaces. Earlier this was proved in [6] in order to treat SDEs and stochastic PDEs driven by martingale measures as stochastic equations driven by martingales. In [16] super-Brownian motions in any dimension are constructed as solutions of SPDEs driven by infinite dimensional martingales, more precisely, by an infinite sequence of independent Wiener processes. As it is well-known, in the one-dimensional case the stochastic equation for the super-Brownian motion can be written as a stochastic PDE driven by a martingale measure, more precisely, by a space-time white noise, but as it is noted in [16], most likely this is not possible in higher dimensions.

Solvability of stochastic PDEs of parabolic type are often investigated in the sense of the mild solution concept, i.e., when solutions to stochastic PDEs are defined as solutions to a stochastic integral equation obtained via Duhamel’s principle, called also variation of constant formula in the context of ODEs (see, e.g., [2] and [3]). For the theory of stochastic PDEs built on this approach, often called semigroup approach, we refer the reader to the monograph [4]. In this framework there are many results on solvability in various Banach spaces \({\mathbb {B}}\), including \(W^m_p\) spaces, when the linear operator in the drift term of the equation is an infinitesimal generator of a continuous semigroup of bounded linear operators acting on \({\mathbb {B}}\). The equations investigated in most papers, including [2] and [3], do not have a differential operator in their diffusion part, unlike the equations studied in this paper. In the case when the differential operator in the drift term is a time dependent random operator, serious problems arise in adaptation the semigroup approach. Thus the semigroup approach is not used to investigate the filtering equations of general signal and observation models, which are included in the class of equations considered in the present paper.

Finally we would like to mention that for some special degenerate stochastic PDEs, for example for the stochastic Euler equations, there are many results on solvability in the literature. See, for example, [1] and the references therein. Concerning the equation in [1] we note that its main term is non random, and its solution can be given in a sense explicitly.

In conclusion we introduce some notation used throughout the paper. All random elements will be given on a fixed probability space \((\Omega ,{\mathcal {F}},P)\), equipped with a filtration \(({\mathcal {F}}_t)_{t\ge 0}\) of \(\sigma \)-fields \({\mathcal {F}}_{t}\subset {\mathcal {F}}\). We suppose that this probability space carries a sequence of independent Wiener processes \((w^r)_{r=1}^{\infty }\), adapted to the filtration \(({\mathcal {F}}_t)_{t\ge 0}\), such that \(w^r_t-w^r_s\) is independent of \({\mathcal {F}}_s\) for each \(r\) and any \(0\le s\le t\). It is assumed that \({\mathcal {F}}_0\) contains all \(P\)-null subsets of \(\Omega \), so that \((\Omega ,{\mathcal {F}},P)\) is a complete probability space and the \(\sigma \)-fields \({\mathcal {F}}_{t}\) are complete. By \({\mathcal {P}}\) we denote the predictable \(\sigma \)-field of subsets of \(\Omega \times (0,\infty )\) generated by \(({\mathcal {F}}_t)_{t\ge 0}\). For basic notions in stochastic analysis, like continuous local martingales and their quadratic variation process, we refer to [12].

For \(p\in [1,\infty )\), the space of measurable mappings \(f\) from \({\mathbb {R}}^d\) into a separable Hilbert space \({\mathcal {H}}\), such that

$$\begin{aligned} \Vert f\Vert _{L_{p}}= \left( \int _{{\mathbb {R}}^d}|f(x)|_{{\mathcal {H}}}^p\,dx\right) ^{1/p}<\infty , \end{aligned}$$

is denoted by \(L_p({\mathbb {R}}^{d},{\mathcal {H}})\).

Remark 1.1

We did not include the symbol \({\mathcal {H}}\) in the notation of the norm in \(L_{p}({\mathbb {R}}^{d},{\mathcal {H}})\). Which \({\mathcal {H}}\) is involved will be absolutely clear from the context. We do the same in other similar situations.

Often \({\mathcal {H}}\) will be \(l_2\), or the space of infinite matrices \(\{g^{ij}\in {\mathbb {R}}:i=1,\ldots ,M,\, j=1,2,\ldots \}\), or finite \(M\times M\) matrices with the Hilbert–Schmidt norm. The space of functions from \(L_p({\mathbb {R}}^{d},{\mathcal {H}})\), whose generalized derivatives up to order \(m\) are also in \(L_p({\mathbb {R}}^{d},{\mathcal {H}})\), is denoted by \(W^m_p ({\mathbb {R}}^{d},{\mathcal {H}})\). By definition \(W^0_p({\mathbb {R}}^{d},{\mathcal {H}})=L_p({\mathbb {R}}^{d},{\mathcal {H}})\). The norm \(|u|_{W^m_p}\) of \(u\) in \(W^m_p({\mathbb {R}}^{d},{\mathcal {H}})\) is defined by

$$\begin{aligned} |u|^p_{W^m_p}=\sum \limits _{|\alpha |\le m}|D^{\alpha }u|^p_{L_p}, \end{aligned}$$
(1.1)

where \(D^{\alpha }:=D_1^{\alpha _1},\ldots ,D_d^{\alpha _d}\) for multi-indices \(\alpha :=(\alpha _1,\ldots ,\alpha _d)\in \{0,1,\ldots \}^d\) of length \(|\alpha |:=\alpha _1+\alpha _2+\cdots +\alpha _d\), and \(D_iu\) is the generalized derivative of \(u\) with respect to \(x^i\) for \(i=1,2\ldots ,d\). We also use the notation \(D_{ij}=D_iD_j\) and \(Du=(D_1u,\ldots ,D_du)\). When we talk about “derivatives up to order \(m\)” of a function for some nonnegative integer \(m\), then we always include the zeroth-order derivative, i.e. the function itself. Unless otherwise indicated, the summation convention with respect to repeated integer valued indices is used throughout the paper.

2 Formulation

In this section \({\mathcal {H}}={\mathbb {R}}\) and we use a shorter notation

$$\begin{aligned} L_{p}=L_{p}({\mathbb {R}}^{d},{\mathbb {R}}),\quad W^{m}_{p}=W^{m}_{p}({\mathbb {R}}^{d},{\mathbb {R}}), \quad W^{m+1}_p (l_2) =W^{m+1}_p ({\mathbb {R}}^{d},l_2). \end{aligned}$$

Fix a \(T\in (0,\infty )\) and consider the problem

$$\begin{aligned} du_t(x)=(L_tu_t(x)+f_t(x))\,dt+\left( M^r_tu_t(x)+g^r_t(x)\right) \,dw^r_t, \end{aligned}$$
(2.1)

\((t,x)\in H_T:=[0,T]\times {\mathbb {R}}^d\), with initial condition

$$\begin{aligned} u_0(x)=\psi (x),\quad x\in {\mathbb {R}}^d, \end{aligned}$$
(2.2)

where

$$\begin{aligned} L_t=a^{ij}_t(x)D_{ij}+b^{i}_t(x)D_i+c_t(x), \quad M^r_t=\sigma _t^{ir}(x)D_i+\nu ^r_t(x), \end{aligned}$$

and all functions, given on \(\Omega \times H_T\), are assumed to be real valued and satisfy the following assumptions in which \(m\ge 0\) is an integer and \(K\) is a constant.

Assumption 2.1

The derivatives in \(x\in {\mathbb {R}}^d\) of \(a^{ij}\) up to order \(\max (m,2)\) and of \(b^{i}\) and \(c\) up to order \(m\) are \({\mathcal {P}}\otimes {\mathcal {B}}({\mathbb {R}}^d)\)-measurable functions, bounded by \(K\) for all \(i,j\in \{1,2,\ldots ,d\}\). The functions \(\sigma ^{i} =(\sigma ^{ir})_{r=1}^{\infty }\) and \(\nu =(\nu ^{r})_{r=1}^{\infty }\) are \(l_2\)-valued and their derivatives in \(x\) up to order \(m+1\) are \({\mathcal {P}}\otimes {\mathcal {B}}({\mathbb {R}}^d)\)-measurable \(l_2\)-valued functions, bounded by \(K\).

Assumption 2.2

The free data, \( f_t \) and \( g_t =(g^r)_{r=1}^{\infty }\) are predictable processes with values in \(W^m_p\) and \(W^{m+1}_p (l_2) \), respectively, such that almost surely

$$\begin{aligned} {\mathcal {K}}_{m,p}^p(T)=\int _0^T \left( |f_t|^p_{W^m_p}+|g_t|^p_{W^{m+1}_p }\right) \,dt<\infty . \end{aligned}$$
(2.3)

The initial value, \(\psi \) is an \({\mathcal {F}}_0\)-measurable random variable with values in \(W^m_p\).

Assumption 2.3

For \(P\otimes dt\otimes dx\)-almost all \((\omega ,t,x)\in \Omega \times [0,T]\times {\mathbb {R}}^d\)

$$\begin{aligned} \alpha ^{ij}_t(x)z^iz^j\ge 0 \end{aligned}$$

for all \(z\in {\mathbb {R}}^d\), where

$$\begin{aligned} \alpha ^{ij}=2a^{ij}-\sigma ^{ir}\sigma ^{ jr}. \end{aligned}$$

This condition is a standard assumption in the theory of stochastic PDEs. If it is not satisfied then Eq. (2.1) may be solvable only for very special initial conditions and free terms. Notice that this assumption allows \(\alpha =0\), which can happen, for example, when \(\sigma ^{ik}=(\sqrt{2a})^{ik}\) for \(i,k=1,\ldots ,d\) and \(\sigma ^{ik}=0\) for \(k>d\).

Let \(\tau \) be a stopping time bounded by \(T\).

Definition 2.1

A \(W_p^1\)-valued function \(u\), defined on the stochastic interval

figure a

, is called a solution of (2.1)–(2.2) on \([0,\tau ]\) if \(u\) is predictable on

figure b

,

$$\begin{aligned} \int _0^{\tau }|u_t|_{W^1_p}^p\,dt<\infty \,(a.s.), \end{aligned}$$

and for each \(\varphi \in C_0^{\infty }({\mathbb {R}}^d)\) for almost all \(\omega \in \Omega \)

$$\begin{aligned} (u_t,\varphi )&=(\psi ,\varphi ) + \int _0^t\left\{ -(a^{ij}_{s}D_{i}u_{s},D_{j}\varphi ) + (\bar{b}^{i}_{s}D_{i}u_{s}+c_{s}u_s+f_{s},\varphi )\right\} \,ds\\&\quad +\int _0^{t}(\sigma ^{ir}_{s}D_{i}u_{s}+\nu ^{r}_{s}u_{s}+g^{r}_{s},\varphi )\, dw^r_{s} \end{aligned}$$

for all \(t\in [0,\tau (\omega )]\), where \(\bar{b}^i=b^{i}-D_ja^{ij}\), and \((\cdot \,,\cdot )\) denotes the inner product in the Hilbert space of square integrable real-valued functions on \({\mathbb {R}}^d\).

We want to prove the following existence and uniqueness theorem about the Cauchy problem (2.1)–(2.2).

Theorem 2.1

Let Assumptions 2.3 and 2.12.2 with \(m\ge 0\) hold. Then there exists at most one solution on \([0,T]\). If together with Assumptions 2.3, 2.12.2 hold with \(m\ge 1\), then there exists a unique solution \(u=(u_t)_{t\in [0,T]}\) on \([0,T]\). Moreover, \(u\) is a \(W^{m}_p\)-valued weakly continuous process, it is a strongly continuous process with values in \(W^{m-1}_p\), and for every \(q>0\) and \(n\in \{0,1,\ldots ,m\}\)

$$\begin{aligned} E\sup _{t\in [0,T]}|u_t|_{W^n_p}^q \le N(E|\psi |^q_{W^n_p}+E{\mathcal {K}}^q_n(T)), \end{aligned}$$
(2.4)

where \(N\) is a constant depending only on \(K\), \(T\), \(d\), \(m\), \(p\) and \(q\).

This result is proved in [22] in the case \(q=p\ge 2\) under the additional assumptions that \(E{\mathcal {K}}^{r}_{m,r}(T)<\infty \) and \(E|\psi |^{r}_{W^{m}_{r}}<\infty \) for \(r=p\) and \(r=2\) (see Theorem 3.1 therein). These additional assumptions are not supposed and a somewhat weaker version of the above theorem is obtained in [8] when \(q\in (0,p]\). The proof of it in [8] uses Theorem 3.1 from [22], whose proof is based on an estimate for the derivatives of the solution \(u\), formulated as Lemma 2.1 in [22]. The proof of this lemma, however, contains a gap. Our aim is to fill in this gap and also to improve the existence and uniqueness theorems from [22] and [8]. Since \(Du=(D_1u,\ldots ,D_du)\) satisfies a system of SPDEs, it is natural to present and prove our results in the context of systems of stochastic PDEs.

3 Systems of stochastic PDEs

Let \(M\ge 1\) be an integer, and let \(\langle \cdot \,,\cdot \rangle \) and \(\langle \cdot \rangle \) denote the scalar product and the norm in \({\mathbb {R}}^{M}\), respectively. By \({\mathbb {T}}^{M}\) we denote the set of \(M\times M\) matrices, which we consider as a Euclidean space \({\mathbb {R}}^{M^{2}}\). For an integer \(m\ge 1\) we define \(l_{2}({\mathbb {R}}^{m})\) as the space of sequences \(\nu =(\nu ^{1},\nu ^{2},\ldots )\) with \(\nu ^{k}\in {\mathbb {R}}^{m}\), \(k\ge 1\), and finite norm

$$\begin{aligned} \Vert \nu \Vert _{l_{2} }=\left( \sum \limits _{k=1}^{\infty }|\nu |^{2}_{{\mathbb {R}}^{m}}\right) ^{1/2} \end{aligned}$$

(cf. Remark 1.1).

We look for \({\mathbb {R}}^{M}\)-valued functions \(u_t(x)=(u^1_t(x),\ldots ,u^M_t(x))\), of \(\omega \in \Omega \), \(t\in [0,T]\) and \(x\in {\mathbb {R}}^d\), which satisfy the system of equations

$$\begin{aligned} du_{t}&= \left[ a^{ij}_{t}D_{ij}u_{t}+b^{i}_{t}D_{i} u_{t} +cu_t+f_{t}\right] \,dt\nonumber \\&+ \left[ \sigma ^{ik}_{t}D_{i}u_{t}+\nu ^{k}_{t}u_{t} +g^{k}_{t}\right] \,dw^{k}_{t}, \end{aligned}$$
(3.1)

and the initial condition

$$\begin{aligned} u_0=\psi , \end{aligned}$$
(3.2)

where \(a_{t}=(a^{ij}_{t}(x))\) takes values in the set of \(d\times d\) symmetric matrices,

$$\begin{aligned} \sigma ^{i}_{t}&= \left( \sigma ^{ik}_{t}(x),k\ge 1\right) \in l_{2} , \quad b^{i}_{t}(x)\in {\mathbb {T}}^{M}, \quad c_t(x)\in {\mathbb {T}}^{M}, \nonumber \\&\nu _{t}(x)\in l_{2}({\mathbb {T}}^{M}), \quad f_t(x)\in {\mathbb {R}}^M, \quad g_t(x)\in l_{2}({\mathbb {R}}^{M}) \end{aligned}$$
(3.3)

for \(i=1,\ldots ,d\), for all \(\omega \in \Omega \), \(t\ge 0\), \(x\in {\mathbb {R}}^d\).

Note that with the exception of \(a^{ij}\) and \(\sigma ^{ik}\), all ‘coefficients’ in Eq. (3.1) mix the coordinates of the process \(u\).

Let \(m\) be a nonnegative integer, \(p\in [2,\infty )\) and make the following assumptions, which are straightforward adaptations of Assumptions 2.1 and 2.2.

Assumption 3.1

The derivatives in \(x\in {\mathbb {R}}^d\) of \(a^{ij}\) up to order \(\max (m,2)\) and of \(b^{i}\) and \(c\) up to order \(m\) are \({\mathcal {P}}\otimes {\mathcal {B}}({\mathbb {R}}^d)\)-measurable functions, in magnitude bounded by \(K\) for all \(i,j\in \{1,2,\ldots ,d\}\). The derivatives in \(x\) of the \(l_2 \)-valued functions \(\sigma ^{i}=(\sigma ^{ik})_{k=1}^{\infty }\) and the \(l_{2}({\mathbb {T}}^{M})\)-valued function \(\nu \) up to order \(m+1\) are \({\mathcal {P}}\otimes {\mathcal {B}}({\mathbb {R}}^d)\)-measurable \(l_2\)-valued and \(l_{2}({\mathbb {T}}^{M})\)-valued functions, respectively, in magnitude bounded by \(K\).

Assumption 3.2

The free data, \((f_t)_{t\in [0,T]}\) and \((g_t)_{t\in [0,T]}\) are predictable processes with values in

$$\begin{aligned} W^m_p\left( {\mathbb {R}}^{d},{\mathbb {R}}^M\right) \quad \text {and}\quad W^{m+1}_p\left( {\mathbb {R}}^{d},l_{2}({\mathbb {R}}^{M}) \right) , \end{aligned}$$

respectively, such that almost surely

$$\begin{aligned} {\mathcal {K}}_{m,p}^p(T)=\int _0^T \left( |f_t|^p_{W^m_p}+|g_t|^p_{W^{m+1}_p}\right) \,dt<\infty . \end{aligned}$$
(3.4)

The initial value, \(\psi \) is an \({\mathcal {F}}_0\)-measurable random variable with values in \( W^m_p({\mathbb {R}}^{d},{\mathbb {R}}^M)\).

Set

$$\begin{aligned} \beta ^i=b^i-\sigma ^{ir}\nu ^r, \quad i=1,2,\ldots ,d, \end{aligned}$$

and recall that \(\alpha ^{ij}=2a^{ij}-\sigma ^{ik}\sigma ^{jk}\) for \(i,j=1,\ldots ,d\). Instead of Assumption 2.3 we impose now the following condition, where \(\delta ^{kl}\) stands for the ‘Kronecker \(\delta \)’, i.e., \(\delta ^{kl}=1\) if \(k=l\) and it is zero otherwise.

Assumption 3.3

There exist a constant \(K_0>0\) and a \({\mathcal {P}}\times {\mathcal {B}}({\mathbb {R}}^d)\)-measurable \({\mathbb {R}}^d\)-valued bounded function \(h=(h^{i}_t(x))\), whose first order derivatives in \(x\) are bounded functions, such that for all \(\omega \in \Omega \), \(t\ge 0\) and \(x\in {\mathbb {R}}^d\)

$$\begin{aligned} |h|+|Dh|\le K, \end{aligned}$$
(3.5)

and for all \((\lambda _1,\ldots ,\lambda _d)\in {\mathbb {R}}^d\)

$$\begin{aligned} \left| \sum \limits _{i=1}^d(\beta ^{ikl}-\delta ^{kl}h^i)\lambda _i\right| ^2 \le K_0\sum \limits _{i,j=1}^d\alpha ^{ij}\lambda _i\lambda _j\quad \text {for}\, k,l=1,\ldots ,M. \end{aligned}$$
(3.6)

Remark 3.1

Let Assumption 3.1 hold with \(m=0\) and the first order derivatives of \(b^i\) in \(x\) are bounded by \(K\) for each \(i\!=\!1,2,\ldots ,d\). Then notice that condition (3.6) is a natural extension of Assumption 2.3 to systems of stochastic PDEs. Indeed, when \(M=1\) then taking \(h^i\!=\!\beta ^i\) for \(i\!=\!1,\ldots ,d\), we can see that Assumption 3.3 is equivalent to \(\alpha \ge 0\). Let us analyse now Assumption 3.3 for arbitrary \(M\!\ge \!1\). Notice that it holds when \(\alpha \) is uniformly elliptic, i.e., \(\alpha \!\ge \! \kappa I_d\) with a constant \(\kappa \!>\!0\) for all \(\omega \), \(t\!\ge \!0\) and \(x\!\in \!{\mathbb {R}}^d\). Indeed, due to Assumption 3.1 there is a constant \(N=N(K,d)\) such that

$$\begin{aligned} \left| \sum \limits _{i=1}^d(\beta ^{ikl}-\delta ^{kl}h^i)\lambda _i\right| ^2 \le N\sum \limits _{i=1}^d|\lambda _i|^2\quad \text {for every}\, k,l=1,2,\ldots ,M, \end{aligned}$$

which together with the uniform ellipticity of \(\alpha \) clearly implies (3.6). Notice also that (3.6) holds in many situations when instead of the strong ellipticity of \(\alpha \) we only have \(\alpha \ge 0\). Such examples arise, for example, when \(a^{ij}=(\sigma ^{ir}\sigma ^{jr})/2\) for all \(i,j=1,\ldots ,d\), and \(b\) and \(\nu \) are such that \(\beta ^i\) is a diagonal matrix for each \(i=1,\ldots ,d\), and the diagonal elements together with their first order derivatives in \(x\) are bounded by a constant \(K\). As a simple example, consider the system of equations

$$\begin{aligned} du_t(x)&= \left\{ \tfrac{1}{2}D^2u_t(x)+Dv_t(x)\right\} \,dt +\left\{ Du_t(x)+v_t(x)\right\} \,dw_t \nonumber \\ dv_t(x)&= \left\{ \tfrac{1}{2}D^2v_t(x) -Du_t(x)\right\} \,dt+\left\{ Dv_t(x)-u_t(x)\right\} \,dw_t \nonumber \end{aligned}$$

for \(t\in [0,T]\), \(x\in {\mathbb {R}}\), for a 2-dimensional process \((u_t(x),v_t(x))\), where \(w\) is a one-dimensional Wiener process. In this example \(\alpha =0\) and \(\beta =0\). Thus clearly, condition (3.6) is satisfied.

In Sect. 5 it will be convenient to use condition (3.6) in an equivalent form, which we discuss in the next remark.

Remark 3.2

Notice that condition (3.6) in Assumption 3.3 can be reformulated as follows: There exists a constant \(K_0\) such that for all values of the arguments and all continuously differentiable \({\mathbb {R}}^{M}\)-valued functions \(u=u(x)\) on \({\mathbb {R}}^{d}\) we have

$$\begin{aligned} \langle u, b^{i}D_{i}u\rangle \!-\!\sigma ^{ik}\langle u,\nu ^{k}D_{i}u\rangle \!\le \! K_{0}\left| \sum \limits _{i,j=1}^{d}\alpha ^{ij}\langle D_{i}u,D_{j}u\rangle \right| ^{1/2}\langle u\rangle \!+\!h^{i} \langle D_{i} u,u \rangle .\quad \quad \end{aligned}$$
(3.7)

Indeed, set \(\hat{\beta }^{i}=\beta ^{i}-h^{i}I_{M}\), where \(I_{M}\) is the \(M\times M\) unit matrix, and observe that (3.7) means

$$\begin{aligned} \langle u, \hat{\beta }^{i}D_{i}u\rangle \le K_{0}\left| \sum \limits _{i,j=1}^{d}\alpha ^{ij}\langle D_{i}u,D_{j}u\rangle \right| ^{1/2}\langle u\rangle . \end{aligned}$$

By considering this relation at a fixed point \(x\) and noting that then one can choose \(u\) and \(Du\) independently, we conclude that

$$\begin{aligned} \left\langle \sum \limits _{i}\hat{\beta }^{i}D_{i}u\right\rangle ^{2}\le K_{0}^{2} \alpha ^{ij}\langle D_{i}u,D_{j}u\rangle \end{aligned}$$
(3.8)

and (3.6) follows (with a different \(K_{0}\)) if we take \(D_{i}u^{k}=\lambda _{i}\delta ^{kl}\).

On the other hand, (3.6) means that for any \(l\) without summation on \(l\)

$$\begin{aligned} \big |\sum \limits _{i}\hat{\beta }^{ikl}D_{i}u ^{l} \big |^{2} \le K_{0}\alpha ^{ij} (D_{i}u ^{l} ) D_{j}u ^{l} . \end{aligned}$$

But then by Cauchy’s inequality similar estimate holds after summation on \(l\) is done and carried inside the square on the left-hand side. This yields (3.8) (with a different constant \(K_{0}\)) and then leads to (3.7).

The notion of solution to (3.1)–(3.2) is a straightforward adaptation of Definition 2.1 to systems of equations. Namely, \(u=(u^1,\ldots ,u^M)\) is a solution on \([0,\tau ]\), for a stopping time \(\tau \le T\), if it is a \(W^1_p ({\mathbb {R}}^{d},{\mathbb {R}}^{M})\)-valued predictable function on

figure c

,

$$\begin{aligned} \int _{0}^{\tau }|u_t|_{W^1_p}^p\,dt<\infty \quad \text {(a.s.)}, \end{aligned}$$

and for each \({\mathbb {R}}^M\)-valued \(\varphi =(\varphi ^1,\ldots ,\varphi ^M)\) from \(C_0({\mathbb {R}}^d)\) with probability one

$$\begin{aligned} (u_t,\varphi )&= (\psi ,\varphi ) +\int _0^t\Big \{-(a^{ij}_sD_iu_s,D_j\varphi )\nonumber \\&+\,(\bar{b}^{i}_sD_iu_s+c_su_s+f_s,\varphi )\Big \}\,ds \end{aligned}$$
(3.9)
$$\begin{aligned}&+\,\int _0^t\Big (\sigma ^{ir}_sD_iu_s+\nu ^{r}_su_s+g^{r}(s),\varphi \Big )\,dw^r_{s} \end{aligned}$$
(3.10)

for all \(t\in [0, \tau ]\), where \(\bar{b}^{i}=b^{i}-D_ja^{ij}I_{M}\). Here, and later on \((\Psi ,\Phi )\) denotes the inner product in the \(L_2\)-space of \({\mathbb {R}}^M\)-valued functions \(\Psi \) and \(\Phi \) defined on \({\mathbb {R}}^d\).

The main result of the paper reads now just like Theorem 2.1 above.

Theorem 3.1

Let Assumption 3.3 hold. If Assumptions 3.1 and 3.2 also hold with \(m\ge 0\), then there is at most one solution to (3.1)–(3.2) on \([0,T]\). If together with Assumption 3.3, Assumptions 3.1 and 3.2 hold with \(m\ge 1\), then there is a unique solution \(u=(u^l)_{l=1}^M\) to (3.1)–(3.2) on \([0,T]\). Moreover, \(u\) is a weakly continuous \(W^m_p({\mathbb {R}}^{d},{\mathbb {R}}^{M})\)-valued process, it is strongly continuous as a \(W^{m-1}_p({\mathbb {R}}^{d},{\mathbb {R}}^{M})\)-valued process, and for every \(q>0\) and \(n\in \{0,1,\ldots ,m\}\)

$$\begin{aligned} E\sup \limits _{t\in [0,T]}|u_t|_{W^n_p}^q \le N\Big (E|\psi |^q_{W^n_p}+E{\mathcal {K}}^q_{n,p}(T)\Big ) \end{aligned}$$
(3.11)

with \(N=N(m,p,q,d, M,K,T)\).

Example 3.1

In hydromagnetic dynamo theory the system of equations

$$\begin{aligned} \frac{\partial }{\partial t}B^k_t(x) =\lambda _t(x)\Delta B^k_t(x) +D_jv^k_t(x)B^j_t(x)-v^j_t(x)D_jB^k_t(x),\quad k=1,2,3, \end{aligned}$$
(3.12)

for \(t\in [0,T]\) and \(x\in {\mathbb {R}}^3\), called induction equation, describes the evolution of a magnetic field \(B=(B^1,B^2,B^3)\) in a fluid flowing with velocity \(v=(v^1,v^2,v^3)\), where \(\lambda \ge 0\) is the magnetic diffusivity (see, for example, [23]). Notice that one can apply Theorem 3.1 to (3.12) to obtain its solvability in \(W^m_p\) spaces. To study effects in turbulent flows, induction equations with random velocity fields \(v\) have been investigated in the literature (see, for example, [24]). In [7] convergence of (3.12) to a system of stochastic PDEs is shown when the velocity fields are random and converge to a random field which is white noise in time. We note that Theorem 3.1 can be applied also to the system of stochastic PDEs obtained in this way.

In the case \(p=2\) we present also a modification of Assumption 3.3, in order to cover an important class of stochastic PDE systems, the hyperbolic symmetric systems.

Observe that if in (3.6) we replace \(\beta ^{ikl}\) with \(\beta ^{ilk}\), nothing will change. By the convexity of \(t^{2}\) condition (3.6) then holds if we replace \(\beta ^{ilk}\) with \((1/2)[ \beta ^{ilk}+\beta ^{ikl}]\). Since

$$\begin{aligned} |a-b|^{2}\le |a+b|^{2}+2a^{2}+2b^{2} \end{aligned}$$

this implies that (3.6) also holds for

$$\begin{aligned} \bar{\beta }^{ikl}=(\beta ^{ikl}-\beta ^{ilk})/2 \end{aligned}$$

in place of \(\beta ^{ikl}\), which is the antisymmetric part of \(\beta ^i=b^i-\sigma ^{ir}\nu ^{r}\).

Hence the following condition is weaker than Assumption 3.3.

Assumption 3.4

There exist a constant \(K_0>0\) and a \({\mathcal {P}}\times {\mathcal {B}}({\mathbb {R}}^d)\)-measurable \({\mathbb {R}}^M\)-valued function \(h=(h^{i}_t(x))\) such that (3.5) holds, and for all \(\omega \in \Omega \), \(t\ge 0\) and \(x\in {\mathbb {R}}^d\) and for all \((\lambda _1,\ldots ,\lambda _d)\in {\mathbb {R}}^d\)

$$\begin{aligned} \left| \sum \limits _{i=1}^d({\bar{\beta }}^{ikl}-\delta ^{kl}h^i)\lambda _i\right| ^2 \le K_0 \sum \limits _{i,j=1}^d\alpha ^{ij}\lambda _i\lambda _j\quad \text {for}\, k,l=1,\ldots ,M. \end{aligned}$$
(3.13)

The following result in the special case of deterministic PDE systems is indicated and a proof is sketched in [9].

Theorem 3.2

Take \(p=2\) and replace Assumption 3.3 with Assumption 3.4 in the conditions of Theorem 3.1. Then the conclusion of Theorem 3.1 holds with \(p=2\).

Remark 3.3

Notice that Assumption 3.4 obviously holds with \(h^{i}=0\) if the matrices \(\beta ^i\) are symmetric and \(\alpha \ge 0\). When \(a=0\) and \(\sigma =0\) then the system is called a first order symmetric hyperbolic system.

Remark 3.4

If Assumption 3.4 does not hold then even simple first order deterministic systems with smooth coefficients may be ill-posed. Consider, for example, the system

$$\begin{aligned} du_t(x)&= Dv_t(x)\,dt \nonumber \\ dv_t(x)&= -Du_t(x)\,dt \end{aligned}$$
(3.14)

for \((u_t(x), v_t(x))\), \(t\in [0,T]\), \(x\in {\mathbb {R}}\), with initial condition \(u_0=\psi \), \(v_0=\phi \), such that \(\psi ,\phi \in W^m_2\setminus W^{m+1}_2\) for an integer \(m\ge 1\). Clearly, this system does not satisfy Assumption 3.4, and one can show that it does not have a solution with the initial condition \(u_0=\psi \), \(v_0=\phi \). We note, however, that it is not difficult to show that for any constant \(\varepsilon \ne 0\) and Wiener process \(w\) the stochastic PDE system

$$\begin{aligned} du_t(x)&= Dv_t(x)\,dt+\varepsilon Dv_t(x)\,dw_t \nonumber \\ dv_t(x)&= -Du_t(x)\,dt-\varepsilon Du_t(x)\, dw_t \end{aligned}$$
(3.15)

with initial condition \((u_0,v_0)=(\psi ,\phi )\in W^m_2\) (for \(m\ge 1\)) has a unique solution \((u_t,v_t)_{t\in [0,T]}\), which is a \(W^m_2\)-valued continuous process. One can prove this statement and the statement about the nonexistence of a solution to (3.14) by using Fourier transform. We leave the details of the proof as exercises for those readers who find them interesting. Clearly, system (3.15) does not belong to the class of stochastic systems considered in this paper.

4 Preliminaries

First we discuss the solvability of (3.1)–(3.2) under the strong stochastic parabolicity condition.

Assumption 4.1

There is a constant \(\kappa >0\) such that

$$\begin{aligned} \alpha ^{ij}\lambda _i\lambda _j\ge \kappa \sum \limits _{i=1}^d|\lambda _i|^2 \end{aligned}$$

for all \(\omega \in \Omega \), \(t\ge 0\), \(x\in {\mathbb {R}}^d\) and \((\lambda _1,\ldots ,\lambda _d)\in {\mathbb {R}}^d\).

If the above non-degeneracy assumption holds then we need weaker regularity conditions on the coefficients and the data than in the degenerate case. Recall that \(m\ge 0\) and make the following assumptions.

Assumption 4.2

The derivatives in \(x\in {\mathbb {R}}^d\) of \(a^{ij}\) up to order \(\max (m,1)\) and of \(b^{i}\) and \(c\) up to order \(m\) are \({\mathcal {P}}\otimes {\mathcal {B}}({\mathbb {R}}^d)\)-measurable functions, bounded by \(K\) for all \(i,j\in \{1,2,\ldots ,d\}\). The derivatives in \(x\) of the \(l_2\)-valued functions \(\sigma ^{i}\) and \( l_2({\mathbb {T}}^{M})\)-valued function \(\nu \) up to order \(m\) are \({\mathcal {P}}\otimes {\mathcal {B}}({\mathbb {R}}^d)\)-measurable \(l_2\)-valued and \(l_2({\mathbb {T}}^{M})\)-valued functions, respectively, in magnitude bounded by \(K\).

Assumption 4.3

The free data, \((f_t)_{t\in [0,T]}\) and \((g_t)_{t\in [0,T]}\) are predictable processes with values in \( W^{m-1}_2({\mathbb {R}}^{d},{\mathbb {R}}^M)\) and \( W^m_2({\mathbb {R}}^{d},l_2({\mathbb {T}}^{M}))\), respectively, such that almost surely

$$\begin{aligned} {\mathcal {K}}_{m-1,2}^2(T)=\int _0^T \big (|f_t|^2_{W^{m-1}_2}+|g_t|^2_{W^{m}_2}\big )\,dt<\infty . \end{aligned}$$

The initial value, \(\psi \) is an \({\mathcal {F}}_0\)-measurable random variable with values in \( W^m_2({\mathbb {R}}^{d},{\mathbb {R}}^M)\).

The following is a standard result from the \(L_2\)-theory of stochastic PDEs. See, for example, [29]. Further results on solvability in \(W^1_2\) spaces for non degenerate systems of stochastic PDEs in \({\mathbb {R}}^d\) and in domains of \({\mathbb {R}}^d\) can be found in [15].

Theorem 4.1

Let Assumptions 4.14.2 and 4.3 hold with \(m\ge 0\). Then (3.1)–(3.2) has a unique solution \(u\). Moreover, \(u\) is a continuous \(W^m_2({\mathbb {R}}^{d},{\mathbb {R}}^M)\)-valued process such that \(u_t\in W^{m+1}({\mathbb {R}}^{d},{\mathbb {R}}^M)\) for \(P\times dt\) everywhere, and

$$\begin{aligned}&E\sup \limits _{t\in [0,T]}|u_t|^2_{W^m_2} +E\int _0^T|u_t|^2_{W^{m+1}_2}\,dt \nonumber \\&\quad \le N\left( E|\psi |^2_{W^m_2} +E\int _0^T\left( |f_t|^2_{W^{m-1}_2}+|g_t|^2_{W^m_2}\right) \,dt\right) \end{aligned}$$
(4.1)

with \(N=N(\kappa ,m,d, M,K,T)\).

The crucial step in the proof of Theorem 2.1 is to obtain an apriori estimate, like estimate (2.4). In order to discuss the way how such estimate can be proved, take \(q=p\), \(M=1\), and for simplicity assume that \((a^{ij})\) is nonnegative definite, it is bounded and has bounded derivatives up to a sufficiently high order, and that all the other coefficients and free terms in Eq. (2.1) are equal to zero. Thus we consider now the PDE

$$\begin{aligned} du(t,x)=a^{ij}(t,x)D_{ij}u(t,x)\,dt, \quad t\in [0,T],\quad x\in {\mathbb {R}}^d, \end{aligned}$$
(4.2)

with initial condition (2.2), where we assume that \(\psi \) is a smooth function from \(W^1_p\). We want to obtain the estimate

$$\begin{aligned} |u(t)|_{W^1_p}^p\le N|\psi |^p_{W^1_p} \end{aligned}$$
(4.3)

for smooth solutions \(u\) to (4.2)–(2.2).

After applying \(D_k\) to both sides of Eq. (4.2) and writing \(v_k\) in place of \(D_kv\), by the chain rule we have

$$\begin{aligned} d\sum \limits _k|u_k|^{p}=p|u_k|^{p-1}u_k\left( a_k^{ij}u_{ij}+a^{ij}u_{ijk}\right) \,dt. \end{aligned}$$

Integrating over \({\mathbb {R}}^d\) we get

$$\begin{aligned} d\sum \limits _k|u_k|^{p}_{L_p}=\int _{{\mathbb {R}}^d}Q(u)\,dx\,dt, \end{aligned}$$

where

$$\begin{aligned} Q(u)=p|u_k|^{p-2}u_k\left( a^{ij}u_{ijk}+a_k^{ij}u_{ij}\right) . \end{aligned}$$

To obtain (4.3) we want to have the estimate

$$\begin{aligned} \int _{{\mathbb {R}}^d}Q(v)\,dx\le N||v||^p_{W^1_p} \end{aligned}$$
(4.4)

for any smooth \(v\) with compact support. To prove this we write \( \xi \sim \eta \) if \(\xi \) and \(\eta \) have identical integrals over \({\mathbb {R}}^d\) and we write \(\xi \preceq \eta \) if \(\xi \sim \eta +\zeta \) such that

$$\begin{aligned} \zeta \le N(|v|^p+|Dv|^p). \end{aligned}$$

Then by integration by parts we have

$$\begin{aligned} |v_k|^{p-2}v_ka^{ij}v_{ijk}&\sim -(p-1)|v_k|^{p-2}a^{ij}v_{ki}v_{kj}-|v_k|^{p-2}v_ka^{ij}_iv_{jk}\\&\sim -(p-1)|v_k|^{p-2}a^{ij}v_{ki}v_{kj}-p^{-1}D_j|v_k|^pa^{ij}_i\\&\preceq -(p-1)|v_k|^{p-2}a^{ij}v_{ki}v_{kj}. \end{aligned}$$

By the simple inequality \(\alpha \beta \le \varepsilon ^{-1} \alpha ^2+ \varepsilon \beta ^2\) we have

$$\begin{aligned} |v_k|^{p-2}v_ka_k^{ij}v_{ij} \le \varepsilon ^{-1}|v_k|^p +\varepsilon |v_k|^{p-2} |a_k^{ij}v_{ij}|^2 \end{aligned}$$

for any \(\varepsilon >0\). To estimate the term \( |a_k^{ij}v_{ij}|^2\) we use the following lemma, which is well-known from [28].

Lemma 4.2

Let \(a=(a^{ij}(x))\) be a function defined on \({\mathbb {R}}^d\), with values in the set of non-negative \(m\times m\) matrices, such that \(a\) and its derivatives in \(x\) up second order are bounded in magnitude by a constant \(K\). Let \(V\) be a symmetric \(m\times m\) matrix. Then

$$\begin{aligned} |Da^{ij}V^{ij}|^2 \le Na^{ij}V^{ik}V^{jk} \end{aligned}$$

for every \(x\in {\mathbb {R}}^d\), where \(N\) is a constant depending only on \(K\) and \(d\).

By this lemma \( |a_k^{ij}v_{ij}|^2\le N a^{ij}v_{il}v_{jl}. \) Hence

$$\begin{aligned} |v_k|^{p-2}v_ka_k^{ij}v_{ij} \preceq N\varepsilon |v_k|^{p-2}a^{ij}v_{il}v_{jl}. \end{aligned}$$

Thus for each fixed \(k=1,2,\ldots ,d\) we have

$$\begin{aligned} Q(v)\preceq -p(p-1)|v_k|^{p-2}a^{ij}v_{ki}v_{kj} +\varepsilon |v_k|^{p-2}a^{ij}v_{il}v_{jl} \end{aligned}$$
(4.5)

for any \(\varepsilon >0\). Notice that for each fixed \(k\) there is a summation with respect to \(l\) over \(\{1,2,\ldots ,d\}\) in the expression \(\varepsilon |v_k|^{p-2}a^{ij}v_{il}v_{jl}\), and terms with \(l\ne k\) cannot be killed by the expression

$$\begin{aligned} -p(p-1)|v_k|^{p-2}a^{ij}v_{ki}v_{kj}. \end{aligned}$$
(4.6)

Hence we can get (4.4) when \(d=1\) or \(p=2\), but we does not get it for \(p>2\) and \(d>1\). To cancel every term in the sum \(\varepsilon |v_k|^{p-2}a^{ij}v_{il}v_{jl}\) we need an expression like

$$\begin{aligned} -\nu |v_k|^{p-2}a^{ij}v_{li}v_{lj}, \end{aligned}$$

with a constant \(\nu \), in place of (4.6), for each \(k\in \{1,\ldots ,d\}\) in the right-hand side of (4.5). This suggests to get (4.3) via an equation for \(|\,|Du|^2|_{L_{p/2}}^{p/2}\) instead of that for \(\sum _k |D_ku|_{L_p}^p\).

Let us test this idea. From

$$\begin{aligned} du_{k}=\left( a^{ij}u_{ijk}+a^{ij}_{k}u_{ij}\right) \,dt \end{aligned}$$

by the chain rule and Lemma 4.2 we have

$$\begin{aligned} d|Du|^{2}&= 2u_{k}a^{ij}u_{ijk}\,dt+2u_{k}a^{ij}_{k}u_{ij}\,dt \le a^{ij}\left[ |Du|^{2}\right] _{ij}\,dt-2a^{ij}u_{ik}u_{jk}\,dt \\&+\,N|Du|\left[ a^{ij}u_{ik}u_{jk}\right] ^{1/2}\,dt \le a^{ij}\left[ |Du|^{2}\right] _{ij}\,dt +N|Du|^{2}\,dt \end{aligned}$$

with a constant \(N\). Hence

$$\begin{aligned} d\left( |Du|^{2}\right) ^{p/2}\le (p/2)|Du|^{p-2}a^{ij}\left[ |Du|^{2}\right] _{ij}\,dt +N|Du|^{p}\,dt, \end{aligned}$$

where

$$\begin{aligned} |Du|^{p-2}a^{ij}[|Du|^{2}]_{ij}&\sim -|Du|^{p-2}a^{ij}_{j}\left[ |Du|^{2}\right] _{i} \nonumber \\&-\,((p-2)/2)|Du|^{p-4}a^{ij}\left[ |Du|^{2}\right] _{i}\left[ |Du|^{2}\right] _{j} \nonumber \\&\le -\,(2/p)a^{ij}_{j}\left[ |Du|^{p}\right] _{i}\preceq N|Du|^p, \end{aligned}$$
(4.7)

which implies

$$\begin{aligned} |\,|Du|^2|_{L_{p/2}}^{p/2}\le N|\,|D\psi |^2| _{L_{p/2}}^{p/2}, \end{aligned}$$

by Gronwall’s lemma. Consequently, estimate (4.3) follows, since it is not difficult to see that

$$\begin{aligned} |u(t)|^p_{L_p}\le N|\psi |^p_{L_p} \end{aligned}$$

holds. The careful reader may notice that though the computations in (4.7) are justified only for \(p\ge 4\), by approximating the function \(|t|^{p-2}\), \(t\in {\mathbb {R}}^d\) by smooth functions we can extend them to get the desired estimate for all \(p\ge 2\).

The following lemma on Itô’s formula in the special case \(M=1\) is Theorem 2.1 from [19]. The proof of this multidimensional variant goes the same way, and therefore will be omitted. Note that for \(p\ge 2\) the second derivative, \(D_{ij}\langle x\rangle ^p\) of the function \((x_1,x_2,\ldots ,x_M)\rightarrow \langle x\rangle ^p\) for \(p\ge 2\) is

$$\begin{aligned} p(p-2)\langle x\rangle ^{p-4}x_ix_j+p\langle x\rangle ^{p-2}\delta _{ij}, \end{aligned}$$

which makes the last term in (4.8) below natural. Here and later on we use the convention \(0\cdot 0^{-1}:=0\) whenever such terms occur.

Lemma 4.3

Let \(p\ge 2\) and let \(\psi =(\psi ^k)_{k=1}^M\) be an \(L_p({\mathbb {R}}^{d}, {\mathbb {R}}^M)\)-valued \({\mathcal {F}}_0\)-measurable random variable. For \( i=0,1,2,\ldots ,d\) and \(k=1,\ldots ,M\) let \(f^{ki}\) and \((g^{kr})_{r=1}^{\infty }\) be predictable functions on \(\Omega \times (0,T]\), with values in \(L_p\) and in \(L_p(l_2)\), respectively, such that

$$\begin{aligned} \int _0^T\left( \sum \limits _{i,k}|f^{ki}_t|_{L_{p}}^p+ \sum \limits _{k}|g^{k\cdot }_t|_{L_{p}}^p\right) \,dt<\infty \quad \text {(a.s.)}. \end{aligned}$$

Suppose that for each \(k=1,\ldots ,M\) we are given a \(W^1_p\)-valued predictable function \(u^k\) on \(\Omega \times (0,T]\) such that

$$\begin{aligned} \int _0^T|u^k_t|^p_{W^1_p}\,dt<\infty \,(a.s.), \end{aligned}$$

and for any \(\phi \in C_0^{\infty }\) with probability 1 for all \(t\in [0,T]\) we have

$$\begin{aligned} \big (u_t^k,\phi \big )=\big (\psi ^k,\phi \big )+ \int _0^t\big (g^{kr}_s,\phi \big )\,dw^r_s +\int _0^t\big ((f^{k0}_s,\phi )-(f^{ki}_s,D_i\phi )\big )\,ds. \end{aligned}$$

Then there exists a set \(\Omega '\subset \Omega \) of full probability such that

$$\begin{aligned} u={{\mathbf {1}}}_{\Omega ^{\prime }}\big (u^1,\ldots ,u^k\big )_{t\in [0,T]} \end{aligned}$$

is a continuous \(L_p({\mathbb {R}}^{d},{\mathbb {R}}^{M})\)-valued process, and for all \(t\in [0,T]\)

$$\begin{aligned} \int _{{\mathbb {R}}^d}\langle u_t\rangle ^p\,dx&= \int _{{\mathbb {R}}^d}\langle \psi \rangle ^p\,dx +\int _0^t\int _{{\mathbb {R}}^d}p\langle u_s\rangle ^{p-2}\langle u_s,g_s^{ r}\rangle \,dx\,dw_s^{ r} \nonumber \\&+\,\int _0^t\int _{{\mathbb {R}}^d}\Big (p\langle u_s\rangle ^{p-2}\langle u_s, f_s^0\rangle -p\langle u_s\rangle ^{p-2}\langle D_i u_s,f_s^i\rangle \nonumber \\&-\,(1/2)p(p-2)\langle u_s\rangle ^{p-4} \langle u_s,f^i_s\rangle D_i\langle u_s\rangle ^2 \nonumber \\&+\,\sum \limits _r\big [(1/2)p(p-2)\langle u_s\rangle ^{p-4}\langle u_s,g_s^{ r}\rangle ^2 +(1/2)p\langle u_s\rangle ^{p-2}\langle g_s^{ r}\rangle ^2\big ]\Big )\,dx\,ds,\nonumber \\ \end{aligned}$$
(4.8)

where \(f^{i}:=(f^{ki})_{k=1}^{M}\) and \(g^{ r} :=(g^{kr})_{k=1}^M\) for all \(i=0,1,\ldots ,d\) and \(r=1,2,\ldots \).

5 The main estimate

Here we consider the problem (3.1)-(3.2) with \(a_{t}=(a^{ij}_{t}(x))\) taking values in the set of nonnegative symmetric \(d\times d\) matrices and the other coefficients and the data are described in (3.3). The following lemma presents the crucial estimate to prove solvability in \(L_p\) spaces. It generalises the estimate for \(Du\) explained in section 4 for a solution \(u\) to a simple PDE.

Lemma 5.1

Suppose that Assumptions 3.1, 3.2, and 3.3 hold with \(m\ge 0\). Assume that \(u=(u_t)_{t\in [0,T]}\) is a solution of (3.1)–(3.2) on \([0,T]\) (as defined before Theorem 3.1). Then (a.s.) \(u\) is a continuous \(L_p({\mathbb {R}}^{d},{\mathbb {R}}^{M})\)-valued process, and there is a constant \(N=N(p,K,d,M,K_0)\) such that

$$\begin{aligned}&d\int _{{\mathbb {R}}^{d}}\langle u_{t}\rangle ^{p} \,dx +(p/4)\int _{{\mathbb {R}}^{d}}\langle u_{t}\rangle ^{p-2} \alpha ^{ij}_{t}\langle D_{i}u_{t},D_{j}u_{t}\rangle \,dx\,dt \nonumber \\&\quad \le p\int _{{\mathbb {R}}^{d}}\langle u_{t}\rangle ^{p-2} \left\langle u_{t},\sigma ^{ik}D_{i}u_{t}+ \nu ^{k}_{t}u_{t}+g^{k}_{t}\right\rangle \,dx\,dw^{k}_{t} \nonumber \\&\quad \quad +\,N\int _{{\mathbb {R}}^{d}}\left[ \langle u_{t}\rangle ^{p} +\langle f_{t}\rangle ^{p} +\left( \sum \limits _{k}\langle g^{k}_{t}\rangle ^{2 }\right) ^{p/2} +\left( \sum \limits _{k}\langle D g^{k}_{t}\rangle ^{2 } \right) ^{p/2} \right] \,dx\,dt.\quad \quad \quad \end{aligned}$$
(5.1)

Proof

By Lemma 4.3 (a.s.) \(u\) is a continuous \(L_p({\mathbb {R}}^{d},{\mathbb {R}}^{M})\)-valued process and

$$\begin{aligned} d\int _{{\mathbb {R}}^d}\langle u_{t}\rangle ^{p}\,dx&= \int _{{\mathbb {R}}^d}p\langle u_t\rangle ^{p-2}\langle u_t,\sigma ^{ik}D_iu_t+ \nu ^k_tu_t+g^k_t\rangle \,dx\,dw^k_t \nonumber \\&+\,\int _{{\mathbb {R}}^d}\Big (p\langle u_t\rangle ^{p-2} \langle u_t,b^i_t D_iu_t+c_tu_t+f_t-D_ia^{ij}_tD_ju_t\rangle \nonumber \\&-\,p\langle u_t\rangle ^{p-2}\langle D_i u_t,a^{ij}_t D_ju_t\rangle \nonumber \\&-\,(1/2)p(p-2)\langle u_t\rangle ^{p-4} D_i\langle u_t\rangle ^2\langle u_t,a^{ij}_tD_ju_t\rangle \nonumber \\&+\,\sum \limits _k\left\{ (1/2)p(p-2)\langle u_t\rangle ^{p-4} \langle u_t,\sigma ^{ik}_tD_i u_t+\nu ^k_tu_t+g^k_t\rangle ^2\right. \nonumber \\&\left. +\,(1/2)p\langle u_t\rangle ^{p-2}\langle \sigma ^{ik}_tD_i u_t +\nu ^k_tu_t+g^k_t\rangle ^2\right\} \Big )\,dx\,dt. \end{aligned}$$
(5.2)

Observe that

$$\begin{aligned} \langle u_{t}\rangle ^{p-2} \langle u_{t},f_{t}\rangle&\le \langle u_{t}\rangle ^{p } +\langle f_{t}\rangle ^{p },\quad \langle u_{t}\rangle ^{p-2} \sum \limits _{k}\langle g^{k}_{t}\rangle ^{2 } \le \langle u_{t}\rangle ^{p }+ \Big (\sum \limits _{k}\langle g^{k}_{t}\rangle ^{2 }\Big )^{p/2}, \\ \langle u_{t}\rangle ^{p-2} \sum \limits _{k}\langle \nu ^{k}_{t}u_{t}, g^{k}_{t}\rangle&\le N\langle u_{t}\rangle ^{p-1} \left( \sum \limits _{k}\langle g^{k}_{t}\rangle ^{2 }\right) ^{1/2} \le N\langle u_{t}\rangle ^{p }+ N \left( \sum \limits _{k}\langle g^{k}_{t}\rangle ^{2 }\right) ^{p/2}, \\ \langle u_{t}\rangle ^{p-4}\sum \limits _{k}\langle u_{t}, g^{k}_{t}\rangle ^{ 2}&\le \langle u_{t}\rangle ^{p-2} \sum \limits _{k}\langle g^{k}_{t}\rangle ^{2 } \le \langle u_{t}\rangle ^{p }+ \left( \sum \limits _{k}\langle g^{k}_{t}\rangle ^{2 }\right) ^{p/2}, \\ \langle u_{t}\rangle ^{p-4}\sum \limits _{k}\langle u_{t}, \nu ^{k}_{t}u_{t}\rangle \langle u_{t}, g^{k}_{t}\rangle&\le N\langle u_{t}\rangle ^{p-1} \left( \sum \limits _{k}\langle g^{k}_{t}\rangle ^{2 }\right) ^{1/2} \le \langle u_{t}\rangle ^{p }+ \left( \sum \limits _{k}\langle g^{k}_{t}\rangle ^{2 }\right) ^{p/2}, \\ \langle u_t\rangle ^{p-2}\langle u_t,c_tu_t\rangle&\le \langle u_t\rangle ^{p-1}\langle c_tu_t\rangle \le |c_t|\langle u_t\rangle ^{p}, \end{aligned}$$

where \(|c|\) denotes the (Hilbert–Schmidt) norm of \(c\).

This shows how to estimate a few terms on the right in (5.2). We write \( \xi \sim \eta \) if \(\xi \) and \(\eta \) have identical integrals over \({\mathbb {R}}^d\) and we write \(\xi \preceq \eta \) if \(\xi \sim \eta +\zeta \) and the integral of \(\zeta \) over \({\mathbb {R}}^d\) can be estimated by the coefficient of \(dt\) in the right-hand side of (5.1). For instance, integrating by parts and using the smoothness of \(\sigma ^{ik}_{t}\) and \(g^{k}_{t}\) we get

$$\begin{aligned} p\langle u_{t}\rangle ^{p-2} \langle \sigma ^{ik}_{t}D_{i}u_{t},g^{k}_{t}\rangle&\preceq -p\sigma ^{ik}_{t}(D_{i}\langle u_{t}\rangle ^{p-2}) \langle u_{t},g^{k}_{t}\rangle \nonumber \\&= -p(p-2)\langle u_{t}\rangle ^{p-4} \langle u_{t},\sigma ^{ik}_{t}D_{i}u_{t} \rangle \langle u_{t},g^{k}_{t}\rangle , \end{aligned}$$
(5.3)

where the first expression comes from the last occurrence of \(g^{k}_{t}\) in (5.2), and the last one with an opposite sign appears in the evaluation of the first term behind the summation over \(k\) in (5.2). Notice, however, that these calculations are not justified when \(p\) is close to \(2\), since in this case \(\langle u_{t}\rangle ^{p-2}\) may not be absolutely continuous with respect to \(x^i\) and it is not clear either if \(0/0\) should be defined as \(0\) when it occurs in the second line. For \(p=2\) we clearly have \(\langle \sigma ^{ik}_{t}D_{i}u_{t},g^{k}_{t}\rangle \preceq 0\). For \(p>2\) we modify the above calculations by approximating the function \(\langle t\rangle ^{p-2}\), \(t\in {\mathbb {R}}^M\), by continuously differentiable functions \(\phi _n(t)=\varphi _n(\langle t\rangle ^2)\) such that

$$\begin{aligned} \lim _{n\rightarrow \infty }\varphi _n(r)=|r|^{(p-2)/2}, \quad \lim _{n\rightarrow \infty }\varphi ^{\prime }_n(r) =(p-2)\mathrm{sign}(r)|r|^{(p-4)/2}/2 \end{aligned}$$

for all \(r\in {\mathbb {R}}\), and

$$\begin{aligned} |\varphi _n(r)|\le N|r|^{(p-2)/2}, \quad |\varphi ^{\prime }_n(r)|\le N|r|^{(p-4)/2} \end{aligned}$$

for all \(r\in {\mathbb {R}}\) and integers \(n\ge 1\), where \(\varphi ^{\prime }_n:=d\varphi _n/dr\) and \(N\) is a constant independent of \(n\). Thus instead of (5.3) we have

$$\begin{aligned} p\varphi _n(\langle u_{t}\rangle ^2) \langle \sigma ^{ik}_{t}D_iu_{t},g^k_{t}\rangle \preceq -2 p\varphi ^{\prime }_n(\langle u_{t}\rangle ^2) \langle u_{t},\sigma ^{ik}_{t}D_{i}u_{t} \rangle \langle u_{t},g^{k}_{t}\rangle , \end{aligned}$$
(5.4)

where

$$\begin{aligned} |\varphi ^{\prime }_n(\langle u_{t}\rangle ^2) \langle u_{t},\sigma ^{ik}_{t}D_{i}u_{t} \rangle \langle u_{t},g^{k}_{t}\rangle | \le N\langle u_{t}\rangle ^{p-2} \langle D_{i}u_{t} \rangle \langle g^{k}_{t}\rangle \end{aligned}$$
(5.5)

with a constant \(N\) independent of \(n\). Letting \(n\rightarrow \infty \) in (5.4) we get

$$\begin{aligned} p\langle u_{t}\rangle ^{p-2} \langle \sigma ^{ik}_{t}D_{i}u_{t},g^{k}_{t}\rangle \preceq -p(p-2)\langle u_{t}\rangle ^{p-4} \langle u_{t},\sigma ^{ik}_{t}D_{i}u_{t} \rangle \langle u_{t},g^{k}_{t}\rangle , \end{aligned}$$

where, due to (5.5), 0/0 means 0 when it occurs .

These manipulations allow us to take care of the terms containing \(f\) and \(g\) and show that to prove the lemma we have to prove

$$\begin{aligned}&p( I_{0}+I_{1} +I_{2})+(p/2)I_{3}+[p(p-2)/2]( I_{4}+I_{5} ) \nonumber \\&\quad \preceq -(p/4)\langle u_{t}\rangle ^{p-2} \alpha ^{ij}_{t}\langle D_{i}u_{t},D_{j}u_{t}\rangle , \end{aligned}$$
(5.6)

where

$$\begin{aligned} I_{0}&= -\langle u_t\rangle ^{p-2}D_ia^{ij}_t\langle u_t,D_ju_t\rangle , \quad I_{1}=-\langle u_t\rangle ^{p-2}a^{ij}_t\langle D_i u_t, D_ju_t\rangle \\ I_{2}&= \langle u_{t}\rangle ^{p-2} \langle u_{t},b^{i}_{t}D_{i} u_{t} \rangle ,\quad I_{3}=\langle u_{t}\rangle ^{p-2} \sum \limits _{k}\langle \sigma ^{ik}_{t}D_{i}u_{t}+\nu ^{k}_{t}u_{t} \rangle ^{2}, \\ I_{4}&= \langle u_{t}\rangle ^{p-4}\sum \limits _{k} \left\langle u_{t},\sigma ^{ik}_{t}D_{i}u_{t}+\nu ^{k}_{t}u_{t} \right\rangle ^{2},\quad I_5=-\langle u_t\rangle ^{p-4}D_i\langle u_t\rangle ^2\langle u_t,a^{ij}_tD_ju_t\rangle . \end{aligned}$$

Observe that

$$\begin{aligned} I_{0}=-(1/2)\langle u_t\rangle ^{p-2}D_ia^{ij}_tD_j\langle u_t\rangle ^2=-(1/p)D_j\langle u_t\rangle ^pD_ia^{ij}_t\preceq 0, \end{aligned}$$

by the smoothness of \(a\). Also notice that

$$\begin{aligned} I_{3} \preceq \langle u_{t}\rangle ^{p-2}\sigma ^{ik}_{t} \sigma ^{jk}_{t}\langle D_{i}u_{t},D_{j}u_{t}\rangle +I_{6}, \end{aligned}$$

where

$$\begin{aligned} I_{6}=2\langle u_{t}\rangle ^{p-2} \sigma ^{ik}_{t}\langle D_{i}u_{t},\nu ^{k}u_{t}\rangle . \end{aligned}$$

It follows that

$$\begin{aligned} pI_{1}+(p/2)I_{3} \preceq -(p/2)\langle u_{t}\rangle ^{p-2} \alpha ^{ij}_{t}\langle D_{i}u_{t},D_{j}u_{t}\rangle +(p/2)I_{6}. \end{aligned}$$

Next,

$$\begin{aligned} I_{4}&\preceq \langle u_{t}\rangle ^{p-4} \sigma ^{ik}_{t}\sigma ^{jk}_{t}\langle u_{t},D_{i}u_{t} \rangle \langle u_{t},D_{j}u_{t} \rangle +2\langle u_{t}\rangle ^{p-4} \sigma ^{ik}_{t}\langle u_{t},D_{i}u_{t}\rangle \langle u_{t},\nu ^{k}_{t}u_{t}\rangle \\&= (1/4)\langle u_{t}\rangle ^{p-4} \sigma ^{ik}_{t}\sigma ^{jk}_{t} D_{i}\langle u_{t} \rangle ^{2} D_{j}\langle u_{t} \rangle ^{2} +[2/(p-2)](D_{i}\langle u_{t} \rangle ^{p-2}) \sigma ^{ik}_{t}\langle u_{t},\nu ^{k}_{t}u_{t}\rangle \\&\preceq (1/4)\langle u_{t}\rangle ^{p-4} \sigma ^{ik}_{t}\sigma ^{jk}_{t} D_{i}\langle u_{t} \rangle ^{2} D_{j}\langle u_{t} \rangle ^{2}- [1/(p-2)]I_{6}-[2/(p-2)]I_{7}, \end{aligned}$$

where

$$\begin{aligned} I_{7}= \langle u_{t} \rangle ^{p-2} \sigma ^{ik}_{t}\langle u_{t},\nu ^{k}_{t}D_{i}u_{t}\rangle . \end{aligned}$$

Hence

$$\begin{aligned} pI_{1}&+(p/2)I_{3}+[p(p-2)/2](I_{4} +I_{5} ) \preceq -(p/2)\langle u_{t}\rangle ^{p-2} \alpha ^{ij}_{t}\langle D_{i}u_{t},D_{j}u_{t}\rangle \\&-[p(p-2)/ 8 ] \langle u_{t}\rangle ^{p-4}\alpha ^{ij}_{t} D_{i}\langle u_{t} \rangle ^{2} D_{j}\langle u_{t} \rangle ^{2}-pI_{7}, \end{aligned}$$

and

$$\begin{aligned} I_2-I_7=\langle u_{t} \rangle ^{p-2}( \langle u_{t},b^{i}_{t} D_{i} u_{t}\rangle -\sigma ^{ik}_{t}\langle u_{t},\nu ^{k}_{t}D_{i}u_{t}\rangle )= \langle u_{t} \rangle ^{p-2}\langle u_{t},\beta ^{i}_{t}D_{i} u_{t}\rangle , \end{aligned}$$

with \(\beta ^i=b^i-\sigma ^{ik}\nu ^{k}\). It follows by Remark 3.2 that the left-hand side of (5.6) is estimated in the order defined by \(\preceq \) by

$$\begin{aligned}&-\,(p/2)\langle u_{t}\rangle ^{p-2} \alpha ^{ij}_{t}\langle D_{i}u_{t},D_{j}u_{t}\rangle \nonumber \\&-\,[p(p-2)/ 8 ] \langle u_{t}\rangle ^{p-4}\alpha ^{ij}_{t} D_{i}\langle u_{t} \rangle ^{2} D_{j}\langle u_{t} \rangle ^{2} \nonumber \\&+\,K_{0}p \langle u_{t}\rangle ^{p-2}\left| \sum \limits _{i,j=1}^{d} \alpha ^{ij}_{t}\langle D_{i}u_{t},D_{j}u_{t}\rangle \right| ^{1/2}\langle u_{t}\rangle +h^{i}D_{i}\langle u_{t}\rangle ^{p} \nonumber \\&\preceq \,-(p/4)\langle u_{t}\rangle ^{p-2} \alpha ^{ij}_{t}\langle D_{i}u_{t},D_{j}u_{t}\rangle \nonumber \\&-\,[p(p-2)/ 8 ] \langle u_{t}\rangle ^{p-4} \alpha ^{ij}_{t} D_{i}\langle u_{t} \rangle ^{2} D_{j}\langle u_{t} \rangle ^{2} \rangle , \end{aligned}$$
(5.7)

where the last relation follows from the elementary inequality \(ab\le \varepsilon a^{2}+\varepsilon ^{-1}b^{2}\). The lemma is proved. \(\square \)

Remark 5.1

In the case that \(p=2\) one can replace condition (3.6) with the following:

There are constant \(K_{0},N\ge 0\) such that for all continuously differentiable \({\mathbb {R}}^{M}\)-valued functions \(u=u(x)\) with compact support in \({\mathbb {R}}^{d}\) and all values of the arguments we have

$$\begin{aligned} \int _{{\mathbb {R}}^d}\langle u, \beta ^{i}D_{i}u\rangle \,dx&\le N\int _{{\mathbb {R}}^d}\langle u\rangle ^2\,dx \nonumber \\&+\,K_{0}\int _{{\mathbb {R}}^d}\left( \left| \sum \limits _{i,j=1}^{d}\alpha ^{ij}\langle D_{i}u,D_{j}u\rangle \right| ^{1/2}\langle u\rangle +h^{i} \langle D_{i} u,u \rangle \right) \,dx.\nonumber \\ \end{aligned}$$
(5.8)

This condition is weaker than (3.6) as follows from Remark 3.2 and still by inspecting the above proof we get that \(u\) is a continuous \(L_2({\mathbb {R}}^{d},{\mathbb {R}}^{M})\)-valued process, and there is a constant \(N=N(K,d,M,K_0)\) such that (5.1) holds with \(p=2\).

Remark 5.2

In the case that \(p=2\) and the magnitudes of the first derivatives of \(b^{i}\) are bounded by \(K\) one can further replace condition (5.8) with a more tractable one, which is Assumption 3.4.

Indeed, for \(\varepsilon >0\)

$$\begin{aligned} R&:= \langle u,(\beta ^i-h^iI_{M})D_{i} u\rangle =\tfrac{1}{2}\beta ^{ikl}D_i(u^ku^l)+\langle u,(\bar{\beta }^i-h^iI_M)D_{i} u\rangle \\&\le \tfrac{1}{2}\beta ^{ikl}D_i(u^ku^l)+ \varepsilon \langle (\bar{\beta }^i-h^iI_M)D_{i} u\rangle ^2/2+\varepsilon ^{-1}\langle u\rangle ^2/2. \end{aligned}$$

Using Assumption 3.4 we get

$$\begin{aligned} R\le \tfrac{1}{2}\beta ^{ikl}D_i(u^ku^l) +\varepsilon MK_0\alpha ^{ij}\langle D_iu, D_ju\rangle /2+\varepsilon ^{-1}\langle u\rangle ^2/2 \end{aligned}$$

for every \(\varepsilon >0\). Hence by integration by parts we have

$$\begin{aligned} \int _{{\mathbb {R}}^d}\langle u,\beta ^iD_{i} u\rangle \,dx&\le N\int _{{\mathbb {R}}^d}\langle u\rangle ^2\,dx +\int _{{\mathbb {R}}^d}\langle u,h^iI_{M}D_{i} u\rangle \,dx \\&+\,MK_0\int _{{\mathbb {R}}^d} (\varepsilon /2) \alpha ^{ij}\langle D_iu_t, D_ju_t\rangle +(\varepsilon ^{-1}/2)\langle u\rangle ^2\,dx. \end{aligned}$$

Minimising here over \(\varepsilon >0\) we get (5.8). In that case again \(u\) is a continuous \(L_2({\mathbb {R}}^{d},{\mathbb {R}}^{M})\)-valued process, and there is a constant \(N=N(K,d,M,K_0)\) such that (5.1) holds with \(p=2\).

Remark 5.3

If \(M=1\), then condition (3.7) is obviously satisfied with \(K_{0}=0\) and \(h^{i}=b^{i}-\sigma ^{ik}\nu ^{k}\).

Also note that in the general case, if the coefficients are smoother, then by formally differentiating equation (3.1) with respect to \(x^{i}\) we obtain a new system of equations for the \(M\times d\) matrix-valued function

$$\begin{aligned} v_{t}=(v_{t}^{nm})=Du_{t}=(D_{m}u^{n}_{t}). \end{aligned}$$

We treat the space of \(M\times d\) matrices as a Euclidean \(Md\)-dimensional space, the coordinates in which are organized in a special way. The inner product in this space is then just \(\langle \langle A,B\rangle \rangle =\mathrm{tr}AB^{*}\). Naturally, linear operators in this space will be given by matrices like \((T^{(nm)(pj)})\), which transforms an \(M\times d\) matrix \((A^{pj})\) into an \(M\times d\) matrix \((B^{nm})\) by the formula

$$\begin{aligned} B^{nm}=\sum \limits _{p=1}^{m}\sum \limits _{j=1}^{d}T^{(nm)(pj)}A^{pj}. \end{aligned}$$

We claim that the coefficients, the initial value and free terms of the system for \(v_{t}\) satisfy Assumptions 3.13.2, and 3.3 with \(m\ge 0\) if Assumptions 3.13.2, and 3.3 are satisfied with \(m\ge 1\) for the coefficients, the initial value and free terms of the original system for \(u_t\).

Indeed, as is easy to see, \(v_{t}\) satisfies (3.1) with the same \(\sigma \) and \(a\) and with \(\tilde{b}^{i}\), \(\tilde{c}\), \(\tilde{f}\), \(\tilde{\nu }^{k}\), \(\tilde{g}^{k}\) in place of \( b^{i}\), \( c\), \( f\), \( \nu ^{k}\), \( g^{k}\), respectively, where

$$\begin{aligned} \tilde{b}^{i(nm)(pj)}&= D_{m}a^{ij}\delta ^{pn}+b^{inp}\delta ^{jm},\quad \tilde{c}^{(nm)(pj)}=c^{np}\delta ^{mj}+D_mb^{jnp}, \end{aligned}$$
(5.9)
$$\begin{aligned} \tilde{f}^{nm}&= D_{m}f^{ n }+u^{r} D_{m} c^{nr},\quad \tilde{\nu }^{k(nm)(pj)}=D_{m}\sigma ^{jk}\delta ^{np} +\nu ^{knp}\delta ^{mj}, \nonumber \\ \tilde{g}^{knm}&= D_{m}g^{kn}+u^{r}D_{m}\nu ^{knr}. \end{aligned}$$
(5.10)

Then the left-hand side of the counterpart of (3.7) for \(v\) is

$$\begin{aligned} \sum \limits _{m=1}^{d}K_{m}+\sum \limits _{n=1}^{M}J_{n}, \end{aligned}$$

where (no summation with respect to \(m\))

$$\begin{aligned} K_{m}=v^{nm}b^{inr}D_{i}v^{rm}- \sigma ^{ik}v^{nm}\nu ^{knr}D_{i}v^{rm} \end{aligned}$$

and (no summation with respect to \(n\))

$$\begin{aligned} J_{n}=v^{nm}D_{m}a^{ij}D_{i}v^{nj} -\sigma ^{ik}v^{nm}D_{m}\sigma ^{jk}D_{i}v^{nj}. \end{aligned}$$

Observe that \(D_{i}v^{nj}=D_{ij}u^{n}\) implying that

$$\begin{aligned} \sigma ^{ik}D_{m}\sigma ^{jk}D_{i}v^{nj}&= (1/2)D_{m}(\sigma ^{ik}\sigma ^{jk})D_{ij}u^{n},\\ J_{n}&= (1/2)v^{nm}D_{m}\alpha ^{ij}D_{ij}u^{n}. \end{aligned}$$

By Lemma 4.2 for any \(\varepsilon >0\) and \(n\) (still no summation with respect to \(n\))

$$\begin{aligned} J_{n}\le N\varepsilon ^{-1} \langle \langle v\rangle \rangle ^{2} +\varepsilon \alpha ^{ij} D_{ik }u^{n}D_{jk }u^{n}, \end{aligned}$$

which along with the fact that \(D_{ik }u^{n}=D_{i}v^{nk}\) yields

$$\begin{aligned} \sum \limits _{n=1}^{M}J_{n}\le N\varepsilon ^{-1}\langle \langle v\rangle \rangle ^{2} +\varepsilon \alpha ^{ij} \langle \langle D_{i}v,D_{j}v\rangle \rangle . \end{aligned}$$

Upon minimizing with respect to \(\varepsilon \) we find

$$\begin{aligned} \sum \limits _{n=1}^{M}J_{n}\le N\left( \sum \limits _{i,j=1}^{d}\alpha ^{ij} \langle \langle D_{i}v,D_{j}v\rangle \rangle \right) ^{1/2} \langle \langle v\rangle \rangle . \end{aligned}$$

Next, by assumption for any \(\varepsilon >0\) and \(m\) (still no summation with respect to \(m\))

$$\begin{aligned} K_{m}\le N\varepsilon ^{-1} \langle \langle v\rangle \rangle ^{2} + \varepsilon \alpha ^{ij}D_{i}v^{rm}D_{j}v^{rm} +(1/2)h^{i}D_{i}\sum \limits _{r=1}^{M}(v^{rm})^{2}. \end{aligned}$$

We conclude as above that

$$\begin{aligned} \sum \limits _{m=1}^{d}K_{m}\le N\left( \sum \limits _{i,j=1}^{d}\alpha ^{ij} \langle \langle D_{i}v,D_{j}v\rangle \rangle \right) ^{1/2} \langle \langle v\rangle \rangle +h^{i}\langle \langle D_{i}v, v\rangle \rangle \end{aligned}$$

and this proves our claim.

The above calculations show also that the coefficients, the initial value and the free terms of the system for \(v_{t}\) satisfy Assumptions 3.13.2, and 3.4 with \(m - 1\ge 0\) if Assumptions 3.13.2, and 3.4 are satisfied with \(m\ge 1\) for the coefficients, the initial value and free terms of the original equation for \(u_t\). (Note that due to Assumptions 3.1 with \(m\ge 1\), \(\tilde{b}\), given in (5.9), has first order derivatives in \(x\), which in magnitude are bounded by a constant.)

Now higher order derivatives of \(u\) are obviously estimated through lower order ones on the basis of this remark without any additional computations. However, we still need to be sure that we can differentiate equation (3.1).

By the help of the above remarks one can easily estimate the moments of the \(W^n_p\)-norms of \(u\) using of the following version of Gronwall’s lemma.

Lemma 5.2

Let \(y=(y_t)_{t\in [0,T]}\) and \(F=(F_t)_{t\in [0,T]}\) be adapted nonnegative stochastic processes and let \(m=(m_t)_{t\in [0,T]}\) be a continuous local martingale such that

$$\begin{aligned} dy_t&\le (Ny_t+F_t)\,dt+dm_t \quad \text {on}\, [0,T] \end{aligned}$$
(5.11)
$$\begin{aligned} d[m]_t&\le (N y^2_t+y^{2(1-\rho )}_tG^{2\rho }_t)\,dt \quad \text {on} [0,T], \end{aligned}$$
(5.12)

with some constants \(N\ge 0\) and \(\rho \in [0,1/2]\), and a nonnegative adapted stochastic process \(G=(G_t)_{t\in [0,T]}\), such that

$$\begin{aligned} \int _0^TG_t\,dt<\infty \,(a.s.), \end{aligned}$$

where \([m]\) is the quadratic variation process for \(m\). Then for any \(q>0\)

$$\begin{aligned} E\sup _{t\le T}y_t^q \le CEy_0^q +CE\left\{ \int _0^{T}(F_t+G_t)\,dt\right\} ^{q} \end{aligned}$$

with a constant \(C=C(N,q,\rho ,T)\).

Proof

This lemma improves Lemma 3.7 from [10]. Its proof goes in the same way as that in [10], and can be found in [11]. \(\square \)

Lemma 5.3

Let \(m\ge 0\). Suppose that Assumptions 3.1, 3.2, and 3.3 are satisfied and assume that \(u=(u_t)_{t\in [0,T]}\) is a solution of (3.1)-(3.2) on \([0,T]\) such that (a.s.)

$$\begin{aligned} \int _0^T|u_t|^p_{W_p^{m+1}}\,dt<\infty . \end{aligned}$$

Then (a.s.) \(u\) is a continuous \(W^m_p({\mathbb {R}}^{d},{\mathbb {R}}^{M})\)-valued process and for any \(q>0\)

$$\begin{aligned} E\sup _{t\in [0,T]}|u_t|_{W^m_p}^q \le N(E|\psi |_{W^m_p}^q+E{\mathcal {K}}^q_{m,p}(T)) \end{aligned}$$
(5.13)

with a constant \(N=N(m,p,q,d,M,K,K_0,T)\). If \(p=2\) and instead of Assumption 3.3 Assumption 3.4 holds and (in case \(m=0\)) the magnitudes of the first derivatives of \(b^{i}\) are bounded by \(K\), then \(u\) is a continuous \(W^m_2({\mathbb {R}}^{d},{\mathbb {R}}^{M})\)-valued process, and for any \(q>0\) estimate (5.13) holds (with \(p=2\)).

Proof

We are going to prove the lemma by induction on \(m\). First let \(m=0\) and denote \(y_t:=|u_t|^p_{L_p}\). Then by virtue of Remark 5.2 and Lemma 5.1, the process \(y=(y_t)_{t\in [0,T]}\) is an adapted \(L_p\)-valued continuous process, and (5.11) holds with

$$\begin{aligned} F_t:&= \int _{{\mathbb {R}}^{d}} \left[ \langle f_{t}\rangle ^{p} +\left( \sum \limits _{k}\langle g^{k}_{t}\rangle ^{2 }\right) ^{p/2} +\left( \sum \limits _{k}\langle D g^{k}_{t}\rangle ^{2 } \right) ^{p/2} \right] \,dx, \\ m_t:&= p\int _0^t\int _{{\mathbb {R}}^{d}}\langle u_{s}\rangle ^{p-2} \left\langle u_{s},\sigma ^{ik}_sD_{i}u_{s}+ \nu ^{k}_{s}u_{s}+g^{k}_{s}\right\rangle \,dx\,dw^{k}_{s}. \end{aligned}$$

Notice that

$$\begin{aligned} d[m_t]&= p^2\sum \limits _{r=1}^{\infty }\left( \int _{{\mathbb {R}}^{d}}\langle u_{t}\rangle ^{p-2} \langle u_{t},\sigma ^{ir}_tD_{i}u_{t}+ \nu ^{r}_{t}u_{t}+g^{r}_{t}\rangle \,dx\right) ^2\,dt. \\&\le 3p^2(A_t+B_t+C_t)\,dt, \end{aligned}$$

with

$$\begin{aligned} A_t&= \sum \limits _{r=1}^{\infty }\left( p\int _{{\mathbb {R}}^{d}}\langle u_{t}\rangle ^{p-2} \sigma ^{ir}_t\langle u_{t},D_{i}u_{t} \rangle \,dx\right) ^2 =\sum \limits _{r=1}^{\infty }\left( \int _{{\mathbb {R}}^{d}}\sigma ^{ir}_tD_i\langle u_{t}\rangle ^{p} \,dx\right) ^2,\\ B_t&= \sum \limits _{r=1}^{\infty }\left( \int _{{\mathbb {R}}^{d}}\langle u_{t}\rangle ^{p-2} \langle u_{t}, \nu ^{r}_{t}u_{t}\rangle \,dx\right) ^2, \quad C_t=\sum \limits _{r=1}^{\infty }\left( \int _{{\mathbb {R}}^{d}}\langle u_{t}\rangle ^{p-2} \langle u_{t},g^{r}_{t}\rangle \,dx\right) ^2. \end{aligned}$$

Integrating by parts and then using Minkowski’s inequality, due to Assumption 2.1, we get \(A_t\le N y_t^2\) with a constant \(N=N(K,M,d)\). Using Minkowski’s inequality and taking into account that

$$\begin{aligned} \sum \limits _{r=1}^{\infty }\langle u,\nu ^{r}u\rangle ^2\le \langle u\rangle ^4\sum \limits _{r=1}^{\infty }|\nu ^r|^2 \le N\langle u\rangle ^4, \quad \sum \limits _{r=1}^{\infty }\langle u,g^{r}\rangle ^2 \le \langle u\rangle ^2|g|, \end{aligned}$$

we obtain

$$\begin{aligned} B_t\le Ny_t^2, \quad C_t\le \left( \int _{{\mathbb {R}}^d}\langle u_t\rangle ^{p-1}|g_t|\,dx\right) ^2 \le |y_t|^{2(p-1)/p}|g_t|_{L_p}^{2}. \end{aligned}$$

Consequently, condition (5.12) holds with \(G_t=|g_t|^p_{L_p}\), \(\rho =1/p\), and we get (5.13) with \(m=0\) by applying Lemma 5.2.

Let \(m\ge 1\) and assume that the assertions of the lemma are valid for \(m-1\), in place of \(m\), for any \(M\ge 1\), \(p\ge 2\) and \(q>0\), for any \(u\), \(\psi \), \(f\) and \(g\) satisfying the assumptions with \(m-1\) in place of \(m\). Recall the notation \(v=(v^{nl}_t)=(D_lu^n_t)\) from Remark 5.3, and that \(v_{t}\) satisfies (3.1) with the same \(\sigma \) and \(a\) and with \(\tilde{b}^{i}\), \(\tilde{c}\), \(\tilde{f}\), \(\tilde{\nu }^{k}\), \(\tilde{g}^{k}\) in place of \( b^{i}\), \( c\), \( f\), \( \nu ^{k}\), \( g^{k}\), respectively. By virtue of Remarks 5.3 and 5.2 the system for \(v=(v_t)_{t\in [0,T]}\) satisfies Assumption 3.3, and it is easy to see that it satisfies also Assumptions 3.1 and 3.2 with \(m-1\) in place of \(m\). Hence by the induction hypothesis \(v\) is a continuous \(W^{m-1}_p({\mathbb {R}}^{d},{\mathbb {R}}^{M})\)-valued adapted process, and we have

$$\begin{aligned} E\sup _{t\in [0,T]}|v_t|_{W^{m-1}_p}^q \le N\left( E|{\tilde{\psi }}|_{W^{m-1}_p}^q +E\tilde{{\mathcal {K}}}^q_{m-1,p}(T)\right) \end{aligned}$$
(5.14)

with a constant \(N=N(T,K,K_0,M,d,p,q)\), where \({\tilde{\psi }}^{nl}=D_{l}\psi ^n\),

$$\begin{aligned} \tilde{\mathcal {K}}^p_{m-1,p}(T) :=\int _0^T\left( |\tilde{f}_t|^p_{W^{m-1}_p}+|\tilde{g}_t|^p_{W^{m}_p}\right) \,dt. \end{aligned}$$

It follows that \((u_t)_{t\in [0,T]}\) is a \(W^m_p({\mathbb {R}}^{d},{\mathbb {R}}^{M})\)-valued continuous adapted process, and by using the induction hypothesis it is easy to see that

$$\begin{aligned} E\tilde{{\mathcal {K}}}^q_{m-1,p}(T))\le N\left( E|\psi |^q_{W^m_p}+E{\mathcal {K}}^q_{m,p}(T)\right) . \end{aligned}$$

Thus (5.13) follows.

If \(p=2\) and Assumption 3.3 is replaced with Assumptions 3.4, then the proof of the conclusion of the lemma goes in the same way with obvious changes. The proof is complete.

6 Proof of Theorems 3.1 and 3.2

First we prove uniqueness. Let \(u^{(1)}\) and \(u^{(2)}\) be solutions to (3.1)-(3.2), and let Assumptions 3.13.2 and 3.3 hold with \(m=0\). Then \(u:=u^{(1)}-u^{(2)}\) solves (3.1) with \(u_0=0\), \(g=0\) and \(f=0\) and Lemma 5.1 and Remark 5.2 are applicable to \(u\). Then using Itô’s formula for transforming \(|u_t|^p_{L_p}\exp (-\lambda t)\) with a sufficiently large constant \(\lambda \), after simple calculations we get that almost surely

$$\begin{aligned} 0\le e^{-\lambda t}|u_t|^p_{L_p}\le m_t\quad \text {for all}\, t\in [0,T], \end{aligned}$$

where \(m:=(m_t)_{t\in [0,T]}\) is a continuous local martingale starting from \(0\). Hence almost surely \(m_t=0\) for all \(t\), and it follows that almost surely \(u_t^{(1)}(x)=u_t^{(2)}(x)\) for all \(t\) and almost every \(x\in {\mathbb {R}}^d\). If \(p=2\) and Assumptions 3.13.2 and 3.4 hold and the magnitudes of the first derivatives of \(b^{i}\) are bounded by \(K\) and \(u^{(1)}\) and \(u^{(2)}\) are solutions, then we can repeat the above argument with \(p=2\) to get \(u^{(1)}=u^{(2)}\).

To show the existence of solutions we approximate the data of system (3.1) with smooth ones, satisfying also the strong stochastic parabolicity, Assumption 4.1. To this end we will use the approximation described in the following lemma.

Lemma 6.1

Let Assumptions 3.1 and 3.3 (3.4, respectively) hold with \(m\ge 1\). Then for every \(\varepsilon \in (0,1)\) there exist \({\mathcal {P}}\otimes {\mathcal {B}}({\mathbb {R}}^d)\)-measurable smooth (in \(x\)) functions \(a^{\varepsilon ij}\), \(b^{(\varepsilon )i}\), \(c^{(\varepsilon )}\), \(\sigma ^{(\varepsilon )i }\), \(\nu ^{(\varepsilon ) },D_ka^{ \varepsilon ij}\) and \(h^{(\varepsilon )i}\), satisfying the following conditions for every \(i,j,k=1,\ldots ,d\).

  1. (i)

    There is a constant \(N=N(K)\) such that

    $$\begin{aligned}&|a^{ \varepsilon ij}-a^{ij}|+|b^{(\varepsilon )i}-b^{i}|+|c^{(\varepsilon )}-c| +|D_ka^{ \varepsilon ij}-D_ka^{ij}|\le N\varepsilon ,\\&|\sigma ^{(\varepsilon )i }- \sigma ^{i }|+ |\nu ^{(\varepsilon ) }-\nu |\le N\varepsilon \end{aligned}$$

    for all \((\omega ,t,x)\) and \(i,j,k=1,\ldots ,d\).

  2. (ii)

    For every integer \(n\ge 0\) the partial derivatives in \(x\) of \(a^{ \varepsilon ij}\), \(b^{(\varepsilon )i}\), \(c^{(\varepsilon )}\), \(\sigma ^{(\varepsilon )i}\) and \(\nu ^{(\varepsilon )}\) up to order \(n\) are \({\mathcal {P}}\otimes {\mathcal {B}}({\mathbb {R}}^d)\)-measurable functions, in magnitude bounded by a constant. For \(n=m\) this constant is independent of \(\varepsilon \), it depends only on \(m\), \(M\), \(d\) and \(K\);

  3. (iii)

    For the matrix \(\alpha ^{ \varepsilon ij} :=2a^{ \varepsilon ij}-\sigma ^{(\varepsilon )ik}\sigma ^{(\varepsilon )jk}\) we have

    $$\begin{aligned} \alpha ^{ \varepsilon ij}\lambda ^{i}\lambda ^{j}\ge \varepsilon \sum \limits _{i=1}^d|\lambda ^i|^2 \quad \text {for all}\, \lambda =(\lambda ^1,\ldots ,\lambda ^d)\in {\mathbb {R}}^d; \end{aligned}$$
  4. (iv)

    Assumption 3.3 (3.4, respectively) holds for the functions \(\alpha ^{ \varepsilon ij}\), \(\beta ^{ \varepsilon i} :=b^{(\varepsilon )i}-\sigma ^{(\varepsilon )ik}\nu ^ {(\varepsilon )k}\) and \(h^{(\varepsilon )i}\) in place of \(\alpha ^{ij}\), \(\beta ^i\) and \(h^i\), respectively, with the same constant \(K_0\).

Proof

The proofs of the two statements containing Assumptions 3.3 and 3.4, respectively, go in essentially the same way, therefore we only detail the former. Let \(\zeta \) be a nonnegative smooth function on \({\mathbb {R}}^d\) with unit integral and support in the unit ball, and let \(\zeta _{\varepsilon }(x)=\varepsilon ^{-d}\zeta (x/\varepsilon )\). Define

$$\begin{aligned} b^{(\varepsilon )i}=b^i*\zeta _{\varepsilon },\, c^{(\varepsilon )}=c*\zeta _{\varepsilon },\, \sigma ^{(\varepsilon )i}=\sigma ^{i}*\zeta _{\varepsilon },\, \nu ^{(\varepsilon )}=\nu *\zeta _{\varepsilon },\, h^{(\varepsilon )i}=h^i*\zeta _{\varepsilon }, \end{aligned}$$

and \( a ^{ \varepsilon ij}=a^{ij}*\zeta _{\varepsilon }+k\varepsilon \delta _{ij}\) with a constant \(k>0\) determined later, where \(\delta _{ij}\) is the Kronecker symbol and ‘\(*\)’ means the convolution in the variable \(x\in {\mathbb {R}}^d\). Since we have mollified functions which are bounded and Lipschitz continuous, the mollified functions, together with \(a^{ \varepsilon ij}\) and \(D_ka^{ \varepsilon ij}\), satisfy conditions (i) and (ii). Furthermore,

$$\begin{aligned} |\sigma ^{(\varepsilon )ir}\nu ^{(\varepsilon )r}-\sigma ^{ir}\nu ^{r}| \le |\sigma ^{(\varepsilon )i}-\sigma ^i||\nu ^{(\varepsilon )}|+ |\sigma ^i||\nu ^{(\varepsilon )}-\nu |\le 2K^2\varepsilon , \end{aligned}$$

for every \(i=1,\ldots ,d\). Similarly,

$$\begin{aligned} |\sigma ^{(\varepsilon )ir}\sigma ^{(\varepsilon )jr} -\sigma ^{ir}\sigma ^{jr}|\le 2K^2\varepsilon , \quad |b^{(\varepsilon )i}-b^{i}|\le K\varepsilon , \quad |h^{(\varepsilon )i}-h^i|\le N\varepsilon \end{aligned}$$

for all \(i,j=1,2,\ldots ,d\). Hence setting

$$\begin{aligned} B^{ \varepsilon i}= b^{(\varepsilon )i}-\sigma ^{(\varepsilon )ik}\nu ^{(\varepsilon )k}-h^{(\varepsilon )i}I_M, \end{aligned}$$

and using the notation \(B^i\) for the same expression without the superscript ‘\( \varepsilon \)’, we have

$$\begin{aligned} |B^{ \varepsilon i}-B^{i}|&\le |b^{(\varepsilon )i}-b^{i}| +|\sigma ^{(\varepsilon )ir}\nu ^{(\varepsilon )r}-\sigma ^{ir}\nu ^{r}| +\sqrt{M}|h^{(\varepsilon )i}-h^i|\le R\varepsilon ,\\&|B^{(\varepsilon )i}+B^{i}|\le R \end{aligned}$$

with a constant \(R=R(M,K)\). Thus for any \(z_1\),...,\(z_d\) vectors from \({\mathbb {R}}^M\)

$$\begin{aligned} |\langle B^{ \varepsilon i} z_i\rangle ^2-\langle B^{i} z_i\rangle ^2|&= |\langle (B^{ \varepsilon i} -B^{i})z_i, (B^{ \varepsilon j}+B^{j})z_j\rangle | \\&\le |B^{ \varepsilon i} -B^{i}||B^{ \varepsilon j}+B^{j}| \langle z_i\rangle \langle z_j\rangle \le dR^2\varepsilon \sum \limits _{i=1}^d\langle z_i\rangle ^2. \end{aligned}$$

Therefore

$$\begin{aligned} \langle B^{ \varepsilon i} z_i\rangle ^2 \le \langle B^i z_i\rangle ^2+C_1\varepsilon \sum \limits _{i=1}^d\langle z_i\rangle ^2 \end{aligned}$$

with a constant \(C_1=C_1(M,K,d)\). Similarly,

$$\begin{aligned}&\sum \limits _{i,j}\left( 2a^{ \varepsilon ij}-\sigma ^{(\varepsilon )ik} \sigma ^{(\varepsilon )jk}\right) \langle z_i,z_j\rangle \\&\ge \sum \limits _{i,j}(2a^{ij}-\sigma ^{ik}\sigma ^{jk})\langle z_i,z_j\rangle +(k-C_2)\varepsilon \sum \limits _i\langle z_i\rangle ^2 \end{aligned}$$

with a constant \(C_2=C_2(K,m,d)\). Consequently,

$$\begin{aligned} \langle (\beta ^{ \varepsilon i}-h^{(\varepsilon )i}I_M)z_i\rangle ^2&\le \langle B^i z_i\rangle ^2+C_1\varepsilon \sum \limits _{i=1}^d\langle z_i\rangle ^2 \\&\le K_0\sum \limits _{i,j=1}^d\alpha ^{ij}\langle z_i,z_j\rangle +C_1\varepsilon \sum \limits _{i=1}^d\langle z_i\rangle ^2 \\&\le K_0\sum \limits _{i,j=1}^d\alpha ^{ \varepsilon ij}\langle z_i,z_j\rangle +(K_0(C_2-k)+C_1)\varepsilon \sum \limits _{i=1}^d\langle z_i\rangle ^2. \end{aligned}$$

Choosing \(k\) such that \(K_0(C_2-k)+C_1=-K_0\) we get

$$\begin{aligned} \langle (\beta ^{ \varepsilon i}-h^{(\varepsilon )i}I_M)z_i\rangle ^2 +K_0\varepsilon \sum \limits _{i=1}^d\langle z_i\rangle ^2 \le K_0\sum \limits _{i,j=1}^d\alpha ^{ \varepsilon ij}\langle z_i,z_j\rangle . \end{aligned}$$

Hence statements (iii) and (iv) follow immediately. \(\square \)

Now we start with the proof of the existence of solutions which are \(W^m_p({\mathbb {R}}^{d},{\mathbb {R}}^{M})\)-valued if the Assumptions 3.13.2 and 3.3 hold with \(m\ge 1\). First we make the additional assumptions that \(\psi \), \(f\) and \(g\) vanish for \(|x|\ge R\) for some \(R>0\), and that \(q\in [2,\infty )\) and

$$\begin{aligned} E|\psi |_{W^m_p}^{q}+E{\mathcal {K}}^{q}_{m,q}(T)<\infty . \end{aligned}$$
(6.1)

For each \(\varepsilon >0\) we consider the system

$$\begin{aligned} du^{\varepsilon }_{t}&= \left[ \sigma ^{(\varepsilon )ir}_{t}D_{i}u^{\varepsilon }_{t} +\nu ^{(\varepsilon )r}_{t}u^{\varepsilon }_{t} +g^{(\varepsilon )r}_{t}\right] \,dw^{r}_{t} \nonumber \\&+\,\big [ a^{ \varepsilon ij}_{t}D_{ij}u^{\varepsilon }_{t} +b^{(\varepsilon )i}_{t}D_{i} u^{\varepsilon }_{t} +f^{(\varepsilon )}_{t}\big ]\,dt \end{aligned}$$
(6.2)

with initial condition

$$\begin{aligned} u_0^{(\varepsilon )}=\psi ^{(\varepsilon )}, \end{aligned}$$
(6.3)

where the coefficients are taken from Lemma 6.1, and \(\psi ^{(\epsilon )}\), \(f^{(\epsilon )}\) and \(g^{(\epsilon )}\) are defined as the convolution of \(\psi \), \(f\) and \(g\), respectively, with \(\zeta _{\varepsilon }(\cdot )=\varepsilon ^{-d}\zeta (\cdot /\varepsilon )\) for \(\zeta \in C^{\infty }_{0}({\mathbb {R}}^d)\) taken from the proof of Lemma 6.1. By Theorem 4.1 the above equation has a unique solution \(u^{\varepsilon }\), which is a \(W^n_2({\mathbb {R}}^{d},{\mathbb {R}}^{M})\)-valued continuous process for all \(n\). Hence, by Sobolev embeddings, \(u^{\varepsilon }\) is a \(W^{m+1}_p({\mathbb {R}}^{d},{\mathbb {R}}^{M})\)-valued continuous process, and therefore we can use Lemma 5.3 to get

$$\begin{aligned} E\sup _{t\in [0,T]}|u^{\varepsilon }_t|_{W^n_{p' }}^{q} \le N\left( E|\psi ^{(\varepsilon )}|_{W^n_{p' }}^{q} +E({\mathcal {K}}^{ \varepsilon }_{n, p' })^{q}(T)\right) \end{aligned}$$
(6.4)

for \(p'\in \{p,2\}\) and \(n=0,1,2,\ldots ,m\), where \(\mathcal {K}^{ \varepsilon }_{n,{p'}}\) is defined by (3.4) with \(f^{(\varepsilon )}\) and \(g^{(\varepsilon )}\) in place of \(f\) and \(g\), respectively. Keeping in mind that \(T^{1/r}\le \max \{1,T\}\), and using basic properties of convolution, we can conclude that

$$\begin{aligned} E\left( \int _0^T|u^{\varepsilon }_t|_{W^n_{p'}}^r\,dt\right) ^{{q}/r} \le N(E|\psi |_{W^n_{p'}}^{q}+E{\mathcal {K}}^{q}_{n,{p'}}(T)) \end{aligned}$$
(6.5)

for any \(r>1\) and with \(N=N(m,p,q,d,M,K,T)\) not depending on \(r\).

For integers \(n\ge 0\), and any \(r,q\in (1,\infty )\) let \({\mathbb {H}}^n_{p,r,q}\) be the space of \({\mathbb {R}}^M\)-valued functions \(v=v_t(x)=(v^i_t(x))_{i=1}^M\) on \(\Omega \times [0,T]\times {\mathbb {R}}^d\) such that \(v=(v_t(\cdot ))_{t\in [0,T]}\) are \(W^n_p({\mathbb {R}}^{d},{\mathbb {R}}^{M})\)-valued predictable processes and

$$\begin{aligned} |v|^q_{{\mathbb {H}}^n_{p,r,q}}=E\left( \int _0^T|v_t|^r_{W^n_p}\,dt\right) ^{q/r}<\infty . \end{aligned}$$

Then \({\mathbb {H}}^n_{p,r,q}\) with the norm defined above is a reflexive Banach space for each \(n\ge 0\) and \(p,r,q\in (1,\infty )\). We use the notation \({\mathbb {H}}^n_{p,q}\) for \({\mathbb {H}}^n_{p,q,q}\).

By Assumption 3.2 the right-hand side of (6.5) is finite for \(p'=p\) and also for \(p=2\) since \(\psi \), \(f\) and \(g\) vanish for \(|x|\ge R\). Thus there exists a sequence \((\varepsilon _k)_{k\in \mathbb {N}}\) such that \(\varepsilon _k\rightarrow 0\) and for \(p'=p,2\) and integers \(r>1\) and \(n\in [0,m]\) the sequence \(v^k:=u^{ \varepsilon _k }\) converges weakly in \({\mathbb {H}}^n_{{p'}r,q}\) to some \(v\in H^m_{p',r,q}\), which therefore also satisfies

$$\begin{aligned} E\left( \int _0^T|v_t|_{W^n_{p'}}^r\,dt\right) ^{q/r} \le N(E|\psi |_{W^n_{p'}}^{q}+E{\mathcal {K}}^{q}_{n,q}(T)) \end{aligned}$$

for \(p'=p,2\) and integers \(r>1\). Using this with \(p'=p\) and letting \(r\rightarrow \infty \) by Fatou’s lemma we obtain

$$\begin{aligned} E \mathop {\hbox {ess sup}}\limits _{t\in [0,T]}|v_t|^q_{W^n_{p}} \le N(E|\psi |_{W^n_{p}}^{q} +E{\mathcal {K}}^{q}_{n,p}(T))\quad \text {for}\, n=0,1,\ldots ,m. \end{aligned}$$
(6.6)

Now we are going to show that a suitable stochastic modification of \(v\) is a solution of (3.1)-(3.2). To this end we fix an \({\mathbb {R}}^M\)-valued function \(\varphi \) in \(C_0^{\infty }({\mathbb {R}}^d)\) and a predictable real-valued process \((\eta _t)_{t\in [0,T]}\), which is bounded by some constant \(C\), and define the functionals \(\Phi \), \(\Phi _k\), \(\Psi \) and \(\Psi _k\) over \({\mathbb {H}}_{p,q}^{1}\) by

$$\begin{aligned} \Phi _k(u)&= E\int _0^T\eta _t \int _0^t\left\{ -(a^{ \varepsilon _k ij}_sD_iu_s,D_j\varphi ) +(\bar{b}^{ \varepsilon _k i}_sD_iu_s+c^{(\varepsilon _k)}_su_s,\varphi )\right\} \,ds\,dt,\\ \Phi (u)&= E\int _0^T\eta _t \int _0^t\left\{ -(a^{ij}_sD_iu_s,D_j\varphi )+(\bar{b}^{i}_sD_iu_s+c_su_s,\varphi )\right\} \,ds\,dt, \\ \Psi (u)&= E\int _0^T\eta _t\int _0^t(\sigma ^{ir}_{t}D_{i}u_{t}+\nu ^{r}_{t}u_{t}, \varphi )\,dw^{r}_{t}\,dt \\ \Psi _k(u)&= E\int _0^T\eta _t\int _0^t\left( \sigma ^{(\varepsilon _k)ir}_{t}D_{i}u_{t} +\nu ^{(\varepsilon _k)r}_{t}u_{t}, \varphi \right) \,dw^{r}_{t}\,dt \end{aligned}$$

for \(u\in {\mathbb {H}}^1_{p,q}\) for each \(k\ge 1\), where \(\bar{b}^{ \varepsilon i}=b^{(\varepsilon )i}-D_ja^{ \varepsilon ij}I_{M}\). By the Bunyakovsky-Cauchy-Schwarz and the Burkholder-Davis-Gundy inequalities for all \(u\in {\mathbb {H}}^1_{p,q}\) we have

$$\begin{aligned} \Phi (u)&\le CNT^{2-1/q}|u|_{{\mathbb {H}}^1_{p,q}}|\varphi |_{W^1_{\bar{p}}},\\ \Psi (u)&\le CTE\sup _{t\le T}\left| \int _0^t (\sigma ^{ir}_{t}D_{i}u_{t}+\nu ^{r}_{t}u_{t}, \varphi )\,dw^{r}_{t}\right| \\&\le 3CT E\left\{ \int _0^T\sum \limits _{r=1}^{\infty } (\sigma ^{ir}_{t}D_{i}u_{t}+\nu ^{r}_{t}u_{t},\varphi )^2\,dt \right\} ^{1/2}\\&\le 3CTE\left\{ \int _0^T\left( \int _{{\mathbb {R}}^d} |\langle \sigma ^{ir}_{t}D_{i}u_{t}+\nu ^{r}_{t}u_{t},\varphi \rangle |_{l_2}\,dx\right) ^2\,dt \right\} ^{1/2}\\&\le CTNE\left\{ \int _0^T|u_t|^2_{W^1_{p}}|\varphi |^2_{W^1_{\bar{p}}}\,dt\right\} ^{1/2} \le CNT^{q/2}|u|_{{\mathbb {H}}^1_{p,q}}|\varphi |_{W^1_{\bar{p}}}\\ \end{aligned}$$

with a constant \(N=N(K,d,M)\), where \(\bar{p}=p/(p-1)\). (In the last inequality we make use of the assumption \(q\ge 2\).) Consequently, \(\Phi \) and \(\Psi \) are continuous linear functionals over \({\mathbb {H}}^1_{p,q}\), and therefore

$$\begin{aligned} \lim _{k\rightarrow \infty }\Phi (v^{k})=\Phi (v), \quad \lim _{k\rightarrow \infty }\Psi (v^{k})=\Psi (v). \end{aligned}$$
(6.7)

Using statement (i) of Lemma 6.1, we get

$$\begin{aligned} |\Phi _k(u)-\Phi (u)|+ |\Psi _k(u)-\Psi (u)| \le N\varepsilon _k|u|_{{\mathbb {H}}^1_{p,q}}|\varphi |_{W^1_{\bar{p}}} \end{aligned}$$
(6.8)

for all \(u\in {\mathbb {H}}^1_{p,q}\) with a constant \(N=N(k,d,M)\). Since \(u^{\varepsilon }\) is the solution of (6.2)-(6.3), we have

$$\begin{aligned} E\int _0^T\eta _t(v_t^{k},\varphi )\,dt =E\int _0^T\eta _t(\psi ^{k},\varphi )\,dt +\Phi (v^{k})+\Psi (v^{k}) \end{aligned}$$
$$\begin{aligned} +F(f^{(\varepsilon _k)})+G(g^{(\varepsilon _k)}) \end{aligned}$$
(6.9)

for each \(k\), where

$$\begin{aligned} F(f^{(\varepsilon _k)})&= E\int _0^T\eta _t\int _0^t\left( f_s^{(\varepsilon _k)},\varphi \right) \,ds\,dt, \\ G(g^{(\varepsilon _k)})&= E\int _0^T\eta _t\int _0^t\left( g^{(\varepsilon _k)r}_s,\varphi \right) \,dw_s^r\,dt. \end{aligned}$$

Taking into account that \(|v^{k}|_{{\mathbb {H}}^1_{p,q}}\) is a bounded sequence, from (6.7) and (6.8) we obtain

$$\begin{aligned} \lim _{k\rightarrow \infty }\Phi _{n}(v^{k})=\Phi (v), \quad \lim _{k\rightarrow \infty }\Psi _{k}(v^{k})=\Psi (v). \end{aligned}$$
(6.10)

One can see similarly (in fact easier), that

$$\begin{aligned} \lim \limits _{k\rightarrow \infty }E\int _0^T\eta _t(v^{k}_t,\varphi )\,dt&= E\int _0^T\eta _t(v_t,\varphi )\,dt, \end{aligned}$$
(6.11)
$$\begin{aligned} \lim \limits _{k\rightarrow \infty } E\int _0^T\eta _t(\psi ^{(\varepsilon _k)}_t,\varphi )\,dt&= E\int _0^T\eta _t(\psi ,\varphi )\,dt, \end{aligned}$$
(6.12)
$$\begin{aligned} \lim \limits _{k\rightarrow \infty }F(f^{(\varepsilon _k)})&= F(f),\quad \lim \limits _{k\rightarrow \infty }G(g^{(\varepsilon _k)})=G(g). \end{aligned}$$
(6.13)

Letting \(k\rightarrow \infty \) in (6.9), and using (6.10) through (6.13) we obtain

$$\begin{aligned}&E\int _0^T\eta _t(v_t,\varphi )\,dt \\&\quad =E\int _0^T\eta _t\Big \{(\psi ,\varphi ) +\int _0^t\big [-(a^{ij}_sD_iu_s,D_j\varphi )+ (\bar{b}^{i}_sD_iu_s+c_su_s+f_s,\varphi )\big ]\,ds \\&\quad \quad +\int _0^t(\sigma ^{ir}D_iv_s+\nu ^rv_s,\varphi )\,dw_s^r\Big \}\,dt \end{aligned}$$

for every bounded predictable process \((\eta _t)_{t\in [0,T]}\) and \(\varphi \) from \(C_0^{\infty }\). Hence for each \(\varphi \in C_0^{\infty }\)

$$\begin{aligned} (v_t,\varphi )&= (\psi ,\varphi ) +\int _0^t\big [-(a^{ij}_sD_iv_s,D_j\varphi ) +(\bar{b}^{i}_sD_iv_s+c_sv_s+f_s,\varphi )\big ]\,ds \\&\quad +\int _0^t(\sigma ^{ir}D_iv_s+\nu ^rv_s+g^r_s,\varphi )\,dw_s^r \end{aligned}$$

holds for \(P\times dt\) almost every \((\omega ,t)\in \Omega \times [0,T]\). Substituting here \((-1)^{|\alpha |}D^{\alpha }\varphi \) in place of \(\varphi \) for a multi-index \(\alpha =(\alpha _1,\ldots ,\alpha _d)\) of length \(|\alpha |\le m-1\) and integrating by parts, we see that

$$\begin{aligned} (D^{\alpha }v_t,\varphi ) =(D^{\alpha }\psi ,\varphi ) +\int _0^t\big [-(F^j_s,D_j\varphi )+(F^0_s,\varphi )\big ]\,ds +\int _0^t(G^r_s,\varphi )\,dw_s^r \end{aligned}$$
(6.14)

for \(P\times dt\) almost every \((\omega ,t)\in \Omega \times [0,T]\), where, owing to the fact that (6.6) also holds with \(2\) in place of \(p\), \(F^i\) and \((G^{r})_{r=1}^{\infty }\) are predictable processes with values in \(L_2\)-spaces for \(i=0,1,\ldots ,d\), such that

$$\begin{aligned} \int _0^T\left( \sum \limits _{i=0}^d|F^i_s|^2_{L_2}+|G_s|^2_{L_2} \right) \,ds<\infty \quad \text {(a.s.)}. \end{aligned}$$

Hence the theorem on Itô’s formula from [21] implies that in the equivalence class of \(v\) in \({\mathbb {H}}^m_{2,q}\) there is a \(W^{m-1}_2({\mathbb {R}}^{d},{\mathbb {R}}^{M})\)-valued continuous process, \(u=(u_t)_{t\in [0,T]}\), and (6.14) with \(u\) in place of \(v\) holds for any \(\varphi \in C_0^{\infty }({\mathbb {R}}^d)\) almost surely for all \(t\in [0,T]\). After that an application of Lemma 4.3 to \(D^{\alpha }u\) for \(|\alpha |\le m-1\) yields that \(D^{\alpha }u\) is an \(L_p({\mathbb {R}}^{d},{\mathbb {R}}^{M})\)-valued, strongly continuous process for every \(|\alpha |\le m-1\), i.e., \(u\) is a \(W^{m-1}_p({\mathbb {R}}^{d},{\mathbb {R}}^{M})\)-valued strongly continuous process. This, (6.6), and the denseness of \(C_{0}^{\infty }\) in \(W^{m }_p({\mathbb {R}}^{d},{\mathbb {R}}^{M})\) implies that (a.s.) \(u\) is a \(W^{m}_p({\mathbb {R}}^{d},{\mathbb {R}}^{M})\)-valued weakly continuous process and (3.11) holds.

To prove the theorem without the assumption that \(\psi \), \(f\) and \(g\) have compact support, we take a \(\zeta \in C^{\infty }_{0}({\mathbb {R}}^{d})\) such that \(\zeta (x)=1\) for \(|x|\le 1\) and \(\zeta (x)=0\) for \(|x|\ge 2\), and define \(\zeta _{n}(\cdot )=\zeta (\cdot /n)\) for \(n>0\). Let \(u(n)=(u_t(n))_{t\in [0,T]}\) denote the solution of (3.1)-(3.2) with \(\zeta _{n}\psi \), \(\zeta _{n}f\) and \(\zeta _{n}g\) in place of \(\psi \), \(f\) and \(g\), respectively. By virtue of what we have proved above, \(u(n)\) is a weakly continuous \(W^{m}_{p}({\mathbb {R}}^{d},{\mathbb {R}}^{M})\)-valued process, and

$$\begin{aligned}&E\sup _{t\in [0,T]}|u_t(n)-\,u_t(l)|^{q}_{W^m_p} \le N E|(\zeta _{n}-\zeta _{l})\psi |^{q}_{W^m_p} \\&\quad +\, NE \left( \int _{0}^{T}\left\{ |(\zeta _n-\zeta _l)f_s|^p_{W^m_p} +|(\zeta _{n}-\zeta _{l})g_s|_{W^{m+1}_p }^{p }\right\} \,ds \right) ^{q/p}. \end{aligned}$$

Letting here \(n,l\rightarrow \infty \) and applying Lebesgue’s theorem on dominated convergence in the left-hand side, we see that the right-hand side of the inequality tends to zero. Thus for a subsequence \(n_k\rightarrow \infty \) we have that \(u_t(n_k)\) converges strongly in \(W^m_p({\mathbb {R}}^{d},{\mathbb {R}}^{M})\), uniformly in \(t\in [0,T]\), to a process \(u\). Hence \(u\) is a weakly continuous \(W^m_p({\mathbb {R}}^{d},{\mathbb {R}}^{M})\)-valued process. It is easy to show that it solves (3.1)–(3.2) and satisfies (3.11).

By using a standard stopping time argument we can dispense with condition (6.1). Finally we can prove estimate (3.11) for \(q\in (0, 2 )\) by applying Lemma 3.2 from [8] in the same way as it is used there to prove the corresponding estimate in the case \(M=1\). The proof of the Theorem 3.1 is complete. We have already showed the uniqueness statement of Theorem 3.2, the proof of the other assertions goes in the above way with obvious changes.