1 Introduction

The study of comparison results for stochastic differential equations started with Skorohod [20]. After that, comparison theorems for solutions of two one-dimensional It\(\hat{o}\)’s stochastic ordinary differential equations with the same diffusion coefficients were intensively studied, and many applications, including stochastic optimal control and test for explosions were presented, see [911, 19, 22, 23] and references therein. It is worth to mention that Peng and Zhu [18] have given a necessary and sufficient condition of the comparison theorem in this case. Yan [24] gave some conclusion about equations driven by general continuous local martingale, continuous increasing process and general increasing process but still based on the same diffusion coefficients. The first comparison theorem for two multi-dimensional It\(\hat{o}\)’s stochastic ordinary differential equations with the same diffusion coefficients was proved by Gei\(\beta \) and Manthey [8]. An additional condition called quasimonotone must be imposed on. It is mainly based on this comparison theorem that Cheushov [5] established stochastic monotone dynamical systems and investigated the structure of random attractor and long-term behavior for some special quasimonotone stochastic ordinary differential equations.

In contrast to the theory of stochastic ordinary differential equations, the theory of stochastic partial differential equations lacks an important tool, namely, we do not have a well-applicable It\(\hat{o}\)’s formula. One technique to handle the difficulties arising from the missing It\(\hat{o}\)’s formula is comparison technique. Pardoux and his collaborators [3, 7] began with the study of comparison theorem for parabolic stochastic partial differential equations. Kotelenez [14] was the first one to consider comparison theorem for a wide class of parabolic stochastic partial differential equations with Lipschitz drift and diffusion coefficient. Manthey and Zausinger [15] extended Kotelenez’s result to the case that drift coefficient allows to be polynomial growth using a different method. Applying the method in [15], Assing [2] generalized Manthey and Zausinger’s result to systems of parabolic stochastic partial differential equations by a method of approximation.

As far as we know, there is very few comparison theorem for stochastic functional differential equations. The only one we have found was presented in [25] for scalar stochastic functional differential equation. They also claimed that “so far, there is no result for comparison theorem on stochastic differential delay equation” in the introduction of their work. If one tries to investigate monotone random dynamical systems generated by stochastic functional differential equations, he has to prove a comparison theorem as in the paper [8]. Besides, such a comparison theorem has its own independent interest. Motivated by these, we will prove a comparison theorem for stochastic functional differential equations under quasimonotone condition and other regular conditions. To achieve this, we need to prove a global existence and uniqueness theorem under weaker conditions. This comparison theorem lays the foundation to investigate deep long-term dynamical behavior of quasimonotone stochastic functional differential equations, which has been thoroughly exploded in deterministic functional differential equations in [21].

2 Preliminaries

Let \((\varOmega ,\mathfrak {F},\{\mathfrak {F}_{t}\}_{t\ge 0},\mathbb {P})\) be a complete filtered probability space satisfying the usual conditions. Fix an arbitrary \(\tau >0\) and two positive integers \(d\) and \(r\).

Consider the following two systems of stochastic functional differential equations(SFDEs) in the sense of It\(\hat{o}\):

$$\begin{aligned} \left\{ \begin{array}{ll} dx_{i}(t)=f_{i}(t,x(t),x_{t})dt+\sum \limits _{j=1}^{r}\sigma _{ij}(t,x(t))dW_{t}^{j}, \quad i=1,2,\ldots ,d,\\ x(\theta )=\phi (\theta ),\ \theta \in J\triangleq [-\tau , 0],\ \phi \in C(J,\mathbb {R}^{d}) , \end{array}\right. \end{aligned}$$
(2.1)

and

$$\begin{aligned} \left\{ \begin{array}{ll} d\widehat{x}_{i}(t)7=\widehat{f}_{i}(t,\widehat{x}(t), \widehat{x}_{t})dt+\sum \limits _{j=1}^{r}\sigma _{ij}(t,\widehat{x}(t))dW_{t}^{j}, \quad i=1,2,\ldots ,d,\\ \widehat{x}(\theta )=\psi (\theta ),\ \theta \in J,\ \psi \in C(J,\mathbb {R}^{d}) \end{array} \right. \end{aligned}$$
(2.2)

where \(x(t)=(x_{1}(t),\ldots ,x_{i}(t),\ldots ,x_{d}(t)), \widehat{x}(t) =(\widehat{x}_{1}(t),\ldots ,\widehat{x}_{i}(t),\ldots ,\widehat{x}_{d}(t)), x_{t}(\theta )=x(t+\theta )\), \( \widehat{x}_{t}(\theta )=\widehat{x}(t+\theta ), \theta \in J=[-\tau , 0], W_{t}(\omega )=(W_{t}^{1}(\omega ),\ldots ,W_{t}^{r}(\omega ))\) is an \(r-\)dimensional \(\{\mathfrak {F}_{t}\}_{t\ge 0}\)-adapted Wiener process with values in \(\mathbb {R}^{r}, \mathcal {C}\triangleq C(J,\mathbb {R}^d)\) is the Banach space of all continuous functions \(\phi :J\rightarrow \mathbb {R}^d\) with the sup-norm \(\Vert \phi \Vert =sup \{|\phi (s)|,s\in J\}\) and \(|\cdot |\) denotes the Euclidean norm.

We make the following hypotheses:

(H1) The drift terms   \(f=(f_{1},\ldots ,f_{d}), \widehat{f}=(\widehat{f}_{1},\ldots , \widehat{f}_{d}):\mathbb {R}_{+}\times \mathbb {R}^d\times \mathcal {C}\rightarrow \mathbb {R}^d\) are continuous and the inequality \(f_{i}(t,\phi (0),\phi )< \widehat{f}_{i}(t,\psi (0),\psi )\) is fulfilled whenever \(\phi \le _{\mathcal {C}}\psi \) and \(\phi _{i}(0)=\psi _{i}(0)\) holds for some \(i\) and \(t \ge 0, \phi ,\psi \in \mathcal {C}\), the notation \(\le _{\mathcal {C}}\) is given in the next section.

(H2) The drift term satisfies global Lipschitz condition, that is, there exists a constant \(L>0\) such that for each \(i=1,2,\ldots ,d,\)

$$\begin{aligned} |f_{i}(t,x,\phi )-f_{i}(t,x',\psi )|^{2}\le L\big (|x-x'|^{2}+\Vert \phi -\psi \Vert ^{2}\big ) \end{aligned}$$

for all \( t\ge 0, x,x'\in {\mathbb {R}}^{d}\) and \(\phi ,\psi \in \mathcal {C}.\)

(H3) The diffusion term \(\sigma (t,x)=(\sigma _{ij}(t,x)):\mathbb {R}_{+}\times \mathbb {R}^d\rightarrow \mathbb {R}^{d\times r},i=1,2,\ldots ,d,j=1,2,\ldots ,r \) is continuous and there exists a nondecreasing continuous concave function \(\rho :\mathbb {R}_{+}\rightarrow \mathbb {R}_{+}\) with \(\rho (0)=0, \rho (x)>0\) for \(x>0\), and \(\int _{0^{+}}\frac{dx}{\rho (x)}=\infty \) such that for each \(i=1,2,\ldots ,d,\)

$$\begin{aligned} \sum \limits _{j=1}^{r}|\sigma _{ij}(t,x)-\sigma _{ij}(t,x')|^{2}\le \rho \big (|x_{i}-x_{i}'|^{2}\big ), \end{aligned}$$

for all \(t\ge 0, x,x'\in \mathbb {R}^{d}\).

If \(\rho (|x_{i}-x_{i}'|^{2})\) is replaced by \(\rho (|x-x'|^{2})\), then we use (H3\(^*\)) to denote the corresponding hypothesis.

(H4) The drift and diffusion terms have linear growth, that is, there is a constant \(\gamma > 0\) such that \(f=(f_{1},\ldots ,f_{d}):\mathbb {R}_{+}\times \mathbb {R}^d\times \mathcal {C}\rightarrow \mathbb {R}^d\), and \(\sigma (t,x)=(\sigma _{ij}(t,x)):\mathbb {R}_{+}\times \mathbb {R}^d\rightarrow \mathbb {R}^{d\times r},i=1,2,\ldots ,d,j=1,2,\ldots ,r \) satisfy

$$\begin{aligned}&|f(t,x,\phi )|^{2}\le \gamma \big (1+|x|^{2}+\Vert \phi \Vert ^{2}\big )\ \mathrm{and}\\&|\sigma (t,x)|^{2}\le \gamma \big (1+|x|^{2}\big ) \end{aligned}$$

for all \(t\ge 0, x\in \mathbb {R}^d,\phi \in \mathcal {C}.\)

Theorem 1

Assume that (H2), (H3\(^*\)) and (H4) hold for system (2.1). Then the system (2.1) has a strong solution \(x(t,\phi )\) for all \(t>0\), and the strong uniqueness holds. Furthermore \(x_{t}(\phi )\) is a \(\mathcal {C}\)-valued process adapted to \(\{\mathfrak {F}_{t}\}_{t\ge 0}\) with continuous sample paths.

This proof is presented in appendix and extends Mao’s technique ([16]) for backward stochastic differential equations to stochastic functional differential equations.

Now we review the general Gronwall inequality in [1], which is useful in the subsequent sections. Consider the following inequality

$$\begin{aligned} u(t)\le a(t)+\int _{0}^{t}\lambda (t,s)\eta (u(s))ds,\quad 0\le t\le t_{1}, \end{aligned}$$
(2.3)

which satisfies the following properties:

  1. (S1)

    \(\eta \) is continuous and nondecreasing function on \([0,\infty )\) and is positive on \((0,\infty )\),

  2. (S2)

    \(a(t)\) is continuously differentiable in \(t\) and nonnegative on \([0,t_{1}]\), where \(t_{1}>0\) is a constant,

  3. (S3)

    \(\lambda (t,s)\) is continuous and nonnegative function on \([0,t_{1}]\times [0,t_{1}]\).

Theorem 2

Suppose that (S1)–(S3) hold and \(u(t)\) is a continuous and nonnegative function on \([0,t_{1}]\) satisfying (2.3). Then

$$\begin{aligned} u(t)\le W^{-1}\Big [W(r(t))+\int _{0}^{t}\max \limits _{0\le \chi \le t}\lambda (\chi ,s)ds\Big ],\quad 0\le t\le t_{c}, \end{aligned}$$
(2.4)

where \(W(u)\triangleq \int _{\widehat{u}}^{u}\frac{dz}{\eta (z)},\widehat{u}>0\) is constant and \(r(t)\) is determined by

$$\begin{aligned} r(t)\triangleq a(0)+\int _{0}^{t}|a'(s)|ds, \end{aligned}$$

\(t_{c}\le t_{1}\) is the largest number such that

$$\begin{aligned} W(r(t_{c}))+\int _{0}^{t_{c}}\max \limits _{0\le \chi \le t_{c}}\lambda (\chi ,s)ds\le \int _{\widehat{u}}^{\infty }\frac{dz}{\eta (z)}. \end{aligned}$$

The proof can be found in [1].

We apply Theorem 2 to study such an inequality :

$$\begin{aligned} u(t)\le \int _{0}^{t}\lambda _{1}\rho (u(s))ds+ \int _{0}^{t} \lambda _{2} u(s)ds,\ t\ge 0. \end{aligned}$$
(2.5)

Corollary 3

If \(\rho \) satisfies the hypothesis (H3), \(\lambda _{1}\) and \(\lambda _{2}\) are two positive constants and \(u(t)\) is a continuous and nonnegative function on \([0,\infty )\) satisfying (2.5), then \(u(t)=0,t\ge 0.\)

Proof

Let \( t_{1}>0\) be arbitrary and consider the following inequality:

$$\begin{aligned} u(t)\le \int _{0}^{t}\lambda _{1}\rho (u(s))ds+\int _{0}^{t}\lambda _{2} u(s)ds,\quad 0\le t\le t_{1}. \end{aligned}$$

Let \(\lambda =\max \{\lambda _{1},\lambda _{2}\}\) and \(\varrho (u)=u+\rho (u)\). Then we have

$$\begin{aligned} u(t)\le \int _{0}^{t}\lambda \varrho (u(s))ds,\quad 0\le t\le t_{1}. \end{aligned}$$

By Theorem 2, it is easy to see that \(W(u)=\int _{\widehat{u}}^{u}\frac{dz}{\varrho (z)},r(t)\equiv 0\),

$$\begin{aligned} u(t)\le W^{-1}\Big [W(r(t))+\int _{0}^{t}\lambda ds\Big ],\quad 0\le t\le t_{c}, \end{aligned}$$
(2.6)

where \(t_{c}\) is the largest number such that

$$\begin{aligned} W(r(t_{c}))+\int _{0}^{t_{c}}\lambda ds\le \int _{\widehat{u}}^{\infty }\frac{dz}{\varrho (z)}. \end{aligned}$$
(2.7)

Since \(\rho \) is a concave function with \(\rho (0)=0\), we have

$$\begin{aligned} \rho (z)\ge \rho (1)z,\quad \mathrm{{for}} \quad 0\le z \le 1. \end{aligned}$$

So

$$\begin{aligned} \int _{0^{+}}\frac{dz}{\varrho (z)}=\int _{0^{+}}\frac{dz}{z+\rho (z)}\ge \frac{\rho (1)}{\rho (1)+1}\int _{0^{+}}\frac{dz}{\rho (z)}=\infty , \end{aligned}$$
(2.8)

which implies that (2.7) is true for \(t_c = t_1\). Therefore, \( u(t) = 0\) for \(0\le t\le t_{1}\) by (2.6). Since \(t_{1}\) is arbitrarily chosen, we have \( u(t)= 0,\ t\ge 0.\) The proof is complete. \(\square \)

Before finishing this section, we provide a criterion for stopping times.

Proposition 4

Suppose that \(\varsigma \) is an \(\mathfrak {F}_{t}-\)stopping time and \(Q\) is progressively measurable. Then

$$\begin{aligned} D_Q(\omega ) \triangleq \mathrm{inf} \{s\mid s > \varsigma (\omega ), (s,\omega )\in Q \} \end{aligned}$$

is an \(\mathfrak {F}_{t}-\)stopping time.

Proof

Let

$$\begin{aligned} \widetilde{Q}\triangleq ((\varsigma ,\infty ))\triangleq \{(s,\omega )\in \mathbb {R}_+\times \varOmega \mid \varsigma (\omega ) < s\}. \end{aligned}$$

Then we will prove that \(\widetilde{Q}\) is a predictable set. It suffices to show that \(I_{\widetilde{Q}}\) is an \(\mathfrak {F}_{t}-\)adapted left-continuous process. It is easy to check that the sample path \(I_{\widetilde{Q}}(t,\omega )\) is left-continuous for a fixed \(\omega \). Since \(I_{\widetilde{Q}}(t,\omega ) = I_{\{\varsigma <t\}}(\omega )\) and the filtration \(\{\mathfrak {F}_{t}\}_{t\ge 0}\) is right-continuous, \(I_{\widetilde{Q}}\) is adapted to the filtration \(\{\mathfrak {F}_{t}\}_{t\ge 0}\).

By the definition of \(\widetilde{Q}\), it is easy to see that

$$\begin{aligned} D_Q(\omega ) = \mathrm{inf} \{s\mid (s,\omega )\in Q\cap \widetilde{Q} \}. \end{aligned}$$

The direct proof shows that for a fixed \(t\),

$$\begin{aligned} \{\omega \in \varOmega \mid D_Q(\omega ) < t\} = \pi (Q\cap \widetilde{Q}\cap ((0,t)\times \varOmega )) \end{aligned}$$
(2.9)

where

$$\begin{aligned} \pi : \mathbb {R}_+\times \varOmega \rightarrow \varOmega \end{aligned}$$

is the projection mapping.

Since \(Q\) is progressively measurable and \(\widetilde{Q}\) is predictable, \(Q\cap \widetilde{Q}\) is progressively measurable, which implies that \(Q\cap \widetilde{Q}\cap ([0,t]\times \varOmega ) \in \mathcal {B}([0,t])\times \mathfrak {F}_{t}.\) Thus,

$$\begin{aligned} Q\cap \widetilde{Q}\cap ((0,t)\times \varOmega ) = (Q\cap \widetilde{Q}\cap ([0,t]\times \varOmega ))\bigcap ((0,t)\times \varOmega ) \in \mathcal {B}([0,t])\times \mathfrak {F}_{t}. \end{aligned}$$

It follows from (2.9) and projection theorem that \(\{\omega \in \varOmega \mid D_Q(\omega ) < t\}\in \mathfrak {F}_{t}\). This completes the proof. \(\square \)

Corollary 5

Suppose that \(F(t,\omega ): \mathbb {R}_+\times \varOmega \rightarrow \mathbb {R}^d\) is a progressively measurable process and \(\varsigma \) is an \(\mathfrak {F}_{t}-\)stopping time. If \(E\in \mathcal {B}(\mathbb {R}^d)\), then

$$\begin{aligned} D_E(\omega ) \triangleq \mathrm{inf} \{s\mid s > \varsigma (\omega ), F(s,\omega )\in E \} \end{aligned}$$

is an \(\mathfrak {F}_{t}-\)stopping time.

Proof

Let \(Q = F^{-1}(E)\). Then \(Q\) is progressively measurable. The conclusion follows immediately from Proposition 4. \(\square \)

3 Comparison Theorems for SFDEs

To obtain the comparison results of SFDEs, we need the partial orders in \(\mathbb {R}^{d}\) and \(\mathcal {C}\). The positive cone in \(\mathbb {R}^{d}\), denoted by \(\mathbb {R}_{+}^{d}\), is the set of all \(d\) tuples with nonnegative coordinates. It gives rise to a partial order on \(\mathbb {R}^{d}\) in the following way:

$$\begin{aligned}&x\le y\Longleftrightarrow x_{i}\le y_{i},\quad \mathrm{for} \quad i=1,\ldots ,d,\\&x<y\Longleftrightarrow x\le y \qquad \mathrm{and} \ x_{i}< y_{i},\quad \mathrm{for}\ \mathrm{some} \ i\in \{1,\ldots ,d\},\\&x\ll y\Longleftrightarrow x_{i}<y_{i},\quad \mathrm{for} \ i=1,\ldots ,d. \end{aligned}$$

Let

$$\begin{aligned} \mathcal {C}_{+}=\{\phi \in \mathcal {C}:\phi (s)\ge 0,\quad s\in J\}. \end{aligned}$$

Then \(\mathcal {C}_{+}\) is a positive cone of the Banach space \(\mathcal {C}.\) Hence the partial order on \(\mathcal {C}\) is given as follows

$$\begin{aligned}&\phi \le _{\mathcal {C}} \psi \Longleftrightarrow \phi (s)\le \psi (s),\quad s\in J,\\&\phi <_{\mathcal {C}}\psi \Longleftrightarrow \phi \le _{\mathcal {C}}\psi ,\ and \quad \phi \ne \psi ,\\&\phi \ll _{\mathcal {C}}\psi \Longleftrightarrow \phi (s)\ll \psi (s),\quad s\in J. \end{aligned}$$

Now we present our first comparison result.

Theorem 6

Suppose that the drift terms \(f, \widehat{f}\) and the diffusion term \(\sigma (t,x)\) satisfy the hypotheses (H1)–(H4). If \(\phi ,\psi \in \mathcal {C}\) satisfying \(\phi \le _{\mathcal {C}}\psi \), then \(\mathbb {P}(\{x(t,\phi )\le \widehat{x}(t,\psi ),t\ge 0\})=1\) and hence \(\mathbb {P}(\{x_{t}(\phi )\le _{\mathcal {C}} \widehat{x}_{t}(\psi ),t\ge 0\})=1.\)

Proof

Let \(\widehat{X}(t)=\widehat{x}(t,\psi )\) and \(X(t)=x(t,\phi ).\) Then we have

$$\begin{aligned} d\widehat{X}_{i}(t)&= \widehat{f}_{i}\left( t,\widehat{X}(t), \widehat{X}_{t}\right) dt+\sum \limits _{j=1}^{r}\sigma _{ij}\left( t, \widehat{X}(t)\right) dW^{j}_{t},\quad i=1,2,\ldots ,d,\\ \widehat{X}(t)&= \psi (t),\quad t\in [-\tau ,0] \end{aligned}$$

and

$$\begin{aligned} dX_{i}(t)&= f_{i}(t,X(t),X_{t})dt+\sum \limits _{j=1}^{r} \sigma _{ij}(t,X(t))dW^{j}_{t},\quad i=1,2,\ldots ,d,\\ X(t)&= \phi (t),\quad t\in [-\tau ,0]. \end{aligned}$$

By Theorem 1, \(X(t)\) and \(\widehat{X}(t)\) are \(\mathfrak {F}_{t}-\)adapted continuous processes.

Set \(Y_{i}(t)=\widehat{X}_{i}(t)-X_{i}(t),\ i=1,2,\ldots ,d.\) Then we have

$$\begin{aligned} dY_{i}(t)&= \Big [\widehat{f}_{i}(t,\widehat{X}(t), \widehat{X}_{t})-f_{i}(t,X(t),X_{t})\Big ]dt\\&\quad + \sum \limits _{j=1}^{r}(\sigma _{ij}(t, \widehat{X}(t))-\sigma _{ij}(t,X(t)))dW^{j}_{t},\quad i=1,2,\ldots ,d. \end{aligned}$$

Now we introduce the following function which was first presented in [4]:

$$\begin{aligned} \varphi _{\epsilon }(y)=\left\{ \begin{array}{lll} y^{2}, &{} \quad \quad y\le 0,\\ y^{2}-\frac{y^{3}}{6\epsilon }, &{} \quad \quad 0<y\le 2\epsilon ,\\ 2\epsilon y-\frac{4}{3}\epsilon ^{2},&{} \quad \quad y>2\epsilon . \end{array} \right. \end{aligned}$$

It is easy to see that \(\varphi _{\epsilon }(y)\in C^{2}(\mathbb {R}), \varphi _{\epsilon }'(y)\rightarrow 2y^{-}\) uniformly with respect to \(y, \varphi _{\epsilon }''(y)\rightarrow 2I_{(y\le 0)}\) and \(\varphi _{\epsilon }(y)\rightarrow |y^{-}|^{2}\) provided that \(\epsilon \rightarrow 0\), where \(y^{-}=y\wedge 0\).

Define the stopping times

$$\begin{aligned} \varGamma _{N}\triangleq inf&\Big \{t>0:|X(t)|>N,|\widehat{X}(t)|>N\Big \}\wedge N,\quad \mathrm{{for every}}\quad N>0,\nonumber \\&\varLambda _{i}\triangleq inf \Big \{t>0: X_{i}(t)>\widehat{X}_{i}(t)\Big \},\quad i=1,2,\ldots ,d, \end{aligned}$$
(3.1)

and

$$\begin{aligned} \varLambda =\varLambda _{1}\wedge \cdots \wedge \varLambda _{d}. \end{aligned}$$

Obviously \(\varGamma _{N}\uparrow \infty \) as \(N \uparrow \infty \), and either \(\varLambda _i = +\infty \) or

$$\begin{aligned} X_{i}(\varLambda _{i})=\widehat{X}_{i}(\varLambda _{i})\ \text {and}\ X_{i}(t)\le \widehat{X}_{i}(t),\quad \text {for}\quad 0 \le t\le \varLambda _{i},\quad i=1,\ldots ,d. \end{aligned}$$
(3.2)

In order to verify the conclusion, we have to prove that

$$\begin{aligned} \mathbb {P}(\{\varLambda < \infty \}) = 0, \end{aligned}$$

it suffices to show that \(\mathbb {P}(\{\varLambda <\varGamma _{N}\})=0\) for every \(N >0\).

For \(i=1,2,\ldots ,d,\) let

$$\begin{aligned}&\vartheta _{i}\triangleq inf \left\{ t>\varLambda _{i}:Z_i(t,\omega )\triangleq f_{i}(t,X(t),X_{t})-\widehat{f}_{i}(t,\widehat{X}_{1}(t), \ldots ,\widehat{X}_{i-1}(t),\widehat{X}_{i}(t)-Y^{-}_{i}(t),\nonumber \right. \\&\left. \widehat{X}_{i+1}(t),\ldots ,\widehat{X}_{d}(t),\widehat{X}_{1,t},\ldots , \widehat{X}_{i-1,t},\widehat{X}_{i,t}-\widetilde{Y}^{-}_{i,t}, \widehat{X}_{i+1,t}\ldots ,\widehat{X}_{d,t})>0\right\} , \end{aligned}$$
(3.3)

where \(Y^{-}_{i}(t)=Y_{i}(t)\wedge 0\) and \(\widetilde{Y}^{-}_{i,t}(\theta )=\widetilde{Y}^{-}_{i}(t+\theta ) =Y^{-}_{i}(t),-\tau \le \theta \le 0\). It is easy to see that the sample path for \(Z_i(t,\omega )\) is continuous for a fixed \(\omega \). By monotone class theorem, one can verify that \(Z_i(t,\omega )\) is \(\mathfrak {F}_{t}-\)adapted. Then applying Corollary 5, we get that \(\vartheta _{i}\) is an \(\mathfrak {F}_{t}-\)stopping time.

We claim that

$$\begin{aligned} \varLambda _{i}<\vartheta _{i}\ \text {on}\ \{\varLambda _{i}=\varLambda <\infty \},\quad i=1,2,\ldots ,d. \end{aligned}$$

By the definition of \(\vartheta _{i}, \varLambda _{i}\le \vartheta _{i}\) holds for every \(i=1,2,\ldots ,d\). Again by the definition of \(\vartheta _{i}\), together with the continuity of \(f_{i}\) and \(\widehat{f}_{i},i=1,2,\ldots ,d \) and the pathwise continuity of \(X\)and \(\widehat{X}\) we have

$$\begin{aligned}&f_{i}(\vartheta _{i},X(\vartheta _{i}),X_{\vartheta _{i}})- \widehat{f}_{i}(\vartheta _{i},\widehat{X}_{1}(\vartheta _{i}), \ldots ,\widehat{X}_{i-1}(\vartheta _{i}),\widehat{X}_{i}(\vartheta _{i})- Y^{-}_{i}(\vartheta _{i}),\widehat{X}_{i+1}(\vartheta _{i}),\nonumber \\&\ldots ,\widehat{X}_{d}(\vartheta _{i}),\widehat{X}_{1,\vartheta _{i}}, \ldots ,\widehat{X}_{i-1,\vartheta _{i}},\widehat{X}_{i,\vartheta _{i}}- \widetilde{Y}^{-}_{i,\vartheta _{i}},\widehat{X}_{i+1,\vartheta _{i}} ,\ldots ,\widehat{X}_{d,\vartheta _{i}})\ge 0. \end{aligned}$$
(3.4)

In order to prove this claim, let us assume the contrary. Then \(\vartheta _{i}=\varLambda _{i}\) on \(\{\varLambda _{i}=\varLambda <\infty \}.\) Since \(X_{i}(\varLambda _{i})=\widehat{X}_{i}(\varLambda _{i})\), we have \(Y^{-}_{i}(\vartheta _{i})=0,\widetilde{Y}^{-}_{i,\vartheta _{i}} (\theta )=0,-\tau \le \theta \le 0\). In view of (3.2) and the hypothesis (H1), it is easy to see that

$$\begin{aligned}&f_{i}(\vartheta _{i},X(\vartheta _{i}),X_{\vartheta _{i}})- \widehat{f}_{i}(\vartheta _{i},\widehat{X}_{1}(\vartheta _{i}), \ldots ,\widehat{X}_{i-1}(\vartheta _{i}),\widehat{X}_{i}(\vartheta _{i})- Y^{-}_{i}(\vartheta _{i}),\\&\widehat{X}_{i+1}(\vartheta _{i}),\ldots ,\widehat{X}_{d}(\vartheta _{i}), \widehat{X}_{1,\vartheta _{i}},\ldots ,\widehat{X}_{i-1,\vartheta _{i}}, \widehat{X}_{i,\vartheta _{i}}-\widetilde{Y}^{-}_{i,\vartheta _{i}}, \widehat{X}_{i+1,\vartheta _{i}},\ldots ,\widehat{X}_{d,\vartheta _{i}})<0 \end{aligned}$$

on \(\{\varLambda _{i}=\varLambda <\infty \}\), which contradicts (3.4). Hence this claim holds.

By (3.3) and the above claim, it can be seen that for all \(s\in [\varLambda _{i},\vartheta _{i}]\)

$$\begin{aligned}&f_{i}(s,X(s),X_{s})-\widehat{f}_{i}(s,\widehat{X}_{1}(s), \ldots ,\widehat{X}_{i-1}(s),\widehat{X}_{i}(s)-Y^{-}_{i}(s), \widehat{X}_{i+1}(s),\nonumber \\&\ldots ,\widehat{X}_{d}(s),\widehat{X}_{1,s},\ldots ,\widehat{X}_{i-1,s}, \widehat{X}_{i,s}-\widetilde{Y}^{-}_{i,s},\widehat{X}_{i+1,s} \ldots ,\widehat{X}_{d,s})\le 0 \end{aligned}$$
(3.5)

on \(\{\varLambda _{i}=\varLambda <\infty \}\).

Now our purpose is to prove that for every \(N >0, \mathbb {P}(\{\varLambda <\varGamma _{N}\})=0\). To this end, we assume that

$$\begin{aligned} \mathbb {P}(\{\varLambda <\varGamma _{N}\})>0 \end{aligned}$$

for some \(N\). It follows that there exists an \(i\in \{1,2,\ldots ,d\}\) such that

$$\begin{aligned} \mathbb {P}(A)>0 \end{aligned}$$

where \(A=\{\varLambda _{i}=\varLambda <\varGamma _{N}\}\).

Since \(Y_{i}(t),\ i=1,2,\ldots ,d\) is a continuous semimartingale ([17]), applying the It\(\hat{o}\) formula, we have

$$\begin{aligned}&\varphi _{\epsilon }(Y_{i}((\varLambda +t)\wedge \vartheta _{i}\wedge \varGamma _{N}))\nonumber \\&\quad =\varphi _{\epsilon }(Y_{i}(\varLambda \wedge \vartheta _{i}\wedge \varGamma _{N}))\nonumber \\&\qquad +\int _{\varLambda \wedge \vartheta _{i}\wedge \varGamma _{N}}^{(\varLambda +t) \wedge \vartheta _{i}\wedge \varGamma _{N}}\varphi _{\epsilon }'(Y_{i}(s)) [\widehat{f}_{i}(s,\widehat{X}(s),\widehat{X}_{s})-f_{i}(s,X(s),X_{s})]ds\nonumber \\&\qquad +\int _{\varLambda \wedge \vartheta _{i}\wedge \varGamma _{N}}^{(\varLambda +t) \wedge \vartheta _{i}\wedge \varGamma _{N}}\varphi _{\epsilon }'(Y_{i}(s)) \sum \limits _{j=1}^{r}(\sigma _{ij}(s,\widehat{X}(s))- \sigma _{ij}(s,X(s)))dW^{j}_{s}\nonumber \\&\qquad +\int _{\varLambda \wedge \wedge \vartheta _{i}\varGamma _{N}}^{(\varLambda +t) \wedge \vartheta _{i}\wedge \varGamma _{N}}\frac{1}{2}\varphi _{\epsilon }''(Y_{i}(s)) \sum \limits _{j=1}^{r}(\sigma _{ij}(s,\widehat{X}(s))- \sigma _{ij}(s,X(s)))^{2}ds\nonumber \\&\quad \triangleq \varDelta _{1}+\varDelta _{2}+\varDelta _{3}+\varDelta _{4}. \end{aligned}$$
(3.6)

Note that \(I_{A}\) is \(\mathcal {F}_{\varLambda }-\) measurable (see Lemma 1.2.16 in [13]). Hence

$$\begin{aligned} E[\varDelta _{3}I_{A}]=E[E[\varDelta _{3}I_{A}|\mathcal {F}_{\varLambda }]] = E[I_{A}E[\varDelta _{3}|\mathcal {F}_{\varLambda }]] = 0. \end{aligned}$$
(3.7)

In fact, let

$$\begin{aligned} M_{\chi }(\omega )\triangleq \int _{0}^{\chi }\varphi _{\epsilon }'(Y_{i}(s)) \sum \limits _{j=1}^{r}(\sigma _{ij}(s,\widehat{X}(s))-\sigma _{ij}(s,X(s)))dW^{j}_{s}. \end{aligned}$$

Then \(M_{\chi }(\omega )\) is a continuous martingale. Thus

$$\begin{aligned} E[\varDelta _{3}|\mathcal {F}_{\varLambda }]&= E[M_{(\varLambda +t)\wedge \vartheta _{i}\wedge \varGamma _{N}} - M_{\varLambda \wedge \vartheta _{i}\wedge \varGamma _{N}}|\mathcal {F}_{\varLambda }]\\&= E[M_{(\varLambda +t)\wedge \vartheta _{i}\wedge \varGamma _{N}}|\mathcal {F}_{\varLambda }] - M_{\varLambda \wedge \vartheta _{i}\wedge \varGamma _{N}}\\&= E[M_{(\varLambda +t)\wedge \vartheta _{i}\wedge \varGamma _{N}}|\mathcal {F}_{{\varLambda } \wedge \vartheta _{i}\wedge \varGamma _{N}}] - M_{\varLambda \wedge \vartheta _{i}\wedge \varGamma _{N}}=0. \end{aligned}$$

We have used the optional sampling theorem (see [13]) in the last equality and (ii) of Problem 1.2.17 in [13] in the second equality. This proves (3.7).

Multiplied by the indicator function \(I_{A}\) to the two sides of (3.6), and then taking expectation and setting \(\epsilon \rightarrow 0\), we obtain that

$$\begin{aligned}&E\Big (|Y_{i}^{-}((\varLambda +t)\wedge \vartheta _{i}\wedge \varGamma _{N})|^{2}I_{A}\Big )\nonumber \\&\quad =EI_{A}\int _{\varLambda \wedge \vartheta _{i}\wedge \varGamma _{N}}^{(\varLambda +t) \wedge \vartheta _{i}\wedge \varGamma _{N}}2\Big (Y_{i}^{-}(s)\Big ) \Big [\widehat{f}_{i}(s,\widehat{X}(s),\widehat{X}_{s})-f_{i}(s,X(s),X_{s})\Big ]ds\nonumber \\&\qquad +EI_{A}\int _{\varLambda \wedge \vartheta _{i}\wedge \varGamma _{N}}^{(\varLambda +t) \wedge \vartheta _{i}\wedge \varGamma _{N}}I_{\{Y_{i}(s)\le 0\}}\sum \limits _{j=1}^{r}\Big (\sigma _{ij}(s,\widehat{X}(s))- \sigma _{ij}(s,X(s))\Big )^{2}ds\nonumber \\&\quad \triangleq EI_{A}\varSigma _1+ EI_{A}\varSigma _{2}. \end{aligned}$$
(3.8)

Relation (3.5) and \(y^- \le 0\) imply that

$$\begin{aligned}&I_{A}\varSigma _{1}=I_{A} \int _{\varLambda \wedge \vartheta _{i}\wedge \varGamma _{N}}^{(\varLambda +t) \wedge \vartheta _{i}\wedge \varGamma _{N}}2(Y_{i}^{-}(s))\left[ \widehat{f}_{i}(s, \widehat{X}(s),\widehat{X}_{s}) -\widehat{f}_{i}(s,\widehat{X}_{1}(s),\ldots ,\widehat{X}_{i-1}(s),\right. \\&\left. \widehat{X}_{i}(s)-Y^{-}_{i}(s),\widehat{X}_{i+1}(s), \ldots ,\widehat{X}_{d}(s),\widehat{X}_{1,s},\ldots , \widehat{X}_{i-1,s},\widehat{X}_{i,s}-\widetilde{Y}^{-}_{i,s}, \widehat{X}_{i+1,s},\ldots ,\widehat{X}_{d,s})\right] ds \\&\qquad +I_{A}\int _{\varLambda \wedge \vartheta _{i}\wedge \varGamma _{N}}^{(\varLambda +t) \wedge \vartheta _{i}\wedge \varGamma _{N}}2(Y_{i}^{-}(s))\left[ \widehat{f}_{i}(s,\widehat{X}_{1}(s),\ldots ,\widehat{X}_{i-1}(s),\widehat{X}_{i}(s)-Y^{-}_{i}(s),\widehat{X}_{i+1}(s), \right. \\&\left. \qquad \ldots ,\widehat{X}_{d}(s),\widehat{X}_{1,s}, \ldots ,\widehat{X}_{i-1,s},\widehat{X}_{i,s}-\widetilde{Y}^{-}_{i,s}, \widehat{X}_{i+1,s},\ldots ,\widehat{X}_{d,s})-f_{i}(s,X(s),X_{s})\right] ds \\&\quad \le I_{A} \int _{\varLambda \wedge \vartheta _{i}\wedge \varGamma _{N}}^{(\varLambda +t)\wedge \vartheta _{i} \wedge \varGamma _{N}}2(Y_{i}^{-}(s))\left[ \widehat{f}_{i}(s,\widehat{X}(s),\widehat{X}_{s}) -\widehat{f}_{i}(s,\widehat{X}_{1}(s),\ldots ,\widehat{X}_{i-1}(s),\widehat{X}_{i}(s) \right. \\&\left. \qquad -Y^{-}_{i}(s),\widehat{X}_{i+1}(s), \ldots ,\widehat{X}_{d}(s),\widehat{X}_{1,s},\ldots ,\widehat{X}_{i-1,s}, \widehat{X}_{i,s}-\widetilde{Y}^{-}_{i,s},\widehat{X}_{i+1,s},\ldots , \widehat{X}_{d,s})\right] ds. \end{aligned}$$

By the global Lipschitz condition for the drift \(\widehat{f}\), there exists a constant \(L^{*}>0\) such that

$$\begin{aligned} I_{A}\varSigma _{1}\le I_{A} \int _{\varLambda \wedge \vartheta _{i}\wedge \varGamma _{N}}^{(\varLambda +t) \wedge \vartheta _{i}\wedge \varGamma _{N}}2|{Y}_{i}^{-}(s)|\times L^{*}\Big (|Y^{-}_{i}(s)|+\Vert \widetilde{Y}^{-}_{i,s}\Vert \Big )ds. \end{aligned}$$
(3.9)

Since \(\widetilde{Y}^{-}_{i,s}(\theta )=Y^{-}_{i}(s),-\tau \le \theta \le 0\), by (3.9) we have

$$\begin{aligned} I_{A}\varSigma _{1}&\le I_{A} \int _{\varLambda \wedge \vartheta _{i}\wedge \varGamma _{N}}^{(\varLambda +t) \wedge \vartheta _{i}\wedge \varGamma _{N}}4L^{*}|{Y}_{i}^{-}(s)|^{2}ds. \end{aligned}$$
(3.10)

From (H3) it follows that

$$\begin{aligned} I_{A}\varSigma _{2}&= I_{A}\int _{\varLambda \wedge \vartheta _{i}\wedge \varGamma _{N}}^{(\varLambda +t) \wedge \vartheta _{i}\wedge \varGamma _{N}}I_{\{Y_{i}(s)\le 0\}}\sum \limits _{j=1}^{r}\Big (\sigma _{ij}(s,\widehat{X}(s))- \sigma _{ij}(s,X(s))\Big )^{2}ds \nonumber \\&\le I_{A}\int _{\varLambda \wedge \vartheta _{i} \wedge \varGamma _{N}}^{(\varLambda +t)\wedge \vartheta _{i}\wedge \varGamma _{N}}I_{\{Y_{i}(s)\le 0\}}\times \rho \Big (|Y_{i}(s)|^{2}\Big )ds \nonumber \\&\le I_{A}\int _{\varLambda \wedge \vartheta _{i} \wedge \varGamma _{N}}^{(\varLambda +t)\wedge \vartheta _{i}\wedge \varGamma _{N}} \rho \Big (|Y^{-}_{i}(s)|^{2}\Big )ds. \end{aligned}$$
(3.11)

By (3.8), (3.10) and (3.11), we have

$$\begin{aligned} E\big (|Y_{i}^{-}((\varLambda +t)\wedge \vartheta _{i}\wedge \varGamma _{N})|^{2}I_{A}\big )&\le EI_{A} \int _{\varLambda \wedge \vartheta _{i}\wedge \varGamma _{N}}^{(\varLambda +t) \wedge \vartheta _{i}\wedge \varGamma _{N}}4L^{*}|Y_{i}^{-}(s)|^{2} ds \\&\quad +EI_{A}\int _{\varLambda \wedge \vartheta _{i}\wedge \varGamma _{N}}^{(\varLambda +t) \wedge \vartheta _{i}\wedge \varGamma _{N}} \rho \Big (|Y^{-}_{i}(s)|^{2}\Big )ds \\&\le E\int _{0}^{t} 4L^{*}|Y^{-}_{i}\Big ((s+\varLambda )\wedge \vartheta _{i} \wedge \varGamma _{N}\Big )|^{2}I_{A}ds \\&\quad +E\int _{0}^{t} \rho \Big (|Y^{-}_{i}((s+\varLambda )\wedge \vartheta _{i}\wedge \varGamma _{N})|^{2}\Big )I_{A} ds \\&\le \int _{0}^{t} 4L^{*}E\Big (|Y^{-}_{i}((s+\varLambda )\wedge \vartheta _{i} \wedge \varGamma _{N})|^{2}I_{A}\Big )ds \\&\quad +\int _{0}^{t} \rho \Big (E(|Y^{-}_{i}((s+\varLambda )\wedge \vartheta _{i}\wedge \varGamma _{N})|^{2}I_{A})\Big ) ds \end{aligned}$$

where the last inequality has applied the Jensen inequality because of the concavity of \(\rho \). Note that \(t\rightarrow E|Y_{i}^{-}((\varLambda +t)\wedge \vartheta _{i}\wedge \varGamma _{N})|^{2}\) is continuous (see Remark 14) and using Corollary 3, we have

$$\begin{aligned} E\Big (|Y_{i}^{-}((\varLambda +t)\wedge \vartheta _{i} \wedge \varGamma _{N}\Big )|^{2}I_{A})=0, \end{aligned}$$

which implies that

$$\begin{aligned} X_{i}\Big ((\varLambda +t)\wedge \vartheta _{i}\wedge \varGamma _{N})\le \widehat{X}_{i}((\varLambda +t)\wedge \vartheta _{i}\wedge \varGamma _{N}\Big ) \ \mathrm{a.s.}\ \mathbb {P} \end{aligned}$$

for every \(t \ge 0\) on \(\{\varLambda _{i}=\varLambda <\varGamma _{N}\}\). It follows from the continuity of \(X_{i}(t),\ \widehat{X}_{i}(t)\) that

$$\begin{aligned} X_{i}\Big ((\varLambda +t)\wedge \vartheta _{i}\wedge \varGamma _{N})\le \widehat{X}_{i}((\varLambda +t)\wedge \vartheta _{i} \wedge \varGamma _{N}\Big ),\quad 0 \le t < \infty \ \mathrm{a.s.}\ \mathbb {P} \end{aligned}$$

on \(\{\varLambda _{i}=\varLambda <\varGamma _{N}\}\). This contradicts (3.1), which shows that \(\mathbb {P}(\{\varLambda <\varGamma _{N}\})=0\) for every \(N >0\). Hence we have \(\mathbb {P}(\{\varLambda =\infty \})=1\). Therefore

$$\begin{aligned} \mathbb {P}\Big (\{X(t)\le \widehat{X}(t),t\ge 0\}\Big )=1, \end{aligned}$$

i.e.,

$$\begin{aligned} \mathbb {P}\Big (\{x(t,\phi )\le \widehat{x}(t,\psi ),t\ge 0\}\Big )=1, \end{aligned}$$

which shows that

$$\begin{aligned} \mathbb {P}\Big (\{x_{t}(\phi )\le _{\mathcal {C}} \widehat{x}_{t}(\psi ),t\ge 0\}\Big )=1. \end{aligned}$$

The proof is complete. \(\square \)

Now we give a condition which makes it possible to replace \(<\) in (H1) by \(\le \). For this purpose we review the following definition given in [21].

Definition 7

A mapping \(g:\mathbb {R}_+\times \mathbb {R}^{d}\times \mathcal {C}\rightarrow \mathbb {R}^{d}\) is called quasi-monotonously increasing, if for every \( i=1,2,\ldots , d \)

$$\begin{aligned} g_{i}\Big (t,\phi (0),\phi \Big )\le g_{i}\Big (t,\psi (0),\psi \Big ), \end{aligned}$$

wherever \(\phi \le _{\mathcal {C}} \psi \) with \(\phi _{i}(0)=\psi _{i}(0)\) and \(t>0\).

Theorem 8

If either drift term \(f\) to system (2.1) or \(\widehat{f}\) to system (2.2) is quasi-monotonously increasing, then the condition (H1) of Theorem 6 can be replaced by

\(\mathrm{(H1^{*})}\)   The function \(f_{i}:\mathbb {R_{+}}\times \mathbb {R}^{d}\times \mathcal {C}\rightarrow \mathbb {R}\) and \(\widehat{f}_{i}:\mathbb {R_{+}}\times \mathbb {R}^{d}\times \mathcal {C}\rightarrow \mathbb {R}, i=1,2,\ldots ,d \) are continuous and satisfy the following condition: \(f_{i}(t,\phi (0),\phi )\le \widehat{f}_{i}(t,\phi (0),\phi )\) for every \(\phi \in \mathcal {C},t\ge 0 \).

Proof

Assume that the drift coefficient \(f\) to system (2.1) is quasi-monotonously increasing, a similar argument holds if the drift coefficient \(\widehat{f}\) to system (2.2) is quasi-monotonously increasing. Let \(\zeta >0\) be arbitrarily chosen and define

$$\begin{aligned} f_{i}^{\zeta }\triangleq f_{i}-\zeta ,\quad i=1,2,\ldots ,d. \end{aligned}$$

Consider the following auxiliary system

$$\begin{aligned} \left\{ \begin{array}{ll} dx_{i}(t)=f_{i}^{\zeta }(t,x(t),x_{t})dt+\sum \limits _{j=1}^{r} \sigma _{ij}(t,x(t))dW_{t}^{j},&{} \quad i=1,2,\ldots ,d,\\ x(\theta )=\phi (\theta ),\ \theta \in J,\ \phi \in C(J,\mathbb {R}^{d}). \end{array}\right. \end{aligned}$$
(3.12)

By the hypotheses (H2)–(H4) and Theorem 1, system (3.12) has a unique strong solution \(X^{\zeta }(t), t\ge 0.\) From \(\mathrm{(H1^{*})}\) and the quasi-monotonicity of \(f\) it follows that the pair \((f_{i}^{\zeta },\widehat{f}_{i}), i=1,2,\ldots ,d,\) satisfies (H1). Hence, applying Theorem 6, we get that

$$\begin{aligned} X^{\zeta }_{i}(t)\le \widehat{X}_{i}(t)\quad \mathrm{for\ all}\ t\ge 0,\quad \mathrm{a.s.}\ \mathbb {P} \end{aligned}$$

for \(i=1,2,\ldots ,d.\) Choose a strictly decreasing sequence \(\zeta _{n},n\ge 1\) with \(\lim \nolimits _{n\rightarrow \infty }\zeta _{n}=0.\) By the same arguments as above we have

$$\begin{aligned} X_{i}^{\zeta _{1}}(t)\le X_{i}^{\zeta _{2}}(t)\le \cdots \le \widehat{X}_{i}(t)\quad \mathrm{for\ all}\ t\ge 0,\ \mathrm{a.s.}\ \mathbb {P} \end{aligned}$$

as well as

$$\begin{aligned} X_{i}^{\zeta _{1}}(t)\le X_{i}^{\zeta _{2}}(t)\le \cdots \le X_{i}(t)\quad \mathrm{for\ all}\ t\ge 0,\ \mathrm{a.s.}\ \mathbb {P} \end{aligned}$$

for \(i=1,2,\ldots ,d.\) Define

$$\begin{aligned} \mathbb {X}_{i}(t)\triangleq \lim \limits _{n\rightarrow \infty }X_{i}^{\zeta _{n}}(t), \end{aligned}$$
(3.13)

for \(t\ge 0, i=1,2,\ldots ,d.\) Then

$$\begin{aligned} \mathbb {X}_{i}(t)\le \widehat{X}_{i}(t)\quad \mathrm{for\ all}\ t\ge 0,\ \mathrm{a.s.}\ \mathbb {P} \end{aligned}$$

for \(i=1,2,\ldots ,d.\) To complete the proof, we shall show that \(\mathbb {X}(t)\) is a modification of the solution \(X(t)\). By the strong solution uniqueness it suffices to check that \(\mathbb {X}(t)\) satisfies (2.1) a.s. \(\mathbb {P}\) for every \(t\ge 0\).

First we prove that \(X^{\zeta _{n}}(t)\) converges to \(\mathbb {X}(t)\) uniformly on \(t\in [0,T]\) almost surely as \(n\rightarrow \infty \). In terms of the Hölder inequality and (3.12) we have

$$\begin{aligned} \sup \limits _{0\le \chi \le t}|X^{\zeta _{n+1}}(\chi )-X^{\zeta _{n}}(\chi )|^{2}&= \sup \limits _{0\le \chi \le t}\Big |\chi (\zeta _{n}-\zeta _{n+1}) \overrightarrow{e}\\&\quad +\int _{0}^{\chi }\big (f(s,X^{\zeta _{n+1}}(s), X^{\zeta _{n+1}}_{s})-f(s,X^{\zeta _{n}}(s),X^{\zeta _{n}}_{s})\big )ds \\&\quad +\int _{0}^{\chi }\big (\sigma (s,X^{\zeta _{n+1}}(s))- \sigma (s,X^{\zeta _{n}}(s))\big )dW_s\Big |^{2}\\&\le 3d(\zeta _{n}-\zeta _{n+1})^{2}T^{2}\\&\quad +3\Big (\int _{0}^{t}|f(s,X^{\zeta _{n+1}}(s),X^{\zeta _{n+1}}_{s})-f(s, X^{\zeta _{n}}(s),X^{\zeta _{n}}_{s})|ds\Big )^{2} \\&\quad + 3\sup \limits _{0\le \chi \le t}\Big |\int _{0}^{\chi }(\sigma (s,X^{\zeta _{n+1}}(s))-\sigma (s,X^{\zeta _{n}}(s)))dW_s\Big |^{2}\\&\le 3d(\zeta _{n}-\zeta _{n+1})^{2}T^{2}\\&\quad +3T\int _{0}^{t}\Big |f(s,X^{\zeta _{n+1}}(s),X^{\zeta _{n+1}}_{s})-f(s, X^{\zeta _{n}}(s),X^{\zeta _{n}}_{s})\Big |^{2}ds \\&\quad + 3\sup \limits _{0\le \chi \le t}\Big |\int _{0}^{\chi }(\sigma (s,X^{\zeta _{n+1}}(s))-\sigma (s, X^{\zeta _{n}}(s)))dW_s\Big |^{2}, \end{aligned}$$

where \(\overrightarrow{e}=(1,1,\ldots ,1)^{\tau }\) is a \(d\) dimensional vector. According to (H2) and (H3) and the Burkholder–Davis–Gundy inequality, for \(t \in [0, T]\), we have

$$\begin{aligned} E\sup \limits _{0\le \chi \le t}|X^{\zeta _{n+1}}(t)\!-\!X^{\zeta _{n}}(\chi )|^{2}&\le 3d(\zeta _{n}-\zeta _{n+1})^{2}T^{2}\\&\quad +3{ TE}\Big (\int _{0}^{t}|f(s,X^{\zeta _{n+1}}(s), X^{\zeta _{n+1}}_{s})\!-\!f(s,X^{\zeta _{n}}(s),X^{\zeta _{n}}_{s})|^{2}ds\Big )\\&\quad + 3E\sup \limits _{0\le \chi \le t}\Big |\int _{0}^{\chi }(\sigma (s,X^{\zeta _{n+1}}(s))-\sigma (s,X^{\zeta _{n}}(s)))dW_s\Big |^{2}\\&\le 3d(\zeta _{n}-\zeta _{n+1})^{2}T^{2}\\&\quad +3{ TdLE}\Big (\int _{0}^{t}(|X^{\zeta _{n+1}}(s)-X^{\zeta _{n}}(s)|^{2}+ \Vert X^{\zeta _{n+1}}_{s}-X^{\zeta _{n}}_{s}\Vert ^{2})ds\Big )\\&+ 12E\int _{0}^{t}\Big |\sigma (s,X^{\zeta _{n+1}}(s))-\sigma (s,X^{\zeta _{n}}(s))\Big |^{2}ds \\&\le 3d(\zeta _{n}-\zeta _{n+1})^{2}T^{2}\\&\quad +6{ TdL} \int _{0}^{t}\Big (E\sup \limits _{0\le \chi \le s}|X^{\zeta _{n+1}}(\chi )-X^{\zeta _{n}}(\chi )|^{2}\Big )ds \\&\quad + 12d\int _{0}^{t}\rho \Big (E\sup \limits _{0\le \chi \le s}|X^{\zeta _{n+1}}(\chi )-X^{\zeta _{n}}(\chi )|^{2}\Big )ds \\&\le 3d(\zeta _{n}-\zeta _{n+1})^{2}T^{2}\\&\quad +(6{ TdL}+12d) \int _{0}^{t}\varrho \Big (E\sup \limits _{0\le \chi \le s}|X^{\zeta _{n+1}}(\chi )-X^{\zeta _{n}}(\chi )|^{2}\Big )ds. \end{aligned}$$

Note that \(t\rightarrow E\sup \nolimits _{0\le \chi \le t}|X^{\zeta _{n+1}}(\chi )-X^{\zeta _{n}}(\chi )|^{2}\) is continuous (see Remark 14) and hence by Theorem 2, we have

$$\begin{aligned} E\sup \limits _{0\le \chi \le t}\Big |X^{\zeta _{n+1}}(\chi )-X^{\zeta _{n}}(\chi )\Big |^{2}&\le W^{-1}\Big [W(r(t))+(6TdL+12d)t\Big ],\ 0\le t\le t_{c}, \end{aligned}$$

where \(W(u)=\int _{\widehat{u}}^{u}\frac{dz}{\varrho (z)}, \widehat{u}\) is any given positive constant, \(r(t)=3d(\zeta _{n}-\zeta _{n+1})^2T^{2}\), and \(t_{c}\le T\) is the largest number such that

$$\begin{aligned} W(r(t_{c}))+(6{ TdL}+12d)t_{c}\le \int _{\widehat{u}}^{\infty }\frac{dz}{\varrho (z)}, \end{aligned}$$
(3.14)

By (2.8) it is easy to see that (3.14) holds if \(n\) is sufficiently large, i.e., there exists \(N_{1}>0\) such that if \(n\ge N_{1}\), then (3.14) holds. Hence, as \(n\ge N_{1}, t_{c}=T\). Then

$$\begin{aligned} E\sup \limits _{0\le t\le T}\Big |X^{\zeta _{n+1}}(t)-X^{\zeta _{n}}(t)\Big |^{2}\le W^{-1}\Big [W(r(T))+6dL T^{2}+12dT\Big ]. \end{aligned}$$
(3.15)

Furthermore, since \(\int _{0^{+}}\frac{1}{\varrho (z)}dz=\infty \), there exist \(\mu _{n},n\ge 1\) such that for every \(n, \mu _{n}< \frac{1}{8^{n}}\) and \(\int _{\mu _{n}}^{\frac{1}{8^{n}}}\frac{1}{\varrho (z)}dz=6dL T^{2}+12dT.\) Let \(r(T)=\mu _{n},\) then \(\zeta _{n}-\zeta _{n+1}=\sqrt{\frac{\mu _{n}}{3dT^{2}}},\) and hence \(\zeta _{n}=\sum \nolimits _{k=n}^{\infty }\sqrt{\frac{\mu _{k}}{3dT^{2}}},\ n\ge 1\). Moreover by (3.15) we have that for \(n\ge N_{1}\),

$$\begin{aligned} E\sup \limits _{0\le t\le T}\Big |X^{\zeta _{n+1}}(t)-X^{\zeta _{n}}(t)\Big |^{2}\le \frac{1}{8^{n}}. \end{aligned}$$

By Chebyshev’s inequality we have

$$\begin{aligned} \sum \limits _{n=N_{1}}^{\infty }\mathbb {P}\Big (\sup \limits _{0\le t\le T}|X^{\zeta _{n+1}}(t)-X^{\zeta _{n}}(t)|>2^{-(n+1)}\Big )\le \sum \limits _{n=N_{1}}^{\infty }4\frac{1}{2^{n}}<\infty . \end{aligned}$$

Then Borel–Cantelli Lemma implies that there exists an event \(\varOmega ^{*}\in \mathfrak {F}\) with \(\mathbb {P}(\varOmega ^{*})=1\) such that for every \(\omega \in \varOmega ^{*}\) there is an integer \(N_{2}(\omega ) \ge N_1\) satisfying that

$$\begin{aligned} \sup \limits _{0\le t\le T}\Big |X^{\zeta _{n+1}}(t)-X^{\zeta _{n}}(t)\Big |\le 2^{-(n+1)},\ n\ge N_{2}(\omega ). \end{aligned}$$

Consequently

$$\begin{aligned} \sup \limits _{0\le t\le T}\left| X^{\zeta _{n+p}}(t)-X^{\zeta _{n}}(t)\right| \le 2^{-n},\quad p\ge 1,\ n\ge N_{2}(\omega ), \end{aligned}$$

that is, \(X^{\zeta _{n}}(s)\) converges to \(\mathbb {X}(s)\) uniformly on \([0,T]\) almost surely. The continuity of \(\mathbb {X}(s)\) follows.

Now define

$$\begin{aligned} T_{N}\triangleq inf \big \{t>0:\parallel X_{t}\parallel >N\ \mathrm{or}\ \Vert X^{\zeta _{1}}_t\Vert >N\big \} \wedge N,\quad \mathrm{{for every}}\quad N>0. \end{aligned}$$

In terms of (3.13) and Lebesgue’s Theorem on dominated convergence, we have

$$\begin{aligned} \int \limits _{0}^{t\wedge T_{N}}f_{i}^{\zeta _{n}}\big (s,X^{\zeta _{n}}(s),X^{\zeta _{n}}_{s}\big )ds\rightarrow \int \limits _{0}^{t\wedge T_{N}}f_{i}\big (s,\mathbb {X}(s),\mathbb {X}_{s}\big )ds,\ \mathrm{a.s.}\ \mathbb {P}, \end{aligned}$$

and

$$\begin{aligned} E\int \limits _{0}^{t\wedge T_{N}}\big |\sigma _{ij}(s,X^{\zeta _{n}}(s))-\sigma _{ij}(s,\mathbb {X}(s))\big |^{2}ds\rightarrow 0, \end{aligned}$$

as \(n\rightarrow \infty .\)

Thus, by Proposition 3.2.10 in [13], we have

$$\begin{aligned} \int \limits _{0}^{t\wedge T_{N}}\sigma _{ij}(s,X^{\zeta _{n}}(s))dW_{s}^{j}\overset{L^2}{\longrightarrow }\int \limits _{0}^{t\wedge T_{N}}\sigma _{ij}(s,\mathbb {X}(s))dW_{s}^{j} \end{aligned}$$

as \(n\rightarrow \infty \). Then taking a subsequence if necessary, we obtain that

$$\begin{aligned} \lim \limits _{n\rightarrow \infty }X_{i}^{\zeta _{n}}(t\wedge T_{N})&= \phi _{i}(0)+\int _{0}^{t\wedge T_{N}}f_{i}(s,\mathbb {X}(s),\mathbb {X}_{s})ds +\sum \limits _{j=1}^{r}\int _{0}^{t\wedge T_{N}}\sigma _{ij}(s,\mathbb {X}(s))dW_{s}^{j}, \end{aligned}$$

that is,

$$\begin{aligned} \mathbb {X}_{i}(t\wedge T_{N})&= \phi _{i}(0)+\int _{0}^{t\wedge T_{N}}f_{i}(s,\mathbb {X}(s),\mathbb {X}_{s})ds +\sum \limits _{j=1}^{r}\int _{0}^{t\wedge T_{N}}\sigma _{ij}(s,\mathbb {X}(s))dW_{s}^{j}. \end{aligned}$$

Note that as \(N \uparrow \infty , T_{N}\uparrow \infty .\) Setting \(N\uparrow \infty \), from the path continuity of \(\mathbb {X}(t)\) we conclude that

$$\begin{aligned} \mathbb {X}_{i}(t)&= \phi _{i}(0)+ \int _{0}^{t}f_{i}(s,\mathbb {X}(s),\mathbb {X}_{s})ds +\sum \limits _{j=1}^{r}\int _{0}^{t}\sigma _{ij}(s,\mathbb {X}(s))dW_{s}^{j}. \end{aligned}$$

The proof is complete. \(\square \)

4 Application

As an application of our comparison theorem, we will show that the solutions of a class of stochastic neutral networks with delays possess monotonicity and sublinearity. Consider the following stochastic neutral networks described by Stratonovich type of stochastic differential equations with delay:

$$\begin{aligned} \left\{ \begin{array}{ll} dx_{i}(t)=\big [-a_{i}x_{i}(t)+\sum \limits _{j=1}^{d}b_{ij}h_{j}(x_{j}(t-\tau _{ij}))\big ]dt+ \sum \limits _{k=1}^{r}\sigma _{ik}x_{i}(t)\circ dW_{t}^{k},&{} \nonumber \\ x_{i}(\theta )=\phi _{i}(\theta ), \theta \in J \triangleq [-\tau , 0], \tau =\max \limits _{1\le i, j\le d}\tau _{ij},\phi \in \mathcal {C}\triangleq C(J,\mathbb {R}_+^d), \quad i=1,2,\ldots ,d, \end{array}\right. \\ \end{aligned}$$
(4.1)

where the activation functions satisfy the following:

  1. (A1)

    \(h_{i}\in C^{1}(\mathbb {R}), \quad h_{i}(0)=0,i=1,2,\ldots ,d ,\)

  2. (A2)

    \(\lim \limits _{s\rightarrow \infty }\frac{h_i(s)}{s}=0, \quad i=1,2,\ldots ,d ,\)

  3. (A3)

    \(0<h'_{i}(s)\le 1,\quad i=1,2,\ldots ,d,\)

  4. (A4)

    \(b_{ij}>0\)    for all    \(i,j=1,2,\ldots ,d.\)

Theorem 9

Besides (A1)–(A4), we further assume that the system (4.1) satisfies

(A5) each \(h_{i}(s)\) is a sublinear function from \(\mathbb {R}_{+}\) into \(\mathbb {R}\) in the sense that

$$\begin{aligned} \lambda h_{i}(s)\le h_{i}(\lambda s)\quad \mathrm {for}\ \mathrm {all}\ 0<\lambda <1\ \mathrm{and} \ s>0. \end{aligned}$$

Then the system (4.1) has a unique strong solution \(\varPhi (t,\phi )\) for \(t\ge 0\), where \(\phi =(\phi _{1},\ldots ,\phi _{d})\). Moreover, it generates a monotone random dynamical system, which means that

$$\begin{aligned} \phi \le _{\mathcal {C}}\psi \Longrightarrow \varPhi (t,\phi )\le _{\mathcal {C}} \varPhi (t,\psi ),\ t\ge 0 \end{aligned}$$

and \(\varPhi (t,\phi )\) is sublinear in the sense that for every \(\phi \ge 0,\)

$$\begin{aligned} \lambda \varPhi (t,\phi )\le _{\mathcal {C}} \varPhi (t,\lambda \phi ),\quad \mathrm{for\ all }\ t\ge 0\ \mathrm{and}\ 0\le \lambda \le 1. \end{aligned}$$

Proof

First, we shall illustrate the strong solutions for (4.1) generate a random dynamical system by the conjugacy technique (see Imkeller and Schmalfuss [12]).

Denote by \(z(\omega )\) the random variable in \(\mathbb {R}^r\) such that

$$\begin{aligned} z(t,\omega ):= z(\theta _t\omega )= (z_1(t,\omega ),z_2(t,\omega ),...,z_r(t,\omega )) \end{aligned}$$

is Stationary Ornstein-Uhlenbeck Process in \(\mathbb {R}^r\) which solves the equations

$$\begin{aligned} dz_k = -\mu z_kdt + dW_t^k,\ \quad \mu > 0, k = 1,\ldots ,r. \end{aligned}$$

Let us first introduce a conjugate transformation:

$$\begin{aligned} T(\omega ,y)= \big (y_1.e_1^{\sigma }(\omega ),\ldots ,y_d.e_d^{\sigma }(\omega )\big ) \end{aligned}$$

where

$$\begin{aligned} e_i^{\sigma }(\omega )= \text {exp}\big \{z_i^{\sigma }(\omega )\big \},\ \ z_i^{\sigma }(\omega )= \sum _{j=1}^r\sigma _{ij}z_j(\omega ). \end{aligned}$$

Applying It\(\hat{o}\) formula to the function \(y_i(t,\omega )= x_i(t,\omega )\text {exp}\{-z_i^{\sigma }(\theta _t\omega )\}\), we transform the system (4.1) into

$$\begin{aligned} {\left\{ \begin{array}{ll} \frac{dy_i(t)}{dt}= -a_iy_i(t)+\mu y_i(t)z_i^{\sigma }(\theta _t\omega )+\text {exp}\{-z_i^{\sigma }(\theta _t\omega )\}\times \\ \sum \limits _{j=1}^{d}b_{ij}h_j(y_j(t-\tau _{ij})\text {exp}\{-z_j^{\sigma } (\theta _{t-\tau _{ij}}\omega )\})\\ y(\theta )=\phi (\theta ),\theta \in J \triangleq [-\tau ,0], \quad i=1,\ldots ,d. \end{array}\right. } \end{aligned}$$
(4.2)

For each \(i\), define

$$\begin{aligned} {\left\{ \begin{array}{ll} F_i(\omega ,\phi ):=-a_i\phi _i(0)+\mu z_i^{\sigma }(\omega )\phi _i(0)+\text {exp}\{-z_i^{\sigma }(\omega )\}\times \\ \sum \limits _{j=1}^{d}b_{ij}h_j(\phi _j(-\tau _{ij})\text {exp}\{-z_j^{\sigma } (\theta _{-\tau _{ij}}\omega )\}). \end{array}\right. } \end{aligned}$$
(4.3)

Then it follows from (A1) and (A3) that

  1. (C1)

    \(F_i(\omega ,0)\equiv 0;\)

  2. (C2)

    \(F_i\) satisfies the global Lipschitz condition in the sense that for any \(\phi , \psi \in \mathcal {C}\),

$$\begin{aligned} |F_i(\omega ,\phi )-F_i(\omega ,\psi )| \le L_i(\omega )\parallel \phi -\psi \parallel \end{aligned}$$

where \(L_i(\omega )= a_i+\mu |z_i^{\sigma }(\omega )| + \text {exp}\{-z_i^{\sigma }(\omega )\}\times \sum \limits _{j=1}^{d}b_{ij} \text {exp}\{-z_j^{\sigma }(\theta _{-\tau _{ij}}\omega )\}\). The system (4.2) can be written as a type of Random Functional Differential Equations:

$$\begin{aligned} \left\{ \begin{array}{ll} \frac{dy(t)}{dt}&{}= F(\theta _t\omega , y_t) \\ y_0&{}=\phi \in \mathcal {C}. \end{array}\right. \end{aligned}$$
(4.4)

One can shows that the solutions for (4.4) satisfying (C1) and (C2) generate a random dynamical system by the fundamental theory of deterministic functional differential equations, we omit the detail.

Stratonovich stochastic differential equation (4.1) can be rewritten in the It\(\hat{o}\) form

$$\begin{aligned} \left\{ \begin{array}{ll} dx_{i}(t)&{}=\overline{f}_{i}(x_{i}(t),x_{1}(t-\tau _{i1}),\ldots ,x_{d} (t-\tau _{id}))dt+\sum \limits _{k=1}^{r}\sigma _{ik}x_{i}(t)dW_{t}^{k}, \\ x_{i}(\theta )&{}=\phi _{i}(\theta _{i}), \quad i=1,2,\ldots ,d, \end{array}\right. \end{aligned}$$
(4.5)

where \(\overline{f}_{i}=f_i(x_{i}(t),x_{1}(t-\tau _{i1}), \ldots ,x_{d}(t-\tau _{id}))+\frac{1}{2}\sum \limits _{k=1}^{r}\sigma _{ik}^{2}x_{i}(t)\) with

$$\begin{aligned} f_i(x_{i}(t),x_{1}(t-\tau _{i1}),\ldots ,x_{d}(t-\tau _{id}))=-a_{i}x_{i}(t) +\sum \limits _{j=1}^{d}b_{ij}h_{j}(x_{j}(t-\tau _{ij})). \end{aligned}$$

Then we only need to prove that the results in Theorem 9 hold for system (4.5). It is obvious that under (A1)–(A4), the global existence and uniqueness strong solution \(\varPhi (t,\phi )\) of the system (4.5) can be obtained by Theorem 1. In addition, observe that the drift terms of equation (4.5) is quasi-monotonously increasing. Then monotonicity of the solutions of the system (4.5) is obtained by applying Theorem 8.

The idea to prove sublinearity of the solution of the equations (4.1) follows from [5]. Let \(x_{i}^{\lambda }(t)=\lambda \varPhi _{i}(t,\phi )\). Then

$$\begin{aligned} \left\{ \begin{array}{ll} dx_{i}^{\lambda }(t)&{}=f_{i}^{\lambda }\big (x_{i}^{\lambda }(t),x_{1}^{\lambda } (t-\tau _{i1}),\ldots ,x_{d}^{\lambda }(t-\tau _{id})\big )dt+ \sum \limits _{k=1}^{r}\sigma _{ik}x_{i}^{\lambda }(t)\circ dW_{t}^{k}, \\ x_{i}^{\lambda }(\theta )&{}=\lambda \phi _{i}(\theta _{i}), \quad i=1,2,\ldots ,d, \end{array}\right. \end{aligned}$$

where

$$\begin{aligned} f_{i}^{\lambda }\big (x_{i}^{\lambda }(t),x_{1}^{\lambda } (t-\tau _{i1}),\ldots ,x_{d}^{\lambda }(t-\tau _{id})\big ) =-a_{i}x_{i}^{\lambda }(t)+\sum \limits _{j=1}^{d}\lambda b_{ij}h_{j}\big (\lambda ^{-1}x_{j}^{\lambda }(t-\tau _{ij})\big ). \end{aligned}$$

Using (A5), it is easy to see that for every \(\phi =(\phi _{1},\ldots ,\phi _{d})\ge 0,\ i=1,\ldots ,d,\)

$$\begin{aligned} f_{i}^{\lambda }\big (\phi _{i}(0),\phi _{1}(-\tau _{i1}), \ldots ,\phi _{d}(-\tau _{id})\big )\le f_{i}\big (\phi _{i}(0),\phi _{1}(-\tau _{i1}), \ldots ,\phi _{d}(-\tau _{id})\big ), \end{aligned}$$

which implies that \(\mathrm{(H1^{*})}\) in Theorem 8 holds. The last paragraph shows that the comparison theorem is still valid for Stratonovich type of stochastic functional differential equation. Again by the quasi-monotonicity of the drift terms of the system (4.1), we can apply Theorem 8 to deduce the sublinearity for \(\varPhi (t,\phi )\). The proof is complete. \(\square \)

Remark 10

Chueshov and Scheutzow [6] obtained comparison principle under the assumptions that the drift terms are decomposed into a non-delay part and a delay part and the non-delay drift terms are at least continuously differentiable and diffusion terms are at least twice continuously differentiable. When the drift terms are replaced by this non-delay drift part, the SDEs generate a stochastic flow of diffeomorphisms in \(\mathbb {R}^d\), which helps them to represent the SFDEs as a random FDEs. Our comparison result can apply to the case that the diffusion terms are even not non-Lipschitz (see Example 11). Thus the corresponding SDEs only generate a stochastic flow of hemeomorphisms, rather than diffeomorphisms in \(\mathbb {R}^d\). However, the comparison principle in Chueshov and Scheutzow [6] is valid for more general noise, exactly, they considered SFDEs driven by a Kunita-type martingale field. Their idea is to represent the SFDEs as a random FDE, while our technique is to combine the method of [4] and stopping times with the nonlinear Gronwall inequality.

Example 11

Let \(d=r=1\) and consider the following two stochastic delay differential equations.

$$\begin{aligned} \left\{ \begin{array}{ll} dx(t)&{}=\frac{1}{2}x(t-\tau )dt+\sigma (x(t))dW_{t}, \\ x(\theta )&{}=\phi (\theta ),\ \theta \in J\triangleq [-\tau , 0],\ \phi \in C(J,\mathbb {R}) , \end{array}\right. \end{aligned}$$
(4.6)

and

$$\begin{aligned} \left\{ \begin{array}{ll} d\widehat{x}(t)&{}=[\frac{1}{2}\widehat{x}(t-\tau )+1]dt+\sigma (\widehat{x}(t))dW_{t}, \\ \widehat{x}(\theta )&{}=\psi (\theta ),\ \theta \in J,\ \psi \in C(J,\mathbb {R}), \end{array} \right. \end{aligned}$$
(4.7)

where \(\sigma (x)=\left\{ \begin{array}{ll} \frac{1}{2}|x|\root 4 \of {\ln \frac{1}{|x|}}, &{} \quad \quad |x|\le \frac{1}{e},\\ \frac{1}{2e},&{} \quad \quad |x|>\frac{1}{e}. \end{array} \right. \)

First it is obvious that the assumptions (H1), (H2) and (H4) hold for systems (4.6) and (4.7), and then it is easy to see that for every \(x,x'\in \mathbb {R},\)

$$\begin{aligned} |\sigma (x)-\sigma (x')|^{2}\le \rho \big (|x-x'|^{2}\big ), \end{aligned}$$

where \(\rho :\mathbb {R}_{+}\rightarrow \mathbb {R}_{+}\) and \(\rho (x)=\left\{ \begin{array}{ll} \frac{1}{2}x\sqrt{\ln \frac{1}{x}}, &{} \quad \quad x\le \frac{1}{e},\\ \frac{1}{2e},&{} \quad \quad x>\frac{1}{e}. \end{array} \right. \) Note that \(\rho (0)=0, \int _{0^{+}}\frac{dx}{\rho (x)}=\infty ,\) and \(\rho (x)\) is a nondecreasing continuous concave function, hence (H3) holds. Then Theorem 6 can apply to the equations (4.6) and (4.7). However, since \(\sigma \) is not differentiable at the origin and comparison principle in Chueshov and Scheutzow [6] needs the diffusion term \(\sigma \) to be at least twice continuously differentiable, their comparison principle cannot applied to the systems (4.6) and (4.7).