1 Introduction

One of the classical directions in the analysis of Markov processes centers around their ergodic properties. In this article, we focus on both qualitative and quantitative aspects of this problem. Let \({\mathbb {X}}\) be a locally compact Polish space, i.e. a locally compact separable completely metrizable topological space. Denote the corresponding metric by \({\mathsf {d}}\), and let \({{\mathbb {T}}}={{\mathbb {R}}}_+\) or \({\mathbb {Z}}_+\) be the time parameter set. We endow \(({\mathbb {X}},{\mathsf {d}})\) with its Borel \(\sigma \)-algebra \({\mathfrak {B}}({\mathbb {X}})\). Further, let \(\bigl (\Omega ,{\mathcal {F}}, \{{\mathcal {F}}_t\}_{t\in {{\mathbb {T}}}},\{\theta _t\}_{t\in {{\mathbb {T}}}}, \{X(t)\}_{t\in {{\mathbb {T}}}},\{{{\mathbb {P}}}_x\}_{x\in {\mathbb {X}}}\bigr )\), denoted by \(\{X(t)\}_{t\in {{\mathbb {T}}}}\) in the sequel, be a time-homogeneous conservative strong Markov process with càdlàg sample paths (when \({{\mathbb {T}}}={{\mathbb {R}}}_+\)) and state space \(\bigl ({\mathbb {X}},{\mathfrak {B}}({\mathbb {X}})\bigr )\), in the sense of [10]. Here, \((\Omega ,{\mathcal {F}}, {{\mathbb {P}}}_x)_{x\in {\mathbb {X}}}\) is a family of probability spaces and \(\{X(t)\}_{t\in {{\mathbb {T}}}}\) satisfies \({{\mathbb {P}}}_x(X(0)=x)=1\), \(\{{\mathcal {F}}_t\}_{t\in {{\mathbb {T}}}}\) is a filtration on \((\Omega ,{\mathcal {F}})\) (non-decreasing family of sub-\(\sigma \)-algebras of \({\mathcal {F}}\)) and \(\{\theta _t\}_{t\in {{\mathbb {T}}}}\) is a family of shift operators on \(\Omega \) satisfying \(X(t)\circ \theta _s = X(t+s)\) for all \(s, t\in {{\mathbb {T}}}\). Recall, \(\{X(t)\}_{t\in {{\mathbb {T}}}}\) is said to be conservative if \({{\mathbb {P}}}_x(X(t)\in {\mathbb {X}})=1\) for all \(t\in {{\mathbb {T}}}\) and \(x\in {\mathbb {X}}\). In the present article, we present (sharp) sufficient conditions under which \(\{X(t)\}_{t\in {{\mathbb {T}}}}\) admits a unique invariant probability measure \(\uppi (\text {d}x)\), and which ensure that the marginals of \(\{X(t)\}_{t\in {{\mathbb {T}}}}\) converge to \(\uppi (\text {d}x)\), as \(t\rightarrow \infty \), in the \(\text {L}^{p}\)-Wasserstein distance at exponential and subexponential rates.

1.1 Summary of the Results

Before stating the main results of this article, we introduce some notation we need in the sequel. Denote by \(p(t,x,\text {d}y):={{\mathbb {P}}}_x(X(t)\in \text {d}y)\) for \(t\in {{\mathbb {T}}}\) and \(x\in {\mathbb {X}}\), the transition kernel of \(\{X(t)\}_{t\in {{\mathbb {T}}}}\). We endow \({{\mathbb {T}}}\) with the standard (Euclidean Borel in the case when \({{\mathbb {T}}}={{\mathbb {R}}}_+\), and discrete when \({{\mathbb {T}}}={\mathbb {Z}}_+\)) \(\sigma \)-algebra. The process \(\{X(t)\}_{t\in {{\mathbb {T}}}}\) is called

  1. (i)

    irreducible if there exists a \(\sigma \)-finite measure \(\upvarphi (\text {d}x)\) on \({\mathfrak {B}}({\mathbb {X}})\) such that whenever \(\upvarphi (B)>0\) we have \(\int _{{{\mathbb {T}}}}p(t,x,B)\,\uptau (\text {d}t)>0\) for all \(x\in {\mathbb {X}}\), where \(\uptau (\text {d}t)\) stands for the Lebesgue measure on \({{\mathbb {T}}}\) when \({{\mathbb {T}}}={{\mathbb {R}}}_+\), and the counting measure when \({{\mathbb {T}}}={\mathbb {Z}}_+\);

  2. (ii)

    transient if it is irreducible, and there exist \(\{b_k\}_{k\in {{\mathbb {N}}}}\subset [0,\infty )\) and a covering \(\{B_k\}_{k\in {{\mathbb {N}}}}\subseteq {\mathfrak {B}}({\mathbb {X}})\) of \({\mathbb {X}}\), such that \(\int _{{{\mathbb {T}}}}p(t,x,B_k)\,\uptau (\text {d}{t})\le b_k\) for all \(x\in {\mathbb {X}}\) and \(k\in {{\mathbb {N}}}\);

  3. (iii)

    recurrent if it is irreducible, and \(\upvarphi (B)>0\) implies that \(\int _{{{\mathbb {T}}}}p(t,x,B)\,\uptau (\text {d}{t})=\infty \) for all \(x\in {\mathbb {X}}\);

  4. (iv)

    aperiodic if there exists \(t_0>0\) such that \(\{X_{kt_0}\}_{k\in {\mathbb {Z}}_+}\) is irreducible, in the case when \({{\mathbb {T}}}={{\mathbb {R}}}_+\); and there does not exist a partition \(\{B_1,\ldots ,B_k\}\subseteq {\mathfrak {B}}({\mathbb {X}})\) with \(k\ge 2\) of \({\mathbb {X}}\) such that \(p(1,x,B_{i+1}) = 1\) for all \(x\in B_i\) and all \(1 \le i \le k -1\), and \(p(1,x, B_1) = 1\) for all \(x \in B_k\), in the case when \({{\mathbb {T}}}={\mathbb {Z}}_+\).

Let us remark that if \(\{X(t)\}_{t\in {{\mathbb {T}}}}\) is irreducible, then it is either transient or recurrent (see [79, Theorem 2.3]). A Borel measure \(\uppi (\text {d}x)\) on \({\mathbb {X}}\) is called invariant for \(\{X(t)\}_{t\in {{\mathbb {T}}}}\) if \(\int _{\mathbb {X}}p(t,x,\text {d}y)\, \uppi (\text {d}x)=\uppi (\text {d}y)\) for all \(t\in {{\mathbb {T}}}\). It is well known that if \(\{X(t)\}_{t\in {{\mathbb {T}}}}\) is recurrent, then it possesses a unique (up to constant multiples) invariant measure (see [79, Theorem 2.6]). If the invariant measure is finite, then it may be normalized to a probability measure. If \(\{X(t)\}_{t\in {{\mathbb {T}}}}\) is recurrent with finite invariant measure, then it is called positive recurrent; otherwise it is called null recurrent. Note that a transient Markov process cannot have a finite invariant measure. A set \(C\in {\mathfrak {B}}({\mathbb {X}})\) is called petite for \(\{X(t)\}_{t\in {{\mathbb {T}}}}\) if there exist a probability measure \(\upchi (\text {d}t)\) on \({{\mathbb {T}}}\) and a non-trivial Borel measure \(\upnu _\upchi (\text {d}x)\) on \({\mathbb {X}}\), such that

$$\begin{aligned} \int _{{{\mathbb {T}}}}p(t,x,B)\,\upchi (\text {d}t)\,\ge \,\upnu _\upchi (B) \end{aligned}$$

for all \(x\in C\) and \( B\in {\mathfrak {B}}({\mathbb {X}}).\) Recall that petite sets play a role of singletons for Markov processes on general state spaces (see [64, Chap. 5] for a detailed discussion). Denote by \({\mathcal {P}}({\mathbb {X}})\) the class of all Borel probability measures on \({\mathbb {X}}\), and for \(f\in {\mathcal {B}}({\mathbb {X}})\) (the space of real-valued Borel measurable functions on \({\mathbb {X}}\)) let \({\mathcal {P}}_f({\mathbb {X}})\) denote the class of all \(\upmu \in {\mathcal {P}}({\mathbb {X}})\) with the property that \(\int _{{\mathbb {X}}}|f(x)|\,\upmu (\text {d}{x})<\infty \). When \(f(x)=\bigl ({\mathsf {d}}(x_0,x)\bigr )^p\) for some \(p>0\) and \(x_0\in {\mathbb {X}}\), we denote this as \({\mathcal {P}}_p({\mathbb {X}})\) . We adopt the usual notation

$$\begin{aligned} \upmu P_t(\text {d}y)=\int _{{\mathbb {X}}}p(t,x,\text {d}y)\,\upmu (\text {d}{x}),\qquad \text {and}\qquad \upmu \bigl (f\bigr )=\int _{{\mathbb {X}}}f(x)\,\upmu (\text {d}{x}) \end{aligned}$$

for \(t\in {{\mathbb {T}}}\), \(x\in {\mathbb {X}}\), \(\upmu \in {\mathcal {P}}({\mathbb {X}})\) and \(f\in {\mathcal {B}}({\mathbb {X}})\). Therefore, with \(\updelta _x\) denoting the Dirac measure concentrated at \(x\in {\mathbb {X}}\), we have \(\updelta _x P_t(\text {d}y) = p(t,x,\text {d}y)\). Finally, recall that the \(\text {L}^p\)-Wasserstein distance on \({\mathcal {P}}_p({\mathbb {X}})\) with \(p\ge 1\) is defined by

$$\begin{aligned} {\mathcal {W}}_p(\upmu _1,\upmu _2)\,:=\,\inf _{\Pi \in {\mathcal {C}}(\upmu _1,\upmu _2)} \biggl ( \int _{{\mathbb {X}}\times {\mathbb {X}}}\bigl ({\mathsf {d}}(x,y)\bigr )^{p}\, \Pi (\text {d}{x},\text {d}{y})\biggr )^{\nicefrac {1}{p}}, \end{aligned}$$

where \({\mathcal {C}}(\upmu _1,\upmu _2)\) is the family of couplings of \(\upmu _1(\text {d}x)\) and \(\upmu _2(\text {d}x)\), i.e. \(\Pi \in {\mathcal {C}}(\upmu _1,\upmu _2)\) if, and only if, \(\Pi (\text {d}x, \text {d}y)\) is a probability measure on \({\mathbb {X}}\times {\mathbb {X}}\) having \(\upmu _1(\text {d}x)\) and \(\upmu _2(\text {d}x)\) as its marginals. It is well known that \({\mathcal {P}}_p({\mathbb {X}})\) is a complete separable metric space under the metric \({\mathcal {W}}_p\) [82, Theorem 6.18]. The topology generated by \({\mathcal {W}}_p\) on \({\mathcal {P}}_p({\mathbb {X}})\) is finer than the Prokhorov topology, i.e. the topology of weak convergence.

We now state the main results of this article.

Theorem 1.1

Suppose that \(\{X(t)\}_{t\in {{\mathbb {T}}}}\) is irreducible and aperiodic, and there exist a continuous \({\mathcal {V}}:{\mathbb {X}}\rightarrow [1,\infty )\), a constant \(b>0\), a nondecreasing differentiable concave function \(\phi :[1,\infty )\rightarrow (0,\infty )\), and a (topologically) closed petite set \(C\subseteq {\mathbb {X}}\) such that

$$\begin{aligned} {{\mathbb {E}}}_x\bigl [{\mathcal {V}}(X(t))\bigr ] - {\mathcal {V}}(x)&\le b\int _{[0,t)}{{\mathbb {E}}}_x\bigl [\mathbb {1}_C(X(s))\bigr ] \uptau (\text {d}{s}) \nonumber \\&\quad - \int _{[0,t)} {{\mathbb {E}}}_x\bigl [\phi \circ {\mathcal {V}}(X(s))\bigr ]\uptau (\text {d}{s}) \end{aligned}$$
(1.1)

for all \((t,x)\in T\times {\mathbb {X}}\). Assume further that \(\sup _{x\in C}{\mathcal {V}}(x)<\infty \), and

$$\begin{aligned} c\,:=\,\inf _{x\in {\mathbb {X}}}\,\dfrac{\phi \circ {\mathcal {V}}(x)}{\bigl (1+{\mathsf {d}}(x,x_0)\bigr )^{\eta }} \,>\, 0 \end{aligned}$$
(1.2)

for some \(\eta \ge 1\) and some (and therefore any) \(x_0\in {\mathbb {X}}\). Then \(\{X(t)\}_{t\in {{\mathbb {T}}}}\) admits a unique invariant \(\uppi \in {\mathcal {P}}_{\phi \circ {\mathcal {V}}}({\mathbb {X}})\). In addition, with \(\Phi (t):=\int _1^t\frac{\text {d}{s}}{\phi (s)}\) and \(r(t) :=\phi \circ \Phi ^{-1}(t)\), the following hold.

  1. (i)

    If \(\displaystyle \lim _{t\rightarrow \infty }\phi '(t)=0\), then for some \(\bar{c}>0\) we have

    $$\begin{aligned}&\left( 1\vee \bigl (r(t)\bigr )^{\nicefrac {(\eta -1)}{\eta }}\right) \, {\mathcal {W}}_1\bigl (\updelta _x P_t,\uppi \bigr ) \,\le \, \bar{c}\, {\mathcal {V}}(x) \qquad \forall \,(t,x)\in {{\mathbb {T}}}\times {\mathbb {X}}, \end{aligned}$$
    (1.3)

    and

    $$\begin{aligned}&\int _{{\mathbb {T}}}\left( 1\vee \bigl (r(t)\bigr )^{\nicefrac {(\eta -1)}{\eta }}\right) \, {\mathcal {W}}_1\bigl (\updelta _x P_t,\updelta _yP_t\bigr )\,\uptau (\text {d}{t}) \,\le \, \bar{c}\,\bigl ({\mathcal {V}}(x)+{\mathcal {V}}(y)\bigr )\quad \forall \, x,y\in {\mathbb {X}}. \end{aligned}$$
    (1.4)
  2. (ii)

    If \(\displaystyle \lim _{t\rightarrow \infty }\phi '(t)=0\), then for any \(p\in [1,\eta ]\) there exists \(\tilde{c}>0\) such that

    $$\begin{aligned} \left( 1\vee \left( t^{\nicefrac {(\eta -p)}{p}}\wedge t^{\nicefrac {(1-p)}{p}}\right) \bigl (r(t)\bigr )^{\nicefrac {(\eta -1)}{p\eta }}\right) \, {\mathcal {W}}_p(\updelta _x P_t,\uppi ) \,\le \, \tilde{c}\, \bigl ({\mathcal {V}}(x) + {\overline{m}}_{\eta }\bigr )\nonumber \\ \end{aligned}$$
    (1.5)

    for all \((t,x)\in {{\mathbb {T}}}\times {\mathbb {X}}\), where \({{\overline{m}}}_\eta =\uppi \bigl (\bigl ({\mathsf {d}}(x_0,\cdot \,)\bigr )^\eta \bigr )\).

  3. (iii)

    If \(\phi (t)={\hat{c}}\,t\) for some \({\hat{c}}>0\), then there exist \({\check{c}}>0\) and \(\gamma >0\), such that

    $$\begin{aligned} \text {e}^{\gamma t}\,{\mathcal {W}}_1\bigl (\updelta _x P_t,\uppi \bigr )\,\le \, {\check{c}}\, {\mathcal {V}}(x)\qquad \forall \,(t,x)\in {{\mathbb {T}}}\times {\mathbb {X}}. \end{aligned}$$
    (1.6)

    In addition, for any \(p\in [1,\eta ]\) there exists \(\breve{c}>0\) such that

    $$\begin{aligned} \bigl (1\vee t^{\nicefrac {\eta }{p}-1}\bigr )\, {\mathcal {W}}_p(\updelta _x P_t,\uppi ) \,\le \, \breve{c}\,\bigl ( {\mathcal {V}}(x)+{\overline{m}}_{\eta }\bigr )^{\nicefrac {1}{p}} \qquad \forall \,(t,x)\in {{\mathbb {T}}}\times {\mathbb {X}}. \end{aligned}$$
    (1.7)

The results in Theorem 1.1 should be compared to equations (2.3) and (2.5) in [12, Theorems 2.1 and 2.4] (see also [28, Theorem 3 ((ii))] and [53, Chap. 4]). The underlying metric \({\mathsf {d}}\) is assumed to be bounded in [12]. The starting point is a Foster–Lyapunov condition of the form in (1.1), and the irreducibility and aperiodicity assumptions are replaced by a closely related structural property: the metric \({\mathsf {d}}\) is contracting, and the sublevel sets of \((x,y)\mapsto {\mathcal {V}}(x)+{\mathcal {V}}(y)\) are \({\mathsf {d}}\)-small (see (3) and (4) in [12, Theorems 2.1 and 2.4]). Then an analogous estimate to (1.3) holds for the corresponding \({\mathcal {W}}_1\)-distance. Observe that when \({\mathsf {d}}\) is bounded, the relation in (1.1) trivially holds for any \(\eta \ge 0\). Provided \(\{X(t)\}_{t\in {{\mathbb {T}}}}\) is irreducible and aperiodic, this gives an analogous result to the one obtained in [12, Theorems 2.1 and 2.4] (in \({\mathcal {W}}_1\)-distance) without assuming either contraction properties of \({\mathsf {d}}\) or \({\mathsf {d}}\)-smallness of the sublevel sets of \((x,y)\mapsto {\mathcal {V}}(x)+{\mathcal {V}}(y)\). The proof of Theorem 1.1 relies on [25, Theorem 3.2] and [24, Theorem 2.8], where, under the assumptions of Theorem 1.1, the authors show ergodicity of \(\{X(t)\}_{t\in {{\mathbb {T}}}}\) in the f-norm with rate \(\Psi _1\circ r(t)\) and \(f(x)=\Psi _2\circ \phi \circ {\mathcal {V}}(x)\vee 1\), for any pair \((\Psi _1^{-1},\Psi _2^{-1})\) of Young’s functions. Recall, for a signed Borel measure \(\upmu (\text {d}x)\) on \({\mathbb {X}}\) and a function \(f:{\mathbb {X}}\rightarrow [1,\infty ]\) the so-called f-norm of \(\upmu (\text {d}x)\) is defined as

$$\begin{aligned} \Vert \upmu \Vert _f\,:=\,\sup _{g\in {\mathcal {B}}({\mathbb {X}}),\, |g|\le f}\, \bigl |\upmu (g)\bigr |, \end{aligned}$$
(1.8)

generalizing the usual total variation norm \(\Vert \upmu \Vert _{\text {TV}}:=\sup _{g\in {\mathcal {B}}({\mathbb {X}}),\, |g|\le 1}\, \bigl |\upmu (g)\bigr |\). We remark here that convergence in the f-norm does not in general imply convergence in the \({\mathcal {W}}_p\)-distance, and vice versa (see Sect. 3 for examples of such Markov processes).

In the following theorem we establish a lower bound for \({\mathcal {W}}_p\)-convergence, which matches the upper bounds obtained in (1.3) and (1.5). For \(\gamma \in C([0,1];{\mathbb {X}})\) (the space of continuous mappings from [0, 1] to \({\mathbb {X}}\)) let

$$\begin{aligned}\Lambda (\gamma )\,:=\, \sup _{k\in {{\mathbb {N}}}}\,\sup _{0=u_0<u_1<\cdots<u_{k-1} <u_k=1}\Big (\textsf {d}\bigl (\gamma (u_0),\gamma (u_1)\bigr )+\cdots +\textsf {d} \bigl (\gamma (u_{k-1}),\gamma (u_k)\bigr )\Big ). \end{aligned}$$

The space \({\mathbb {X}}\) is called a length space if

$$\begin{aligned} \textsf {d}(x,y) = \inf _{\gamma \in C([0,1];{\mathbb {X}})} \left\{ \Lambda (\gamma ):\gamma (0)=x,\ \gamma (1)=y \right\} \qquad \forall \, x,y\in {\mathbb {X}}. \end{aligned}$$

Theorem 1.2

Assume that \({\mathbb {X}}\) is a length space, \(\{X(t)\}_{t\in {{\mathbb {T}}}}\) satisfies (1.1), and there exist a Lipschitz continuous function \(L:{\mathbb {X}}\rightarrow [0,\infty )\) and constants \(\theta >\vartheta \ge 1\) and \(c>0\), such that

$$\begin{aligned} {\mathcal {V}}(x)\,\ge \, c\,\bigl (L(x)\bigr )^{\theta },\quad \text {and}\quad \phi \circ {\mathcal {V}}(x)\,\ge \, c\,\bigl (L(x)\bigr )^\vartheta \qquad \forall \,x\in {\mathbb {X}}. \end{aligned}$$

In addition, suppose that \(\{X(t)\}_{t\in {{\mathbb {T}}}}\) admits an invariant \(\uppi \in {\mathcal {P}}({\mathbb {X}})\) such that \(\int _{{\mathbb {X}}}\bigl (L(x)\bigr )^{\vartheta +\varepsilon }\,\uppi (\text {d}x)=\infty \) for some \(\varepsilon \in (0,\theta -\vartheta )\). Then, for each \(p\in [1,\vartheta ]\), \(\iota \in (0,\theta -\vartheta -\varepsilon )\) and \(x\in {\mathbb {X}}\), there exist a constant \(\bar{c}>0\) and a diverging increasing sequence \(\{t_n\}_{n\in {{\mathbb {N}}}}\subseteq {{\mathbb {T}}}\), depending on these parameters, such that

$$\begin{aligned} {\mathcal {W}}_p(\updelta _x P_{t_n},\uppi )\,\ge \,\bar{c}\, \bigl (t_n+{\mathcal {V}}(x)\bigr )^{-\frac{\vartheta -p+\varepsilon +\iota }{(\theta -\vartheta -\varepsilon -\iota )p}}\qquad \forall \, n\in {{\mathbb {N}}}. \end{aligned}$$
(1.9)

Note that the parameters \(\theta \), \(\vartheta \), \(\varepsilon \), p and \(\iota \) are such that the exponent in the above expression is always strictly negative. Obtaining lower bound for the convergence in the total variation norm is discussed in [38, Theorem 5.1 and Corollary 5.2]. Applications of Theorem 1.2 are discussed in Sect. 3.

1.2 Ergodicity of a Class of Lévy-Type Processes

Here, we discuss ergodic properties of a class of Markov processes on the Euclidean space \({{\mathbb {R}}}^n\) (endowed with the standard Euclidean metric) generated by a (Lévy-type) operator \({\mathcal {L}}:{\mathcal {D}}({\mathcal {L}})\subseteq {\mathcal {B}}({{\mathbb {R}}}^n)\rightarrow {\mathcal {B}}({{\mathbb {R}}}^n)\) given by

$$\begin{aligned} {\mathcal {L}} f(x)= & {} \bigl \langle b(x),\nabla f(x)\bigr \rangle +\frac{1}{2}{{\,\mathrm{Tr}\,}}\bigl (a(x)\nabla ^2f(x)\bigr ) \nonumber \\&+\int _{{{\mathbb {R}}}^n}{\mathfrak {d}}_1 f(x;y)\upnu (x,\text {d}{y}),\qquad x\in {{\mathbb {R}}}^n.\nonumber \\ \end{aligned}$$
(1.10)

Here, \(b=(b_i)_{i=1,\ldots ,n}:{{\mathbb {R}}}^n\rightarrow {{\mathbb {R}}}^n\) is Borel measurable, \(a=(a_{ij})_{1\le i,j\le n}:{{\mathbb {R}}}^n\rightarrow {{\mathbb {R}}}^{n\times n}\) is a symmetric non-negative definite \(n \times n\) matrix-valued Borel measurable function, \(\upnu (x, \text {d}y)\) is a nonnegative Borel kernel on \({{\mathbb {R}}}^n\times {\mathfrak {B}}({{\mathbb {R}}}^n)\), called the Lévy kernel, satisfying

$$\begin{aligned} \upnu (x,\{0\}) = 0,\quad \text {and}\quad \int _{{{\mathbb {R}}}^n}\bigl (1\wedge |y|^2\bigr ) \,\upnu (x,\text {d}y)<\infty \qquad \forall \,x\in {{\mathbb {R}}}^n, \end{aligned}$$

and

$$\begin{aligned} {\mathfrak {d}}_1 f(x;y)\,:=\,f(x+y)-f(x)-\mathbb {1}_{{\mathcal {B}}}(y) \langle y,\nabla f(x)\rangle ,\qquad x,y\in {{\mathbb {R}}}^n,\quad f\in C^1({{\mathbb {R}}}^n). \end{aligned}$$

The symbol \({\mathcal {D}}({\mathcal {L}})\) stands for the domain of \({\mathcal {L}}\), i.e. the set of functions \(f\in {\mathcal {B}}({{\mathbb {R}}}^n)\) for which (1.10) is well defined, \(\langle \cdot ,\cdot \rangle \) and \(|\cdot |\) denote the standard inner product and the corresponding Euclidean norm on \({{\mathbb {R}}}^n\), \({{\,\mathrm{Tr}\,}}M\) stands for the trace of a square matrix M, and \(\nabla ^2f(x)\) denotes the Hessian of \(f\in C^2({{\mathbb {R}}}^n)\). An open (resp. closed) ball of radius \(r>0\) centered at x is denoted by \({\mathcal {B}}_r(x)\) (resp. \({\overline{{\mathcal {B}}}}_r(x)\)). If \(x=0\), we write \({\mathcal {B}}_r\) (resp. \({\overline{{\mathcal {B}}}}_r\)), and the unit open (resp. closed) ball centered at 0 is denoted by \({\mathcal {B}}\) (resp. \({\overline{{\mathcal {B}}}}\)). Observe that \(C_b^2({{\mathbb {R}}}^n)\subseteq {\mathcal {D}}({\mathcal {L}})\), where \(C_b^k({{\mathbb {R}}}^n)\), \(k\ge 0\), denotes the space of k times differentiable functions such that all derivatives up to order k are bounded. We also denote by \(\Vert M\Vert :=\bigl ({{\,\mathrm{Tr}\,}}MM'\bigr )^{\nicefrac {1}{2}}\) the Hilbert–Schmidt norm of a matrix M, where \(M'\) stands for the transpose of M.

We introduce the following assumption:

  • (MP) There exists a conservative strong Markov process \({\{X(t)\}_{t\ge 0}}\) with càdlàg sample paths such that

    $$\begin{aligned} M_f(t)\,:=\,f\bigl (X(t)\bigr )-f\bigl (X(0)\bigr ) -\int _0^t{\mathcal {L}}f\bigl (X(s)\bigr )\,\text {d}s, \qquad t\ge 0, \end{aligned}$$
    (1.11)

    is a \({{\mathbb {P}}}_{x}\)-martingale (with respect to \(\{{\mathcal {F}}_t\}_{t\ge 0}\)) for any \(f\in C_c^\infty ({{\mathbb {R}}}^n)\) (the space of smooth functions with compact support).

Define

$$\begin{aligned} q(x,\xi )\,&:=\,-i\langle \xi ,b(x)\rangle +\frac{1}{2}\langle \xi ,a(x)\xi \rangle \\&\quad +\int _{{{\mathbb {R}}}^n}\bigl (1-\text {e}^{i\langle \xi ,y\rangle }+i\langle \xi ,y\rangle \mathbb {1}_{{\mathcal {B}}}(y)\bigr )\upnu (x,\text {d}y),\qquad x,\xi \in {{\mathbb {R}}}^n, \end{aligned}$$

and observe that

$$\begin{aligned} {\mathcal {L}}f(x) = -\int _{{{\mathbb {R}}}^n}\text {e}^{i\langle \xi ,x\rangle }q(x,\xi )\hat{f}(\xi )\, \text {d}\xi \end{aligned}$$

for all \(x\in {{\mathbb {R}}}^n\) and \(f\in C_c^\infty ({{\mathbb {R}}}^n),\) where \(\hat{f}(\xi ):=(2\pi )^{-n}\int _{{{\mathbb {R}}}^n}\text {e}^{-i\langle \xi ,x\rangle }f(x)\,\text {d}x\) denotes the Fourier transform of f(x). In other words, \({\mathcal {L}}\) is a pseudo-differential operator with symbol \(q(x,\xi )\). According to [52, Theorem 1.1], (MP) is satisfied if

  • (LB) The functions b(x), a(x), and \(x\mapsto \int _{{{\mathbb {R}}}^n}\bigl (1\wedge |y|^2\bigr )\,\upnu (x,\text {d}y)\) are locally bounded.

  • (SG) \(x\mapsto q(x,\xi )\) is continuous for all \(\xi \in {{\mathbb {R}}}^n\), and \(q(x,\xi )\) is locally uniformly continuous at \(\xi =0\), i.e.

    $$\begin{aligned} \lim _{\rho \rightarrow \infty }\,\sup _{x\in {\mathcal {B}}_\rho }\,\sup _{\xi \in {\mathcal {B}}_{\nicefrac {1}{\rho }}}\, \bigl |q(x,\xi )\bigr | = 0. \end{aligned}$$

Observe that the second condition in (SG) essentially means that the coefficients b(x), a(x), and \(\upnu (x,\text {d}y)\) have a sublinear growth. Namely, it is satisfied if

$$\begin{aligned} \lim _{\rho \rightarrow \infty }\,\Biggl (\frac{\sup _{x \in {\mathcal {B}}_\rho }|b(x)|}{\rho } +\frac{\sup _{x \in {\mathcal {B}}_\rho }|a(x)|}{\rho ^2}&+\frac{\sup _{x \in {\mathcal {B}}_\rho }\int _{{\mathcal {B}}}|y|^2\,\upnu (x,\text {d}y)}{\rho ^2}\\ {}&+\sup _{x\in {\mathcal {B}}_\rho }\,\sup _{\xi \in {\mathcal {B}}_{\nicefrac {1}{\rho }}}\, \int _{{\mathcal {B}}^c} \bigl (1-\text {e}^{i\langle \xi ,y\rangle }\bigr )\upnu (x,\text {d}y)\Biggr ) = 0. \end{aligned}$$

In order to allow linear growth of the coefficients, we replace (LB) and (SG) by

  • (LG) \({\mathcal {L}}\bigl (C_c^\infty ({{\mathbb {R}}}^n)\bigr )\subseteq C_\infty ({{\mathbb {R}}}^n)\), \(x\mapsto q(x,\xi )\) is continuous for all \(\xi \in {{\mathbb {R}}}^n\), and

    $$\begin{aligned} \limsup _{|x|\rightarrow \infty }\,\sup _{\xi \in {\mathcal {B}}_{\nicefrac {1}{|x|}}}\,\bigl |q(x,\xi )\bigr | <\infty \end{aligned}$$

    (see [50, Corollary 3.2]).

Here, \(C_\infty ({{\mathbb {R}}}^n)\) stands for the space of continuous functions vanishing at infinity. Clearly, the last condition in (LG) follows from

$$\begin{aligned} \limsup _{|x|\rightarrow \infty }\,\left( \frac{|b(x)|}{|x|}+\frac{\Vert a(x)\Vert }{|x|^2} +\frac{\int _{{\mathcal {B}}}|y|^2\upnu (x,\text {d}y)}{|x|^2}+\upnu (x,{\mathcal {B}}^c)\right) <\infty . \end{aligned}$$
(1.12)

Let us also remark that due to [51, Theorem A1] the map \(x\mapsto q(x,\xi )\) is continuous for all \(\xi \in {{\mathbb {R}}}^n\) if b(x) and a(x) are continuous, and for any \(r>0\), \(x\in {{\mathbb {R}}}^n\) and \(f\in C_c({{\mathbb {R}}}^n\setminus \{0\})\),

$$\begin{aligned}&\lim _{\rho \rightarrow \infty }\,\sup _{y\in {\mathcal {B}}_r}\, \upnu (y,{\mathcal {B}}_\rho ^c) = 0, \qquad \lim _{\rho \rightarrow 0}\,\sup _{y\in {\mathcal {B}}_r}\, \int _{{\mathcal {B}}_\rho }|z|^2\upnu (y,\text {d}z) = 0, \end{aligned}$$

and

$$\begin{aligned}&\lim _{y\rightarrow x}\, \int _{{{\mathbb {R}}}^n}f(z)\,\upnu (y,\text {d}z) = \int _{{{\mathbb {R}}}^n}f(z)\,\upnu (x,\text {d}z). \end{aligned}$$

Furthermore, under the continuity of \(x\mapsto q(x,\xi )\) (for all \(\xi \in {{\mathbb {R}}}^n\)) in the same reference it has been shown that \({\mathcal {L}}\bigl (C_c^\infty ({{\mathbb {R}}}^n)\bigr )\subseteq C_b({{\mathbb {R}}}^n)\). In addition, if

$$\begin{aligned} \lim _{|x|\rightarrow \infty }\, \upnu \bigl (x,{\mathcal {B}}_r(-x)\bigr ) = 0\qquad \forall \,r>0, \end{aligned}$$

we easily see that \({\mathcal {L}}\bigl (C_c^\infty ({{\mathbb {R}}}^n)\bigr )\subseteq C_\infty ({{\mathbb {R}}}^n)\).

Definition 1.3

Let \({\mathcal {M}}_+\) denote the class of positive definite matrices in \({{\mathbb {R}}}^{n\times n}\). For \(Q\in {\mathcal {M}}_+\), let \(|x|_Q:=\langle x,Qx\rangle ^{\nicefrac {1}{2}}\) for \(x\in {{\mathbb {R}}}^n\), and \(\chi _Q\in C^\infty ({{\mathbb {R}}}^n)\) be some nonnegative, symmetric convex function such that \(\chi _Q(x)=|x|_Q\) for \(x\in {\mathcal {B}}^c\). For \(Q\in {\mathcal {M}}_+\) and \(\zeta >0\), we define

$$\begin{aligned} {\mathcal {V}}_{Q,\zeta }(x) \,:=\, \bigl (\chi _Q(x)\bigr )^\zeta , \quad \text {and}\quad {\widetilde{{\mathcal {V}}}}_{Q,\zeta }(x)\,:=\,\text {e}^{\zeta \chi _Q(x)},\qquad x\in {{\mathbb {R}}}^n. \end{aligned}$$

Further, let

$$\begin{aligned} \Theta _\upnu \,:=\,\left\{ \theta \ge 0\,:\sup _{x\in {{\mathbb {R}}}^n}\int _{{{\mathbb {R}}}^n} \bigl (|y|^{2}\,\mathbb {1}_{{\mathcal {B}}}(y)+|y|^{\theta }\,\mathbb {1}_{{\mathcal {B}}^c}(y)\bigr )\,\upnu (x,\text {d}{y}) <\infty \right\} , \end{aligned}$$

and when \(\Theta _\upnu \ne \emptyset \), let \(\theta _\upnu :=\sup \Theta _\upnu \).

We now discuss ergodic properties of the Lévy-type process \({\{X(t)\}_{t\ge 0}}\).

Theorem 1.4

Assume (LB) and (MP), and suppose that \({\{X(t)\}_{t\ge 0}}\) is irreducible and aperiodic, and that every compact set is petite for \({\{X(t)\}_{t\ge 0}}\). Then the following hold.

  1. i

    If \(\theta _\upnu >0\),

    $$\begin{aligned} \lim _{r\rightarrow \infty }\,\sup _{x\in {{\mathbb {R}}}^n}\, \int _{{\mathcal {B}}^c_r}|y|^\theta \,\upnu (x,\text {d}y) = 0 \end{aligned}$$
    (1.13)

    for some \(\theta \in (0,\theta _\upnu ]\cap \Theta _\upnu \), and there exist \(Q\in {\mathcal {M}}_+\) and \(\vartheta \in [0\vee (2-\theta ),2)\) such that

    $$\begin{aligned} \limsup _{|x|\rightarrow \infty }\frac{\Vert a(x)\Vert }{|x|^\vartheta } = 0, \quad \text {and}\quad \limsup _{|x|\rightarrow \infty }\,\frac{\bigl \langle b(x) +\mathbb {1}_{[1,\infty )}(\theta )\int _{{\mathcal {B}}^c} y\,\upnu (x,\text {d}{y}), Qx\bigr \rangle }{|x|^\vartheta } < 0, \end{aligned}$$

    then \({\{X(t)\}_{t\ge 0}}\) admits a unique invariant \(\uppi \in {\mathcal {P}}_{\theta -2+\vartheta }({{\mathbb {R}}}^n)\). In addition, if \(\theta -3+\vartheta \ge 0\), then Theorem 1.1(i) and (ii) hold with \({\mathcal {V}}(x)={\mathcal {V}}_{Q,\theta }(x)+1\), \(\phi (t)=t^{\nicefrac {(\theta -2+\vartheta )}{\theta }}\) and \(\eta =\theta -2+\vartheta \).

  2. ii

    If \(\theta _\upnu >0\), (1.13) holds for some \(\theta \in (0,\theta _\upnu ]\cap \Theta _\upnu \), and there exists \(Q\in {\mathcal {M}}_+\) such that

    $$\begin{aligned} \limsup _{|x|\rightarrow \infty }\,\frac{\Vert a(x)\Vert }{|x|^2} = 0, \quad \text {and}\quad \limsup _{|x|\rightarrow \infty }\,\frac{\bigl \langle b(x)+\mathbb {1}_{[1,\infty )}(\theta )\, \int _{{\mathcal {B}}^c}y\,\upnu (x,\text {d}{y}), Qx\bigr \rangle }{|x|^2} < 0, \end{aligned}$$

    then \({\{X(t)\}_{t\ge 0}}\) admits a unique invariant \(\uppi \in {\mathcal {P}}_{\theta }({{\mathbb {R}}}^n)\). In addition, if \(\theta \ge 1\), then the conclusion of Theorem 1.1(iii) holds with \({\mathcal {V}}(x)={\mathcal {V}}_{Q,\theta }(x)+1\) and \(\eta =\theta \).

  3. iii

    Suppose that a(x) is bounded, and there exist \(\theta >0\) and \(Q\in {\mathcal {M}}_+\), such that

    $$\begin{aligned} \sup _{x\in {{\mathbb {R}}}^n}\,\int _{{{\mathbb {R}}}^n} \bigl (|y|^2\mathbb {1}_{{\mathcal {B}}}(y)+\text {e}^{\theta |y|}\mathbb {1}_{{\mathcal {B}}^c}(y)\bigr )\, \upnu (x,\text {d}y)<\infty , \end{aligned}$$
    (1.14)

    and

    $$\begin{aligned} \limsup _{|x|\rightarrow \infty }\,\frac{\bigl \langle b(x)+\int _{{\mathcal {B}}^c}y\,\upnu (x,\text {d}{y}), Qx\bigr \rangle }{|x|} < 0. \end{aligned}$$

    Then the conclusion of Theorem 1.1(iii) holds with \({\mathcal {V}}(x)={\widetilde{{\mathcal {V}}}}_{Q,\zeta }(x)\) for any \(\zeta >0\) sufficiently small and any \(\eta \ge 1\).

Irreducibility and aperiodicity are crucial structural properties of the underlying process in Theorems 1.1 to 1.4. Roughly speaking, they ensure that the process does not show singular behavior in its motion, and together with the Foster–Lyapunov condition in (1.1) (which ensures controllability of the \(\phi \circ \Phi ^{-1}\)-modulated moment of return-times to the petite set C, see [25, Theorem 4.1]) they lead to the ergodic properties stated.

Under an asymptotic flatness (uniform dissipativity) property (see (1.16)), we use a completely different approach to this problem, the so-called synchronous coupling method (see [14, Example 2.16] for details), to obtain ergodic properties for a class of Itô processes which are not necessarily irreducible and aperiodic. Recall that an Itô process is a solution to a stochastic differential equation (SDE) of the following form

$$\begin{aligned} \begin{aligned} X(t)&= x+\int _0^tb\bigl (X(s)\bigr )\,\text {d}s+\int _0^t\sigma \bigl (X(s)\bigr )\,\text {d}B_s\\&\quad +\int _0^t\int _{\{w:|k(X(s-),w)|<1\}} k(X(s-),v)\, \bigl (\nu _p(\text {d}v,\text {d}s)-\nu (\text {d}v)\, \text {d}s\bigr )\\&\quad +\int _0^t\int _{\{w:|k(X(s-),w)|\ge 1\}} k(X(s-),v)\, \nu _p(\text {d}v,\text {d}s), \quad (t,x)\in [0,\infty )\times {{\mathbb {R}}}^n, \end{aligned} \end{aligned}$$
(1.15)

where \(b:{{\mathbb {R}}}^n\rightarrow {{\mathbb {R}}}^n\), \(\sigma :{{\mathbb {R}}}^n\rightarrow {{\mathbb {R}}}^{n\times n}\) and \(k:{{\mathbb {R}}}^n\times {{\mathbb {R}}}\rightarrow {{\mathbb {R}}}^n\) are Borel measurable, \({\{B(t)\}_{t\ge 0}}\) is a standard n-dimensional Brownian motion, and \(\nu _p(\text {d}v,\text {d}s)\) is a Poisson random measure on \({\mathfrak {B}}({{\mathbb {R}}})\otimes {\mathfrak {B}}\bigl ([0,\infty )\bigr )\), with intensity measure \(\nu (\text {d}v)\,\text {d}s\) (a \(\sigma \)-finite measure on \({\mathfrak {B}}({{\mathbb {R}}})\otimes {\mathfrak {B}}({{\mathbb {R}}})\)). According to [13, Theorem 3.33], every Itô process is a semimartingale Hunt process. In particular, it is a conservative strong Markov process with càdlàg sample paths. Conversely, again by [13, Theorem 3.33], for every n-dimensional semimartingale Hunt process \({\{X(t)\}_{t\ge 0}}\), and every \(\sigma \)-finite nonfinite and nonatomic measure \(\nu (\text {d}v)\) on \({\mathfrak {B}}({{\mathbb {R}}})\), there exist b(x), \(\sigma (x)\), k(xv), \({\{B(t)\}_{t\ge 0}}\), and \(\nu _p(\text {d}v, \text {d}s)\) as above (possibly defined on an enlargement of the initial stochastic basis), such that \({\{X(t)\}_{t\ge 0}}\) satisfies the relation in (1.15). By setting

$$\begin{aligned} \upnu _p(\text {d}y,\text {d}s) = \nu _p\bigl (\{(v,u)\in {{\mathbb {R}}}\times [0,\infty ) :(k(X(u-),v),u)\in (\text {d}y,\text {d}s)\}\bigr ), \end{aligned}$$

and

$$\begin{aligned} \upnu (x,\text {d}y) = \nu \bigl (\{u\in {{\mathbb {R}}}:k(x,u)\in \text {d}y\}\bigr ), \end{aligned}$$

the relation in (1.15) reads as

$$\begin{aligned} \begin{aligned} X(t)&= x+\int _0^tb\bigl (X(s)\bigr )\,\text {d}s+\int _0^t\sigma \bigl (X(s)\bigr )\,\text {d}B_s\\&\quad +\int _0^t\int _{{\mathcal {B}}} y\,\bigl (\upnu _p(\text {d}y,\text {d}s)-\upnu (X(s-),\text {d}y)\,\text {d}s\bigr )\\&\quad +\int _0^t\int _{{\mathcal {B}}^c} y\, \upnu _p(\text {d}y,\text {d}s), \qquad (t,x)\in [0,\infty )\times {{\mathbb {R}}}^n. \end{aligned} \end{aligned}$$

Set \(a(x):=\sigma (x)\sigma (x)'\), and let \({\mathcal {L}}\) be as in (1.10). According to [40, Theorem II.2.42] (with \(h(x)=x\mathbb {1}_{{\mathcal {B}}}(x)\)), for any \(f\in C_b^2({{\mathbb {R}}}^n)\), the process \({\{M_f(t)\}_{t\ge 0}}\), defined as in (1.11), is a \({{\mathbb {P}}}_x\)-local martingale for every \(x\in {{\mathbb {R}}}^n\). In addition, if (LB) holds true, then \({\{M_f(t)\}_{t\ge 0}}\) is a \({{\mathbb {P}}}_x\)-local martingale for every \(f\in C_c^\infty ({{\mathbb {R}}}^n)\) and every \(x\in {{\mathbb {R}}}^n\), i.e. (MP) is satisfied.

For \(x,z\in {{\mathbb {R}}}^{n}\) define

$$\begin{aligned} \varDelta _{z}b(x):= & {} b(x+z)-b(x),\quad \varDelta _{z}\sigma (x)\,:=\,\sigma (x+z)-\sigma (x),\\ \varDelta _{z}\upnu (x,\text {d}y):= & {} \upnu (x+z,\text {d}y)-\upnu (x,\text {d}y),\\ \varDelta _{z}\tilde{b}(x):= & {} \varDelta _{z}b(x)+\int _{{\mathcal {B}}^c}y\, \varDelta _{z}\upnu (x,\text {d}y),\quad \text {and} \quad \tilde{a}(x;z):=\varDelta _{z}\sigma (x) \varDelta _{z}\sigma (x)'. \end{aligned}$$

If \(b(x)\equiv b\) (resp. \(\sigma (x)\equiv \sigma \), or \(\upnu (x,\text {d}y)\equiv \upnu (\text {d}y)\)), then of course \(\varDelta _{z}b(x)\) (resp. \(\varDelta _{z}\sigma (x)\), or \(\varDelta _{z}\upnu (x,\text {d}y)\)) is equal to zero.

Theorem 1.5

Assume that b(x) and a(x) are locally bounded and satisfy the linear growth condition in (1.12), and that \(\upnu (x,\text {d}y)\) is such that \(2\in \Theta _\upnu \). If for some \(p\in [2,\theta _\upnu ]\cap \Theta _\upnu \) there exist \(Q\in {\mathcal {M}}_+\), and a \(\sigma \)-finite nonfinite and nonatomic measure \(\nu (\text {d}v)\) on \({\mathfrak {B}}({{\mathbb {R}}})\) such that (1.15) admits a unique strong solution \({\{X(t)\}_{t\ge 0}}\), and

$$\begin{aligned} 2\,\bigl \langle \varDelta _{z}\tilde{b}(x),Qz\bigr \rangle&+{{\,\mathrm{Tr}\,}}\, \bigl (\tilde{a}(x;z) Q\bigr )+(p-2)\,\bigl \Vert \sqrt{Q}\,\varDelta _{z}\sigma (x)\bigr \Vert ^{2}\nonumber \\&+2^{p-3}\bigl (1+(p-2)\Vert Q^{-1}\Vert \bigr ) \nonumber \\&\times \int _{{{\mathbb {R}}}}\bigl |k(x+z,v)-k(x,v)\bigr |_Q^2\, \nu (\text {d}v)\nonumber \\&+\frac{2^{p-2}}{p(p-1)}\bigl (1+(p-2)\Vert Q^{-1}\Vert \bigr )|z|_Q^{2-p} \nonumber \\&\times \int _{{{\mathbb {R}}}}\bigl |k(x+z,v)-k(x,v)\bigr |_Q^{p}\nu (\text {d}v) \nonumber \\&\le \, -\frac{2\,c(p)}{p}|z|_Q^2 \end{aligned}$$
(1.16)

for some \(c(p)>0\) and all \(x,z\in {{\mathbb {R}}}^n\), where \(k:{{\mathbb {R}}}^n\times {{\mathbb {R}}}\rightarrow {{\mathbb {R}}}^n\) is given in Sect. 1.15, then

$$\begin{aligned} {\mathcal {W}}_p(\updelta _xP_t,\updelta _yP_t)\,\le \, \left( \frac{{\overline{\lambda }}_Q}{{\underline{\lambda }}_Q}\right) ^{\nicefrac {1}{2}}|x-y|\, \text {e}^{-\frac{c(p)t}{p}} \end{aligned}$$
(1.17)

for all \(t\ge 0\) and \(x,y\in {{\mathbb {R}}}^n\), where \({\overline{\lambda }}_Q\) (\({\underline{\lambda }}_Q\)) stands for the largest (smallest) eigenvalue of Q. Furthermore, \({\{X(t)\}_{t\ge 0}}\) admits a unique invariant \(\uppi \in {\mathcal {P}}_{p}({{\mathbb {R}}}^n)\), and

$$\begin{aligned} {\mathcal {W}}_p(\upmu P_t,\uppi )\,\le \, \left( \frac{{\overline{\lambda }}_Q}{{\underline{\lambda }}_Q}\right) ^{\nicefrac {1}{2}} {\mathcal {W}}_p(\upmu ,\uppi )\,\text {e}^{-\frac{c(p)t}{p}} \end{aligned}$$
(1.18)

for all \(t\ge 0\) and \(\upmu \in {\mathcal {P}}_p({{\mathbb {R}}}^n)\).

In addition, if \(\sigma (x)\equiv \sigma \), a constant, \(\upnu (x,\text {d}y)\equiv \upnu (\text {d}y)\), \(1\in \Theta _\upnu \), and (1.16) holds for some \(p\in [1,\theta _\upnu ]\cap \Theta _\upnu \), then (1.17) and (1.18) remain valid.

We remark that ergodic properties of a Markov process with respect to the \({\mathcal {W}}_p\)-distance are invariant under the Bochner’s random time-change method. Recall that a subordinator \({\{S(t)\}_{t\ge 0}}\) is a nondecreasing Lévy process on \(\left[ 0,\infty \right) \) with Laplace transform \({\mathbb {E}}\bigl [\text {e}^{-uS_t}\bigr ] = \text {e}^{-t\psi (u)}\), \(u,t\ge 0\). The characteristic (Laplace) exponent \(\psi :(0,\infty )\rightarrow (0,\infty )\) is a Bernstein function, i.e. it is of class \(C^\infty \) and \((-1)^n\psi ^{(n)}(u)\ge 0\) for all \(n\in {{\mathbb {N}}}\). It is well known that every Bernstein function admits a unique (Lévy-Khintchine) representation

$$\begin{aligned} \psi (u) = b_Su+\int _{(0,\infty )}(1-\text {e}^{-uy})\,\upnu _S(\text {d}y) \qquad \forall \,u\ge 0, \end{aligned}$$

where \(b_S\ge 0\) is the drift parameter and \(\upnu _S(\text {d}y)\) is a Lévy measure, i.e. a Borel measure on \({\mathfrak {B}}\bigl ((0,\infty )\bigr )\) satisfying \(\int _{(0,\infty )}(1\wedge y)\,\upnu (\text {d}y)<\infty \). For additional reading on subordinators and Bernstein functions we refer the reader to the monograph [75]. Suppose \({\{X(t)\}_{t\ge 0}}\) is a Markov process on \(\bigl ({\mathbb {X}},{\mathfrak {B}}({\mathbb {X}})\bigr )\) with transition kernel \(p(t,x,\text {d}y)\), and let \({\{S(t)\}_{t\ge 0}}\) be a subordinator with characteristic exponent \(\psi (u)\), independent of \({\{X(t)\}_{t\ge 0}}\). The process \(X^{\psi }(t):=X\bigl (S(t)\bigr )\), \(t\ge 0\), obtained from \({\{X(t)\}_{t\ge 0}}\) by a random time change through \({\{S(t)\}_{t\ge 0}}\), is referred to as the subordinate process \({\{X(t)\}_{t\ge 0}}\) with subordinator \({\{S(t)\}_{t\ge 0}}\) in the sense of Bochner. It is easy to see that \({\{X^\psi (t)\}_{t\ge 0}}\) is again a Markov process with transition kernel

$$\begin{aligned} p^\psi (t,x,\text {d}y) = \int _{\left[ 0,\infty \right) } p(s,x,\text {d}y)\,\upmu _t(\text {d}s), \qquad t\ge 0,\quad x\in {{\mathbb {R}}}^n, \end{aligned}$$

where \(\upmu _t(\cdot )={\mathbb {P}}(S(t)\in \cdot )\). It is also elementary to check that if \(\uppi (\text {d}x)\) is an invariant measure for \({\{X(t)\}_{t\ge 0}}\), then it is also invariant for the subordinate process \({\{X^\psi (t)\}_{t\ge 0}}\).

Proposition 1.6

Assume that \({\{X(t)\}_{t\ge 0}}\) admits an invariant \(\uppi \in {\mathcal {P}}({\mathbb {X}})\) such that \({\mathcal {W}}_{p}(\updelta _x P_t,\uppi )\le c(x)\,r(t)\) for some \(p\ge 1\), and all \(t\ge 0\) and \(x\in {\mathbb {X}}\), where \(r:[0,\infty )\rightarrow [1,\infty )\) is Borel measurable, and \(c:{\mathbb {X}}\rightarrow [0,\infty )\). Then,

$$\begin{aligned} {\mathcal {W}}_{p}(\updelta _x P^\psi _t,\uppi ) \,\le \, c(x)\,r_\psi (t)\qquad \forall \, (t,x)\in [0,\infty )\times {\mathbb {X}}, \end{aligned}$$

where \(r_\psi (t):=\Bigl ({\mathbb {E}}\Bigl [\bigl (r(S(t))\bigr )^p\Bigr ]\Bigr )^{\nicefrac {1}{p}}\).

Ergodic properties of Markov processes under subordination in the f-norm are discussed in [20,21,22].

1.3 Literature Review

Our work contributes to the understanding of the ergodic properties of Markov processes. Most of the existing literature focuses on characterizing the exponential or subexponential ergodicity under the f-norm, and in particular the total variation norm, see [3, 19, 24,25,26, 32, 33, 35, 58, 64,65,66, 78] and the references therein. However, there have been some recent developments in understanding ergodic properties of Markov processes (both continuous and discrete time) under the Wasserstein distances; see [12, 28,29,30, 34, 53, 56, 57, 59, 60, 85]. As already mentioned, exponential and subexponential convergence rates in the \({\mathcal {W}}_1\)-distance for general Markov processes that are (possibly) not irreducible or aperiodic are established in [12, 28, 53], under the Foster–Lyapunov condition in (1.1), contractivity of the underlying metric, and smallness of sublevel sets of the corresponding Lyapunov function. Using the coupling approach, the authors in [29, 30, 59] studied exponential ergodicity with respect to a class of Wasserstein distances for SDEs driven by an additive Brownian noise term and a drift term satisfying an asymptotic flatness property at infinity. Under the same assumption on the drift term, these results have been extended in [60, 85] to allow for more general additive Lévy noises. Subexponential ergodicity with respect to the \({\mathcal {W}}_p\)-distance for stochastic differential equations driven by an additive Lévy noise term, with a drift term satisfying asymptotic flatness property at zero, has been studied in [56]. By combining the Foster–Lyapunov method with the coupling approach, exponential ergodicity with respect to a class of f-norms and Wasserstein distances (given in terms of the underlying Lyapunov function) is established in [57] for a class of Mckean–Vlasov SDE with Lévy noise. Lastly, exponential ergodicity with respect to the \({\mathcal {W}}_1\)-distance for one-dimensional positive-valued stochastic differential equations with jumps and the drift term satisfying asymptotic flatness property has been studied in [34].

Our results on both exponential and subexponential ergodicity under the \({\mathcal {W}}_p\)-distance contribute to this active research topic. Of particular interest is the result obtained in Theorem 1.2 which seems to be completely new in the literature, and which, in some cases, allows one to conclude that the obtained upper bound on the rate of convergence is sharp.

As we have already remarked, irreducibility and aperiodicity are crucial structural properties of the underlying process used in Theorem 1.1 to 1.4. There is a vast literature on these, and related questions such as the strong Feller property and heat kernel estimates of Markov processes. In particular, we refer the readers to [8, 15,16,17,18, 36, 42, 43, 45,46,47,48, 55, 67, 71, 77] for the case of a class of Markov Lévy-type processes with bounded coefficients, and to [6, 9, 39, 44, 56, 62, 63, 68, 69, 72, 76, 86] for the case of a class of Itô processes.

Recall that the Foster–Lyapunov condition in (1.1) implies that for any \(\varepsilon >0\) the \(\phi \circ \Phi ^{-1}\)-modulated moment of the \(\varepsilon \)-shifted hitting time \(\tau _C^\varepsilon :=\inf \{t\ge \delta :X(t)\in C\}\) of \({\{X(t)\}_{t\ge 0}}\) of C (with respect to \({\mathbb {P}}_x\)) is finite and controlled by \({\mathcal {V}}(x)\) (see [25, Theorem 4.1]). However, this property in general does not immediately imply ergodicity of \({\{X(t)\}_{t\ge 0}}\). Namely, we also need to ensure that a similar property holds for any other “reasonable” set. If \({\{X(t)\}_{t\ge 0}}\) is irreducible with irreducibility measure \(\upvarphi (\text {d}x)\), then indeed for any \(\varepsilon >0\) the \(\phi \circ \Phi ^{-1}\)-modulated moment of \(\tau _B^\varepsilon \), for any \(B\in {\mathfrak {B}}({\mathbb {X}})\) with \(\upvarphi (B)>0\), is again finite and controlled by \({\mathcal {V}}(x)\) (see [25, the discussion after Theorem 4.1]). However, \({\{X(t)\}_{t\ge 0}}\) can also show certain cyclic behavior which destroys ergodicity (see [65, Sect. 5] and [64, Chap. 5]). By assuming aperiodicity, which excludes this type of behavior, (sub)exponential ergodicity in the \({\mathcal {W}}_p\)-distance of \({\{X(t)\}_{t\ge 0}}\) follows as discussed in Theorem 1.1, and in the f-norm as discussed in [33, Theorem 1].

1.4 Organization of the Article

In Sect. 2, we give the proofs of Theorems 1.1 to 1.6 together with some auxiliary lemmas. Applications of the main results to several classes of Markov processes, including Langevin tempered diffusion processes, Ornstein–Uhlenbeck processes with jumps, piecewise Ornstein–Uhlenbeck processes with jumps under constant and stationary Markov controls, state-space models, and backward recurrence time chains, are contained in Sect. 3.

2 Proofs of the Main Results

We start with the proof of Theorem 1.1.

Proof of Theorem 1.1

We consider the case when \({{\mathbb {T}}}={{\mathbb {R}}}_+\) only. The case when \({{\mathbb {T}}}={\mathbb {Z}}_+\) proceeds in an analogous way, by employing the results from [24, Theorem 2.8] and [64, Theorem 15.0.2].

First, under the assumptions of the theorem, it has been shown in [25, Proposition 3.1] and [66, Theorem 4.2] that \({\{X(t)\}_{t\ge 0}}\) admits a unique invariant \(\uppi \in {\mathcal {P}}_{\phi \circ {\mathcal {V}}}({\mathbb {X}})\). This, together with (1.2), implies that \(\uppi \in {\mathcal {P}}_{\eta }({\mathbb {X}})\). We continue now with the proof of part ((i)). By the Kantorovich-Rubinstein theorem, we have

$$\begin{aligned} {\mathcal {W}}_1(\upmu _1,\upmu _2) = \sup _{\{f:{{\,\mathrm{Lip}\,}}(f)\le 1\}}\, \biggl |\int _{{\mathbb {X}}} f(x)\bigl (\upmu _1(\text {d}{x})-\upmu _2(\text {d}{x})\bigr )\biggr | \qquad \forall \,\upmu _1,\upmu _2\in {\mathcal {P}}_1({\mathbb {X}}), \end{aligned}$$

where the supremum is taken over all Lipschitz continuous functions \(f:{\mathbb {X}}\rightarrow {{\mathbb {R}}}\) with Lipschitz constant \({{\,\mathrm{Lip}\,}}(f)\le 1\). We apply [25, Theorem 3.2],

$$\begin{aligned}&r_*(t) = \phi \circ \Phi ^{-1}(t),\quad f_*(x) = \phi \circ {\mathcal {V}}(x),\quad \Psi _1(z) = z^{\nicefrac {(\eta -1)}{\eta }},\quad \text {and}\\&\Psi _2(z) = c^{-\nicefrac {1}{\eta }}z^{\nicefrac {1}{\eta }}. \end{aligned}$$

Note that if \(f:{\mathbb {X}}\rightarrow {{\mathbb {R}}}\) is such that \({{\,\mathrm{Lip}\,}}(f)\le 1\) and \(f(x_0)=0\), then \(|f(x)|\le {\mathsf {d}}(x,x_0)\le \Psi _2\circ f_*(x)\). Thus

$$\begin{aligned} \sup _{f:{{\,\mathrm{Lip}\,}}(f)\le 1} \biggl |\int _{{\mathbb {X}}} f(x)\bigl (\upmu _1(\text {d}{x})-\upmu _2(\text {d}{x})\bigr )\biggr |\le & {} \sup _{|f| \le \Psi _2\circ f_*\vee 1}\, \biggl |\int _{{\mathbb {X}}} f(x)\bigl (\upmu _1(\text {d}{x})-\upmu _2(\text {d}{x})\bigr )\biggr | \\= & {} \Vert \upmu _1-\upmu _2\Vert _{\Psi _2\circ f_*\vee 1} \end{aligned}$$

(recall the definition of the f-norm in (1.8)). Now, from [25, (3.5) and (3.6)] we have

$$\begin{aligned} \bigl (\Psi _1\circ r_*(t)\vee 1\bigr ){\mathcal {W}}_1(\updelta _x P_t,\uppi )&\le \bigl (\Psi _1\circ r_*(t)\vee 1\bigr ) \Vert \updelta _x P_t-\uppi \,\Vert _{\Psi _2\circ f_*\vee 1}\,\le \, \bar{c}\, {\mathcal {V}}(x), \end{aligned}$$

and

$$\begin{aligned}&\int _0^\infty \bigl (\Psi _1\circ r_*(s)\vee 1\bigr ) {\mathcal {W}}_1(\updelta _x P_s,\updelta _y P_s)\,\text {d}s\\&\quad \le \int _0^\infty \bigl (\Psi _1\circ r_*(s)\vee 1\bigr ) \Vert \updelta _x P_s-\uppi \,\Vert _{\Psi _2(f_*)\vee 1}\,\text {d}s\\&\quad \le \bar{c}\, \bigl ({\mathcal {V}}(x)+{\mathcal {V}}(y)\bigr ), \end{aligned}$$

for some \(\bar{c}>0\), and all \(t\ge 0\) and \(x,y\in {{\mathbb {R}}}^n\), which proves (1.3) and (1.4), respectively.

We next prove part ((ii)). Applying (1.3) and [25, (3.5)] with \(\Psi _1(z)=1\), and \(\Psi _2(z)=z\), we obtain \({{\mathbb {E}}}_{x} \left[ {\mathsf {d}}(X(t),x_0)^{\eta }\right] \le {\overline{m}}_{\eta }+\breve{c}\,{\mathcal {V}}(x)\), for some \(\breve{c}>0\), and all \(t\ge 0\) and \(x\in {\mathbb {X}}\). Hence

$$\begin{aligned} {{\mathbb {E}}}_{x} \bigl [{\mathsf {d}}(X(t),x_0)^p\,\mathbb {1}_{{\mathcal {B}}_{t}^c(x_0)}\bigl (X(t)\bigr )\bigr ] \,\le \, t^{p-\eta }\,\bigl ({\overline{m}}_{\eta }+\breve{c}\,{\mathcal {V}}(x)\bigr ) \qquad \forall \,(t,x)\in [0,\infty )\times {\mathbb {X}}.\nonumber \\ \end{aligned}$$
(2.1)

Further, for \(t\ge 0\), \(z\in {\mathbb {X}}\), and \(\Pi \in {\mathcal {C}}(\updelta _{z} P_t,\uppi )\), we have

$$\begin{aligned} \int _{{\mathbb {X}}\times {\mathbb {X}}}\bigl ({\mathsf {d}}(x,y)\bigr )^p\,\Pi (\text {d}x,\text {d}y)= & {} \int _{{\mathcal {B}}_{t}(x_0)\times {\mathcal {B}}_{t}(x_0)}\bigl ({\mathsf {d}}(x,y)\bigr )^p\,\Pi (\text {d}x,\text {d}y)\nonumber \\&\quad +\int _{\bigl ({\mathcal {B}}_{t}(x_0)\times {\mathcal {B}}_{t}(x_0)\bigr )^c} \bigl ({\mathsf {d}}(x,y)\bigr )^p\,\Pi (\text {d}x,\text {d}y)\nonumber \\\le & {} (2t)^{p-1}\int _{{\mathbb {X}}\times {\mathbb {X}}}{\mathsf {d}}(x,y)\,\Pi (\text {d}x,\text {d}y)\nonumber \\&\quad +2^{p-1}\int _{{\mathcal {B}}_{t}^c(x_0)}\bigl ({\mathsf {d}}(x,x_0)\bigr )^p \bigl [\updelta _{z}P_t(\text {d}x)+\uppi (\text {d}x)\bigr ].\nonumber \\ \end{aligned}$$
(2.2)

Using (2.1) and (2.2), and the bound \(\int _{{\mathcal {B}}_{t}^c(x_0)}\bigl ({\mathsf {d}}(x,x_0)\bigr )^p\, \uppi (\text {d}x)\le t^{p-\eta }\,{\overline{m}}_{\eta }\), we have

$$\begin{aligned} {\mathcal {W}}_p^p(\updelta _x P_t,\uppi )\le & {} (2t)^{p-1}\,{\mathcal {W}}_1(\updelta _x P_t,\uppi ) + 2^{p-1} t^{p-\eta }\,\bigl (2{\overline{m}}_{\eta } +\breve{c}\,{\mathcal {V}}(x)\bigr ) \\&\quad \text {for all } t\ge 0 \text { and } x\in {\mathbb {X}}, \end{aligned}$$

and combining this with (1.3) we obtain

$$\begin{aligned}&\left( 1\vee \left( t^{\eta -p}\wedge t^{1-p} \bigl (r_*(t)\bigr )^{\nicefrac {(\eta -1)}{\eta }}\right) \right) \, {\mathcal {W}}_p^p(\updelta _x P_t,\uppi )\\&\quad \le 2^{p-1} \bar{c}\, {\mathcal {V}}(x) + 2^{p-1}\,\bigl (2{\overline{m}}_{\eta } +\breve{c}\,{\mathcal {V}}(x)\bigr ) \end{aligned}$$

for all \(t\ge 0\) and \(x\in {\mathbb {X}}\), from which (1.5) follows with \(\tilde{c} = 2\max \{1,\bar{c},\breve{c}\}^{\nicefrac {1}{p}}\).

Moving on to the proof of part ((iii)), note that according to [65, Proposition 6.1], [66, Theorem 4.2], and [26, Theorem 5.2], there exist constants \(\mathring{c}>0\) and \(\gamma >0\), such that

$$\begin{aligned} \Vert \updelta _xP_t-\uppi \,\Vert _{{\mathcal {V}}} \,\le \, \mathring{c}\, {\mathcal {V}}(x) \,\text {e}^{-\gamma t}\qquad \forall \,(t,x)\in [0,\infty )\times {\mathbb {X}}. \end{aligned}$$
(2.3)

Equation (1.6) now follows from the Kantorovich-Rubinstein theorem and (1.2). Let \(p\in [1,\eta ]\). First, from (2.3) we obtain \({{\mathbb {E}}}_{x} \left[ {\mathsf {d}}(X_t,x_0)^{\eta }\right] \le {\overline{m}}_{\eta }+{\dot{c}}\,{\mathcal {V}}(x)\), for some \({\dot{c}}>0\), and all \(t\ge 0\) and \(x\in {\mathbb {X}}\), which again implies (2.1). By (2.1) and (2.2), we have

$$\begin{aligned} {\mathcal {W}}_p^p(\updelta _x P_t,\uppi )\le & {} (2t)^{p-1}\,{\mathcal {W}}_1(\updelta _x P_t,\uppi ) + 2^{p-1} t^{p-\eta }\,\bigl (2{\overline{m}}_{\eta } +{\dot{c}}\,{\mathcal {V}}(x)\bigr ) \\&\quad \text {for all } t\ge 0 \text { and } x\in {\mathbb {X}}\end{aligned}$$

and combining this with (1.6) we obtain

$$\begin{aligned} (1\vee t^{\eta -p})\, {\mathcal {W}}_p^p(\updelta _x P_t,\uppi )\le & {} 2^{p-1} {\check{c}}\,{\mathcal {V}}(x) + 2^{p-1} \,\bigl (2{\overline{m}}_{\eta } +{\dot{c}}\,{\mathcal {V}}(x)\bigr ) \\&\quad \text {for all } t\ge 0 \text { and } x\in {\mathbb {X}}\end{aligned}$$

from which (1.7) follows again with \(\breve{c}= 2\max \{1,{\check{c}},{\dot{c}}\}^{\nicefrac {1}{p}}\). This completes the proof. \(\square \)

We proceed with the proof of Theorem 1.2.

Proof of Theorem 1.2

We again consider the case when \({{\mathbb {T}}}={{\mathbb {R}}}_+\) only. The case when \({{\mathbb {T}}}={\mathbb {Z}}_+\) proceeds in a similar manner.

Fix some \(x_0\in {\mathbb {X}}\), \(p\in [1,\vartheta ]\) and \(\iota \in (0,\theta -\vartheta -\varepsilon )\). For \(s>0\), define \(f_s:{\mathbb {X}}\rightarrow [0,\infty )\) by

$$\begin{aligned} f_s(x)\,:=\, {\left\{ \begin{array}{ll} 0, &{} \text {if } L(x)\le \frac{s}{2},\\[3pt] L(x)-\frac{s}{2}, &{} \text {if } L(x)> \frac{s}{2}. \end{array}\right. } \end{aligned}$$

We have

$$\begin{aligned} \int _{\mathbb {X}}\bigl (f_s(x)\bigr )^p\, \uppi (\text {d}x) \,\ge \, \Bigl (\frac{s}{2}\Bigr )^p\, \uppi \bigl (\{x:L(x)> s\}\bigr )\qquad \forall \,s>0. \end{aligned}$$
(2.4)

Since, by assumption, \(\int _{\mathbb {X}}\bigl (L(x)\bigr )^{\vartheta +\varepsilon }\, \uppi (\text {d}{x})=\infty \), there exists an increasing diverging sequence \(\{s_n\}_{n\in {{\mathbb {N}}}}\subset [0,\infty )\) such that

$$\begin{aligned} \Bigl (\frac{s_n}{2}\Bigr )^p\, \uppi \bigl (\{x:L(x) > s_n\}\bigr ) \,\ge \,2^ps_n^{p-\vartheta -\varepsilon -\iota }. \end{aligned}$$
(2.5)

Note also that \(\bigr (f_s(x)\bigl )^p \le 2^{\theta -p} s^{p-\theta }\,\bigl (L(x)\bigr )^{\theta }\le \frac{2^{\theta -p}}{c}\, s^{p-\theta }\,{\mathcal {V}}(x)\) for all \(s>0\) and \(x\in {\mathbb {X}}\). This follows from the facts that \(f_s(x)=0\) for \(s>0\) and \(x\in {\mathbb {X}}\) such that \(L(x)\le s/2\),

$$\begin{aligned} 0\,\le \,\frac{f_s(x)}{\frac{s}{2}}\,\le \,\frac{L(x)}{\frac{s}{2}}\qquad \forall \,(s,x) \in (0,\infty )\times {\mathbb {X}}, \end{aligned}$$

and \(\theta >p\ge 1\). Thus, by the Foster–Lyapunov equation (1.1) (see [66, Theorem 1.1]), we obtain

$$\begin{aligned} \int _{\mathbb {X}}\bigl (f_s(x)\bigr )^p\, \updelta _{x_0} P_t (\text {d}x) \,\le \,\frac{2^{\theta -p}}{c}\, s^{p-\theta } \bigl (b\, t + {\mathcal {V}}(x_0)\bigr )\qquad \forall \,s,t>0. \end{aligned}$$
(2.6)

Select a sequence \(\{t_n\}_{n\in {{\mathbb {N}}}}\subset [0,\infty )\) such that

$$\begin{aligned} s_n^{\theta -\vartheta -\varepsilon -\iota } = \frac{2^{\theta -p}}{c}\, \bigl (b\, t_n+ {\mathcal {V}}(x_0)\bigr ). \end{aligned}$$
(2.7)

Combining (2.4)–(2.7) above we have

$$\begin{aligned} \begin{aligned}&\left( \int _{\mathbb {X}}\bigl (f_{s_n}(x)\bigr )^p\, \uppi (\text {d}x)\right) ^{\nicefrac {1}{p}} - \left( \int _{\mathbb {X}}\bigl (f_{s_n}(x)\bigr )^p\, \updelta _{x_0} P_{t_n} (\text {d}x)\right) ^{\nicefrac {1}{p}} \,\ge \, (s_n)^{\frac{p-\vartheta -\varepsilon -\iota }{p}}\\&\,\ge \, \Bigl (\tfrac{2^{\theta -p}}{c}\,\bigl (b\, t_n + {\mathcal {V}}(x_0)\bigr )\Bigr )^{-\frac{\vartheta -p+\varepsilon +\iota }{(\theta -\vartheta -\varepsilon -\iota )p}}\quad \forall \,n\in {{\mathbb {N}}}. \end{aligned} \end{aligned}$$

The result then follows by [82, Proposition 7.29], which asserts that

$$\begin{aligned} \Biggl |\left( \int _{{\mathbb {X}}} \bigl (f(x)\bigr )^p\, \upmu _1 (\text {d}x)\right) ^{\nicefrac {1}{p}} -\left( \int _{{\mathbb {X}}} \bigl (f(x)\bigr )^p\, \upmu _2 (\text {d}x)\right) ^{\nicefrac {1}{p}}\Biggr | \,\le \, \text {Lip}\bigl (f\bigr ) \,{\mathcal {W}}_p(\upmu _1,\upmu _2) \end{aligned}$$

for all \(\upmu _1,\upmu _2\in {\mathcal {P}}_p({\mathbb {X}})\) and Lipschitz \(f:{\mathbb {X}}\rightarrow {{\mathbb {R}}}\) with Lipschitz constant \(\text {Lip}\bigl (f\bigr ).\)

\(\square \)

For the proof of Theorem 1.4 we need two auxiliary results given in Lemmas 2.1 and 2.2 below. First, recall that \(\{X(t)\}_{t\ge 0}\) is said to be conservative if \({{\mathbb {P}}}_x(X(t)\in {{\mathbb {R}}}^n)=1\) for all \(t\ge 0\) and \(x\in {{\mathbb {R}}}^n\), and note that this is equivalent to

$$\begin{aligned} {{\mathbb {P}}}_x\Bigl (\lim _{k\rightarrow \infty }\tau _k=\infty \Bigr ) = 1\qquad \forall \,x\in {{\mathbb {R}}}^n, \end{aligned}$$

where \(\tau _k:=\inf \{t\ge 0:X_t\in {\mathcal {B}}^c_k\}\) for \(k\in {{\mathbb {N}}}\) (here it is also essential that \(\{X(t)\}_{t\ge 0}\) has càdlàg sample paths). Namely, for \(t\ge 0\) and \(x\in {{\mathbb {R}}}^n\) it holds that

$$\begin{aligned}&{{\mathbb {P}}}_x(X(t)\in {{\mathbb {R}}}^n) = {{\mathbb {P}}}_x\Bigl (\lim _{k\rightarrow \infty }\tau _k>t\Bigr )\,\ge \, {{\mathbb {P}}}_x\Bigl (\lim _{k\rightarrow \infty }\tau _k=\infty \Bigr )\\&\quad =\,\lim _{t\rightarrow \infty }{{\mathbb {P}}}_x\Bigl (\lim _{k\rightarrow \infty }\tau _k>t\Bigr ) = \lim _{t\rightarrow \infty }{{\mathbb {P}}}_x(X(t)\in {{\mathbb {R}}}^n). \end{aligned}$$

Lemma 2.1

Assume (LB) and (MP). Then for any \(x\in {{\mathbb {R}}}^n\) and any nonnegative \(f\in C^{\infty }({{\mathbb {R}}}^n)\) such that the map \(y\mapsto \int _{{\mathcal {B}}^c}f(y+z)\,\upnu (y,\text {d}z)\) is locally bounded, \({\{M_f(t)\}_{t\ge 0}}\) is a \({{\mathbb {P}}}_x\)-local martingale (with respect to \(\{{\mathcal {F}}_t\}_{t\ge 0}\)).

Proof

For \(k\in {{\mathbb {N}}}\), let \(\chi _k\in C_c^\infty ({{\mathbb {R}}}^n)\) be such that \(\mathbb {1}_{{\mathcal {B}}_k}(x)\le \chi _k(x)\le \mathbb {1}_{{\mathcal {B}}_{k+1}}(x)\) and \(\chi _k(x)\le \chi _{k+1}(x)\) for \(x\in {{\mathbb {R}}}^n\). Then, for any \(x\in {{\mathbb {R}}}^n\), \(k,j\in {{\mathbb {N}}}\) and \(s,t\ge 0\), \(s\le t\), [31, Theorem 2.2.13] implies that

$$\begin{aligned} {{\mathbb {E}}}_x\bigl [M_{f\chi _k}(t\wedge \tau _j) \,|\,{\mathcal {F}}_s\bigr ] = M_{f\chi _k}(s\wedge \tau _j). \end{aligned}$$

Next, by employing the monotone and dominated convergence theorems, we easily see that

$$\begin{aligned} {{\mathbb {E}}}_x\left[ \biggl |\int _0^t{\mathcal {L}}f\bigl (X(s\wedge \tau _j)\bigr )\,\text {d}s\biggr |\right] <\infty \qquad \forall \,(x,j)\in {{\mathbb {R}}}^n\times {{\mathbb {N}}}, \end{aligned}$$

and

$$\begin{aligned} {{\mathbb {E}}}_x\bigl [f\bigl (X(t\wedge \tau _j)\bigr )\bigr ]&= \lim _{k\rightarrow \infty }\, {{\mathbb {E}}}_x\bigl [f\bigl (X(t\wedge \tau _j)\bigr )\chi _k\bigl (X(t\wedge \tau _j)\bigr )\bigr ]\\&= f(x)+\lim _{k\rightarrow \infty }{{\mathbb {E}}}_x\biggl [\int _0^t{\mathcal {L}} \bigl (f\bigl (X(s\wedge \tau _j)\bigr )\chi _k\bigl (X(s\wedge \tau _j)\bigr )\,\text {d}s\biggr ]\\&= f(x)+{{\mathbb {E}}}_x\biggl [\int _0^t{\mathcal {L}}f\bigl (X(s\wedge \tau _j)\bigr )\,\text {d}s\biggr ] \qquad \forall \,(x,j)\in {{\mathbb {R}}}^n\times {{\mathbb {N}}}. \end{aligned}$$

Hence, for each \(x\in {{\mathbb {R}}}^n\), \(t\ge 0\) and \(j\in {{\mathbb {N}}}\), \(M_f(t\wedge \tau _j)\) is integrable. Also,

$$\begin{aligned} \lim _{k\rightarrow \infty }{{\mathbb {E}}}_x\bigl [M_{f\chi _k}(t\wedge \tau _j)\,|\,{\mathcal {F}}_s\bigr ]= & {} {{\mathbb {E}}}_x\bigl [M_{f}(t\wedge \tau _j)\,|\,{\mathcal {F}}_s\bigr ],\quad \text {and} \\ \lim _{k\rightarrow \infty }M_{f\chi _k}(s\wedge \tau _j)= & {} M_{f}(s\wedge \tau _j), \end{aligned}$$

for all \(x\in {{\mathbb {R}}}^n\), \(t\ge s\ge 0\), and \(j\in {{\mathbb {N}}}\). The assertion now follows from the conservativeness of \({\{X(t)\}_{t\ge 0}}\). \(\square \)

For \(f\in C^1({{\mathbb {R}}}^n)\) we let

$$\begin{aligned} {\mathfrak {d}} f(x;y):= & {} f(x+y)-f(x)-\langle y,\nabla f(x)\rangle , \qquad x,y\in {{\mathbb {R}}}^n,\\ {\mathfrak {J}}_{1,\upnu }[f](x):= & {} \int _{{{\mathbb {R}}}^n}{\mathfrak {d}}_1 f(x;y)\, \upnu (x,\text {d}{y}), \quad \text {and } {\mathfrak {J}}_\upnu [f](x)\,:=\,\int _{{{\mathbb {R}}}^n}{\mathfrak {d}} f(x;y)\, \upnu (x,\text {d}{y}), \\&\quad \text {for } x\in {\mathbb {X}}\end{aligned}$$

whenever the integrals are well defined.

Lemma 2.2

Suppose that \(\theta _\upnu >0\), and that (1.13) holds for some \(\theta \in (0,\theta _\upnu ]\cap \Theta _\upnu \). Then, we have the following:

  1. (i)

    If \(\theta \in (0,1)\), and \(f\in C^2({{\mathbb {R}}}^n)\) satisfies

    $$\begin{aligned} \sup _{x\in {\mathcal {B}}^c}\,|x|^{-\theta }\max \bigl (|f(x)|,|x|\,|\nabla f(x)|, |x|^2\,\Vert \nabla ^2f(x)\Vert \bigr )<\infty , \end{aligned}$$

    then \({\mathfrak {J}}_{1,\upnu }[f](x)\) vanishes at infinity.

  2. (ii)

    If \(\theta \ge 1\), and \(f\in C^2({{\mathbb {R}}}^n)\) satisfies

    $$\begin{aligned} \sup _{x\in {\mathcal {B}}^c}\,|x|^{1-\theta }\max \bigl (|\nabla f(x)|, |x|\,\Vert \nabla ^2f(x)\Vert \bigr )<\infty , \end{aligned}$$

    then \({\mathfrak {J}}_\upnu [f](x)\) vanishes at infinity when \(\theta \in [1,2)\), and the map \(x\mapsto (1+|x|)^{2-\theta }\,{\mathfrak {J}}_\upnu [f](x)\) is bounded when \(\theta \ge 2\).

  3. (iii)

    If (1.14) holds for some \(\theta >0\), then there exist \(c>0\) and \(r=r(\zeta )>0\), such that for any \(\zeta \in \bigl (0,\frac{1}{2}\theta \Vert Q\Vert ^{-\nicefrac {1}{2}}\bigr )\) we have

    $$\begin{aligned} {\mathfrak {J}}_\upnu \bigl [{\widetilde{{\mathcal {V}}}}_{Q,\zeta }\bigr ](x)\,\le \, c\,\zeta ^{\nicefrac {3}{2}}\,{\widetilde{{\mathcal {V}}}}_{Q,\zeta }(x)\qquad \forall \,x\in {\mathcal {B}}_r^c. \end{aligned}$$
    (2.8)

Proof

The proof of parts (i) and (ii) follows as a straightforward adaptation of [6, Lemma  5.1] by setting

$$\begin{aligned}&C_0(\theta ) \,:=\, \sup _{x\in {{\mathbb {R}}}^n}\int _{{{\mathbb {R}}}^n} \bigl (|y|^{2}\wedge |y|^{\theta }\bigr )\,\upnu (x,\text {d}{y}),\qquad {\widehat{C}}_0(\theta ) \,:=\, \sup _{x\in {{\mathbb {R}}}^n}\int _{{\mathcal {B}}}|y|^{2}\,\upnu (x,\text {d}{y}), \end{aligned}$$

and

$$\begin{aligned}&\breve{C}_0(r;\theta ) \,:=\, \sup _{x\in {{\mathbb {R}}}^n}\int _{{\mathcal {B}}_r^c}|y|^{\theta } \upnu (x,\text {d}{y}),\qquad r>0. \end{aligned}$$

To prove part (iii), we use the identity

$$\begin{aligned}&\int _{{{\mathbb {R}}}^n}{\mathfrak {d}}{\widetilde{{\mathcal {V}}}}_{Q,\zeta }(x;y)\, \upnu (x,\text {d}{y}) \nonumber \\&\quad = \int _{{{\mathbb {R}}}^n}\int _0^1(1-t) \bigl \langle y,\nabla ^2{\widetilde{{\mathcal {V}}}}_{Q,\zeta }(x+ty)y\bigr \rangle \, \text {d}{t}\,\upnu (x,\text {d}{y}) \end{aligned}$$
(2.9)

Consider the set

$$\begin{aligned} A_x\,:=\,\bigl \{(t,y)\in [0,1]\times {{\mathbb {R}}}^n:|x+ty|_Q \le \tfrac{1}{2}|x|_Q\bigr \},\qquad x\in {{\mathbb {R}}}^n. \end{aligned}$$

On this set we have the bound

$$\begin{aligned} \bigl |\bigl \langle y,\nabla ^2{\widetilde{{\mathcal {V}}}}_{Q,\zeta }(x+ty)y\bigr \rangle \bigr | \,\le \, {\bar{c}} (\zeta + \zeta ^2)|y|^2\, \text {e}^{\zeta \Vert Q\Vert ^{\nicefrac {1}{2}}|ty|}\, {\widetilde{{\mathcal {V}}}}_{Q,\zeta }(x) \end{aligned}$$
(2.10)

for some \({\bar{c}}\ge 1\). Since \(\zeta \Vert Q\Vert ^{\nicefrac {1}{2}}<\theta \), and \(|y|_Q\ge |ty|_Q\ge \frac{1}{2}|x|_Q\) on the set \(A_x\), there exists \(\rho =\rho (\zeta )\ge 1\) such that

$$\begin{aligned} \zeta ^{-\nicefrac {1}{2}}(1 + \zeta )|y|^2\, \text {e}^{\zeta \Vert Q\Vert ^{\nicefrac {1}{2}}|ty|} \,\le \, \text {e}^{\theta |y|} \end{aligned}$$
(2.11)

for all \(x\in {\mathcal {B}}^c_{2\rho }\) and \((t,y)\in A_x\). Hence, using (2.10) and (2.11) and Fubini’s theorem, we have

$$\begin{aligned}&\iint _{A_x}(1-t) \bigl \langle y,\nabla ^2{\widetilde{{\mathcal {V}}}}_{Q,\zeta }(x+ty)y\bigr \rangle \, \text {d}{t}\,\upnu (x,\text {d}{y}) \nonumber \\&\,\le \,2\,{\bar{c}}\, \zeta ^{\nicefrac {3}{2}} \biggl (\int _{ {\mathcal {B}}_{\rho }^c}\text {e}^{\theta |y|}\,\upnu (x,\text {d}y)\biggr ) {\widetilde{{\mathcal {V}}}}_{Q,\zeta }(x) \end{aligned}$$
(2.12)

for all \(x\in {\mathcal {B}}^c_{2\rho }\). Next, since \(|x+ty|_Q > \frac{1}{2}|x|_Q\) on the set \(A^c_x\), we have a bound of the form

$$\begin{aligned} \bigl \langle y,\nabla ^2{\widetilde{{\mathcal {V}}}}_{Q,\zeta }(x+ty)y\bigr \rangle \,\le \, {\bar{c}} \biggl (\zeta ^2 + \frac{\zeta }{|x|}\biggr )|y|^2\, \text {e}^{\zeta \Vert Q\Vert ^{\nicefrac {1}{2}}|y|}\, {\widetilde{{\mathcal {V}}}}_{Q,\zeta }(x) \end{aligned}$$
(2.13)

for all \(x\in {\mathcal {B}}^c\) and \((t,y)\in A_x^c\), where, without loss of generality, we use the same constant \({\bar{c}}\) as in (2.10). Since \(\theta > 2\zeta \sqrt{\Vert Q\Vert }\), it is clear that there exists \({\tilde{c}}>0\), independent of \(\zeta \), such that

$$\begin{aligned}&\biggl (\zeta ^2 + \frac{\zeta }{|x|}\biggr )|y|^2\, \text {e}^{\zeta \Vert Q\Vert ^{\nicefrac {1}{2}}|y|} \nonumber \\&\,\le \, {\tilde{c}}\,\zeta ^{\nicefrac {3}{2}}\,\bigl ( |y|^2\mathbb {1}_{{\mathcal {B}}}(y)+\text {e}^{\theta |y|}\mathbb {1}_{{\mathcal {B}}^c}(y)\bigr ) \qquad \forall (x,y)\in {\mathcal {B}}_{1/\zeta }^c\times {{\mathbb {R}}}^n. \end{aligned}$$
(2.14)

Thus, by (1.14), (2.13) and (2.14), there exists \({\hat{c}}>0\) such that

$$\begin{aligned}&\iint _{A_x^c}(1-t) \bigl \langle y,\nabla ^2{\widetilde{{\mathcal {V}}}}_{Q,\zeta }(x+ty)y\bigr \rangle \, \text {d}{t}\,\upnu (x,\text {d}{y})\nonumber \\&\,\le \,{\hat{c}}\, \zeta ^{\nicefrac {3}{2}}\,{\widetilde{{\mathcal {V}}}}_{Q,\zeta }(x) \qquad \forall \, x\in {\mathcal {B}}^c_{1/\zeta }. \end{aligned}$$
(2.15)

The estimate in (2.8) follows from (1.14), (2.9), (2.12) and (2.15). This completes the proof. \(\square \)

We next prove Theorem 1.4.

Proof of Theorem 1.4

In cases ((i)) and ((ii)), we take \({\mathcal {V}}(x)={\mathcal {V}}_{Q,\theta }(x)+1\), while in case ((iii)) we use \({\mathcal {V}}(x)={\widetilde{{\mathcal {V}}}}_{Q,\zeta }(x)\) with \(\zeta >0\) sufficiently small. Then, in view of Lemma 2.2 it is straightforward to see that there exist constants \(\bar{c}>0\), \(\tilde{c}>0\), and \(r>0\), such that

$$\begin{aligned} {\mathcal {L}}{\mathcal {V}}(x)\,\le \,\bar{c}\,\mathbb {1}_{\bar{{\mathcal {B}}}_r}(x)-\tilde{c}\, \bigl ({\mathcal {V}}(x)\bigr )^{\nicefrac {(\theta -2+\vartheta )}{\theta }} \qquad \forall \,x\in {{\mathbb {R}}}^n, \end{aligned}$$

in case ((i)), and

$$\begin{aligned} {\mathcal {L}}{\mathcal {V}}(x)\,\le \,\bar{c}\,\mathbb {1}_{\bar{{\mathcal {B}}}_r}(x)-\tilde{c}\,{\mathcal {V}}(x) \qquad \forall \,x\in {{\mathbb {R}}}^n, \end{aligned}$$

in cases ((ii)) and ((iii)). Observe that the above relations, together with [66, Theorem 2.1] and Lemma 2.1, imply that \({\{X(t)\}_{t\ge 0}}\) is conservative. Finally, according to Lemma 2.1 and [25, Theorem 3.4] the process \({\{X(t)\}_{t\ge 0}}\) satisfies (1.1) with \(\phi (t)=t^{\nicefrac {(\theta -2+\vartheta )}{\theta }}\) in case ((i)), and \(\phi (t)=t\) in cases ((ii)) and ((iii)) (for some \(b>0\) and closed petite set C). \(\square \)

The proof of Theorem 1.5 is based on the following lemma.

Lemma 2.3

Let \({\{X(t)\}_{t\ge 0}}\) be an Itô process with locally bounded coefficients b(x) and a(x) and satisfying the linear growth condition in (1.12), and \(\upnu (x,\text {d}y)\) such that \(\theta _\upnu >0\). Then, for any \(\theta \in [0,\theta _\upnu ]\cap \Theta _\upnu \), there exists a constant \(c>0\) such that

$$\begin{aligned} {{\mathbb {E}}}_{x}\bigl [|X(t)|^\theta \bigr ] \,\le \, \bigl (1+|x|^\theta \bigr )\,\text {e}^{ct} \qquad \forall \,(t,x) \in [0,\infty )\times {{\mathbb {R}}}^n. \end{aligned}$$

Proof

Let \(\varphi \in C^2({{\mathbb {R}}}^n)\) be such that \(\varphi (x)\ge 0\) and \(\varphi (x)\le |x|^\theta \) for \(x\in {{\mathbb {R}}}^n\), and \(\varphi (x)=|x|^\theta \) for \(x\in {\mathcal {B}}^c\). Further, for \(k\in {{\mathbb {N}}}\), let \(\varphi _k\in C^2_b({{\mathbb {R}}}^n)\) be such that \(\varphi _k(x)\ge 0\), \(\varphi _k(x)=\varphi |_{{\mathcal {B}}_{k+1}}(x)\), and \(\varphi _k(x)\nearrow \varphi (x)\), as \(k\rightarrow \infty \), for every \(x\in {{\mathbb {R}}}^n\). Then, according to Itô’s formula and the conservativeness of \({\{X(t)\}_{t\ge 0}}\) we have

$$\begin{aligned} {{\mathbb {E}}}_x\bigl [\varphi _k\bigl (X(t\wedge \tau _k)\bigr )\bigr ]&= \varphi _k(x) +{{\mathbb {E}}}_x\left[ \int _0^{t\wedge \tau _k}{\mathcal {L}}\varphi _k\bigl (X(s)\bigr )\,\text {d}s\right] \\&\,\le \, \varphi _k(x)+c_k(t\wedge \tau _k) +c_k\,{{\mathbb {E}}}_x\left[ \int _0^{t\wedge \tau _k}\varphi _k\bigl (X(s)\bigr )\,\text {d}s\right] \\&\,\le \, \varphi _k(x)+c_kt+c_k\int _0^{t} {{\mathbb {E}}}_x\left[ \varphi _k\bigl (X(s\wedge \tau _k)\bigr )\right] \,\text {d}s \end{aligned}$$

for all \(k\in {{\mathbb {N}}}\), \(t\ge 0\), and \(x\in {{\mathbb {R}}}^n\), where the constants \(c_k>0\) depend on \(\theta \), b(x), a(x), and the quantities

$$\begin{aligned}&\sup _{x\in {{\mathbb {R}}}^n}\,\int _{{{\mathbb {R}}}^n} \left( |y|^2\mathbb {1}_{{\mathcal {B}}}(y)+|y|^\theta \mathbb {1}_{{\mathcal {B}}^c}(y)\right) \upnu (x,\text {d}y) \quad \text {and }\\&\sup _{x\in {\mathcal {B}}_r}\,\Bigl (\bigl |\varphi _k(x)\bigr |+\bigl |\nabla \varphi _k(x)\bigr | +\bigl |\nabla ^2\varphi _k(x)\bigr |\Bigr ), \end{aligned}$$

for \(r>0\) large enough. Clearly, the functions \(\varphi _k(x)\) can be chosen such that \(c:=\sup _{k\in {{\mathbb {N}}}}c_k<\infty \). Now, since the function \(t\mapsto {{\mathbb {E}}}_x\bigl [\varphi _k(X\bigl (t\wedge \tau _k)\bigr )\bigr ]\) is bounded and càdlàg, Gronwall’s lemma implies that

$$\begin{aligned} {{\mathbb {E}}}_x\bigl [\varphi _k\bigl (X(t\wedge \tau _k)\bigr )\bigr ]\,\le \, \bigl (1+\varphi _k(x)\bigr )\, \text {e}^{c t} -1 \end{aligned}$$

for all \(k\in {{\mathbb {N}}}\), \(t\ge 0\), and \(x\in {{\mathbb {R}}}^n\). By letting \(k\rightarrow \infty \), Fatou’s lemma and the conservativeness of \({\{X(t)\}_{t\ge 0}}\) imply that

$$\begin{aligned} {{\mathbb {E}}}_x\bigl [\varphi \bigl (X(t)\bigr )\bigr ]\,\le \, \bigl (1+\varphi (x)\bigr )\, \text {e}^{c t} -1\qquad \forall \,(t,x) \in [0,\infty )\times {{\mathbb {R}}}^n. \end{aligned}$$

Finally, we have that

$$\begin{aligned} {{\mathbb {E}}}_x\bigl [|X(t)|^\theta \bigr ]\le & {} {{\mathbb {E}}}_x\bigl [\varphi \bigl (X(t)\bigr )\bigr ]+1 \\\le & {} \bigl (1+\varphi (x)\bigr )\,\text {e}^{c t}\le (1+|x|^\theta )\,\text {e}^{c t}\quad \forall \,(t,x) \in [0,\infty )\times {{\mathbb {R}}}^n. \end{aligned}$$

This completes the proof. \(\square \)

We next prove Theorem 1.5.

Proof of Theorem 1.5

For \(p\in [2,\theta _\upnu ]\cap \Theta _\upnu \), define \({\mathcal {V}}_{p}(x) :=|x|_Q^{p}\), \(x\in {{\mathbb {R}}}^n\), and

$$\begin{aligned} \tilde{{\mathcal {L}}} f(x;z) \,:=\,&\sum ^{n}_{i=1} \varDelta _{z}\tilde{b}_{i}(x) \frac{\partial f}{\partial x_{i}}(z) + \frac{1}{2} \sum ^{n}_{i,j=1}\tilde{a}_{ij}(x;z) \frac{\partial ^{2} f}{\partial x_{i}\partial x_{j}}(z)\\&+\int _{{{\mathbb {R}}}^n}\left( f(z+y)-f(y) -\sum _{i=1}^n y_i\frac{\partial f}{\partial x_{i}}(z)\right) \varDelta _{z}\upnu (x,\text {d}y),\quad x,z\in {{\mathbb {R}}}^n. \end{aligned}$$

Calculating \(\tilde{{\mathcal {L}}}{\mathcal {V}}_{p}(x;z)\), using (1.16), we obtain

$$\begin{aligned} \tilde{{\mathcal {L}}}{\mathcal {V}}_{p}(x;z)&= \frac{p}{2}\,|z|_Q^{p-2}\left( 2\,\bigl \langle \varDelta _{z} \tilde{b}(x),Qz\bigr \rangle +{{\,\mathrm{Tr}\,}}\,\bigl (\tilde{a}(x;z) Q\bigr )\right) \\&\quad +\frac{p(p-2)}{2}\,|z|_Q^{p-4}|\varDelta _{z}\sigma '(x) Qz|^{2}\\&\quad +\int _{{{\mathbb {R}}}^n}\int _0^1(1-t) \bigl \langle y,\nabla ^2{\mathcal {V}}_{\varepsilon ,p}(z+ty)y\bigr \rangle \,\text {d}t\, \varDelta _{z}\upnu (x,\text {d}y)\\&= \frac{p}{2}\,|z|_Q^{p-2}\left( 2\,\bigl \langle \varDelta _{z} \tilde{b}(x),Qz\bigr \rangle \right. \\&\quad \left. +{{\,\mathrm{Tr}\,}}\,\bigl (\tilde{a}(x;z) Q\bigr )\right) +\frac{p(p-2)}{2}\,|z|_Q^{p-4}|\varDelta _{z}\sigma '(x) Qz|^{2} +\int _{{{\mathbb {R}}}^n}\int _0^1(1-t)\\ {}&\quad \times \left( p\, |z+ty|_Q^{p-2} |y|^2_Q+p(p-2)|z+ty|_Q^{p-4}|y'Q(z+ty)|^2\right) \\ {}&\quad \times \text {d}t\, \varDelta _{z}\upnu (x,\text {d}y)\\&\,\le \, \frac{p}{2}\,|z|_Q^{p-2}\left( 2\,\bigl \langle \varDelta _{z} \tilde{b}(x),Qz\bigr \rangle +{{\,\mathrm{Tr}\,}}\,\bigl (\tilde{a}(x;z) Q\bigr )+(p-2)\, |\sqrt{Q}\,\varDelta _{z}\sigma (x)|^{2}\right) \\&\quad +p\bigl (1+(p-2)\Vert Q^{-1}\Vert \bigr )\int _{{{\mathbb {R}}}^n} |y|_Q^2\int _0^1(1-t )|z+ty|_Q^{p-2}\,\text {d}t\, \varDelta _{z}\upnu (x,\text {d}y)\\&\,\le \, \frac{p}{2}\,|z|_Q^{p-2}\left( 2\,\bigl \langle \varDelta _{z} \tilde{b}(x),Qz\bigr \rangle +{{\,\mathrm{Tr}\,}}\,\bigl (\tilde{a}(x;z) Q\bigr )+(p-2)\, |\sqrt{Q}\,\varDelta _{z}\sigma (x)|^{2}\right) \\&\quad +p2^{p-4}\bigl (1+(p-2)\Vert Q^{-1}\Vert \bigr )|z|_Q^{p-2} \int _{{{\mathbb {R}}}^n}|y|_Q^2\, \varDelta _{z}\upnu (x,\text {d}y)\\&\quad +\frac{2^{p-3}}{p-1}\bigl (1+(p-2)\Vert Q^{-1}\Vert \bigr ) \int _{{{\mathbb {R}}}^n}|y|_Q^{p}\, \varDelta _{z}\upnu (x,\text {d}y)\\&= \frac{p}{2}\,|z|_Q^{p-2}\Biggl (2\,\bigl \langle \varDelta _{z} \tilde{b}(x),Qz\bigr \rangle +{{\,\mathrm{Tr}\,}}\,\bigl (\tilde{a}(x;z) Q\bigr )+(p-2)\, |\sqrt{Q}\,\varDelta _{z}\sigma (x)|^{2}\\&\quad +2^{p-3}\bigl (1+(p-2)\Vert Q^{-1}\Vert \bigr ) \int _{{{\mathbb {R}}}^n}|y|_Q^2\, \varDelta _{z}\upnu (x,\text {d}y)\\&\quad +\frac{2^{p-2}}{p(p-1)}\bigl (1+(p-2)\Vert Q^{-1}\Vert \bigr ) |z|_Q^{2-p}\int _{{{\mathbb {R}}}^n}|y|_Q^{p}\, \varDelta _{z}\upnu (x,\text {d}y)\Biggr )\\&= \frac{p}{2}\,|z|_Q^{p-2}\Biggl (2\,\bigl \langle \varDelta _{z} \tilde{b}(x),Qz\bigr \rangle +{{\,\mathrm{Tr}\,}}\,\bigl (\tilde{a}(x;z) Q\bigr )+(p-2)\, |\sqrt{Q}\,\varDelta _{z}\sigma (x)|^{2}\\&\quad +2^{p-3}\bigl (1+(p-2)\Vert Q^{-1}\Vert \bigr )\int _{{{\mathbb {R}}}} |k(x+z,v)-k(x,v)|_Q^2\, \nu (\text {d}v)\\&\quad +\frac{2^{p-2}}{p(p-1)}\bigl (1+(p-2)\Vert Q^{-1}\Vert \bigr ) |z|_Q^{2-p}\\ {}&\quad \times \int _{{{\mathbb {R}}}}|k(x+z,v)-k(x,v)|_Q^{p}\, \nu (\text {d}v)\Biggr )\,\le \, -c(p)\,{\mathcal {V}}_{p}(z) \end{aligned}$$

for all \(x,z\in {{\mathbb {R}}}^n\). Next, for \(x,z\in {{\mathbb {R}}}^n\), let \(\tau :=\inf \{t\ge 0:\, X^{x+z}(t) = X^{x}(t)\}\) (possibly \(+\infty \)), where \({\{X^x(t)\}_{t\ge 0}}\) denotes the solution to (1.15) with \(X^x(0)=x\) for \(x\in {{\mathbb {R}}}^n\). By Itô’s formula and the conservativeness of \({\{X(t)\}_{t\ge 0}}\) we obtain

$$\begin{aligned}&{{\mathbb {E}}}\bigl [{\mathcal {V}}_{p}\bigl (X^{x+z}(t\wedge \tau \wedge \tau _k) -X^{x}(t\wedge \tau \wedge \tau _k)\bigr )\bigr ] - {\mathcal {V}}_{p}(z)\\&\quad = {{\mathbb {E}}}\biggl [ \int _{0}^{t\wedge \tau \wedge \tau _k} \tilde{{\mathcal {L}}}{\mathcal {V}}_{p}\bigl (X^{x}(s);X^{x+z}(s) - X^{z}(s)\bigr ) \,\text {d}{s}\biggr ]\\&\quad = {{\mathbb {E}}}\biggl [ \int _{0}^{t} \tilde{{\mathcal {L}}}{\mathcal {V}}_{p}\bigl (X^{x}(s\wedge \tau \wedge \tau _k);X^{x+z}(s\wedge \tau ) - X^{z}(s\wedge \tau \wedge \tau _k)\bigr ) \,\text {d}{s}\biggr ]\\&\quad = \int _{0}^{t} {{\mathbb {E}}}\Bigl [ \tilde{{\mathcal {L}}}{\mathcal {V}}_{p}\bigl (X^{x}(s\wedge \tau \wedge \tau _k);X^{x+z}(s\wedge \tau \wedge \tau _k) - X^{z}(s\wedge \tau \wedge \tau _k)\bigr )\Bigr ] \,\text {d}{s} \end{aligned}$$

for all \(t\ge 0\) and \(k\in {{\mathbb {N}}}\), since, for \(t \ge \tau \), \(X^{x+z}(t) = X^{x}(t)\) a.s. by the pathwise uniqueness of the solution to (1.15). From this and Lemma 2.3 we conclude that the function \(t\mapsto {{\mathbb {E}}}\bigl [{\mathcal {V}}_{p}\bigl (X^{x+z}(t\wedge \tau \wedge \tau _k) -X^{x}(t\wedge \tau \wedge \tau _k)\bigr )\bigr ]\) is differentiable a.e. on \((0,\infty )\). Note that \(|\tilde{{\mathcal {L}}}{\mathcal {V}}_{p}\bigl (x;z\bigr )|\le c|z|^p\) for some \(c>0\) and all \(x,z\in {{\mathbb {R}}}^n\), We conclude now that

$$\begin{aligned}&\frac{\text {d}}{\text {d}{t}}{{\mathbb {E}}}\bigl [{\mathcal {V}}_{p} \bigl (X^{x+z}(t\wedge \tau \wedge \tau _k) -X^{x}(t\wedge \tau \wedge \tau _k)\bigr )\bigr ]\\&\quad = {{\mathbb {E}}}\bigl [ \tilde{{\mathcal {L}}}{\mathcal {V}}_{p}\bigl (X^{x}(s\wedge \tau \wedge \tau _k);X^{x+z}(t\wedge \tau \wedge \tau _k) - X^{z}(t\wedge \tau \wedge \tau _k)\bigr )\bigr ]\\&\quad \le -c(p)\,{{\mathbb {E}}}\bigl [{\mathcal {V}}_{p} \bigl (X^{x+z}(t\wedge \tau \wedge \tau _k)-X^{x}(t\wedge \tau \wedge \tau _k)\bigr )\bigr ] \qquad \text {a.e. on}\ (0,\infty ) \end{aligned}$$

for all \(k\in {{\mathbb {N}}}\). Thus by Gronwall’s lemma, it follows that

$$\begin{aligned} {{\mathbb {E}}}\bigl [{\mathcal {V}}_{p}\bigl (X^{x+z}(t\wedge \tau _k) - X^{x}(t\wedge \tau _k)\bigr )\bigr ]= & {} {{\mathbb {E}}}\bigl [{\mathcal {V}}_{p} \bigl (X^{x+z}(t\wedge \tau \wedge \tau _k)-X^{x}(t\wedge \tau \wedge \tau _k)\bigr )\bigr ]\\\le & {} {\mathcal {V}}_{p}(z)\,\text {e}^{-c(p) t}, \end{aligned}$$

and Fatou’s lemma implies that

$$\begin{aligned} {{\mathbb {E}}}\bigl [{\mathcal {V}}_{p}\bigl (X^{x+z}(t) - X^{x}(t)\bigr )\bigr ] \,\le \, {\mathcal {V}}_{p}(z)\,\text {e}^{-c(p) t} \end{aligned}$$

for all \(t\ge 0\) and \(x,z\in {{\mathbb {R}}}^n\). Next, from the bound \({\underline{\lambda }}_Q|z|^{2}\le |z|^2_Q\le \bar{\lambda }_Q|z|^{2}\) we obtain

$$\begin{aligned} {{\mathbb {E}}}\bigl [|X^{x+z}(t) - X^{x}(t)|^p\bigr ]&\,\le \, \bigl ({\underline{\lambda }}_Q)^{-\nicefrac {p}{2}}\, {{\mathbb {E}}}\bigl [|X^{x+z}(t) - X^{x}(t)|^p_Q\bigr ]\\&\,\le \, \bigl ({\underline{\lambda }}_Q)^{-\nicefrac {p}{2}}\, \bigl ({\overline{\lambda }}_Q)^{\nicefrac {p}{2}}\, |z|^{p}\,\text {e}^{{-c(p)t}} \end{aligned}$$

for all \(t\ge 0\) and \(x,z\in {{\mathbb {R}}}^n\), thus establishing (1.17).

Finally, in order to establish (1.18), we follow the idea from [59, Proof of Corollary 1.8] or [49, Proof of Theorem 2.1]. Observe first that, according to Lemma 2.3, for any \(\upmu \in {\mathcal {P}}_p({{\mathbb {R}}}^{n})\), \(\upmu P_t\in {\mathcal {P}}_p({{\mathbb {R}}}^{n})\) for all \(t\ge 0\). Next, let \(\upmu _1,\upmu _2\in {\mathcal {P}}_p({{\mathbb {R}}}^n)\) be arbitrary. According to (1.17), we have

$$\begin{aligned} {\mathcal {W}}_p(\upmu _1 P_t,\upmu _2 P_t) \,\le \, \left( \frac{\overline{\lambda }_Q}{\underline{\lambda }_Q}\right) ^{\nicefrac {1}{2}} {\mathcal {W}}_p(\upmu _1,\upmu _2)\text {e}^{-\frac{c(p)t}{p}}\qquad \forall \,t\ge 0. \end{aligned}$$

Fix \(t_0\ge 0\) such that

$$\begin{aligned} \left( \frac{\overline{\lambda }_Q}{\underline{\lambda }_Q}\right) ^{\nicefrac {1}{2}} \text {e}^{-\frac{c(p)t_0}{p}}<1. \end{aligned}$$

Then, the mapping \(\upmu \mapsto \upmu P_{t_0}\) is a contraction on \({\mathcal {P}}_p({{\mathbb {R}}}^{n})\). Thus, since \(({\mathcal {P}}_p({{\mathbb {R}}}^{n}),{\mathcal {W}}_p)\) is a complete metric space, the Banach fixed point theorem entails that there exists a unique \(\uppi _{0}\in {\mathcal {P}}_p({{\mathbb {R}}}^{n})\) such that \(\uppi _{0}P_{t_0}(\text {d}x)=\uppi _{0}(\text {d}x)\). By defining \(\uppi (\text {d}x):=t_0^{-1}\int _0^{t_0}\uppi _{0}P_s(\text {d}x)\,\text {d}s\), we can easily see that \(\uppi P_{t}(\text {d}x)=\uppi (\text {d}x)\) for all \(t\ge 0\), i.e. \(\uppi (\text {d}x)\) is an invariant probability measure for \({\{X(t)\}_{t\ge 0}}\). By employing Lemma 2.1 again, we also see that \(\uppi \in {\mathcal {P}}_p({{\mathbb {R}}}^{n})\). Finally, for any \(\upmu \in {\mathcal {P}}_p({{\mathbb {R}}}^{n})\) we have

$$\begin{aligned} {\mathcal {W}}_p(\upmu P_t,\uppi ) = {\mathcal {W}}_p(\upmu P_t,\uppi P_t) \,\le \, \left( \frac{\overline{\lambda }_Q}{\underline{\lambda }_Q}\right) ^{\nicefrac {1}{2}} {\mathcal {W}}_p(\upmu ,\uppi )\text {e}^{-\frac{c(p)t}{p}}\qquad \forall \,t\ge 0, \end{aligned}$$

which also proves uniqueness of \(\uppi (\text {d}x)\).

To prove the second assertion, we adapt the proof of [4, Lemma 7.3.4], where an analogous result is shown for \(p=1\). Define

$$\begin{aligned} V_{\varepsilon ,p}(x) \,:=\, \frac{|x|_Q^{p+1}}{\bigl (\varepsilon +|x|^2_Q\bigr )^{\nicefrac {1}{2}}}, \qquad \varepsilon >0,\quad x\in {{\mathbb {R}}}^n, \end{aligned}$$

and observe that in this case \(\tilde{{\mathcal {L}}}\) reduces to

$$\begin{aligned} \tilde{{\mathcal {L}}} f(x;z) = \sum ^{d}_{i=1} \varDelta _{z}b^{i}(x) \frac{\partial f}{\partial x_{i}}(z)\qquad \forall \, x,z\in {{\mathbb {R}}}^n. \end{aligned}$$

Calculating \(\tilde{{\mathcal {L}}}V_{\varepsilon ,p}(x;z)\), using (1.16), we obtain

$$\begin{aligned} \tilde{{\mathcal {L}}}{\mathcal {V}}_{\varepsilon ,p}(x;z)&= \frac{\bigl (\varepsilon (p+1) + p|z|^2_Q\bigr )\,|z|_Q^{p-1}}{\bigl (\varepsilon + |z|^2_Q\bigr )^{\nicefrac {3}{2}}} \, \bigl \langle Qz,\varDelta _{z}b(x)\bigr \rangle \\&\,\le \, -c(p)\, \frac{\varepsilon \frac{p+1}{p} + |z|^2_Q}{\varepsilon + |z|^2_Q}\,{\mathcal {V}}_{\varepsilon ,p}(z)\\&\,\le \, -c(p)\, {\mathcal {V}}_{\varepsilon ,p}(z)\qquad \forall \,x,z\in {{\mathbb {R}}}^n. \end{aligned}$$

As before, by Itô’s formula and the conservativeness of \({\{X(t)\}_{t\ge 0}}\), combined with the fact that the Lévy noise does not depend on the state, we obtain

$$\begin{aligned}&{{\mathbb {E}}}\bigl [{\mathcal {V}}_{\varepsilon ,p}\bigl (X^{x+z}(t\wedge \tau \wedge \tau _k) -X^{x}(t\wedge \tau \wedge \tau _k)\bigr )\bigr ] - {\mathcal {V}}_{\varepsilon ,p}(z)\\&\quad = \int _{0}^{t} {{\mathbb {E}}}\Bigl [ \tilde{{\mathcal {L}}}{\mathcal {V}}_{\varepsilon ,p} \bigl (X^{x}(s\wedge \tau \wedge \tau _k);X^{x+z}(s\wedge \tau \wedge \tau _k) - X^{z}(s\wedge \tau \wedge \tau _k)\bigr )\Bigr ] \,\text {d}{s} \end{aligned}$$

for all \(t\ge 0\) and \(k\in {{\mathbb {N}}}\), and

$$\begin{aligned}&\frac{\text {d}}{\text {d}{t}}{{\mathbb {E}}}\bigl [{\mathcal {V}}_{\varepsilon ,p} \bigl (X^{x+z}(t\wedge \tau \wedge \tau _k)-X^{x}(t\wedge \tau \wedge \tau _k)\bigr )\bigr ]\\&= {{\mathbb {E}}}\bigl [ \tilde{{\mathcal {L}}}{\mathcal {V}}_{\varepsilon ,p} \bigl (X^{x}(t\wedge \tau \wedge \tau _k);X^{x+z}(t\wedge \tau \wedge \tau _k) - X^{z}(t\wedge \tau \wedge \tau _k)\bigr )\bigr ]\\&\,\le \,-c(p)\,{{\mathbb {E}}}\bigl [{\mathcal {V}}_{\varepsilon ,p} \bigl (X^{x+z}(t\wedge \tau \wedge \tau _k)-X^{x}(t\wedge \tau \wedge \tau _k)\bigr )\bigr ] \qquad \text {a.e. on} \ (0,\infty ) \end{aligned}$$

for all \(k\in {{\mathbb {N}}}\). Thus by Gronwall’s and Fatou’s lemmas it follows that

$$\begin{aligned} {{\mathbb {E}}}\bigl [{\mathcal {V}}_{\varepsilon ,p}\bigl (X^{x+z}(t) - X^{x}(t)\bigr )\bigr ] \,\le \, {\mathcal {V}}_{\varepsilon ,p}(z)\,\text {e}^{-c_p t} \end{aligned}$$

for all \(t\ge 0\) and \(x,z\in {{\mathbb {R}}}^n\). Taking limits as \(\varepsilon \rightarrow 0\), and using monotone convergence, the assertion follows. \(\square \)

In what follows we give an alternative proof of Theorem 1.5 in the case when \(\sigma (x)\equiv \sigma \) and \(\upnu (x,\text {d}y)\equiv \upnu (\text {d}y)\). Let \(\bar{X}(t):=Q^{\nicefrac {1}{2}}X(t)\) for \(t\ge 0\). Clearly, \({\{\bar{X}(t)\}_{t\ge 0}}\) is again an Itô process which satisfies

$$\begin{aligned} \bar{X}(t) = x+Q^{\nicefrac {1}{2}}\int _0^tb\bigl (Q^{-\nicefrac {1}{2}}\bar{X}(s)\bigr )\,\text {d}s +Q^{\nicefrac {1}{2}}\sigma \,B(t) +Q^{\nicefrac {1}{2}} L(t)\qquad \forall \, t\ge 0, \end{aligned}$$

where \({\{L(t)\}_{t\ge 0}}\) is an n-dimensional pure-jump and zero-drift Lévy process determined by \(\upnu (\text {d}y)\). The corresponding transition probability satisfies

$$\begin{aligned} \bar{p}(t,x,\text {d}y)&= \bar{{\mathbb {P}}}^x(\bar{X}(t)\in \text {d}y)\\&= {{\mathbb {P}}}^{Q^{-\nicefrac {1}{2}}x}(X(t)\in Q^{-\nicefrac {1}{2}}\text {d}y)\\&= p(t,Q^{-\nicefrac {1}{2}}x,Q^{-\nicefrac {1}{2}}\text {d}y) \quad \forall \, (t,x)\in [0,\infty )\times {{\mathbb {R}}}^n. \end{aligned}$$

Thus, we have

$$\begin{aligned} \begin{aligned} \langle \varDelta _z\bar{b}(x),Qz\rangle&= \langle Q^{\nicefrac {1}{2}}\varDelta _{Q^{-\nicefrac {1}{2}}z} b(Q^{-\nicefrac {1}{2}}x),Qz\rangle \\&= \langle \varDelta _{Q^{-\nicefrac {1}{2}}z} b(Q^{-\nicefrac {1}{2}}x),Q^{\nicefrac {1}{2}}z\rangle \\&\,\le \, -\frac{c(p)}{p}|Q^{-\nicefrac {1}{2}}z|_Q = -\frac{c(p)}{p}|z|_Q\qquad \forall \,x,z\in {{\mathbb {R}}}^n. \end{aligned} \end{aligned}$$
(2.16)

Now, in [11] it has been shown that (2.16) implies that

$$\begin{aligned} {\mathcal {W}}_p(\updelta _x\bar{P}_t,\updelta _y \bar{P}_t) \,\le \, |x-y|\,\text {e}^{-\frac{c(p)t}{p}} \end{aligned}$$

for all \(t\ge 0\) and \(x,y\in {{\mathbb {R}}}^n\). Finally we get

$$\begin{aligned} \begin{aligned} {\mathcal {W}}_p(\updelta _x P_t,\updelta _y P_t)&= {\mathcal {W}}_p\bigl (\bar{p}(t,Q^{\nicefrac {1}{2}}x,Q^{\nicefrac {1}{2}}\text {d}z), \bar{p}(t,Q^{\nicefrac {1}{2}}y,Q^{\nicefrac {1}{2}}\text {d}z)\bigr )\\&\,\le \, \bigl (\underline{\lambda }_Q\bigr )^{-\nicefrac {1}{2}}| Q^{\nicefrac {1}{2}}(x-y)|\,\text {e}^{-\frac{c(p)t}{p}}\\&\,\le \, \left( \frac{\overline{\lambda }_Q}{\underline{\lambda }_Q}\right) ^{\nicefrac {1}{2}}|x-y|\, \text {e}^{-\frac{c(p)t}{p}} \end{aligned} \end{aligned}$$

for all \(t\ge 0\) and \(x,y\in {{\mathbb {R}}}^n\), which is (1.17).

Lastly, we prove Proposition 1.6.

Proof of Proposition 1.6

According to [82, Theorem 4.1], for each \(s\in [0,\infty )\) there exists \(\Pi _s\in {\mathcal {C}}(\updelta _x P_s,\uppi )\) such that \({\mathcal {W}}_{p}(\updelta _x P^\psi _s,\uppi )= \int _{{\mathbb {X}}\times {\mathbb {X}}}{\mathsf {d}}(y,z)\,\Pi _s(\text {d}y,\text {d}z)\). Now, we have that

$$\begin{aligned} {\mathcal {W}}^p_{p}(\updelta _xP_t^\psi ,\uppi )&= \inf _{\Pi \in {\mathcal {C}}(\updelta _xP_t^\psi ,\uppi )}\, \int _{{\mathbb {X}}\times {\mathbb {X}}}\bigl ({\mathsf {d}}(y,z)\bigr )^p\,\Pi (\text {d}y,\text {d}z)\\&\le \int _{{\mathbb {X}}\times {\mathbb {X}}}\bigl ({\mathsf {d}}(y,z)\bigr )^p \int _{[0,\infty )}\Pi _s(\text {d}y,\text {d}z)\,\upmu _t(\text {d}s)\\&\le \int _{[0,\infty )} {\mathcal {W}}^p_{p}(\updelta _xP_s,\uppi )\,\upmu _t(\text {d}s)\\&\le \bigl (c(x)\bigr )^p\int _{[0,\infty )} \bigl (r^p(s)\bigr )^p\,\upmu _t(\text {d}s) = \bigl (c(x)\bigr )^p\,{\mathbb {E}}\bigl [\bigl (r(S(t))\bigr )^p\bigr ] \\&\qquad \text {for all } t\ge 0 \text { and } x\in {\mathbb {X}}\end{aligned}$$

which completes the proof. \(\square \)

3 Examples

In this section, we consider applications of the main results to several classes of Markov processes, including Langevin tempered diffusion processes, Ornstein–Uhlenbeck processes with jumps, piecewise Ornstein–Uhlenbeck processes with jumps under constant and stationary Markov controls, state-space models and backward recurrence time chains. Further examples can be found in [24, 25, 32, 33, 78].

3.1 Langevin Tempered Diffusion Processes

We first consider a class of Langevin tempered diffusion processes. Let \(\alpha \in (0,1/n)\), and let \(\pi \in C^2({{\mathbb {R}}}^n)\) be strictly positive, \(\pi (x)=c\,|x|^{-\nicefrac {1}{\alpha }}\) for some \(c>0\) and all \(x\in {\mathcal {B}}^c\), and \(\int _{{{\mathbb {R}}}^n}\pi (x)\,\text {d}x=1\). Further, for \(\beta \in [0,(1+\alpha (2-n))/2]\) and \(x\in {{\mathbb {R}}}^n\), let

$$\begin{aligned} \sigma (x)\,:=\,\bigl (\pi (x)\bigr )^{-\beta }\,{\mathbb {I}}_n,\qquad a(x)\,:=\, \sigma (x)\sigma (x)'=\bigl (\pi (x)\bigr )^{-2\beta }\,{\mathbb {I}}_n, \end{aligned}$$

and

$$\begin{aligned} b(x)\,:=\,\frac{1}{2}\,\bigl (a(x)\,\nabla \log (\pi (x))+\nabla \cdot a(x)'\bigr ) = \frac{1-2\beta }{2}\,\bigl (\pi (x)\bigr )^{-2\beta }\nabla \log (\pi (x)). \end{aligned}$$

Then, in [33, Proposition 15] it has been shown that the SDE

$$\begin{aligned} X(t) = x+\int _0^tb\bigl (X(s)\bigr )\,\text {d}s+\int _0^t\sigma \bigl (X(s)\bigr )\, \text {d}B_s, \qquad t\ge 0,\quad x\in {{\mathbb {R}}}^n, \end{aligned}$$

admits a weak solution \((\Omega ,{\mathcal {F}},\{{\mathcal {F}}_t\}_{t\ge 0},{\{B(t)\}_{t\ge 0}},{\{X(t)\}_{t\ge 0}},{{\mathbb {P}}})\), which is a conservative strong Markov process with continuous sample paths. Moreover, it is irreducible, aperiodic, every compact set is petite, and \(\uppi (\text {d}x):=\pi (x)\text {d}x\) is its unique invariant probability measure. Here, \((\Omega ,{\mathcal {F}}, \{{\mathcal {F}}_t\}_{t\ge 0},{\{B(t)\}_{t\ge 0}},{{\mathbb {P}}})\) is a standard n-dimensional Brownian motion. Note also that according to Itô’s formula \({\{X(t)\}_{t\ge 0}}\) satisfies (MP) with

$$\begin{aligned}{\mathcal {L}}f(x) = \bigl \langle b(x),\nabla f(x)\bigr \rangle +\frac{1}{2}{{\,\mathrm{Tr}\,}}\bigl (a(x)\nabla ^2f(x)\bigr ),\qquad x\in {{\mathbb {R}}}^n. \end{aligned}$$

Proposition 3.1

  1. (i)

    If \(\beta \in \bigl [\alpha ,\frac{1}{2}(1+\alpha (1-n))\bigr )\), then the assertions of Theorem 1.1(iii) hold with \({\mathcal {V}}(x)= 1+{\mathcal {V}}_{{\mathbb {I}}_n,\frac{\gamma }{\alpha }}(x)\) and \(\eta = \frac{\gamma }{\alpha }\) for any \(\gamma \in [\alpha ,1+\alpha (2-n)-2\beta )\).

  2. (ii)

    If \(\alpha \in (0,1/(n+1))\) and \(\beta \in [0,\alpha )\), then the assertions of Theorem 1.1(i) and (ii) hold with

    $$\begin{aligned} {\mathcal {V}}(x)= 1+{\mathcal {V}}_{{\mathbb {I}}_n,\frac{\gamma }{\alpha }}(x),\quad \phi (t)= t^{1-\frac{2(\alpha -\beta )}{\gamma }},\quad \text {and } \eta = \alpha ^{-1}\bigl (\gamma -2(\alpha -\beta )\bigr ) \end{aligned}$$

    for any \(\gamma \in [3\alpha -2\beta ,1+\alpha (2-n)-2\beta )\).

  3. (iii)

    Under the assumptions of ((ii)), \(\uppi \in {\mathcal {P}}_{\alpha ^{-1}(1-\alpha n)-\iota }({{\mathbb {R}}}^n)\) for \(\iota \in \bigl (0,\alpha ^{-1}(1-\alpha (n+1))\bigr )\). Let \(\rho \in (0,(1-(n+1)\alpha )\wedge 2(\alpha -\beta ))\) and \(\varepsilon \in \bigl [\alpha ^{-1}\rho ,2\alpha ^{-1}(\alpha -\beta )\bigr )\) be fixed. Then, for every \(p\in \bigl [1,\alpha ^{-1}(1-n\alpha -\rho )\bigr ]\) and \(\iota \in (0,2\alpha ^{-1}(\alpha -\beta )-\varepsilon )\) there exist a positive constant \(\bar{c}\) and a diverging increasing sequence \(\{t_n\}_{n{{\mathbb {N}}}}\subset [0,\infty )\), depending on the above parameters, such that (1.9) in Theorem 1.2 holds with \({\mathcal {V}}(x)\) as above, \(\theta = \alpha ^{-1}(1+\alpha (2-n)-2\beta -\rho )\), and \(\vartheta = \alpha ^{-1}(1-n\alpha -\rho )\).

Proof

  1. (i)

    In [33, Theorem 16 ((i))] it has been shown that for \(\beta \in \bigl [\alpha ,\frac{1}{2}(1+\alpha (1-n))\bigr )\) and \(\gamma \in (0,1+\alpha (2-n)-2\beta )\) the Foster–Lyapunov condition in (1.1) holds with \({\mathcal {V}}(x)\) as above, \(\phi (t)=t\) and \(C=\bar{{\mathcal {B}}}_r\) for some \(r>0\) large enough. Also, the relation in (1.2) easily follows from the form of \({\mathcal {V}}(x)\) and \(\phi (t)\), and the choice of \(\eta \).

  2. (ii)

    In [33, Theorem 16 ((ii))] it has been shown that for \(\alpha \in (0,1/n)\), \(\beta \in [0,\alpha )\) and \(\gamma \in (2(\alpha -\beta ),1+\alpha (2-n)-2\beta )\), the Foster–Lyapunov condition in (1.1) holds with \({\mathcal {V}}(x)\) and \(\phi (t)\) as above and \(C=\bar{{\mathcal {B}}}_r\) for some \(r>0\) large enough. The relation in (1.2) can again be easily verified due to the form of \({\mathcal {V}}(x)\) and \(\phi (t)\), and the choice of \(\eta \).

  3. (iii)

    Since \(\vartheta +\varepsilon -\nicefrac {1}{\alpha }+n-1\ge -1\), we have \(\int _{ {{\mathbb {R}}}^n}|x|^{\vartheta +\varepsilon }\,\uppi (\text {d}x) = \infty \). The assertion now follows from Theorem 1.2 by taking \(L(x)=|x|\).

This completes the proof. \(\square \)

Remark 3.2

Observe that the rates obtained in Proposition 3.1 ((ii)) and ((iii)) match. Also, in Proposition 3.1(ii) we assume that \(\alpha \in \bigl (0,(n+1)^{-1}\bigr )\). Namely, for \(\alpha \in \bigl [(n+1)^{-1},n^{-1}\bigr )\) it holds that \(\int _{{{\mathbb {R}}}^n}|x|\,\uppi (\text {d}x)=\infty \), and hence convergence in the \({\mathcal {W}}_{p}\)-distance cannot hold. On the other hand, in this case, [33, Theorem 16((ii))] shows subexponential convergence in the f-norm. In the following subsections we give examples of Markov processes which are ergodic in the \({\mathcal {W}}_p\)-distance but not in the f-norm. For additional results on ergodic properties of Langevin tempered diffusion processes with respect to the f-norm see [25, 33].

3.2 Ornstein–Uhlenbeck Processes with Jumps

We next consider a class of Itô processes with linear drift. Let H be an \(n\times n\) matrix, and let \({\{L(t)\}_{t\ge 0}}\) be an n-dimensional Lévy process determined by Lévy triplet \(\bigl (b_L,a_L,\upnu _L(\text {d}y)\bigr )\). It is well known that the SDE

$$\begin{aligned} X(t) = x+H\int _0^tX(s)\,\text {d}s+ L(t),\qquad t\ge 0,\quad x\in {{\mathbb {R}}}^n, \end{aligned}$$

admits a unique conservative strong solution \({\{X(t)\}_{t\ge 0}}\) which is a strong Markov process with càdlàg sample paths (see e.g. [2, Theorem 3.1 and Proposition 4.2]). In particular, \({\{X(t)\}_{t\ge 0}}\) is an Itô process satisfying (MP) with \(b(x)=b_L+Mx\), \(a(x)=a_L\), and \(\upnu (x,\text {d}y)=\upnu _L(\text {d}y)\). This process is known as an Ornstein–Uhlenbeck process with jumps. In the case when \({\{L(t)\}_{t\ge 0}}\) is a standard Brownian motion, \({\{X(t)\}_{t\ge 0}}\) is the classical Ornstein–Uhlenbeck process. If H is a Hurwitz matrix (a square matrix whose eigenvalues have all strictly negative real parts), it has been shown in [73, Theorems 4.1 and 4.2] that \({\{X(t)\}_{t\ge 0}}\) admits a unique invariant \(\uppi \in {\mathcal {P}}({{\mathbb {R}}}^n)\) if, and only if,

$$\begin{aligned} \int _{{\mathcal {B}}^c}\log (|y|)\,\upnu _L(\text {d}y)<\infty . \end{aligned}$$

Moreover, if this is the case, then \(\lim _{t\rightarrow \infty }\updelta _xP_t\bigl (f\bigr ) = \uppi \bigl (f\bigr )\) for all \(x\in {{\mathbb {R}}}^n\) and \(f\in C_b({{\mathbb {R}}}^n)\), i.e. for any \(x\in {{\mathbb {R}}}^n\), the transition kernel \(p(t,x,\text {d}y)\) converges weakly, as \(t\rightarrow \infty \), to \(\uppi (\text {d}y)\). However, this is not enough for \({\mathcal {W}}_{p}\)-convergence of \(p(t,x,\text {d}y)\) to \(\uppi (\text {d}y)\) (see [82, Theorem 6.9]). Assume additionally that \(1\in \Theta _\upnu \), and let \(p\in [1,\theta _\upnu ]\cap \Theta _\upnu \). Since H is Hurwitz, there exists \(Q\in {\mathcal {M}}_+\) such that \(-(QH+H'Q)\in {\mathcal {M}}_+\) (see [27, Lemma 2.2]). The left-hand side of (1.16) then reads

$$\begin{aligned} 2\,\bigr \langle \varDelta _z\tilde{b}(x),Qz\bigl \rangle = 2\,\bigl \langle Hz,Qz\bigr \rangle = \bigl \langle (QH+H'Q)z,z\bigr \rangle \qquad \forall \,x,z\in {{\mathbb {R}}}^n. \end{aligned}$$

Now, by setting

$$\begin{aligned} c(p):=\inf _{z\in {{\mathbb {R}}}^n}\frac{-p\,\bigl \langle (QH+H'Q)z,z\bigr \rangle }{2\,|z|^2_Q}, \end{aligned}$$

the assertions of Theorem 1.5 follow. We remark here that this result does not necessarily imply ergodicity of \({\{X(t)\}_{t\ge 0}}\) in the f-norm. Indeed, let \(n=1\), and take \(L_t\equiv 0\) for \(t\ge 0\). Then it is easy to see that \(X_t=x\,\text {e}^{Ht}\) for \(t\ge 0\). Thus, \(\uppi (\text {d}x)=\updelta _0(\text {d}x)\), and \(\updelta _x P_t\) converges to \(\uppi (\text {d}x)\), as \(t\rightarrow \infty \), in \({\mathcal {W}}_p\)-distance for any \(p\ge 1\), but clearly this convergence cannot hold in the f-norm.

If \(\theta _\upnu >1\), and \({\{X(t)\}_{t\ge 0}}\) satisfies the assumptions in [6, Theorem 3.1] (which ensure that \({\{X(t)\}_{t\ge 0}}\) is irreducible and aperiodic, and that the support of the corresponding irreducibility measure has nonempty interior), then according to [2, Proposition 4.3] and [79, Theorems 5.1 and 7.1] (which imply that every compact set is petite for \({\{X(t)\}_{t\ge 0}}\)) the conclusions of Theorem 1.4(ii) hold true for any \(\theta \in (1,\theta _\upnu ]\cap \Theta _\upnu \). If \(\theta _\upnu >0\), then under the same assumptions as above, [26, Theorem 5.2], [65, Proposition 6.1], and [66, Theorem 4.2] (and [2, Proposition 4.3], and [79, Theorems 5.1 and 7.1]) imply that for any \(\theta \in (0,\theta _\upnu ]\cap \Theta _\upnu \) the process \({\{X(t)\}_{t\ge 0}}\) is exponentially ergodic in the f-norm with \(f(x)={\mathcal {V}}_{Q,\theta }(x)+1\). However, this does not necessarily imply ergodicity of \({\{X(t)\}_{t\ge 0}}\) in the \({\mathcal {W}}_{p}\)-distance. To see this take again \(n=1\), and let \({\{L(t)\}_{t\ge 0}}\) be a one-dimensional symmetric \(\alpha \)-stable Lévy process with \(\alpha \in (0,1)\) and symbol (characteristic exponent) \(q(\xi )=|\xi |^\alpha \). Thus, \(1\notin \Theta _\upnu \), and \(\theta _\upnu =\alpha \). We claim that \(\uppi \notin {\mathcal {P}}_1({{\mathbb {R}}})\). Assume this is not the case. Then,

$$\begin{aligned} \int _{ {{\mathbb {R}}}}\int _{ {{\mathbb {R}}}}|y|\,p(t,x,\text {d}y)\,\uppi (\text {d}x) = \int _{ {{\mathbb {R}}}}|x|\,\uppi (dx)<\infty \qquad \forall \,t\ge 0. \end{aligned}$$

In particular, for every \(t>0\) it holds that \(\int _{ {{\mathbb {R}}}} |y|\,p(t,x,\text {d}y)<\infty \), \(\uppi \)-a.e. On the other hand, according to [73, Theorem 3.1], we have

$$\begin{aligned} P_tf(x) = \int _{{{\mathbb {R}}}}f(\text {e}^{Ht}x+y)\,\upmu _t(\text {d}y) \end{aligned}$$

for all \(t\ge 0\), \(x\in {{\mathbb {R}}}\) and \(f\in {\mathcal {B}}_b({{\mathbb {R}}}),\) where \(\upmu _t(\text {d}y)\) is a probability measure on \({{\mathbb {R}}}\) with characteristic function

$$\begin{aligned} \hat{\upmu }_t(\xi ) = \text {e}^{-\int _0^tq(\text {e}^{Hs}\xi )\,\text {d}s} = \text {e}^{\frac{1-\text {e}^{\alpha Ht}}{\alpha H}\,|\xi |^\alpha }, \qquad t\ge 0,\quad \xi \in {{\mathbb {R}}}, \end{aligned}$$

and \({\mathcal {B}}_b({{\mathbb {R}}})\) denotes the space of bounded functions in \({\mathcal {B}}({{\mathbb {R}}})\). Hence, \(\upmu _t(\text {d}y)\) is the law of a symmetric \(\alpha \)-stable random variable. Now, the monotone convergence theorem implies that

$$\begin{aligned} \int _{{{\mathbb {R}}}}|y|\,p(t,x,\text {d}y) = \int _{ {{\mathbb {R}}}}|\text {e}^{Ht}x+y|\,\upmu _t(\text {d}y)<\infty \qquad \forall \,(t,x)\in [0,\infty )\times {{\mathbb {R}}}, \end{aligned}$$

which is impossible.

Let us mention that ergodic properties of Ornstein–Uhlenbeck processes with jumps in the f-norm, and in particular in the total variation norm, have been considered in [41, 61, 74, 83, 84].

3.3 Piecewise Ornstein–Uhlenbeck Processes with Jumps

We extend the results from the previous subsection to a class of Itô processes with a piecewise linear drift. Consider an n-dimensional SDE of the form

$$\begin{aligned} X(t) = x+\int _0^t\bar{b}\bigl (X(s)\bigr )\,\text {d}{s} +\int _0^t\sigma \bigl (X(s)\bigr )\, \text {d}B(s)+ L(t),\qquad t\ge 0,\quad x\in {{\mathbb {R}}}^n,\nonumber \\ \end{aligned}$$
(3.1)

where

  1. (i)

    the function \(\bar{b}:{{\mathbb {R}}}^n\rightarrow {{\mathbb {R}}}^n\) is given by

    $$\begin{aligned} \bar{b}(x) = l-M(x-\langle e,x\rangle ^+v)-\langle e,x\rangle ^+\varGamma v, \end{aligned}$$

    where \(l \in {{\mathbb {R}}}^n\), \(v\in {{\mathbb {R}}}^n\) has nonnegative components and satisfies \(\langle e,v\rangle =1\) with \(e = (1,\dotsc ,1)'\in {{\mathbb {R}}}^n\), \(M\in {{\mathbb {R}}}^{n\times n}\) is a nonsingular M-matrix such that the vector \(e'M\) has nonnegative components, and \(\varGamma ={{\,\mathrm{diag}\,}}(\gamma _1,\dotsc ,\gamma _n)\) with \(\gamma _i\ge 0\) for \(i=1,\dotsc ,n\) ;

  2. (ii)

    \({\{B(t)\}_{t\ge 0}}\) is a standard m-dimensional Brownian motion, and the covariance function \(\sigma :{{\mathbb {R}}}^n \rightarrow {{\mathbb {R}}}^{n\times m}\) is locally Lipschitz continuous and satisfies, for some \(c>0\),

    $$\begin{aligned} \Vert \sigma (x)\Vert ^2\,\le \,c\,(1+|x|^2)\qquad \forall \,x\in {{\mathbb {R}}}^n\,; \end{aligned}$$
  3. (iii)

    \({\{L(t)\}_{t\ge 0}}\) is a n-dimensional pure-jump Lévy process specified by a drift \(b_L\in {{\mathbb {R}}}^n\) and Lévy measure \(\upnu _L(\text {d}y)\).

Recall that a \(n\times n\) matrix M is called an M-matrix if it can be expressed as \(M=\mu {\mathbb {I}}_n-N\) for some \(\mu >0\) and some nonnegative \(n\times n\) matrix N with the property that \(\rho (N)\le \mu \), where \({\mathbb {I}}_n\) and \(\rho (N)\) denote the \(n\times n\) identity matrix and the spectral radius of N, respectively. Clearly, the matrix M is nonsingular if \(\rho (N)<\mu \). It is well known that the SDE in (3.1) admits a unique conservative strong solution \({\{X(t)\}_{t\ge 0}}\) which is a strong Markov process with càdlàg sample paths (see e.g. [2, Theorem 3.1 and Proposition 4.2]). In particular, \({\{X(t)\}_{t\ge 0}}\) is an Itô process satisfying (MP) with \(b(x)=b_L+\bar{b}(x)\), \(a(x)=\sigma (x)\sigma (x)'\), and \(\upnu (x,\text {d}y)=\upnu _L(\text {d}y)\). This process is often called a piecewise Ornstein–Uhlenbeck process with jumps. It arises as a limit of the suitably scaled queueing processes of multiclass many-server queueing networks with heavy-tailed (bursty) arrivals and/or asymptotically negligible service interruptions. In these models, if the scheduling policy is based on a static priority assignment on the queues, then the vector v in the limiting diffusion (3.1) corresponds to a constant control. The process \({\{X(t)\}_{t\ge 0}}\) also arises in many-server queues with phase-type service times, where the constant vector v corresponds to the probability distribution of the phases. For a multiclass queueing network with independent heavy-tailed arrivals, the process \({\{L(t)\}_{t\ge 0}}\) is an anisotropic Lévy process consisting of independent one-dimensional symmetric \(\alpha \)-stable components. Under service interruptions, \({\{L(t)\}_{t\ge 0}}\) is either a compound Poisson process, or an anisotropic Lévy process described above together with a compound Poisson component. More details on these queueing models can be found in [6, Sect. 4].

We first discuss the case when \(\varGamma v=0\). This corresponds to the case when the control gives lowest priority to queues whose abandonment rate is zero. When \(1\in \Theta _\upnu \), we define

$$\begin{aligned} \tilde{l}\,:=\, l+b_L+\int _{{\mathcal {B}}^c}y\,\upnu _L(\text {d}y). \end{aligned}$$
(3.2)

Proposition 3.3

In addition to the assumptions of [6, Theorem 3.1] (which ensure that \({\{X(t)\}_{t\ge 0}}\) is irreducible and aperiodic with irreducibility measure having support with nonempty interior), suppose that \(\varGamma v=0\), \(2\in \Theta _\upnu \), and \(\bigl \langle e,M^{-1}\tilde{l}\bigr \rangle <0\).

  1. (i)

    If

    $$\begin{aligned} \limsup _{|x|\rightarrow \infty }\,\frac{\Vert a(x)\Vert }{|x|} = 0, \end{aligned}$$
    (3.3)

    then there exists \(Q\in {\mathcal {M}}_+\) such that the assertions of Theorem 1.4(i) hold true with \(\vartheta =1\).

  2. (ii)

    If a(x) is bounded, and \(\int _{{\mathcal {B}}^c}\text {e}^{\theta |y|}\,\upnu _L(\text {d}y) < \infty \) for some \(\theta >0\), then there exists \(Q\in {\mathcal {M}}_+\) such that the assertions of Theorem 1.4(iii) hold.

Proof

  1. (i)

    In [6, Theorem 3.2((i))] it has been shown that there exist \(Q\in {\mathcal {M}}_+\), \(\bar{c}=\bar{c}(\theta )>0\), and \(\tilde{c}=\tilde{c}(\theta )>0\), such that for any \(\theta \in [1,\theta _\upnu ]\cap \Theta _\upnu \), we have

    $$\begin{aligned} {\mathcal {L}} {\mathcal {V}}_{Q,\theta }(x)\,\le \, \bar{c} -\tilde{c}\, {\mathcal {V}}_{Q,\theta -1}(x)\qquad \forall \,x\in {{\mathbb {R}}}^n. \end{aligned}$$

    It is easy to see that the above relation implies that there exist \(r>0\), \(\hat{c}>0\), and \(\breve{c}>0\), such that

    $$\begin{aligned} {\mathcal {L}} {\mathcal {V}}(x)\,\le \, \hat{c}\,\mathbb {1}_{\bar{{\mathcal {B}}}_r}(x) -\breve{c}\, \bigl ({\mathcal {V}}(x)\bigr )^{\nicefrac {(\theta -1)}{\theta }} \qquad \forall \,x\in {{\mathbb {R}}}^n, \end{aligned}$$

    with \({\mathcal {V}}(x)={\mathcal {V}}_{Q,\theta }(x)+1\). The assertion now follows from Theorem 1.4(i), together with [2, Proposition 4.3], [25, Theorem 3.4], and [79, Theorems 5.1 and 7.1].

  2. (ii)

    Let \(\tilde{b}(x)\,:=\, \bar{b}(x) + \tilde{l} -l\). As shown in the proof of [6, Theorem 3.2((ii))], there exist \(Q\in {\mathcal {M}}_+\), \(\bar{c}=\bar{c}(\zeta )>0\) and \(\tilde{c}=\tilde{c}(\zeta )>0\), such that for any \(\zeta \in \bigl (0,\theta \Vert Q\Vert ^{-\nicefrac {1}{2}}\bigr )\),

    $$\begin{aligned} \bigl \langle \tilde{b}(x), \nabla {\widetilde{V}}_{Q,\zeta }(x)\bigr \rangle \,\le \, {\bar{c}}- {\tilde{c}} \, {\widetilde{V}}_{Q,\zeta }(x) \qquad \forall \,x\in {{\mathbb {R}}}^n. \end{aligned}$$

    This together with Lemma 2.2 (iii) imply that, for any \(\zeta >0\) sufficiently small, there exist \({\hat{c}}={\hat{c}}(\zeta )>0\) and \({\check{c}}={\check{c}}(\zeta )>0\), such that

    $$\begin{aligned} {\mathcal {L}} {\widetilde{{\mathcal {V}}}}_{Q,\zeta }(x)\,\le \, {\hat{c}}-{\check{c}} \,{\widetilde{{\mathcal {V}}}}_{Q,\zeta }(x) \qquad \forall \,x\in {{\mathbb {R}}}^n. \end{aligned}$$

    Again, It is straightforward to see that the above relation implies that there exist \(r>0\), \(\breve{c}>0\) and \(\mathring{c}>0\), such that

    $$\begin{aligned} {\mathcal {L}} {\widetilde{{\mathcal {V}}}}_{Q,\zeta }(x)\,\le \, \breve{c}\,\mathbb {1}_{\bar{{\mathcal {B}}}_r}(x) -\mathring{c}\, {\widetilde{{\mathcal {V}}}}_{Q,\zeta }(x)\qquad \forall \, x\in {{\mathbb {R}}}^n. \end{aligned}$$

    The assertion now follows from Theorem 1.4(iii), and the results from [2, 25, 79] cited in part ((i)).

\(\square \)

Remark 3.4

It has been shown in [6, Theorem 3.3 (b) and Lemma 5.7] that the assumptions \(1\in \Theta _\upnu \) and \(\langle e,M^{-1}\tilde{l}\rangle <0\) are both necessary for the existence of an invariant probability measure of \({\{X(t)\}_{t\ge 0}}\). Using this, we can exhibit an example where we have ergodicity with respect to the f-norm but not with respect to \({\mathcal {W}}_p\)-distance. Suppose that \(\varGamma v=0\), \(\langle e,M^{-1}\tilde{l}\rangle <0\), a(x) satisfies (3.3), and \({\{L(t)\}_{t\ge 0}}\) is a rotationally invariant \(\alpha \)-stable process with \(\alpha \in (1,2)\). Then [6, Theorem 3.1((i))] shows that \({\{X(t)\}_{t\ge 0}}\) admits a unique invariant \(\uppi \in {\mathcal {P}}_{\alpha -1-\iota }({{\mathbb {R}}}^n)\) for \(\iota \in (0,\alpha -1)\), and

$$\begin{aligned} \lim _{t\rightarrow \infty }t^{\alpha -1-\iota }\,\Vert \updelta _x P_t(\cdot ) -\uppi (\cdot )\Vert _{\text {TV}} = 0 \end{aligned}$$

for all \(x\in {{\mathbb {R}}}^n\) and \(\iota \in (0,\alpha -1)\). Here, \(\Vert \cdot \Vert _\text {TV}\) stands for the total variation norm, i.e. the f-norm with \(f(x)\equiv 1\). However, \(\int _{{{\mathbb {R}}}^n} |x|\, \uppi (\text {d}x)=\infty \) by [6, Theorem 3.4 (b)], so we cannot have convergence in \({\mathcal {W}}_1\)-distance.

We next exhibit a lower bound on the polynomial rate of convergence in Proposition 3.3(i), which is analogous to [6, Theorem 3.4]. We let

$$\begin{aligned} \tilde{\theta }_\upnu \,:=\, \sup \biggl \{ \theta \ge 0:\int _{{{\mathbb {R}}}^n}\bigl (\langle e,M^{-1}y\rangle ^+\bigr )^\theta \, \upnu _L(\text {d}y)<\infty \biggr \}. \end{aligned}$$
(3.4)

Note that, in general, \(\tilde{\theta }_\upnu \ge \theta _\upnu \). In [6] it is assumed that \({\{L(t)\}_{t\ge 0}}\) is a compound Poisson process with drift \(b_L\), and Lévy measure \(\upnu _L(\text {d}y)\) which is supported on a half-line of the form \(\{\zeta w:\zeta \in [0,\infty )\}\) with \(\langle e,M^{-1}w\rangle >0\), and a(x) satisfies (3.3). This implies that \(\tilde{\theta }_\upnu =\theta _\upnu \), and subsequently, this equality is used in the proof of [6, Lemma 5.7 (b)] to establish that, provided \(\varGamma v=0\), \(\int _{{{\mathbb {R}}}^n}\bigl (\langle e,M^{-1}x\rangle ^+\bigr )^{p-1}\, \uppi (\text {d}x)<\infty \) implies \(p\in \Theta _\upnu \) for \(p>1\). We use this fact, namely that the conclusions of [6, Lemma 5.7 (b)] hold under the weaker assumption that \(\tilde{\theta }_\upnu =\theta _\upnu \) in the proof of the following proposition.

Proposition 3.5

In addition to the assumptions of [6, Theorem 3.1], assume that \(\varGamma v=0\), \(\bigl \langle e,M^{-1}\tilde{l}\bigr \rangle <0\), and \(\tilde{\theta }_\upnu =\theta _\upnu \in (2,\infty )\). Then, due to Proposition 3.3(i), \({\{X(t)\}_{t\ge 0}}\) admits a unique invariant \(\uppi \in {\mathcal {P}}_{\theta _\upnu -1-\iota }({{\mathbb {R}}}^n)\), \(\iota \in (0,\theta _\upnu -1)\). Next, fix \(\rho \in (0,(\theta _\upnu -2)\wedge 1)\) and \(\varepsilon \in (\rho ,1)\). Then, for any \(p\in [1,\theta _\upnu -\rho -1]\) and \(\iota \in (0,1-\varepsilon )\) there exist \(\bar{c}>0\) and a diverging increasing sequence \(\{t_n\}_{n\in {{\mathbb {N}}}}\subset [0,\infty )\), depending on these parameters, such that (1.9) holds with \(\theta =\theta _\upnu -\rho \), \(\vartheta =\theta -1\), and \({\mathcal {V}}(x)={\mathcal {V}}_{Q,\theta }(x)+1\), where \(Q\in {\mathcal {M}}_+\) is given in Proposition 3.3(i).

Proof

Observe first that \(\vartheta +\varepsilon >\theta _\upnu -1\). Thus, according to [6, Lemma 5.7 (b)], we have

$$\begin{aligned} \int _{{{\mathbb {R}}}^n}\bigl (\langle e,M^{-1}x\rangle ^+\bigr )^{\vartheta +\varepsilon }\, \uppi (\text {d}x) = \infty . \end{aligned}$$

The assertion now follows from the proof of Proposition 3.3(i) (together with [2, Proposition 4.3], [25, Theorem 3.4], and [79, Theorems 5.1 and 7.1]), and Theorem 1.2 by setting \(L(x)=\langle e,M^{-1}x\rangle ^+\) and \(\phi (t)= t^{\nicefrac {(\theta -1)}{\theta }}\). \(\square \)

We now discuss the case when \(\varGamma v\ne 0\). For \(x\in {{\mathbb {R}}}^n\), we write \(x\ge 0\) (\(x\gneqq 0\)) to indicate that all components of x are nonnegative (nonnegative and at least one is strictly positive). Also, for \(x,y\in {{\mathbb {R}}}^n\) we write \(x\ge y\) if, and only if, \(x-y\ge 0\).

Proposition 3.6

In addition to the assumptions of [6, Theorem 3.1], suppose that \(\theta _\upnu >0\),

$$\begin{aligned} \limsup _{|x|\rightarrow \infty }\,\frac{\Vert a(x)\Vert }{|x|^2} = 0, \end{aligned}$$
(3.5)

and that one of the following holds:

  1. (i)

    \(Mv\ge \varGamma v\gneqq 0\);

  2. (ii)

    \(M={{\,\mathrm{diag}\,}}(m_1,\dotsc ,m_d)\) with \(m_i>0\), \(i=1,\dotsc ,n\), and \(\varGamma v\ne 0\).

Then there exists \(Q\in {\mathcal {M}}_+\) such that the assertions of Theorem 1.4(ii) hold true.

Proof

In [6, Theorem 3.5] it has been shown that there exist \(Q\in {\mathcal {M}}_+\), \(\bar{c}=\bar{c}(\theta )>0\), and \(\tilde{c}=\tilde{c}(\theta )>0\), such that for any \(\theta \in (0,\theta _\upnu ]\cap \Theta _\upnu \), we have

$$\begin{aligned} {\mathcal {L}} {\mathcal {V}}_{Q,\theta }(x)\,\le \, \bar{c}-\tilde{c} \,{\mathcal {V}}_{Q,\theta }(x) \qquad \forall \, x\in {{\mathbb {R}}}^n. \end{aligned}$$

As in Proposition 3.3, it is easy to see that the above relation implies that there exist \(r>0\), \(\hat{c}>0\) and \(\breve{c}>0\), such that

$$\begin{aligned} {\mathcal {L}} {\mathcal {V}}(x)\,\le \, \hat{c}\,\mathbb {1}_{\bar{{\mathcal {B}}}_r}(x)-\breve{c}\, {\mathcal {V}}(x) \qquad \forall \,x\in {{\mathbb {R}}}^n, \end{aligned}$$

with \({\mathcal {V}}(x)={\mathcal {V}}_{Q,\theta }(x)+1\). The assertion now follows from Theorem 1.4(ii), together with the results from [2, 25, 79] cited in the proof of Proposition 3.5. \(\square \)

In the case when \(\varGamma v\ne 0\) (under ((i)) or ((ii)) in Proposition 3.6) the dynamics are contractive in the \({\mathcal {W}}_p\)-distance. This is shown by establishing an asymptotic flatness (uniform dissipativity) property for \({\{X(t)\}_{t\ge 0}}\). As a consequence, we assert exponential ergodicity of \({\{X(t)\}_{t\ge 0}}\) with respect to \({\mathcal {W}}_p\), without assuming irreducibility and aperiodicity, i.e. we allow the SDE in (3.1) to be degenerate.

Proposition 3.7

Suppose that \(2\in \Theta _\upnu \), \(\sigma (x)\) is Lipschitz continuous, and either (i) or (ii) in Proposition 3.6 holds. Then there exists \(Q\in {\mathcal {M}}_+\) such that the matrices

$$\begin{aligned} MQ+QM,\quad \text {and}\quad \bigl (M- ev'(M-\varGamma )\bigr )Q + Q\bigl (M- (M-\varGamma )v e'\bigr ) \end{aligned}$$
(3.6)

are in \({\mathcal {M}}_+\). Let \({\underline{\kappa }}\) denote the smallest eigenvalue of the positive definite matrices in (3.6), and \({\overline{\lambda }}_Q\), \({\underline{\lambda }}_Q\) denote the largest, smallest eigenvalue of Q, respectively. For \(p\ge 1\), let

$$\begin{aligned} c(p) \,:=\, \frac{p}{2} \left( \frac{{\underline{\kappa }}}{{\overline{\lambda }}_Q} -\frac{(p-1){{\,\mathrm{Lip}\,}}^2\bigl (\sqrt{Q}\,\sigma \bigr )}{\underline{\lambda }_Q}\right) , \end{aligned}$$

where \({{\,\mathrm{Lip}\,}}(\sqrt{Q}\,\sigma )\) is the Lipschitz constant of \(\sqrt{Q}\,\sigma (x)\) with respect to the Hilbert-Schmidt norm, and suppose that \(c(p)>0\) for some \(p\in [2,\theta _\upnu ]\cap \Theta _\upnu \). Then the assertions of Theorem 1.5 hold true. If \(\sigma (x)\equiv \sigma \) and \(1\in \Theta _\upnu \), the assertions of Theorem 1.5 hold true for any \(p\in [1,\theta _\upnu ]\cap \Theta _\upnu \).

Proof

Existence of the matrix Q has been proven in [6, Theorem 3.5]. We prove that (1.16) holds with c(p) defined above. First, clearly,

$$\begin{aligned} {{\,\mathrm{Tr}\,}}\,\bigl (\tilde{a}(x;y-x) Q\bigr )+(p-2)\, \bigl \Vert \sqrt{Q}\,\varDelta _{y-x}\sigma (x)\bigr \Vert ^{2}\,\le & {} (p-1){{\,\mathrm{Lip}\,}}^2\bigl (\sqrt{Q}\,\sigma \bigr )\,|y-x|^2\,\nonumber \\\le & {} \frac{(p-1){{\,\mathrm{Lip}\,}}^2\bigl (\sqrt{Q}\,\sigma \bigr )}{{\underline{\lambda }}_Q}\,|y-x|_Q^2\nonumber \\ \end{aligned}$$
(3.7)

for all \(x,y\in {{\mathbb {R}}}^n\). We next discuss the term \(\bigl \langle \varDelta _{y-x}\tilde{b}(x),Q(y-x)\bigr \rangle \) for \(x,y\in {{\mathbb {R}}}^n\). Clearly, \(\varDelta _{y-x}\tilde{b}(x)=\varDelta _{y-x} \bar{b}(x)\) for \(x,y\in {{\mathbb {R}}}^n\). With \(\hat{v}=-M^{-1}(M v - \varGamma v)\), we have \(\bar{b}(x) = l -M (x+\langle e,x\rangle ^+\,\hat{v})\). If both x and y are on the same half-space, i.e. \(\langle e,x\rangle \ge 0\) and \(\langle e,y\rangle \ge 0\), or the opposite, then

$$\begin{aligned} \bigl \langle \varDelta _{y-x}\tilde{b}(x),Q(y-x)\bigr \rangle \,\le \, - \frac{{\underline{\kappa }}}{2}\, |y-x|^2. \end{aligned}$$

So suppose, without loss of generality, that \(\langle e,x\rangle \ge 0\) and \(\langle e,y\rangle \le 0\). Then we have

$$\begin{aligned} \bigl \langle y-x, Q \,\bar{b}(x) \bigr \rangle&= \bigl \langle y-x,Q\,l\bigr \rangle -\bigl \langle y-x, Q M x\bigr \rangle -\bigl \langle y-x, Q M\hat{v}e' x\bigr \rangle \end{aligned}$$
(3.8a)
$$\begin{aligned} \bigl \langle y-x, Q\, \bar{b}(y) \bigr \rangle&= \bigl \langle y-x,Q\,l\bigr \rangle -\bigl \langle y-x, Q M y\bigr \rangle . \end{aligned}$$
(3.8b)

We distinguish two cases.

  1. (i)

    \(\bigl \langle y-x, Q M\hat{v}e' x\bigr \rangle \le 0\). Then of course subtracting Eq. (3.8a) from (3.8b), we obtain

    $$\begin{aligned} \bigl \langle \varDelta _{y-x}\tilde{b}(x),Q(y-x)\bigr \rangle&= - \bigl \langle y-x, QM (y-x)\bigr \rangle + \bigl \langle y-x, Q M\hat{v}e' x\bigr \rangle \\&\,\le \,- \bigl \langle y-x, QM (y-x)\bigr \rangle \\&\,\le \, - \frac{{\underline{\kappa }}}{2}\,|y-x|^2. \end{aligned}$$
  2. (ii)

    \(\bigl \langle y-x, Q M\hat{v}e' x\bigr \rangle >0\). Since \(\langle e,x\rangle \ge 0\), we must have \(\bigl \langle y-x, Q M\hat{v}\bigr \rangle >0\). This in turn implies, since \(\langle e,y\rangle \le 0\), that

    $$\begin{aligned} \bigl \langle y-x, Q M\hat{v}e' y\bigl \rangle \,\le \,0. \end{aligned}$$
    (3.9)

Adding Eq. (3.8a) and (3.9) and subtracting Eq. (3.8b) from the sum, we obtain

$$\begin{aligned} \bigl \langle \varDelta _{y-x}\tilde{b}(x),Q(y-x)\bigr \rangle\le & {} - \bigl \langle y-x, QM (y-x)\bigr \rangle - \bigl \langle y-x, Q M\hat{v}e' (y-x)\bigr \rangle \nonumber \\\le & {} - \bigl \langle y-x, QM({\mathbb {I}}_n+\hat{v}e') (y-x)\bigr \rangle \le - \frac{{\underline{\kappa }}}{2}\,|y-x|^2.\nonumber \\ \end{aligned}$$
(3.10)

Finally, combining (3.7) and (3.10), we obtain

$$\begin{aligned}&2\,\bigl \langle \varDelta _{y-x}\tilde{b}(x),Q(y-x)\bigr \rangle +{{\,\mathrm{Tr}\,}}\,\bigl (\tilde{a}(x;y-x) Q\bigr )+(p-2)\, \bigl \Vert \sqrt{Q}\,\varDelta _{y-x}\sigma (x)\bigr \Vert ^{2}\\&\quad \,\le \, \left( - \frac{{\underline{\kappa }}}{{\overline{\lambda }}_Q}+ \frac{(p-1){{\,\mathrm{Lip}\,}}^2\bigl (\sqrt{Q}\,\sigma \bigr )}{{\underline{\lambda }}_Q}\right) |y-x|_Q^2\\&\quad =\, -\frac{2\, c(p)}{p}\, |y-x|_Q^2\qquad \forall \, x,y\in {{\mathbb {R}}}^n, \end{aligned}$$

thus completing the proof. \(\square \)

The hypothesis in Proposition 3.7 that \(c(p)>0\) is, of course, always true if \(\sigma (x)\equiv \sigma \), in which case we have \(c(p)=p\frac{\underline{\kappa }}{2\overline{\lambda }_Q}\). This is the scenario for multiclass queueing models with service interruptions described in [6, Sect. 4.2].

Some examples of degenerate SDEs of the form (3.1) for which Proposition 3.7 is applicable are the following.

  1. (i)

    \({\{L(t)\}_{t\ge 0}}\) is given by \(L(t) = R \tilde{L}(t)\) for \(t\ge 0\), where \(R \in {\mathbb {R}}^{n\times r}\) has rank smaller than \(\min \{n,r\}\), and \(\{\tilde{L}(t)\}_{t\ge 0}\) is a r-dimensional Lévy process. As a special case \(\{\tilde{L}(t)\}_{t\ge 0}\) may be composed of mutually independent \(\alpha \)-stable processes. This is the case in the queueing example described below.

  2. (ii)

    \({\{L(t)\}_{t\ge 0}}\) is a degenerate subordinate Brownian motion, as studied in [87].

The following is an example of a degenerate SDE that arises in applications for which Proposition 3.6 is applicable. Consider a two class \(GI/M/k+M\) queue with class-1 jobs having a Poisson process, and class-2 jobs having a heavy-tailed renewal arrival process. Service and patience times are exponentially distributed with rates \(m_i\) and \(\gamma _i\) for \(i=1,2\), respectively. Assume that the arrival, service and abandonment processes are mutually independent, and that the number of servers is k. Consider a sequence of such models indexed by k, operating in the critically loaded asymptotic modified Halfin-Whitt regime as \(k\rightarrow \infty \). Let \({\{A^k_i(t)\}_{t\ge 0}}\) denote the arrival process for class \(i=1,2\), with arrival rates \(\lambda ^k_i\). Assume that \(m_i\) and \(\gamma _i\) for \(i=1,2\) are independent of k, and that \(\frac{\lambda ^k_i}{k}\rightarrow \lambda _i>0\) as \(k\rightarrow \infty \), for \(i=1,2\). The arrival process \({\{A^k_1(t)\}_{t\ge 0}}\) satisfies a functional central limit theorem (FCLT) with a Brownian motion limit \({\{\hat{A}_1(t)\}_{t\ge 0}}={\{\sqrt{\lambda _1} B_1(t)\}_{t\ge 0}}\), where \({\{B_1(t)\}_{t\ge 0}}\) is a standard Brownian motion, i.e.

Here, denotes the convergence in the space \(D=D([0,\infty ), {{\mathbb {R}}})\) of càdlàg functions endowed with the Skorokhod \(\text {J}_1\) topology. We assume that the arrival process \({\{A^k_2(t)\}_{t\ge 0}}\) satisfies a FCLT with a symmetric \(\alpha \)-stable Lévy process \({\{\hat{A}_2(t)\}_{t\ge 0}}\), \(\alpha \in (1,2)\), in the limit, i.e.

Here, denotes the convergence in the space D with the \(\text {M}_1\) topology. Let \(\rho ^k_i=\frac{\lambda ^k_i}{km_i}\) and \(\rho _i = \frac{\lambda _i}{m_i}\) for \(i=1,2\). The modified Halfin-Whitt regime requires the parameters satisfy

$$\begin{aligned} \lim _{k\rightarrow \infty }k^{1-\nicefrac {1}{\alpha }} \left( 1- \sum _{i=1}^2 \rho _i^k\right) = \hat{\rho }\in {{\mathbb {R}}},\qquad \text {and}\qquad \sum _{i=1}^2 \rho _i = 1. \end{aligned}$$

In addition, we assume that \(k^{-\nicefrac {1}{\alpha }} (\lambda ^k_i - k \lambda _i)\rightarrow l_i\) as \(k\rightarrow \infty \) for \(i=1,2\). Next, let \({\{X^k_i(t)\}_{t\ge 0}}\) denote the number of class-i jobs in the system. Define the scaled processes \(\hat{X}^k_i(t)=k^{-\nicefrac {1}{\alpha }} (X^k_i(t)-k \rho _it)\) for \(t\ge 0\). Let \({\{U^k_i(t)\}_{t\ge 0}}\) be the scheduling control process, representing allocations of service capacity to class i. Let \(\hat{X}^k(t) = \bigl (\hat{X}^k_1(t), \hat{X}^k_2(t)\bigr )'\) and \(U^k(t)= \bigl (U^k_1(t),U^k_2(t)\bigr )'\) for \(t\ge 0\). We consider work conserving and preemptive scheduling policies resulting in constant controls at the limit, i.e. , where \(V(t)=v\) for \(t\ge 0\) with \(v\in {{\mathbb {R}}}^2\) being a probability vector. Then, as in [6, Theorem 4.1], it can shown that , where the limit process \({\{X(t)\}_{t\ge 0}}\) is a solution to the following two-dimensional degenerate \(\alpha \)-stable driven SDE:

$$\begin{aligned} \text {d}X_1(t)&= \Bigl ( l_1-m_1 (X_1(t) - \langle e, X(t)\rangle ^+ v_1) - \gamma _1 \langle e, X(t)\rangle ^+ v_1 \Bigr ) \text {d}t, \\ \text {d}X_2(t)&= \Bigl (l_2 -m_2 (X_1(t) - \langle e, X(t)\rangle ^+ v_2) -\gamma _2 \langle e, X(t)\rangle ^+ v_2 \Bigr ) \text {d}t + \text {d}\hat{A}_2(t), \end{aligned}$$

which is (3.1) with \(l=(l_1,l_2)'\), \(M={{\,\mathrm{diag}\,}}(m_1,m_2)\), \(\varGamma ={{\,\mathrm{diag}\,}}(\gamma _1,\gamma _2)\), \(\sigma (x)=(0,0)'\), and \(L(t)=(0,\hat{A}_2(t))'\) for \(t\ge 0\). Observe that the process \({\{X(t)\}_{t\ge 0}}\) does not fall into any of the four categories in [6, Theorem 3.1]. In fact, one can consider multiple classes of jobs with all heavy-tailed arrival processes that have different scaling parameters \(\alpha _i\)’s for \(i=1,\ldots ,\bar{k}\), in their corresponding FCLTs. The centered queueing process should be scaled as \(k^{-\nicefrac {1}{\alpha }}\), where \(\alpha :=\min _{i=1,\ldots ,\bar{k}}\{\alpha _i\}\), and the limit process has the components \({\{X_i(t)\}_{t\ge 0}}\) driven by independent \(\alpha \)-stable processes if the arrival process of class i has the parameter \(\alpha _i\) equal to the minimum \(\alpha \), and the other components are degenerate without stochastic driving terms.

We remark here that without assuming irreducibility and aperiodicity, establishing subgeometric ergodicity in the case \(\varGamma v=0\) is difficult. Consider the following example. Let \(n=1\), \(\sigma (x)\equiv 0\), \(L(t)\equiv 0\) for \(t\ge 0\), and

$$\begin{aligned} \bar{b}(x) = {\left\{ \begin{array}{ll} -1 , &{} x\ge 0,\\[2pt] -1 - x, &{}x\le 0. \end{array}\right. } \end{aligned}$$

Clearly, \(\bar{b}(x)\) satisfies all the assumptions in [6], and

$$\begin{aligned} X^x(t) = x+\int _0^t\bar{b}\bigl (X^x(s)\bigr )\,\text {d}s,\qquad t\ge 0,\quad x\in {{\mathbb {R}}}. \end{aligned}$$

A straightforward calculation shows that

$$\begin{aligned} X^x(t) = {\left\{ \begin{array}{ll} {\left\{ \begin{array}{ll} x-t , &{} 0\le t\le x \\[2pt] \text {e}^{x-t}-1, &{}t\ge x,\\ \end{array}\right. } , &{} x\ge 0,\\[15pt] -1 +\text {e}^{-t}+x\,\text {e}^{-t}, &{}x\le 0. \end{array}\right. } \end{aligned}$$

Let

$$\begin{aligned} {\mathsf {d}}(x,y)\,:=\,\frac{|x-y|}{1+|x-y|},\qquad x,y\in {{\mathbb {R}}}. \end{aligned}$$

Then it is easy to see that the conditions (1)–(3) in [12, Theorem 2.4] hold. However, condition (4) does not hold. Namely, for arbitrary \(t_0>0\) let \(x,y>t_0\). Then, \({\mathsf {d}}\bigl (X^x(t),X^y(t)\bigr )={\mathsf {d}}(x,y)\) for all \(t_0\le t\le x\wedge y\).

Let us mention that ergodic properties of piecewise Ornstein–Uhlenbeck processes with jumps in the total variation norm have been considered in [6, 23, 70].

3.4 Piecewise Ornstein–Uhlenbeck Processes with Jumps Under Stationary Markov Controls

In Sect. 3.3 we consider a model with a constant control, i.e. with the vector \(v\in \Delta :=\{u\in {{\mathbb {R}}}^n:u\ge 0,\ \langle e,u\rangle =1\}\) being constant and fixed. If the scheduling policy (control) is a function of the state of the system, then v(x) in the limiting SDE (3.1) is, in general, a Borel measurable map from \({{\mathbb {R}}}^n\) to \(\Delta \). We call such a v(x) a stationary Markov control and denote the set of such controls by \(\mathfrak {U}_{\text {SM}}\). If \(L_t\equiv 0\) for \(t\ge 0\), or it is a compound Poisson process, it follows from the results in [37] that, under any \(v\in \mathfrak {U}_{\text {SM}}\), (3.1) admits a unique conservative strong solution which is a strong Markov process with càdlàg sample paths. In the general case, we consider the subclass of stationary Markov controls for which

$$\begin{aligned} \bar{b}_v(x) = l-M\bigl (x-\langle e,x\rangle ^+v(x)\bigr ) -\langle e,x\rangle ^+\varGamma v(x), \end{aligned}$$

is locally Lipschitz continuous. We let \(\widetilde{{\mathfrak {U}}}_{\text {sm}}\) denote the class of such controls. Clearly, for any \(v\in \widetilde{{\mathfrak {U}}}_{\text {sm}}\), the drift \(\bar{b}_v(x)\) has at most linear growth. Other parameters are as in Sect. 3.3. Again, the SDE of the form (3.1), with \(\bar{b}(x)\) replaced by \(\bar{b}_v(x)\), admits a unique conservative strong solution \({\{X(t)\}_{t\ge 0}}\) which is a strong Markov process with càdlàg sample paths. Also, it is an Itô process satisfying (MP) with \(b(x)=b_L+\bar{b}_v(x)\), \(a(x)=\sigma (x)\sigma (x)'\), and \(\upnu (x,\text {d}y)=\upnu _L(\text {d}y)\).

Recently, in [5] the authors have studied ergodic properties with respect to the total variation norm of this model with \({\{L(t)\}_{t\ge 0}}\) being either (or a combination of) a rotationally invariant \(\alpha \)-stable Lévy process, an anisotropic Lévy process consisting of independent one-dimensional symmetric \(\alpha \)-stable components, or a compound Poisson process. Observe that in this situation we cannot follow the procedure from the constant control case. Namely, the matrices \(Q\in {\mathcal {M}}_+\) used in constructing the appropriate Lyapunov functions \({\mathcal {V}}(x)\) depend on v.

Proposition 3.8

Grant the assumptions of [6, Theorem 3.1], and suppose that \(M ={{\,\mathrm{diag}\,}}(m_1,\dotsc ,m_n)\), with \(m_i>0\) for \(i=1,\ldots ,n.\)

  1. (i)

    Assume that the diagonal components of \(\varGamma \) are strictly positive, a(x) satisfies (3.5), and \({\{L(t)\}_{t\ge 0}}\) is either a rotationally invariant \(\alpha \)-stable Lévy process, an anisotropic Lévy process consisting of independent one-dimensional symmetric \(\alpha \)-stable components (in both cases we assume that \(\alpha \in (1,2)\)), or a compound Poisson process satisfying \(1\in \Theta _\upnu \). We allow \({\{L(t)\}_{t\ge 0}}\) to have a drift. Then, for any \(v\in \widetilde{{\mathfrak {U}}}_{\text {sm}}\) and \(\theta \in [1,\theta _\upnu ]\cap \Theta _\upnu \), the assertions of Theorem 1.1(iii) hold true with \(\eta =\theta \), and \({\mathcal {V}}(x)=\bigl (\bar{\mathcal {V}}(x)\bigr )^\theta +1\), where \(\bar{\mathcal {V}}\in C^2({{\mathbb {R}}}^n)\) (given explicitly in [5, Definition 1]) is bounded from below away from zero, is Lipschitz continuous, and satisfies

    $$\begin{aligned} 0<\liminf _{|x|\rightarrow \infty }\frac{\bar{\mathcal {V}}(x)}{|x|}\,\le \, \limsup _{|x|\rightarrow \infty }\frac{\bar{\mathcal {V}}(x)}{|x|}<\infty . \end{aligned}$$
  2. (ii)

    Assume \(\bigl \langle e,M^{-1}\tilde{l}\bigr \rangle <0\), where \(\tilde{l}\) is given in (3.2), a(x) satisfies (3.3), and \({\{L(t)\}_{t\ge 0}}\) is a pure-jump Lévy process (possibly with drift) satisfying \(2\in \Theta _\upnu \). Then, for any \(v\in \widetilde{{\mathfrak {U}}}_{\text {sm}}\) and \(\theta \in [2,\theta _\upnu ]\cap \Theta _\upnu \), the assertions of Theorem 1.1(i) and (ii) hold true with \(\phi (t)= t^{\nicefrac {(\theta -1)}{\theta }}\), \(\eta =\theta -1\), and \({\mathcal {V}}(x)\) as in ((i)).

  3. (iii)

    In addition to the assumptions in ((ii)) assume that \(\tilde{\theta }_\upnu =\theta _\upnu \in (2,\infty )\), where \(\tilde{\theta }_\upnu \) is given in (3.4). Then, due to ((ii)), for any \(v\in \widetilde{{\mathfrak {U}}}_{\text {sm}}\), \({\{X(t)\}_{t\ge 0}}\) admits a unique invariant \(\uppi _v\in {\mathcal {P}}_{\theta _\upnu -1-\iota }({{\mathbb {R}}}^n)\) for \(\iota \in (0,\theta _\upnu -1)\). Next, fix \(\rho \in (0,(\theta _\upnu -2)\wedge 1)\) and \(\varepsilon \in (\rho ,1)\). Then, for any \(v\in \widetilde{{\mathfrak {U}}}_{\text {sm}}\) such that \(\varGamma v(x)=0\) a.e., \(p\in [1,\theta _\upnu -\rho -1]\) and \(\iota \in (0,1-\varepsilon )\), there exist \(\bar{c}>0\) and a diverging increasing sequence \(\{t_n\}_{n\in {{\mathbb {N}}}}\subset [0,\infty )\), depending on these parameters, such that (1.9) holds for the corresponding \(\uppi _v(\text {d}x)\) with \(\theta =\theta _\upnu -\rho \), \(\vartheta =\theta -1\), and \({\mathcal {V}}(x)\) as above.

Proof

  1. (i)

    Observe first that in the case when \({\{L(t)\}_{t\ge 0}}\) is a rotationally invariant \(\alpha \)-stable Lévy process or an anisotropic Lévy process consisting of independent one-dimensional symmetric \(\alpha \)-stable components, \(\Theta _\upnu =[0,\alpha )\). In [5, Theorem 3 and the discussion after Theorem 5] it has been shown that for any \(v\in \widetilde{{\mathfrak {U}}}_{\text {sm}}\) and \(\theta \in [1,\theta _\upnu ]\cap \Theta _\upnu \) there exist \(\bar{c}=\bar{c}(\theta ,v)>0\) and \(\tilde{c}=\tilde{c}(\theta ,v)>0\), such that

    $$\begin{aligned} {\mathcal {L}} \bigl (\bar{\mathcal {V}}^\theta \bigr )(x)\,\le \, \bar{c}-\tilde{c}\, \bigl (\bar{\mathcal {V}}(x)\bigr )^{\theta }\qquad \forall \,x\in {{\mathbb {R}}}^n. \end{aligned}$$

    It is easy to see that the above relation implies that there exist \(r>0\), \(\hat{c}>0\), and \(\breve{c}>0\), such that

    $$\begin{aligned} {\mathcal {L}} {\mathcal {V}}(x)\,\le \, \hat{c}\,\mathbb {1}_{\bar{{\mathcal {B}}}_r}(x)-\breve{c}\, {\mathcal {V}}(x) \qquad \forall \,x\in {{\mathbb {R}}}^n. \end{aligned}$$

    The assertion then follows from Theorem 1.1(iii), together with [2, Proposition 4.3], [25, Theorem 3.4], and [79, Theorems 5.1 and 7.1].

  2. (ii)

    In Theorem 5 and the discussion following the proof of this theorem in [5] it has been shown that for any \(v\in \widetilde{{\mathfrak {U}}}_{\text {sm}}\) and \(\theta \in (1,\theta _\upnu ]\cap \Theta _\upnu \) there exist \(r=r(\theta ,v)>0\), \(\bar{c}=\bar{c}(\theta ,v)>0\), and \(\tilde{c}=\tilde{c}(\theta ,v)>0\), such that

    $$\begin{aligned} {\mathcal {L}} \bigl (\bar{\mathcal {V}}^\theta \bigr )(x)\,\le \, \bar{c}\,\mathbb {1}_{{\overline{{\mathcal {B}}}}_r}-\tilde{c}\, \bigl (\bar{\mathcal {V}}(x)\bigr )^{\theta -1} \qquad \forall \,x\in {{\mathbb {R}}}^n. \end{aligned}$$

    It is easy to see that the above relation implies that there exist \(\hat{r}>0\), \({\check{c}}>0\), and \(\breve{c}>0\), such that

    $$\begin{aligned} {\mathcal {L}} {\mathcal {V}}(x)\,\le \, {\check{c}}\,\mathbb {1}_{\bar{{\mathcal {B}}}_{\hat{r}}}(x) -\breve{c}\, \bigl ({\mathcal {V}}(x)\bigr )^{\nicefrac {(\theta -1)}{\theta }} \qquad \forall \,x\in {{\mathbb {R}}}^n, \end{aligned}$$

    with \({\mathcal {V}}(x)\) given as above. The assertion now follows from Theorem 1.1 (i) and (ii), together with the results from [2, 25, 79] cited in part (i).

  3. (iii)

    Clearly, \(\vartheta +\varepsilon >\theta _\upnu -1\). Thus, according to [6, Lemma 5.7 (b)],

    $$\begin{aligned} \int _{{{\mathbb {R}}}^n}\bigl (\langle e,M^{-1}x\rangle ^+\bigr )^{\vartheta +\varepsilon }\, \uppi _v(\text {d}x) = \infty . \end{aligned}$$

    The assertion now follows from (the proof of) (ii) (together with the results from [2, 25, 79] cited in part (i)), and Theorem 1.2 by setting \(L(x)=\bar{\mathcal {V}}(x)\) and \(\phi (t)= t^{\nicefrac {(\theta -1)}{\theta }}\).

\(\square \)

As discussed in Sect. 3.3, the hypothesis that \(\tilde{\theta }_\upnu =\theta _\upnu \) is true if \({\{L(t)\}_{t\ge 0}}\) is a compound Poisson process (possibly with drift) with Lévy measure \(\upnu _L(\text {d}y)\) supported on a half-line of the form \(\{tw:t\in [0,\infty )\}\) with \(\langle e,M^{-1}w\rangle >0\).

Ergodic properties in the f-norm of piecewise Ornstein–Uhlenbeck processes with jumps under stationary Markov controls have been considered in [5, 7].

3.5 State-Space Models

Let \(F:{{\mathbb {R}}}^n\rightarrow {{\mathbb {R}}}^n\) be continuous, and such that \(|F(x)|\le c|x|\) for some \(c>0\) and all \(x\in {{\mathbb {R}}}^n.\) Further, let X(0) be an \({{\mathbb {R}}}^n\)-valued random variable, and let \(\{W(k)\}_{k\in {{\mathbb {N}}}}\) be a sequence of i.i.d. \({{\mathbb {R}}}^n\)-valued random variables independent of X(0). Assume that the common distribution of \(\{W(k)\}_{k\in {{\mathbb {N}}}}\) has a nontrivial absolutely continuous component which is bounded away from zero in a neighborhood of the origin. Then the Markov process defined by

$$\begin{aligned} X(k+1) = F\bigl (X(k)\bigr )+W(k+1), \qquad k\ge 0, \end{aligned}$$

is irreducible, aperiodic, and all compact sets are petite (see [78, Proposition 5.2]). Further, assume that there exist constants \(l\in {{\mathbb {N}}}\), \(l\ge 2\), \(\varepsilon \in (0,1)\), and \(\bar{c},r>0\), such that

$$\begin{aligned} {{\mathbb {E}}}\bigl [|W_1|^l\bigr ]<\infty , \qquad \text {and} \qquad |F(x)| \,\le \,c|x|-\bar{c}|x|^{1-\varepsilon }\quad \forall \,x\in {\mathcal {B}}_r^c. \end{aligned}$$

Proposition 3.9

Under the above assumptions, the assertions of Theorem 1.1 (i) and (ii) hold with \({\mathcal {V}}(x)= |x|^l\), \(\phi (t)= t^{\nicefrac {(l-1)}{l}}\), and \(\eta = l-1\).

Proof

In [78, Proposition 5.2] it has been proved that the Foster–Lyapunov condition in (1.1) holds with \({\mathcal {V}}(x)\) and \(\phi (t)\) as above, and \(C={\mathcal {B}}_{\bar{r}}\) for some \(\bar{r}>0\). The result now follows from Theorem 1.1 (i) and (ii). \(\square \)

Ergodic properties of state-space models in the f-norm have been studied in [32, 78].

3.6 Backward Recurrence Time Chain

Let \(\{p_i\}_{i\ge 0}\subset (0,\infty )\) be such that \(p_0=1\), \(p_i<1\) for \(i\in {{\mathbb {N}}}\), and \(\prod _{j=0}^ip_j\rightarrow 0\), as \(i\rightarrow \infty \). Let \(\{X(k)\}_{k\ge 0}\) be a Markov process on \(\{0,1,\dotsc \}\) defined by the transition kernel \(p(i,i+1)=1-p(i,0):=p_i\) for \(i\ge 0\). The process \(\{X(k)\}_{k\ge 0}\) is irreducible and aperiodic, and it admits a unique invariant \(\uppi \in {\mathcal {P}}(\{0,1,\dotsc \})\) if, and only, if

$$\begin{aligned} c\,:=\,\sum _{i=1}^\infty \prod _{j=1}^ip_j<\infty . \end{aligned}$$

In this case, \(\uppi (0)=\uppi (1)=(2+c)^{-1}\), and \(\uppi (i)=(2+c)^{-1}\prod _{j=0}^{i-1}p_j\) for \(i\ge 2\).

Proposition 3.10

  1. (i)

    If there exist \(i_0\in {{\mathbb {N}}}\) and \(\alpha >1\), such that \(p_i=\frac{1+\alpha }{i}\) for \(i\ge i_0\), then the assertions of Theorem 1.1(i) and (ii) hold with

    $$\begin{aligned}&{\mathcal {V}}(i)= i^{\beta (1+\alpha )}+1,\quad \phi (t)= t^{1-\frac{1}{\beta (1+\alpha )}},\quad \text {and}\quad \eta = \beta (1+\alpha )-1\quad \text {for }\\&\beta \in [\nicefrac {2}{(1+\alpha )},1). \end{aligned}$$
  2. (ii)

    Under the assumptions in (i), \(\uppi \in {\mathcal {P}}_{\alpha -\iota }(\{0,1,\dotsc \})\) for \(\iota \in (0,\alpha )\). Next, fix \(\rho \in (0,(\alpha -1)\wedge 1)\) and \(\varepsilon \in [\rho ,1)\). Then, for every \(p\in [1,\alpha -\rho ]\) and \(\iota \in (0,1-\varepsilon )\) there exist a positive constant c and a diverging increasing sequence \(\{t_n\}_{n\in {{\mathbb {N}}}}\subset [0,\infty )\), depending on these parameters, such that (1.9) holds with \({\mathcal {V}}(i)\) as above, \(\theta = 1+\alpha -\rho \), and \(\vartheta = \alpha -\rho \).

Proof

  1. (i)

    In [24, Sect. 3] it has been shown that the Foster–Lyapunov condition in (1.1) holds with a Lyapunov function \(\bar{\mathcal {V}}(i)\) which asymptotically behaves like \({\mathcal {V}}(i)\), \(\phi (t)\) as above, and C being a finite set for any \(\alpha >0\) and \(\beta \in (0,1)\). Taking into account (1.2), the assertion follows.

  2. (ii)

    From the assumptions on the sequence \(\{p_i\}_{i\ge 0}\) we see that \(\lim _{i\rightarrow \infty }i^{1+\alpha }\,\uppi (i)>0\). Now, since \(\vartheta +\varepsilon -1-\alpha \ge -1\), we have \(\sum _{i=0}^\infty i^{\vartheta +\varepsilon }\,\uppi (i) = \infty \). The assertion now follows from Theorem 1.2 by taking \(L(i)= i\).

\(\square \)

4 Concluding Remarks

We remark on some other approaches in the study of exponential or subexponential ergodicity of Markov processes. By analyzing polynomial moments of hitting times of compact sets directly, polynomial ergodicity results are established in [80, Theorem 6] for a class of irreducible (with respect to the Lebesgue measure) and aperiodic diffusion processes. In a follow-up work, by using analogous techniques, the same author established polynomial ergodicity of a class of diffusion processes without directly assuming irreducibility and aperiodicity of the process, but employing instead a so-called (local) Dobrushin condition (also known as Markov-Dobrushin condition) [81, Theorem 6]. This approach is based on a Foster–Lyapunov condition of the form (1.1), and instead of assuming irreducibility and aperiodicity of \({\{X(t)\}_{t\ge 0}}\), it is assumed that (i) \({\mathcal {V}}(x)\) has precompact sub-level sets, and (ii) for every \(\delta >0\) there exists \(t_\delta \in {{\mathbb {T}}}\setminus \{0\}\) such that

$$\begin{aligned} \sup _{(x,y)\in \{(u,v):\,{\mathcal {V}}(u)+{\mathcal {V}}(v)\le \delta \}}\, \bigl \Vert p(t_\delta ,x,\text {d}z)-p(t_\delta ,y,\text {d}z)\bigr \Vert _{\text {TV}} < 1, \end{aligned}$$

(see [53, Chap. 3]). Observe that this condition actually means that for each (xy) satisfying \({\mathcal {V}}(x)+{\mathcal {V}}(y)\le \delta \) the probability measures \(p(t_\delta ,x,\text {d}z)\) and \(p(t_\delta ,y,\text {d}z)\) are not mutually singular. Intuitively, the Dobrushin condition encodes irreducibility and aperiodicity of \({\{X(t)\}_{t\ge 0}}\), and petiteness of sub-level sets of \({\mathcal {V}}(x)\). Based on these assumptions, and using an appropriate Markov coupling of \({\{M(t)\}_{t\ge 0}}\), it follows that the \(\Phi ^{-1}\)-modulated moment of the corresponding coupling time is finite and controlled by \({\mathcal {V}}(x)+{\mathcal {V}}(y)\). This then implies (sub)geometric ergodicity of \({\{X(t)\}_{t\ge 0}}\) in the total variation norm (see [38, Theorem 4.1] or [53, Chap. 3]).

We remark that irreducibility and aperiodicity (together with (1.1)) imply that the Dobrushin condition holds on the Cartesian product of any petite set with itself. Namely, according to [65, Proposition 6.1], for any petite set C there exists \(t_C\in {{\mathbb {T}}}\setminus \{0\}\) such that for the measure \(\upchi (\text {d}t)\) (in the definition of petiteness) the Dirac measure in \(t_C\) can be taken (together with some non-trivial measure \(\upnu _\upchi (\text {d}x)\)). Thus, \(p(t_C,x,B)\ge \upnu _\upchi (B)\) for any \(x\in C\) and \(B\in {\mathfrak {B}}({\mathbb {X}})\), which implies that

$$\begin{aligned} \sup _{(x,y)\in C\times C}\, \bigl \Vert p(t_C,x,\text {d}z)-p(t_C,y,\text {d}z)\bigr \Vert _{{\text {TV}}} < 1. \end{aligned}$$
(4.1)

If, in addition, \({\{X(t)\}_{t\ge 0}}\) is \(C_b\)-Feller (i.e. \(x\mapsto \int _{ {\mathbb {X}}}f(y)\,p(t,x,\text {d}y)\) is continuous and bounded for any \(t\in {{\mathbb {T}}}\) and any continuous and bounded function f(x)), and the support of the corresponding irreducibility measure has nonempty interior, then every compact set is petite (see [79, Theorems 5.1 and 7.1]) and thus (4.1) holds for any bounded set C. This shows that, at least in this particular situation, the approach based on the Dobrushin condition is more general than the approach based on irreducibility and aperiodicity. Situations where it has a clear advantage are discussed in [1, 54]. In [54], the author considers a Markov process obtained as a solution to a Lévy-driven SDE with highly irregular coefficients and noise term; while in [1], a diffusion process with highly irregular (discontinuous) drift function and uniformly elliptic diffusion coefficient has been considered. In these concrete situations it is not clear whether one can obtain irreducibility and aperiodicity of the processes, whereas the authors obtain (4.1) for any compact set C (see [54, Theorem 1.3] and [1, Lemma 3]). For additional results on ergodic properties of Markov processes based on the Dobrushin condition we refer the readers to [38, 53].