Keywords

9.1 Introduction

Let (W, d) be a separable Fréchet space, for two probability measures P and Q on \((W,\mathcal{B}(W))\), then the Wasserstein distance (cf. [11]) between P and Q, denoted as d W (P, Q), is defined as

$${d}_{W}^{2}(P,Q) =\inf \left \{\int\limits_{W\times W}d{(x,y)}^{2}\theta (\mathit{dx},\mathit{dy}) :\, \theta \in \Sigma (P,Q)\right \},$$

where Σ(P, Q) denotes the set of probability measures on W ×W whose first marginal is P and the second one is Q; note that this is a compact set under the weak topology, hence the infimum is always attained for any d (even lower semi-continuous). It is quite useful to find an upper bound for this distance, if possible dimension independent. There are a lot of works on this subject (cf. [11]), beginning by the contributions of M. Talagrand, cf. [10], where it is shown that the relative entropy is a fully satisfactory upper bound. In [5, 6], it is shown that the relative entropy is again an upper bound when P is the Wiener measure and d is the singular Cameron–Martin distance using the Girsanov theorem (cf. also [4]). The same method has also been employed in [12] and more recently in [8] to obtain a transportation cost inequality w.r.t. Banach norm for diffusion processes. The former assumes quite strong conditions on the coefficients which govern the diffusion which are superfluous and make difficult the applicability of the inequality, while the latter one treats essentially the one-dimensional case with an extension to the case where the diffusion coefficients are independent and their slight perturbations. Inspired with these works, we have attacked the general case: namely, the case of fully dependent diffusion like processes and their extensions and infinite dimensional diffusion processes governed with a cylindrical Brownian motion. Besides, there is a special class of diffusion processes with singular (dissipative) drifts which are constructed as weak limits of the Lipschitzian case where the approximating diffusions have Lipschitz continuous drifts but the Lipschitz constant explodes at the limit; this last class is particularly interesting because of their applications to physics.

To achieve this program, we need the following result about the stability of the transportation cost inequality under the weak limits of probability measures, which is proved by Djellout et al. in [4]. Since we make an important use of it, we give it with a (slightly different and more general) proof.

Lemma 9.1.

Assume that \(({P}_{k},k \geq 1)\) is a sequence of probability measures on a separable Fréchet space (W,d), converging weakly to a probability P. If

$${d}_{W}^{2}(Q,{P}_{ k}) \leq {c}_{k}\int\limits_{W} \frac{dQ} {d{P}_{k}}\log \frac{dQ} {d{P}_{k}}d{P}_{k} = {c}_{k}H(Q\vert {P}_{k})$$

for any k ≥ 1, for any probability Q, where c k > 0 are bounded constants, then the transportation inequality holds for P, namely

$${d}_{W}^{2}(Q,P) \leq cH(Q\vert P)\,,$$
(9.1)

where c = sup k c k .

Proof.

If \(f=\,dQ/dP\) is a bounded, continuous function, then the inequality (9.1) follows from the lower semi-continuity of the transportation cost w.r.t. the weak convergence and from the hypothesis since flogf is continuous and bounded. Due to the dominated convergence theorem, to prove the general case, it suffices to prove the case where f is P-essentially bounded and measurable. In this case, there exists a sequence of bounded, upper semi-continuous functions, say (f n , n ≥ 1), increasing to fP-almost surely. By the dominated convergence theorem, the measures \((\tilde{{f}}_{n}dP,\,n \geq 1)\) converge weakly to the measure fdP, where \(\tilde{{f}}_{n} = f/P({f}_{n})\). On the other hand, \(H(\tilde{{f}_{n}}dP\vert dP) \rightarrow H(fdP\vert P)\) again by the dominated convergence theorem. Hence, to prove the general case, it is sufficient to prove the inequality with f upper semi-continuous and bounded. Since we are on a Fréchet space, there exists a sequence of (positive) continuous functions decreasing to f which may be chosen uniformly bounded by taking the minimum of each with the upper bound of f, and the inequality (9.1) follows again due to the dominated convergence theorem.

9.2 Diffusion Type Processes with Lipschitz Coefficients

Let (W, H, μ) be the classical Wiener space, i.e., \(W\,=\,{C}_{0}([0,1],{\mathrm{I\!R}}^{d}),\,H\,={H}^{1}([0,1],{\mathrm{I\!R}}^{d})\) and μ is the Wiener measure under which the evaluation map at t ∈ [0, 1] is a Brownian motion. Suppose that \(X = ({X}_{t},t \in [0,1])\) is the solution of the following SDE (stochastic differential equation)

$$\begin{array}{rcl}{ \mathit{dX}}_{t}& =& \sigma (t,{X}_{t}){\mathit{dW }}_{\!t} + b(t,X)\mathit{dt} \\ {X}_{0}& =& z \in {\mathrm{I\!R}}^{d} \\ \end{array}$$

where \(\sigma : [0,1] \times {\mathrm{I\!R}}^{d} \rightarrow \otimes {\mathrm{I\!R}}^{d}\) is uniformly Lipschitz w.r.t. x with a Lipschitz constant being equal to K, \(b : [0,1] \times W \rightarrow {\mathrm{I\!R}}^{d}\) is adapted and such that

$$\vert b(t,\xi ) - b(t,\eta )\vert \leq K\sup_{s\leq t}\vert \xi (s) - \eta (s)\vert =\| \xi - {\eta \|}_{t}$$

for any \(\xi ,\eta \in W\). We denote by d W the Wasserstein distance on the probability measures on W defined by the uniform norm:

$${d}_{W}^{2}(\rho ,\nu ) =\inf \left (\int\limits_{W\times W}\|x - {y\|}^{2}d\gamma (x,y) :\, \gamma \in \Sigma (\rho ,\nu )\right )$$

where \(\Sigma (\rho ,\nu )\) the set of probabilities on W ×W whose first marginals are ρ and the second ones are ν. We have the following bound for d W :

Theorem 9.1.

Let P be the law of the solution of the SDE described above; then for any probability Q on \((W,\mathcal{B}(W))\) , we have

$${d}_{W}^{2}(P,Q) \leq 6\,{e}^{15{K}^{2} }H(Q\vert P)\,$$
(9.2)

where H(Q|P) is the relative entropy of Q w.r.t. P.

Proof.

Due to the rotation invariance of the Wiener measure, we can suppose without loss of generality that σ takes its values in the set of positive matrices. Suppose first that σ is strictly elliptic. From the general results about the SDE (cf. [7, 9]), the coordinate process x under the probability P can be written as

$${\mathit{dx}}_{t} = \sigma (t,{x}_{t})d{\beta }_{t} + b(t,x)\mathit{dt}$$

with x 0 = zP-a.s., where β is an \({\mathrm{I\!R}}^{d}\)-valued P-Brownian motion. At this point of the proof we need the following result, which is probably well known (cf. [9] and the references there), though we include its proof for the sake of completeness:

Lemma 9.2.

Any bounded P-martingale can be written as a stochastic integral w.r.t. β of an adapted process \(({\alpha }_{s},s \in [0,1])\) , with E P 0 1 s | 2 ds < ∞.

Proof.

Let us denote by P 0 the law of the solution of

$${\mathit{dX}}_{t} = \sigma (t,{X}_{t}){\mathit{dW }}_{\!t},$$

then under P 0, the coordinate process x can be written as

$$\mathit{dx} = \sigma (t,{x}_{t})d{\beta }_{t}^{0},$$

where β0 is a P 0-Brownian motion. Let Z be a bounded P-martingale with Z 0 = 0, assume that it is orthogonal to the Hilbert space of P-square integrable martingales written as the stochastic integrals w.r.t. β of the adapted processes. Let M be the exponential martingale defined as

$${M}_{t} =\exp \left (-\int\limits_{0}^{t}({\sigma }^{-1}(s,{x}_{ s})b(s,x),d{\beta }_{s}) -\frac{1} {2}\int\limits_{0}^{t}\vert {\sigma }^{-1}(s,{x}_{ s})b(s,x){\vert }^{2}\mathit{ds}\right ).$$

Then, we know from the uniqueness and the Girsanov theorem that MdP = dP 0, since M can be written as a stochastic integral w.r.t. β, our hypothesis implies that ZM is again a P-martingale, hence Z is a P 0-martingale, therefore, from the classical Markov case it can be written as

$$\begin{array}{rcl}{ Z}_{t}& =& \int\limits_{0}^{t}{H}_{ s}.d{\beta }_{s}^{0} \\ & =& \int\limits_{0}^{t}{H}_{ s}.(d{\beta }_{s} - {\sigma }^{-1}(s,{x}_{ s})b(s,x)\mathit{ds}).\end{array}$$

This last expression implies that

$$\langle Z,{Z\rangle }_{t} =\langle Z,\int\limits_{0}^{\cdot }{H}_{ s}.d{\beta {}_{s}\rangle }_{t}\,$$

but Z is orthogonal to the stochastic integrals of the form ∫α s . dβ s , hence \({Z}_{t} = {E}_{P}[{Z}_{t}] = 0\), which proves the claim.

Let us complete now the proof of the theorem: If Q is singular w.r.t. P, then there is nothing to prove due to the definition of the entropy. Let L be the Radon–Nikodym derivative dQ ∕ dP, we shall first suppose that L > 0 P-a.s. In this case we can write

$$L = \rho (-\delta v),$$

where \(v(t,x) =\int\limits_{0}^{t}\dot{{v}}_{s}(x)\mathit{ds}\), \(\dot{{v}}_{s}(x)\) is a.s. adapted and \(\int\limits_{0}^{1}\vert \dot{{v}}_{s}(x){\vert }^{2}\mathit{ds} < \infty \) a.s. and \(\delta v\,=\,\int\limits_{0}^{1}\dot{{v}}_{s}d{\beta }_{s}\). From the Girsanov theorem, \({z}_{t}\,=\,{\beta }_{t} +\int\limits_{0}^{t}\dot{{v}}_{s}\mathit{ds}\) is Q-Brownian motion, hence by the uniqueness of the solution of SDE, if we denote by x v the solution of the SDE given as

$${\mathit{dx}}_{t}^{v} = \sigma (t,{x}_{ t}^{v})d{z}_{ t} + {b}_{t}({x}^{v})\mathit{dt}\,$$

the image of Q under the solution map x v is equal to P, consequently \(({x}^{v} \times {I}_{W})(Q) \in \Sigma (P,Q)\), hence we have the following domination:

$${d}_{W}^{2}(P,Q) \leq {E}_{ Q}[\|{x}^{v} - {x\|}^{2}]$$

where \(\|\cdot \|\) denotes the uniform norm on W. Using Doob and Hölder inequalities, we get

$$\begin{array}{rcl}{ E}_{Q}[\sup_{r\leq t}\vert {x}_{r}^{v} - {x}_{ r}{\vert }^{2}]& \leq & (12 + 3t){K}^{2}{E}_{ Q}\int\limits_{0}^{t}\vert {x}_{ s}^{v} - {x}_{ s}{\vert }^{2}\mathit{ds} \\ & & +3t{E}_{Q}\int\limits_{0}^{t}\vert \dot{{v}}_{ s}{\vert }^{2}\mathit{ds}.\end{array}$$

It follows from the Gronwall lemma that

$${E}_{Q}[\sup_{r\leq t}\vert {x}_{r}^{v} - {x}_{ r}{\vert }^{2}] \leq 3t\,{E}_{ Q}\int\limits_{0}^{t}\vert \dot{{v}}_{ s}{\vert }^{2}\mathit{ds}\,{e}^{3{K}^{2}(4+t) }\,$$

since

$${E}_{Q}\int\limits_{0}^{1}\vert \dot{{v}}_{ s}{\vert }^{2}\mathit{ds} = 2H(Q\vert P)$$

the claim follows in the case P ∼ Q. For the case where Q ≪ P let

$${L}_{\epsilon } = \frac{L + \epsilon } {1 + \epsilon } \,,$$

then it is easy to see that \(({L}_{\epsilon }\log {L}_{\epsilon },\,\epsilon \leq {\epsilon }_{0})\) is P-uniformly integrable provided \({E}_{P}[L\log L] < \infty \). Hence the proof, in the strictly elliptic case, follows by the lower semi-continuity of Q → d W (P, Q). The general case follows by replacing σ by \(\epsilon {I}_{{\mathrm{I\!R}}^{d}} + \sigma \), then remarking that the corresponding probabilities \(({P}_{\epsilon },\epsilon \leq {\epsilon }_{0})\) converge weakly and that

$${d}_{W}^{2}({P}_{ \epsilon },Q) \leq 6\,{e}^{15{(\epsilon +K)}^{2} }H(Q\vert {P}_{\epsilon })\,$$

and hence it follows from Lemma 9.1 that

$${d}_{W}^{2}(P,Q) \leq 6\,{e}^{15{K}^{2} }H(Q\vert P)\,.$$

Since the inequality (9.2) is dimension independent, we can extend it easily to the infinite dimensional case:

Corollary 9.1.

Let M be a separable Hilbert space, suppose that B is a M-cylindrical Wiener process. Assume that \(\sigma : [0,1] \times M \rightarrow {L}_{2}(M,KM = M {\otimes }_{2}M\) (space of Hilbert–Schmidt operators on M) and \(b : [0,1] \times M \rightarrow M\) are uniformly Lipschitz with Lipschitz constant K. Let P be the law of the following SDE:

$${\mathit{dX}}_{t} = \sigma (t,{X}_{t})d{B}_{t} + b(t,{X}_{t})\mathit{dt}\,,{X}_{0} = x \in M\,.$$

Then the law of P satisfies the transportation cost inequality (9.2).

Proof.

Let (π n , n ≥ 1) be an sequence of orthogonal projections of M increasing to the identity, define \({\sigma }_{n} = {\pi }_{n}\sigma \circ {\pi }_{n}\), \({b}_{n} = {\pi }_{n}b \circ {\pi }_{n}\), B n = π n B, and \({x}^{n} = {\pi }_{n}x\). Let then P n be the law of the SDE

$${\mathit{dX}}_{t}^{n} = {\sigma }^{n}(t,{X}_{ t}^{n})d{B}_{ t}^{n} + {b}^{n}(t,{X}_{ t}^{n})\mathit{dt}\,,{X}_{ 0}^{n} = {x}^{n}\,.$$

From Theorem 9.1, P n satisfies the inequality (9.2) with a constant independent of n, since \(({P}^{n},n \geq 1)\) converges weakly to P, the proof follows from Lemma 9.1.

9.2.1 Transport Inequality with a Singular Cost Function

In the case of Wiener space, we can define a stronger Wasserstein metric using the Cameron–Martin norm as we have already done in [5, 6] as follows:

$${d}_{H}^{2}(P,Q) =\inf \left \{\int\limits_{W\times W}\vert x - y{\vert }_{H}^{2}\theta (\mathit{dx},\mathit{dy}) :\, \theta \in \Sigma (P,Q)\right \}.$$

Note that this distance is strictly stronger than d W and it is still lower semi-continuous with respect to the weak topology of measures on W. In the above cited references, we have proved the following inequality:

$${d}_{H}^{2}(\rho ,\mu ) \leq 2H(\rho \vert \mu )$$

for any measure ρ, where μ denotes the Wiener measure. This inequality can be extended to the class of diffusions whose diffusion coefficients are constant (it suffices to consider the case where it is equal to the identity matrix):

Theorem 9.2.

Assume that \(b : [0,1] \times {\mathrm{I\!R}}^{d} \rightarrow {\mathrm{I\!R}}^{d}\) is a K-Lipschitz map w.r.t. x uniformly in t ∈ [0,1]. Let P be the law of the solution of the following SDE:

$${\mathit{dX}}_{t} = b(t,{X}_{t})\mathit{dt} +{ \mathit{dW }}_{\!t},\,{X}_{0} = x\,.$$

Then the following transport inequality holds:

$${d}_{H}^{2}(P,Q) \leq 2(1 + 2{K}^{2}{e}^{2{K}^{2} })H(Q\vert P)\,.$$

Proof.

Using the same reasoning as in the proof of Theorem 9.1 and supposing first that dQ ∕ dP is strictly positive a.s., we reduce the problem to calculate (in the canonical space) the expectation of

$$\vert x - {x}^{v}{\vert }_{ H([0,t])}^{2} =\int\limits_{0}^{t}\vert b(s,{x}_{ s}) - b(s,{x}_{s}^{v}) -\dot{ {v}}_{ s}{\vert }^{2}\mathit{ds}$$

under the probability Q, the rest of the proof is the same and we get rid of the strict positivity hypothesis again using the lower semi-continuity of the cost function on the space of probabilities.

9.3 Transport Inequality for the Monotone Case

Assume that the Lipschitz property of the adapted drift coefficient is replaced by the following dissipativity hypothesis

$$(b(t,x) - b(t,y),{x}_{t} - {y}_{t}) \leq 0$$

for any t ∈ [0, 1] and x, y ∈ W, where, as before \((\cdot ,\cdot )\) denotes the scalar product in \({\mathrm{I\!R}}^{d}\). The derivative of a proper concave function on \({\mathrm{I\!R}}^{d}\) is a typical example of such drift. We shall suppose first that

$$\int\limits_{0}^{1}\vert b(t,x){\vert }^{2}\mathit{ds} < \infty $$

for any x ∈ W.

Proposition 9.1.

Assume that b is of linear growth, i.e., \(\vert b(t,x)\vert \leq N(1 +\| x\|)\) and let P be the law of the solution of the following SDE

$${ \mathit{dX}}_{t} = \sigma (t,{X}_{t}){\mathit{dW }}_{\!t} + b(t,X)\mathit{dt} + m(t,{X}_{t})\mathit{dt}$$
(9.3)

with \({X}_{0} = x \in {\mathrm{I\!R}}^{d}\) and that σ and \(m : [0,1] \times {\mathrm{I\!R}}^{d} \rightarrow {\mathrm{I\!R}}^{d}\) are uniformly K-Lipschitz w.r.t. the space variable. Then for any Q ≪ P, we have

$$\begin{array}{rcl}{ d}_{W}^{2}(P,Q)& \leq & \left (c{2}^{3/2}\|{\sigma \|}_{ \infty }^{3/2}{e}^{\frac{1} {2} ({K}^{2}+2K+1) }\right )\sqrt{ H(Q\vert P)} \\ & & +2\|{\sigma \|}_{\infty }{e}^{\frac{1} {2} ({K}^{2}+2K+1) }\left (1 + K(K + 2)){e}^{\frac{1} {2} ({K}^{2}+2K+1) }\right )H(Q\vert P)\,, \\ & & \end{array}$$
(9.4)

where \(\|{\sigma \|}_{\infty }\) is a uniform bound for σ, K is the Lipschitz constant, and c is the universal constant of Davis’ inequality for p = 1.

Proof.

Recall that under P, the coordinate process satisfies \(\mathit{dx}\,=\,\sigma (t,{x}_{t})d\beta + (b(t,x) + m(t,{x}_{t}))\mathit{dt}\), where β is a P-Brownian motion. Assume that Q is another probability on W such that Q ≪ P, let L be dQ ∕ dP. Suppose first that L > 0 P-almost surely. As explained in the first section, we can write L as an exponential martingale \(L\,=\,\rho (-\delta v)\), then x v(Q) = P, where x v is defined as before: \({\mathit{dx}}^{v} = \sigma (t,{x}_{t}^{v})(d{\beta }_{t} +\dot{ {v}}_{t}\mathit{dt}) + b(t,{x}^{v})\mathit{dt} + m(t,{x}_{t}^{v})\mathit{dt}\). Again by the uniqueness of the solutions, we have \(({x}^{v} \times {I}_{W})(Q) \in \Sigma (P,Q)\), hence

$${d}_{W}^{2}(P,Q) \leq {E}_{ Q}[\|{x}^{v} - {x\|}^{2}]\,.$$

It follows from the Itô formula, letting \(dz = d\beta +\dot{ v}\mathit{dt}\), that

$$\begin{array}{rcl} \vert {x}_{t}^{v} - {x}_{ t}{\vert }^{2}& =& 2\int\limits_{0}^{t}({x}_{ s}^{v} - {x}_{ s},{\mathit{dx}}_{s}^{v} -{\mathit{dx}}_{ s}) +\int\limits_{0}^{t}\vert \sigma (s,{x}_{ s}^{v}) - \sigma (s,{x}_{ s}){\vert }^{2}\mathit{ds} \\ & =& 2\int\limits_{0}^{t}({x}_{ s}^{v} - {x}_{ s},b(s,{x}^{v}) - b(s,x))\mathit{ds} \\ & & +\,2\int\limits_{0}^{t}({x}_{ s}^{v} - {x}_{ s},(\sigma (s,{x}_{s}^{v}) - \sigma (s,{x}_{ s}))d{z}_{s} + (m(s,{x}_{s}^{v}) - m(s,{x}_{ s}))\mathit{ds}) \\ & & +\int\limits_{0}^{t}\vert \sigma (s,{x}_{ s}^{v}) - \sigma (s,{x}_{ s}){\vert }^{2}\mathit{ds} - 2\int\limits_{0}^{t}({x}_{ s}^{v} - {x}_{ s},\sigma (s,{x}_{s}^{v})\dot{{v}}_{ s})\mathit{ds}.\end{array}$$

By the dissipative character of b, we get

$$\begin{array}{rcl} \vert {x}_{t}^{v} - {x}_{ t}{\vert }^{2}& \leq & 2\int\limits_{0}^{t}({x}_{ s}^{v} - {x}_{ s},(\sigma (s,{x}_{s}^{v}) - \sigma (s,{x}_{ s}))d{z}_{s} + (m(s,{x}_{s}^{v}) - m(s,{x}_{ s}))\mathit{ds}) \\ & & +\int\limits_{0}^{t}\vert \sigma (s,{x}_{ s}^{v}) - \sigma (s,{x}_{ s}){\vert }^{2}\mathit{ds} - 2\int\limits_{0}^{t}({x}_{ s}^{v} - {x}_{ s},\sigma (s,{x}_{s}^{v})\dot{{v}}_{ s}))\mathit{ds}.\end{array}$$

Using, the usual stopping techniques, we can suppose that the stochastic integral has zero expectation and taking the Q-expectation of both sides, we obtain

$$\begin{array}{rcl}{ E}_{Q}[\vert {x}_{t}^{v} - {x}_{ t}{\vert }^{2}]& \leq & (2K + {K}^{2})E\int\limits_{0}^{t}\vert {x}_{ s}^{v} - {x}_{ s}{\vert }^{2}\mathit{ds} \\ & & +2\|{\sigma \|}_{\infty }E\int\limits_{0}^{t}\vert {x}_{ s}^{v} - {x}_{ s}\vert \vert \dot{{v}}_{s}\vert \mathit{ds} \\ \end{array}$$

using the inequality \(xy \leq \delta ({x}^{2}/2) + ({y}^{2}/2\delta )\), we get

$${E}_{Q}[\vert {x}_{t}^{v} - {x}_{ t}{\vert }^{2}] \leq (2K + {K}^{2} + \delta \|{\sigma \|}_{ \infty }^{2})E\int\limits_{0}^{t}\vert {x}_{ s}^{v} - {x}_{ s}{\vert }^{2}\mathit{ds} + +\frac{2} {\delta }{H}_{t}(Q\vert P)\,,$$

where δ > 0 is arbitrary and \({H}_{t}(Q\vert P) = \int \log \frac{dQ} {dP}{\vert }_{{\mathcal{F}}_{t}}dQ\) is the entropy for the horizon [0, t], which is an increasing function of t. It follows from the Gronwall lemma that

$${E}_{Q}[\vert {x}_{t}^{v} - {x}_{ t}{\vert }^{2}] \leq \frac{2} {\delta }{H}_{t}(Q\vert P)\exp \left [t(2K + {K}^{2} + \delta \|{\sigma \|}_{ \infty }^{2})\right ]\,.$$
(9.5)

Using now the Davis’ inequality, the Lipschitz property, and the boundedness of σ, we get

$$\begin{array}{rcl} E [\sup_{r\leq t}\vert {x}_{r}^{v} - {x}_{ r}{\vert }^{2}]& \leq & (2c\|{\sigma \|}_{ \infty } + \sqrt{2}{H}_{t}{(Q\vert P)}^{1/2})E{\left [\int\limits_{0}^{t}\vert {x}_{ s}^{v} - {x}_{ s}{\vert }^{2}\mathit{ds}\right ]}^{1/2} \\ & & +K(K + 2)E\int\limits_{0}^{t}\vert {x}_{ s}^{v} - {x}_{ s}{\vert }^{2}\mathit{ds}\,, \\ \end{array}$$

where c is the universal constant of Davis’ inequality. Note that the right-hand side of the inequality (9.5) is monotone increasing in t, we insert it to the above inequality and minimize it w.r.t. δ for t = 1 and the proof is completed.

In fact we have another version of the inequality (9.5) in the case where σ is not bounded but still K-Lipschitz:

Proposition 9.2.

Assume that all the hypothesis of Proposition  9.1 are satisfied except the boundedness of σ which appears in the SDE (9.3), then we have the following transportation cost inequality:

$${d}_{W}^{2}(P,Q) \leq H(Q\vert P) \frac{2} {{(1 - acK)}^{2}}\exp \left ( \frac{1} {1 - acK}\left (\frac{cK} {a} + 1 - acK + 2K + {K}^{2}\right )\right )$$
(9.6)

where P is the law of the SDE (9.3), Q is any other probability, and a > 0 is arbitrary provided that acK < 1.

Proof.

The proof is somewhat similar to the proof of Proposition 9.1: in fact we control uniformly the stochastic integral term in the Itô development of | x t v − x t  | 2 as follows:

$$\begin{array}{rcl} & & E\left [\sup_{r\leq t}\left \vert \int\limits_{0}^{r}({x}_{ s}^{v} - {x}_{ s},(\sigma (s,{x}_{s}^{v}) - \sigma (s,{x}_{ s})d{z}_{s})\right \vert \right ] \\ & & \leq cE\left [{\left (\int\limits_{0}^{t}\vert {x}_{ s}^{v} - {x}_{ s}{\vert }^{2}\vert \sigma (s,{x}_{ s}^{v}) - \sigma (s,{x}_{ s}){\vert }^{2}\mathit{ds}\right )}^{1/2}\right ] \\ & & \leq cKE\left [{\left (\int\limits_{0}^{t}\vert {x}_{ s}^{v} - {x}_{ s}{\vert }^{4}\mathit{ds}\right )}^{1/2}\right ] \\ & & \leq cKE\left [{\left (\sup_{s\leq t}\vert {x}_{s}^{v} - {x}_{ s}{\vert }^{2}\int\limits_{0}^{t}\vert {x}_{ s}^{v} - {x}_{ s}{\vert }^{2}\right )}^{1/2}\right ] \\ & & \leq \frac{caK} {2} E\left [\sup_{s\leq t}\vert {x}_{s}^{v} - {x}_{ s}{\vert }^{2}\right ] + \frac{cK} {2a} E\int\limits_{0}^{t}\vert {x}_{ s}^{v} - {x}_{ s}{\vert }^{2}\mathit{ds}.\end{array}$$

Hence we get

$$\begin{array}{rcl} E\left [\sup_{s\leq t}\vert {x}_{s}^{v} - {x}_{ s}{\vert }^{2}\right ]& \leq & acKE\left [\sup_{ s\leq t}\vert {x}_{s}^{v} - {x}_{ s}{\vert }^{2}\right ] + \frac{cK} {a} E\int\limits_{0}^{t}\vert {x}_{ s}^{v} - {x}_{ s}{\vert }^{2}\mathit{ds} \\ & & +(2K + {K}^{2} + \delta )E\int\limits_{0}^{t}\vert {x}_{ s}^{v} - {x}_{ s}{\vert }^{2}\mathit{ds} + \frac{1} {\delta }E\int\limits_{0}^{t}\vert \dot{{v}}_{ s}{\vert }^{2}\mathit{ds}\,, \\ \end{array}$$

where a, δ > 0 are arbitrary, c is the constant of Davis’ inequality. From above, we obtain

$$\begin{array}{rcl} (1 - acK)E\left [\sup_{s\leq t}\vert {x}_{s}^{v} - {x}_{ s}{\vert }^{2}\right ]& \leq & \left (\frac{cK} {a} + 2K + {K}^{2} + \delta \right )E\int\limits_{0}^{t}\vert {x}_{ s}^{v} - {x}_{ s}{\vert }^{2}\mathit{ds} \\ & & +\frac{2} {\delta }{H}_{t}(Q\vert P) \\ \end{array}$$

and Gronwall lemma implies that

$$\begin{array}{rcl} E\left [\sup_{s\leq t}\vert {x}_{s}^{v} - {x}_{ s}{\vert }^{2}\right ]& \leq & \frac{2} {\delta (1 - acK)}{H}_{t}(Q\vert P) \\ & & \cdot \exp \left [ \frac{t} {1 - acK}\left (\frac{cK} {a} + \delta + 2K + {K}^{2}\right )\right ].\end{array}$$

Taking t = 1 and minimizing the r.h.s. of the last inequality w.r.t. δ completes the proof.

It is important to notice that we did not use any regularity property about b except that the integrability of t → b(t, x) for almost all x in an intermediate step. This observation means that we can deal with very singular drifts provided that they are dissipative. Let us give an application of Proposition 9.1 to multi-valued SDE (cf. [1]) from this point of view

Theorem 9.3.

Let P be the law of the process which is the solution of the following multi-valued stochastic differential equation:

$$m({X}_{t})\mathit{dt} + \sigma (t,{X}_{t}){\mathit{dW }}_{\!t} \in {\mathit{dX}}_{t} + A({X}_{t})\mathit{dt}\,,\,{X}_{0} = x \in D(A)\,,$$

where A is a maximal, monotone set-valued function (hence − A is dissipative), such that \(\mathit{Int}(D(A))\neq \varnothing \) . Assume that σ and m are uniformly K-Lipschitz and that σ is bounded. Then P satisfies the transportation cost inequality (9.4). If σ is only Lipschitz, but not necessarily bounded, then P satisfies the inequality (9.6).

Proof.

Let b n be the Yosida approximation of A, i.e., \({J}_{n} = {({I}_{{\mathrm{I\!R}}^{d}} + \frac{1} {n}A)}^{-1}\) and \(-{b}_{n} = n(I - {J}_{n})\) then b n is dissipative and Lipschitz, hence the law of the solution of the SDE

$${\mathit{dX}}_{t}^{n} = \sigma (t,{X}_{ t}){\mathit{dW }}_{\!t} + {b}_{n}({X}_{t}^{n})\mathit{dt} + m({X}_{ t}^{n})\mathit{dt}$$

satisfies the inequality (9.4) with the constants independent of n, moreover the law of \(({X}^{n},n \in \mathrm{I\!N})\) converges weakly to P (cf. [1]), hence P satisfies also the inequality (9.4) due to Lemma 9.1.

As an example of application of this theorem, let us give

Theorem 9.4.

Let P be the law of the solution of the following SDE:

$${\mathit{dX}}_{t}^{i} = m({X}_{ t}^{i})\mathit{dt} + \sigma ({X}_{ t}^{i}){\mathit{dW }}_{\! t}^{i} + \gamma \sum\limits_{1\leq j\neq i\leq d} \frac{1} {{X}_{t}^{i} - {X}_{t}^{j}}\mathit{dt}\,,\,i = 1,\ldots ,d\,,$$

with σ bounded and Lipschitz, γ > 0. Then P satisfies the transportation cost inequality (9.4) and if σ is not bounded but only Lipschitz, then P satisfies the inequality (9.6).

Proof.

It suffices to remark that the drift term following γ is the subdifferential of the concave function defined by

$$F(x) = \gamma \sum\limits_{i<j}\log ({x}^{j} - {x}^{i})$$

if \({x}^{1} < {x}^{2} < \ldots < {x}^{d}\) and it is equal to −  otherwise.

Remark 9.1.

For details about the equation of Theorem 9.4 cf. [2]. Moreover Theorem 9.3 is applicable to all the models given in [3].