Stochastic optimal transport revisited

Mikami, Toshio

doi:10.1007/s42985-020-00059-3

Stochastic optimal transport revisited

Original Paper
Published: 13 January 2021

Volume 2, article number 5, (2021)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

SN Partial Differential Equations and Applications Aims and scope Submit manuscript

Stochastic optimal transport revisited

Download PDF

Toshio Mikami¹

711 Accesses
6 Citations
1 Altmetric
Explore all metrics

Abstract

We prove the Duality Theorems for the stochastic optimal transportation problems with a convex cost function without a regularity assumption that is often supposed in the proof of the lower semicontinuity of an action integral. In our new approach, we prove that the stochastic optimal transportation problems with a convex cost function are equivalent to a class of variational problems for the Fokker–Planck equation, which lets us revisit them. It is done by the so-called superposition principle and by an idea from the Mather theory. The superposition principle is the construction of a semimartingale from the Fokker–Planck equation and can be considered a class of the so-called marginal problems that construct stochastic processes from given marginal distributions. It was first considered in stochastic mechanics by Nelson, called Nelson’s problem, and was proved by Carlen first. The semimartingale is called the Nelson process, provided it is Markovian. We also consider the Markov property of a minimizer of the stochastic optimal transportation problem with a nonconvex cost in a one-dimensional case. In the proof, the superposition principle and the minimizer of an optimal transportation problem with a concave cost function play crucial roles. Lastly, we prove the semiconcavity and the Lipschitz continuity of Schrödinger’s problem that is a typical example of the stochastic optimal transportation problem.

Existence of Solutions to the Nonlinear Kantorovich Transportation Problem

Article 21 October 2022

Stochastic Optimal Transport with at Most Quadratic Growth Cost

Article 15 May 2024

Continuity and Estimates for Multimarginal Optimal Transportation Problems with Singular Costs

Article 18 February 2017

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The construction of a stochastic process from given marginal distributions is called a marginal problem.

Schrödinger’s problem is the construction of a Markov diffusion process on [0, 1] from two endpoint marginal distributions at $t=0,1$ by solving a variational problem on the relative entropy. We describe it briefly (see $V_S$ in (1.19), (4.19), and also [28, 30, 44]). Let $\sigma$ and $\xi$ be, respectively, a $d\times d$ nondegenerate matrix-valued and an $\mathbb {R}^d$-valued function on $[0,1]\times \mathbb {R}^d$. Suppose that the following stochastic differential equation has a weak solution $\{X(t)\}_{0\le t\le 1}$ with a positive transition probability density $p(s,x;t,y), 0\le s<t\le 1, x,y\in \mathbb {R}^d$:

$$\begin{aligned} dX(t)=\xi (t,X(t))dt+\sigma (t,X(t))dW(t),\quad 0<t<1, \end{aligned}$$

(1.1)

where W(t) denotes a d-dimensional Brownian motion defined on a probability space (see Theorem 6 in Sect. 4). Let $\mathcal{P} (\mathbb {R}^d )$ denote the set of all Borel probability measures on $\mathbb {R}^d$. For any $P_0, P_1\in \mathcal{P} (\mathbb {R}^d )$, there exists a unique product measure $\nu _0(dx)\nu _1(dy)$ that satisfies the following:

$$\begin{aligned} P_1(dy) & = {} \nu _1(dy)\int _{\mathbb {R}^d}p(0,x,1,y)\nu _0(dx), \end{aligned}$$

(1.2)

$$\begin{aligned} P_0(dx) & = {} \nu _0(dx)\int _{\mathbb {R}^d}p(0,x,1,y)\nu _1(dy). \end{aligned}$$

(1.3)

This is Euler’s equation of Schrödinger’s problem and is called Schrödinger’s functional equation or the Schrödinger system (see [55, 56] and also [27] and Proposition 2.1 in [42]). Under some assumptions on $\sigma$ and $\xi$ (see, e.g. (A.5)–(A.6) in Sect. 4), if $P_1(dy)\ll dy$, then there exists a unique weak solution $\{Y(t)\}_{0\le t\le 1}$ to the following (see [28]) :

$$\begin{aligned} dY(t) & = {} \{a(t,Y(t))D_y\log h(t,Y(t))+\xi (t,Y(t))\}dt \end{aligned}$$

(1.4)

$$\begin{aligned}&+\sigma (t,Y(t))dW(t),\quad 0<t<1,\nonumber \\ P^{Y(0)} & = {} P_0, \end{aligned}$$

(1.5)

where $a(t,x):=\sigma (t,x)\sigma (t,x)^*$, $\sigma (t,x)^*$ denotes the transpose of $\sigma (t,x)$, $D_y:=\left( \partial /\partial y_i\right) _{i=1}^d$,

$$\begin{aligned} h(t,x):=\int _{\mathbb {R}^d}p(t,x,1,y)\nu _1(dy),\quad 0\le t<1, x\in \mathbb {R}^d, \end{aligned}$$

and $P^{Y(0)}$ denotes the probability law of Y(0). Besides, the following holds:

$$\begin{aligned} P^{(Y(0),Y(1))}(dxdy)=\nu _0(dx)p(0,x,1,y)\nu _1(dy), \end{aligned}$$

(1.6)

which implies that $P^{Y(1)}=P_1$ from (1.2). $\{Y(t)\}_{0\le t\le 1}$ is called the h-path process for $\{X(t)\}_{0\le t\le 1}$ from two endpoint marginals $P_0, P_1$ at $t=0,1$, respectively.

Remark 1

Schrödinger’s functional Eqs. (1.2)–(1.3) is equivalent to the following:

$$\begin{aligned} \overline{h}(1,y) & = {} \int _{\mathbb {R}^d}p(0,x,1,y) \left\{ \int _{\mathbb {R}^d}p(0,x,1,z)\overline{h}(1,z)^{-1}P_1(dz)\right\} ^{-1} P_0(dx), \end{aligned}$$

(1.7)

$$\begin{aligned} \nu _ 1(dy): & = {} \overline{h}(1,y)^{-1}P_1(dy), \end{aligned}$$

(1.8)

$$\begin{aligned} \nu _0(dx): & = {} \left\{ \int _{\mathbb {R}^d}p(0,x,1,z)\nu _ 1(dz)\right\} ^{-1}P_0(dx). \end{aligned}$$

(1.9)

In particular, one only has to find a solution $\overline{h}(1,\cdot )$ in (1.7).

Motivated by Schrödinger’s quantum mechanics, Nelson proposed the problem of the construction of a Markov diffusion process from the Fokker–Planck equation. We describe it. Let a and b be, respectively, a $d\times d$ symmetric nonnegative definite matrix-valued and an $\mathbb {R}^d$-valued function on $[0,1]\times \mathbb {R}^d$, and let $\{ P_t\}_{0\le t\le 1}\subset \mathcal{P} (\mathbb {R}^d )$.

By $(a,b)\in \mathbf{A} (\{P_t\}_{0\le t\le 1})$, we mean that $a,b\in L^1([0,1]\times \mathbb {R}^d, dtP_t(dx))$ and the following Fokker-Planck equation holds: for any $f \in C^{1,2}_b ([0,1]\times \mathbb {R}^d)$ and $t\in [0,1]$,

$$\begin{aligned}&\int _{\mathbb {R}^d}f (t,x)P_t(dx)-\int _{\mathbb {R}^d}f(0,x)P_0(dx)\nonumber \\&\quad =\int _0^t ds\int _{\mathbb {R}^d}\biggl (\partial _sf(s,x)+\frac{1}{2} \langle a(s,x), D_x^2f(s,x)\rangle \nonumber \\&\qquad + \langle b(s,x),D_x f (s,x)\rangle \biggr ) P_s(dx). \end{aligned}$$

(1.10)

Here $\partial _s:=\partial /\partial s$, $D_x^2:=\left( \partial ^2/\partial x_i\partial x_j\right) _{i,j=1}^d$, $\langle x,y\rangle$ denotes the inner product of $x, y\in \mathbb {R}^d$ and

$$\begin{aligned} \langle A,B\rangle :=\sum _{i,j=1}^d A_{ij}B_{ij},\quad A=(A_{ij})_{i,j=1}^d, B=(B_{ij})_{i,j=1}^d \in M(d,\mathbb {R}). \end{aligned}$$

We also write $(a,b)\in \mathbf{A}_0 (\{P_t\}_{0\le t\le 1})$ if $a,b\in L^1_{loc}([0,1]\times \mathbb {R}^d, dtP_t(dx))$ and (1.10) holds for all $f \in C^{1,2}_0 ([0,1]\times \mathbb {R}^d)$.

Remark 2

For $\{ P_t\}_{0\le t\le 1}$ in (1.10), $\mathbf{A} (\{P_t\}_{0\le t\le 1})$ is not necessarily a singleton (see [11,12,13,14, 33]).

The following is a generalized version of Nelson’s problem (see [47, 49, 50]).

Definition 1

(Nelson’s problem) For any $\{ P_t\}_{0\le t\le 1}\subset \mathcal{P} (\mathbb {R}^d )$ such that $\mathbf{A}_0 (\{P_t\}_{0\le t\le 1})$ is not empty and for any $(a,b)\in \mathbf{A}_0 (\{P_t\}_{0\le t\le 1})$, construct a $d\times d$ matrix-valued function $\sigma (t,x)$ on $[0,1]\times \mathbb {R}^d$ and a semimartingale $\{X(t)\}_{0\le t\le 1}$ such that the following holds: for $(t,x)\in [0,1]\times \mathbb {R}^d$,

$$\begin{aligned} a(t,x) & = {} \sigma (t,x)\sigma (t,x)^*,\quad dtP_t(dx)\hbox {-a.e.}, \end{aligned}$$

(1.11)

$$\begin{aligned} X(t) & = {} X(0)+\int _0^t b(s,X(s))ds+\int _0^t \sigma (s, X(s))dW_X(s), \end{aligned}$$

(1.12)

$$\begin{aligned} P^{X(t)} & = {} P_t. \end{aligned}$$

(1.13)

Here $W_X$ denotes a d-dimensional Brownian motion.

When $\sigma (t,x)$ is nondegenerate, $W_X$ can be taken to be an $(\mathcal{F}_t^X)$-Brownian motion, where $\mathcal{F}_t^X$ denotes $\sigma [X(s):0\le s\le t]$. Otherwise (1.12) means that $X(t)-X(0)-\int _0^t b(s,X(s))ds$ is a local martingale with a quadratic variational process $\int _0^t a (s, X(s))ds$ (see, e.g. [25]).

The first result on Nelson’s problem was given by E. Carlen when a is an identity matrix (see [8, 9], and also [10, 46, 63] for different approaches). We generalized it to the case with a variable diffusion matrix (see [33]). P. Cattiaux, C. Léonard extensively generalized it to the case where the jump-type Markov processes are also considered (see [11,12,13,14]). In these papers, they assumed the following condition.

Definition 2

(Finite energy condition (FEC))

There exists $(a,b) \in \mathbf{A} (\{P_t\}_{0\le t\le 1})$ such that the following holds:

$$\begin{aligned} \int _0^1 dt\int _{\mathbb {R}^d}\langle a(t,x)^{-1}b(t,x),b(t,x)\rangle P_t(dx)<\infty . \end{aligned}$$

(1.14)

We describe a class of stochastic optimal transportation problems (SOTPs for short) and approaches to the h-path process and Nelson’s problem by the SOTPs.

Fix a Borel measurable $d\times d$-matrix function $\sigma (t,x)$. Let $\mathcal{A}$ denote the set of all $\mathbb {R}^d$-valued, continuous semimartingales $\{ X(t)\}_{0\le t\le 1}$ on a (possibly different) complete filtered probability space such that there exists a Borel measurable $\beta _X :[0,1]\times C([0,1])\longrightarrow \mathbb {R}^d$ for which the following holds:

(i)
$\omega \mapsto \beta _X (t,\omega )$ is $\mathbf{B}(C([0,t]))_+$-measurable for all $t\in [0,1]$,
(ii)
$X(t)=X(0)+\int _0^t \beta _X (s,X)ds+\int _0^t \sigma (s,X(s))dW_X(s)$, $0\le t\le 1$,
(iii)
$$\begin{aligned} E\left[ \int _0^1\left\{ |\beta _X (t,X)|+|\sigma (t,X(t))|^2\right\} dt\right] <\infty . \end{aligned}$$
Here $\mathbf{B}(C([0,t]))$ and $\mathbf{B}(C([0,t]))_+$ denote the Borel $\sigma$-field of C([0, t]) and $\cap _{s> t}{} \mathbf{B}(C([0,s]))$, respectively (see, e.g. [31]). $|\cdot |:=\langle \cdot , \cdot \rangle ^{1/2}$.

Let $L:[0,1]\times \mathbb {R}^d \times \mathbb {R}^d\longrightarrow [0,\infty )$ be continuous. The following is a class of the SOTPs (see [41, 45], and also [33, 37, 44]).

Definition 3

(Stochastic optimal transportation problems)

(1)
For $P_0$, $P_1\in \mathcal{P} (\mathbb {R}^d )$,
$$\begin{aligned} V (P_0,P_{1}):=\inf _{\begin{array}{c} {\scriptstyle X\in \mathcal{A},} \\ {{\scriptstyle P^{X (t)}=P_t, t=0,1}} \end{array}}\ E\biggl [\int _0^1 L(t,X (t);\beta _X (t,X))dt \biggr ]. \end{aligned}$$
(1.15)
(2)
For $\{ P_t\}_{0\le t\le 1}\subset \mathcal{P} (\mathbb {R}^d )$,
$$\begin{aligned} \mathbf{V} (\{ P_t\}_{0\le t\le 1} ):=\inf _{\begin{array}{c} {\scriptstyle X\in \mathcal{A},} \\ {{\scriptstyle P^{X (t)}=P_t, 0\le t\le 1}} \end{array}} E\biggl [\int _0^1 L(t,X (t);\beta _X (t,X))dt \biggr ]. \end{aligned}$$
(1.16)

If the set over which the infimum is taken is empty, then we set the infimum for infinity.

Suppose that one knows the marginal probability distributions of a stochastic system at times $t=0, 1$ or $t\in [0,1]$. To study the stochastic system on [0, 1] from the viewpoint of the principle of least action, one has to consider these kinds of problems.

Remark 3

(i) The sets of stochastic processes over which the infimum are taken in (1.15)-(1.16) can be empty. If $P_1(dx)\ll dx$, then the case when it is not empty is known for (1.15) in [28] and for (1.16) in [5, 8,9,10,11,12,13,14, 33, 35,36,37, 40, 41, 46, 60, 63]. (ii) For $\{X(t)\}_{0\le t\le 1}\in \mathcal{A}$,

$$\begin{aligned}&(\sigma (t,x)\sigma (t,x)^*, b_X(t,x):=E[\beta _X(t,X)|(t, X(t)=x)])_{(t,x)\in [0,1]\times \mathbb {R}^d}\nonumber \\&\quad \in \mathbf{A} (\{P^{X(t)}\}_{0\le t\le 1}). \end{aligned}$$

(1.17)

Indeed, by Itô’s formula, (1.10) with $a=\sigma \sigma ^*, b=b_X$ holds and by Jensen’s inequality,

$$\begin{aligned} E[|b_X(t,X(t))|]=E[|E[\beta _X(t,X)|(t, X(t))]|]\le E[|\beta _X(t,X)|]. \end{aligned}$$

(1.18)

Schrödinger’s problem which is a typical example of the SOTP is $V_S:=V$ in (1.15) when the following holds:

$$\begin{aligned} L=\frac{1}{2}|\sigma (t,x)^{-1} (u-\xi (t,x))|^2 \end{aligned}$$

(1.19)

(see, e.g. [30, 44, 53]). If $V_S (P_0, P_1)$ is finite for $P_0$, $P_1\in \mathcal{P} (\mathbb {R}^d )$ and if $\sigma$ and $\xi$ satisfy nice conditions, then the minimizer uniquely exists and is the h-path process with two endpoint marginals $P_0, P_1$ in (1.4)–(1.5) (see [16, 21, 44, 45, 51, 62]).

By the continuum limit of $V (\cdot ,\cdot )$, we considered Nelson’s Problem in a more general setting, including the following case (see [33, 40]).

Definition 4

(Generalized finite energy condition (GFEC))

There exists $\gamma >1$ and $(a,b) \in \mathbf{A} (\{P_t\}_{0\le t\le 1})$ such that the following holds:

$$\begin{aligned} \int _0^1 dt\int _{\mathbb {R}^d}\langle a(t,x)^{-1}b(t,x),b(t,x)\rangle ^{\frac{\gamma }{2}} P_t(dx)<\infty . \end{aligned}$$

(1.20)

As an application of the Duality Theorem for $\mathbf{V}$, we also gave an approach to Nelson’s Problem under the condition which includes the GFEC (see [41]).

If (1.11)–(1.13) hold, then they also say that the superposition principle holds. When $a\equiv 0$, the superposition principle was studied in [1, 2, 36, 37]. Trevisan’s result [60] almost completely solved Nelson’s problem (see also [5, 19]). In the case where the linear operator with the second order differential operator and with the Lévy measure is considered, it was studied in [14, 52].

Theorem 1

(See [60]) Suppose that there exists $\{ P_t\}_{0\le t\le 1}\subset \mathcal{P} (\mathbb {R}^d )$ such that $(a,b)\in \mathbf{A} (\{P_t\}_{0\le t\le 1})$ exists.

Then Nelson’s problem (1.11)–(1.13) has a solution.

In his problem, Nelson considered the case where $a=Identity$ and $b=D_x \psi (t,x)$ for some function $\psi$. It turned out that the Nelson process is the minimizer of $\mathbf{V}_N:=\mathbf{V}$ when (1.19) with $\sigma =Identity$ and $\xi =0$ and the FEC hold (see Proposition 3.1 in [33] and also Theorem 4 in Sect. 2). Indeed, if $(a, D_x \psi _i)\in \mathbf{A} (\{P_t\}_{0\le t\le 1})$, $i=1,2$, then $D_x \psi _1=D_x \psi _2$, $dtP_t(dx)$-a.e.. In this sense, we consider that Nelson’s problem is the studies of the superposition principle and of the minimizer of $\mathbf{V}$. In particular, if the superposition principle holds, then the set over which the infimum is taken in $\mathbf{V}$ is not empty and then one can consider a minimizer of $\mathbf{V}$, provided it is finite. There was a different approach by showing Proposition 1 in Sect. 2 via the Duality Theorems in Theorems 3 and 4 in Sect. 2 (see [41] and also [33, 40]). It is also generalized by the superposition principle and our previous approach to the first part of Nelson’s problem is not useful anymore.

In Sect. 2, we improve our previous results on the SOTPs with a convex cost function by the superposition principle in Theorem 1.

More precisely, we prove that the SOTPs are equivalent to variational problems for probability measures given by the Fokker–Planck equation and to those by a relaxed version of the Fokker–Planck equation (see Proposition 1 in Sect. 2). In particular, we can prove the convexity and the lower-semicontinuity of the SOTPs in marginal distributions by a finite-dimensional approach though the SOTPs are variational problems for semimartingales. It gives a new insight into the SOTPs and lets us revisit them.

In Sect. 3, in the case where $d=1$ and where a is not fixed, we consider slightly relaxed versions of the SOTPs of which cost functions are not supposed to be convex. In this case, we need a generalization of Trevisan’s result which was recently obtained by Bogachev, Röckner, Shaposhnikov.

Theorem 2

(See [5]) Suppose that there exists $\{ P_t\}_{0\le t\le 1}\subset \mathcal{P} (\mathbb {R}^d )$ such that $(a,b)\in \mathbf{A}_0 (\{P_t\}_{0\le t\le 1})$ exists and that the following holds:

$$\begin{aligned} \int _0^1 dt\int _{\mathbb {R}^d}\frac{|a(t,x)|+|\langle x,b(t,x)\rangle |}{1+|x|^2}P_t(dx)<\infty . \end{aligned}$$

(1.21)

Then Nelson’s problem (1.11)–(1.13) has a solution.

As a fundamental problem of the stochastic optimal control theory, the test of the Markov property of a minimizer is known. We also discuss this problem for a finite-time horizon stochastic optimal control problem.

In Sect. 4, we study the semiconcavity and the Lipschitz continuity of Schrödinger’s problem $V_S$.

2 SOTPs with a convex cost

In this section, we discuss applications of D. Trevisan’s result to the Duality Theorems for the SOTPs in the case where $u\mapsto L(t,x;u)$ is convex and where $\sigma$ and $a=\sigma \sigma ^*$ in (1.11) are fixed. We write $b\in \mathbf{A} (\{P_t\}_{0\le t\le 1})$ if $(a,b)\in \mathbf{A} (\{P_t\}_{0\le t\le 1})$ for the sake of simplicity (see (1.10) for notation).

As a preparation, we introduce two classes of marginal problems which play crucial roles in the proof of the Duality Theorems for the SOTPs (see [40, 41]) and which will be proved to be equivalent to the SOTPs by D. Trevisan’s result.

The following can be considered as versions of the SOTPs for a flow of marginals which satisfy (1.10).

Definition 5

(SOTPs for marginal flows)

(1)
For $P_0$, $P_{1}\in \mathcal{P}(\mathbb {R}^d)$,
$$\begin{aligned} \mathrm{v}(P_0,P_{1}) :=\inf _{\begin{array}{c} {\scriptstyle b\in \mathbf{A} (\{Q_t\}_{0\le t\le 1}),} \\ {{\scriptstyle Q_t=P_t, t=0,1}} \end{array}} \int _0^1 dt \int _{\mathbf{R }^d}L(t,x;b(t,x))Q_t(dx). \end{aligned}$$
(2.1)
(2)
For $\{ P_t\}_{0\le t\le 1}\subset \mathcal{P}(\mathbb {R}^d)$,
$$\begin{aligned} \mathbf{v} (\{ P_t\}_{0\le t\le 1} ) :=\inf _{b\in \mathbf{A} (\{ P_t\}_{0\le t\le 1})} \int _0^1 dt\int _{\mathbb {R}^d} L(t,x;b(t,x))P_t(dx). \end{aligned}$$
(2.2)

For $\mu (dxdu)\in \mathcal{P} ( \mathbb {R}^d\times \mathbb {R}^d )$,

$$\begin{aligned} \mu _{1}(dx):=\mu (dx\times \mathbb {R}^d),\quad \mu _{2}(du):=\mu (\mathbb {R}^d\times du). \end{aligned}$$

(2.3)

We write $\nu (dtdxdu)\in \tilde{\mathcal{A}}$ if the following holds. (i) $\nu \in \mathcal{P} ([0,1]\times \mathbb {R}^d \times \mathbb {R}^d )$ and

$$\begin{aligned} \int _{[0,1]\times \mathbb {R}^d\times \mathbb {R}^d} (|a(t,x)|+|u|)\nu (dtdxdu)<\infty . \end{aligned}$$

(2.4)

(ii) $\nu (dtdxdu)=dt\nu (t,dxdu)$, $\nu (t,dxdu)\in \mathcal{P} ( \mathbb {R}^d\times \mathbb {R}^d )$, $\nu _{1}(t, dx), \nu _{2}(t, du)\in \mathcal{P} ( \mathbb {R}^d)$, $dt-$a.e. and $t\mapsto \nu _{1}(t, dx)$ has a weakly continuous version $\nu _{1,t}(dx)\in \mathcal{P} ( \mathbb {R}^d)$ for which the following holds: for any $t\in [0,1]$ and $f\in C^{1,2}_b ([0,1]\times \mathbb {R}^d)$,

$$\begin{aligned}&\int _{\mathbb {R}^d}f (t,x)\nu _{1,t}(dx)-\int _{\mathbb {R}^d}f(0,x)\nu _{1,0}(dx)\nonumber \\&\quad =\int _{[0,t]\times \mathbb {R}^d\times \mathbb {R}^d} \mathcal{L}_{s,x,u} f(s,x)\nu (dsdxdu). \end{aligned}$$

(2.5)

Here

$$\begin{aligned} \mathcal{L}_{s,x,u} f(s,x):=\partial _s f(s,x)+\frac{1}{2}\langle a(s,x),D_x^2 f(s,x)\rangle +\langle u,D_x f (s,x)\rangle . \end{aligned}$$

(2.6)

We introduce a relaxed version of the problem above (see [23] and references therein for related topics).

Definition 6

(SOTPs for marginal measures)

(1)
For $P_0$, $P_{1}\in \mathcal{P}(\mathbb {R}^d)$,
$$\begin{aligned} \tilde{\mathrm{v}}(P_0,P_{1}) :=\inf _{\begin{array}{c} {\scriptstyle \nu \in \tilde{\mathcal{A}},} \\ {{\scriptstyle \nu _{1,t}=P_t, t=0,1}} \end{array}} \int _{[0,1]\times \mathbb {R}^d\times \mathbb {R}^d}L(t,x;u)\nu (dtdxdu). \end{aligned}$$
(2.7)
(2)
For $\{ P_t\}_{0\le t\le 1}\subset \mathcal{P}(\mathbb {R}^d)$,
$$\begin{aligned} \tilde{\mathbf{v}} (\{ P_t\}_{0\le t\le 1} ) :=\inf _{\begin{array}{c} {\scriptstyle \nu \in \tilde{\mathcal{A}},} \\ {{\scriptstyle \nu _{1,t}=P_t, 0\le t\le 1}} \end{array}} \int _{[0,1]\times \mathbb {R}^d\times \mathbb {R}^d}L(t,x;u)\nu (dtdxdu). \end{aligned}$$
(2.8)

Remark 4

If $b\in \mathbf{A}(\{ P_t\}_{0\le t\le 1})$ and $X\in \mathcal{A}$, then $dtP_t(dx)\delta _{b(t,x)}(du)\in \tilde{\mathcal{A}}$ and $dtP^{(X(t),\beta _X(t,X))}(dxdu)\in \tilde{\mathcal{A}}$, respectively. Here $\delta _x$ denotes the delta measure on $\{x\}$. In particular, $dtP^{(X(t),\beta _X(t,X))}(dxdu)$ is the distribution of a $[0,1]\times \mathbb {R}^d\times \mathbb {R}^d$-valued random variable $(t,X(t),\beta _X(t,X))$. This is why we call (2.7)–(2.8) SOTPs for marginal measures (see also Lemma 1 given later). One can also identify $\{ P_t\}_{0\le t\le 1}\subset \mathcal{P}(\mathbb {R}^d)$ with $dtP_t(dx)\in \mathcal{P}([0,1]\times \mathbb {R}^d)$ when $\mathbf{V}, \mathbf{v}$ and $\tilde{\mathbf{v}}$ are considered (see Theorem 4 and also [41, 44]).

We introduce assumptions.

(A.0.0). (i) $\sigma _{ij}\in C_b([0,1]\times \mathbb {R}^d)$, $i,j=1,\ldots ,d$. (ii) $\sigma (\cdot )=(\sigma _{ij}(\cdot ))_{i,j=1}^d$ is a nondegenerate $d\times d$-matrix function on $[0,1]\times \mathbb {R}^d$.

(A.1). (i) $L\in C([0,1]\times \mathbb {R}^d \times \mathbb {R}^d;[0,\infty ))$. (ii) $\mathbb {R}^d\ni u\mapsto L(t,x;u)$ is convex for $(t,x)\in [0,1]\times \mathbb {R}^d$.

(A.2).

$$\begin{aligned} \lim _{|u|\rightarrow \infty } \frac{\inf \{L(t,x;u)|(t,x)\in [0,1]\times \mathbb {R}^d\}}{|u|}=\infty . \end{aligned}$$

The following proposition gives the relations among and the properties of three classes of the SOTPs stated in Definitions 3, 5, and 6 above. In particular, it implies that they are equivalent in our setting and why they are all called the SOTPs. It also implies the convexities and the lower semicontinuities of $V(P_0,P_{1})$ and $\mathbf{V} (\{ P_t\}_{0\le t\le 1})$.

Proposition 1

(i)
Suppose that (A.1) holds. Then the following holds:
$$\begin{aligned} V(P_0,P_{1})=\mathrm{v}(P_0,P_{1})=\tilde{\mathrm{v}}(P_0,P_{1}),\quad P_0, P_{1}\in \mathcal{P}(\mathbb {R}^d), \end{aligned}$$
(2.9)
$$\begin{aligned} \mathbf{V} (\{ P_t\}_{0\le t\le 1} )= \mathbf{v} (\{ P_t\}_{0\le t\le 1} )=\tilde{\mathbf{v}} (\{ P_t\}_{0\le t\le 1} ), \{ P_t\}_{0\le t\le 1}\subset \mathcal{P}(\mathbb {R}^d). \end{aligned}$$
(2.10)
(ii)
Suppose, in addition, that (A.0.0,i) and (A.2) hold. Then there exist minimizers X of $V (P_0,P_{1})$ and Y of $\mathbf{V} (\{ P_t\}_{0\le t\le 1} )$ for which
$$\begin{aligned} \beta _X (t,X)=b_X(t,X(t)),\quad \beta _Y (t,Y)=b_Y(t,Y(t)), \end{aligned}$$
(2.11)
provided $V (P_0,P_{1})$ and $\mathbf{V} (\{ P_t\}_{0\le t\le 1} )$ are finite, respectively (see (1.17) for notation).
(iii)
Suppose, in addition, that (A.0.0,ii) holds and that $\mathbb {R}^d\ni u\mapsto L(t,x;u)$ is strictly convex for $(t,x)\in [0,1]\times \mathbb {R}^d$. Then for any minimizers X of $V (P_0,P_{1})$ and Y of $\mathbf{V} (\{ P_t\}_{0\le t\le 1} )$, (2.11) holds and $b_X$ and $b_Y$ in (2.11) are unique on the support of $dtP^{X(t)}(dx)$ and $dtP^{Y(t)}(dx)$, respectively.

Remark 5

Let $c\in C(\mathbb {R}^d\times \mathbb {R}^d;[0,\infty ))$. For $P_0, P_{1}\in \mathcal{P}(\mathbb {R}^d)$,

$$\begin{aligned}&T_M(P_0, P_{1}):=\inf \left\{ \int _{\mathbb {R}^d}c(x,\varphi (x))P_0(dx)\biggl |P_0\varphi ^{-1}=P_1\right\} \nonumber \\&\quad \ge T(P_0, P_{1}):= \inf \left\{ \int _{\mathbb {R}^d\times \mathbb {R}^d}c(x,y)\mu (dxdy)\biggl | \mu _i=P_{i-1}, i=1,2\right\} \end{aligned}$$

(2.12)

(see (2.3) for notation). $T_M(P_0, P_{1})$ and $T(P_0, P_{1})$ are called Monge’s and Monge-Kantorovich’s problems, respectively. The second equalities in (2.9)–(2.10) are similar to the relation between Monge’s and Monge-Kantorovich’s problems since $\tilde{\mathrm{v}}$ and $\tilde{\mathbf{v}}$ are the infimums of linear functionals of measure (see, e.g. [51, 61]).

Before we prove Proposition 1, we state its application to the SOTPs.

For any $s\ge 0$ and $P\in \mathcal{P} (\mathbb {R}^d )$,

$$\begin{aligned} \mathbf{\Psi }_{P}(s):= \biggl \{ \nu \in \tilde{\mathcal{A}}\biggl |\nu _{1,0}=P, \int _{[0,1]\times \mathbb {R}^d\times \mathbb {R}^d}L(t,x;u)\nu (dtdxdu)\le s \biggl \}. \end{aligned}$$

(2.13)

Let $\mathcal{P} (\mathbb {R}^d )$ be endowed with a weak topology. Then the following is known.

Lemma 1

(See [41]) Suppose that (A.0.0,i) and (A.1)–(A.2) hold. Then for any $s\ge 0$ and compact set $K\subset \mathcal{P} (\mathbb {R}^d )$, the set $\cup _{P\in K}{} \mathbf{\Psi }_{P}(s)$ is compact in $\mathcal{P} ([0,1]\times \mathbb {R}^d \times \mathbb {R}^d )$.

Lemma 1 was given in [41] to prove the Duality Theorems for $\mathrm{v}(P_0,P_{1})$ and $\mathbf{v}(\{ P_t\}_{0\le t\le 1})$. By Proposition 1, it can be also used in the proof of the lower semicontinuities of ${V}(P_0,P_{1})$ and $\mathbf{V}(\{ P_t\}_{0\le t\le 1})$. Besides, we do not need the following assumption anymore.

(A).

$$\begin{aligned} \varDelta L(\varepsilon _1,\varepsilon _2):=\sup \frac{L(t,x;u)-L(s,y;u)}{1+L(s,y;u)} \rightarrow 0\quad \hbox {as} \varepsilon _1, \varepsilon _2\downarrow 0, \end{aligned}$$

(2.14)

where the supremum is taken over all (t, x) and $(s,y) \in [0,1]\times \mathbb {R}^d$ for which $|t-s|<\varepsilon _1$, $|x-y|<\varepsilon _2$ and over all $u\in \mathbb {R}^d$.

This assumption can be used to prove the lower semicontinuity of the following (see [26], Chapter 9.1):

$$\begin{aligned} AC([0,1];\mathbb {R}^d)\ni f\mapsto \int _0^1 L\left( t,f(t);\frac{df(t)}{dt}\right) dt. \end{aligned}$$

(2.15)

We state additional assumptions and the improved versions of the Duality Theorems for ${V}(P_0,P_{1})$ and $\mathbf{V}(\{ P_t\}_{0\le t\le 1})$.

(A.0). $\sigma _{ij}\in C^{1}_b ([0,1]\times \mathbb {R}^d)$, $i,j=1,\dots ,d$.

(A.3). (i) $\partial _t L(t,x;u)$ and $D_x L(t,x;u)$ are bounded on $[0,1]\times \mathbb {R}^d \times B_R$ for all $R>0$, where $B_R:=\{ x\in \mathbb {R}^d ||x|\le R\}$. (ii) $C_L$ is finite, where

$$\begin{aligned} C_L:=\sup \left\{ \frac{L(t,x;u)}{1+L(t,y;u)}\biggr |0\le t\le 1, x, y, u \in \mathbb {R}^d\right\} . \end{aligned}$$

(2.16)

$$\begin{aligned} H(t,x;z):=\sup \{\langle z,u\rangle -L(t,x;u)|u\in \mathbb {R}^d\}. \end{aligned}$$

(2.17)

The following is a generalization of [41], in that we do not need the nondegeneracy of a and the assumption (A) and can be proved almost in the same way as in [41] by Proposition 1 and by Lemma 1. Indeed, in our previous papers, by the nondegeneracy of a, we made use of the Cameron–Martin–Maruyama–Girsanov formula to prove the convexity of $P\mapsto V(P_0,P)$, which we can avoid by Proposition 1. The lower semicontinuity of $P\mapsto V(P_0,P)$ can be proved by Proposition 1 and by Lemma 1. In [59], they considered a similar problem and used a general property on the convex combination of probability measures on an enlarged space, which allows them not to assume the nondegeneracy of a, though they assumed a condition which is similar to (A).

One can also find details in [44] (see [24] for related topics). We refer readers to [15, 20, 29] on the viscosity solution.

Theorem 3

(Duality Theorem for V) Suppose that (A.0)–(A.3) hold. Then, for any $P_0$, $P_{1}\in \mathcal{P} (\mathbb {R}^d )$,

$$\begin{aligned} V(P_0,P_1) & = {} \mathrm{v}(P_0,P_{1})=\tilde{\mathrm{v}}(P_0,P_{1})\nonumber \\ & = {} \sup _{f\in C_b^\infty (\mathbb {R}^d)} \biggl \{\int _{\mathbb {R}^d}f(x)P_1 (dx)-\int _{\mathbb {R}^d}\varphi (0,x;f)P_0(dx)\biggr \}, \end{aligned}$$

(2.18)

where $\varphi (t,x;f)$ denotes the minimal bounded continuous viscosity solution to the following HJB Eqn: on $[0,1)\times \mathbb {R}^d$,

$$\begin{aligned} \partial _t\varphi (t,x)+ \frac{1}{2}\langle a(t,x), D_x^2 \varphi (t,x)\rangle +H(t,x;D_x \varphi (t,x)) & = {} 0,\nonumber \\ \varphi (1,\cdot ) & = {} f. \end{aligned}$$

(2.19)

We introduce the following condition to replace $\varphi$ in (2.18) by classical solutions to the HJB Eq. (2.19).

(A.4). (i) “$\sigma$ is an identity”, or “ $\sigma (\cdot )=(\sigma _{ij}(\cdot ))_{i,j=1}^d$ is uniformly nondegenerate, $\sigma _{ij}\in C^{1,2}_b ([0,1]\times \mathbb {R}^d)$, $i,j=1,\ldots ,d$, and there exist functions $L_1$ and $L_2$ so that $L=L_1 (t,x)+L_2 (t,u)$”. (ii) $L(t,x;u)\in C^1([0,1]\times \mathbb {R}^d \times \mathbb {R}^d;[0,\infty ))$ and is strictly convex in u. (iii) $L\in C^{1,2,0}_b ([0,1]\times \mathbb {R}^d\times B_R )$ for any $R>0$.

Since (A.4,i), (A.4,ii), and (A.4,iii) imply (A.0), (A.1), and (A.3,i), respectively, the following holds from Theorem 3, in the same way as in [41] (see also [44]).

Corollary 1

Suppose that (A.2), (A.3,ii), and (A.4) hold. Then (2.18) holds even if the supremum is taken over all classical solutions $\varphi \in C_b^{1,2} ([0,1]\times \mathbb {R}^d)$ to the HJB Eqn (2.19). Besides, for any $P_0, P_1\in \mathcal{P}(\mathbb {R}^d)$ for which $V (P_0, P_1)$ is finite, a minimizer $\{X(t)\}_{0\le t\le 1}$ of $V (P_0, P_1)$ exists and the following holds: for any maximizing sequence $\{\varphi _n\}_{n\ge 1}$ of (2.18),

$$\begin{aligned} 0 & = {} \lim _{n\rightarrow \infty } E\biggl [\int _0^1 |L(t,X (t);\beta _X(t,X)) \nonumber \\&-\{\langle \beta _X(t,X), D_x \varphi _n (t,X(t))\rangle -H(t,X (t);D_x\varphi _n(t,X(t)))\}|dt\biggr ]. \end{aligned}$$

(2.20)

In particular, there exists a subsequence $\{n_k \}_{k\ge 1}$ for which

$$\begin{aligned} \beta _X(t,X) = \lim _{k\rightarrow \infty } D_z H(t,X(t);D_x \varphi _{n_k}(t,X(t))),\quad dtdP\hbox {-a.e.} \end{aligned}$$

(2.21)

The following is also a generalization of [41] and can be proved almost in the same way as in [41] by Proposition 1 and Lemma 1.

Theorem 4

(Duality Theorem for $\mathbf{V}$) Suppose that (A.0)-(A.3) hold. Then for any $\mathbf{P}:=\{ P_t\}_{0\le t\le 1}\subset \mathcal{P} (\mathbb {R}^d )$,

$$\begin{aligned}&\mathbf{V} (\mathbf{P})=\mathbf{v} (\mathbf{P})=\tilde{\mathbf{v}} (\mathbf{P} )\nonumber \\&\quad =\sup _{f\in C_b^\infty ([0,1]\times \mathbb {R}^d)} \biggl \{\int _0^1\int _{\mathbb {R}^d}f(t,x)dtP_t (dx) -\int _{\mathbb {R}^d}\phi (0,x;f)P_0(dx)\biggr \}, \end{aligned}$$

(2.22)

where $\phi (t,x;f)$ denotes the minimal bounded continuous viscosity solution of the following HJB Eqn: on $[0,1)\times \mathbb {R}^d$,

$$\begin{aligned} \partial _t\phi (t,x)+\frac{1}{2}\langle a(t,x), D_x^2 \phi (t,x)\rangle +H(t,x;D_x \phi (t,x))+f(t,x) & = {} 0,\nonumber \\ \phi (1,x) & = {} 0. \end{aligned}$$

(2.23)

Suppose that (A.4) holds instead of (A.0), (A.1), and (A.3,i). Then (2.22) holds even if the supremum is taken over all classical solutions $\phi \in C^{1,2}_b ([0,1]\times \mathbb {R}^d)$ to the HJB Eqn (2.23). Besides, if $\mathbf{V} (\mathbf{P})$ is finite, then a minimizer $\{X(t)\}_{0\le t\le 1}$ of $\mathbf{V} (\mathbf{P})$ exists and the following holds: for any maximizing sequence $\{\phi _n\}_{n\ge 1}$ of (2.22),

$$\begin{aligned} 0 & = {} \lim _{n\rightarrow \infty } E\biggl [\int _0^1 |L(t,X (t);\beta _X(t,X)) \nonumber \\&-\{\langle \beta _X(t,X), D_x \phi _n (t,X(t))\rangle -H(t,X (t);D_x\phi _n(t,X(t)))\}|dt\biggr ]. \end{aligned}$$

(2.24)

In particular, there exists a subsequence $\{n_k \}_{k\ge 1}$ for which

$$\begin{aligned} \beta _X(t,X) = \lim _{k\rightarrow \infty } D_z H(t,X(t);D_x \phi _{n_k}(t,X(t))),\quad dtdP\hbox {-a.e.} \end{aligned}$$

(2.25)

Remark 6

(See [41, 44]) (i) Suppose that (A.0)–(A.3) hold. Then for any $f\in UC_b (\mathbb {R}^d)$, the following is the minimal bounded continuous viscosity solution of the HJB equation (2.19):

$$\begin{aligned} \varphi (t,x;f)= \sup _{X\in \mathcal{A}_t, X (t)=x}E\biggl [f(X(1))-\int _t^1 L(s, X(s);\beta _X(s,X))ds\biggl ], \end{aligned}$$

(2.26)

where $\mathcal{A}_t$ denotes $\mathcal{A}$ with a time interval [0, 1] replaced by [t, 1]. (ii) Suppose that (A.0)–(A.3) with L replaced by $L(t,x;u)-f(t,x)$ hold. Then the following is the minimal bounded continuous viscosity solution of the HJB Eq. (2.23):

$$\begin{aligned} \phi (t,x;f)= \sup _{X\in \mathcal{A}_t,X (t)=x} E\biggl [\int _t^1 \left\{ f(s,X(s))-L(s, X(s);\beta _X(s,X))\right\} ds\biggl ]. \end{aligned}$$

(2.27)

We consider Schrödinger’s and Nelson’s problems, i.e., $V_S$ and $V_N$. We introduce a new assumption.

(A.4)’. (1.19) holds, $\sigma (\cdot )=(\sigma _{ij}(\cdot ))_{i,j=1}^d$ is uniformly nondegenerate, and $a\in C^{1,2}_b ([0, 1] \times \mathbb {R}^d;M(d,\mathbb {R})), \xi \in C^{1,2}_b ([0, 1] \times \mathbb {R}^d;\mathbb {R}^d)$.

(A.4)’ implies (A.0)-(A.3). Besides, for $f\in C^3_b (\mathbb {R}^d)$ and $f\in C^{1,2}_b ([0,1]\times \mathbb {R}^d)$, the HJB equations (2.19) and (2.23) have unique classical solutions in $C^{1,2}_b ([0,1]\times \mathbb {R}^d)$, respectively. They are also the minimal bounded continuous viscosity solutions of (2.19) and (2.23), respectively, since they have the same representation formulas given in Remark 6 (see, e.g. [20, 22] on classical solutions and Lemma 4.5 in [41] on viscosity solution). In particular, the following holds though (A.4)’ does not imply (A.4).

Corollary 2

Suppose that (A.4)’ holds. Then the assertions in Corollary 1 and Theorem 4 hold.

Remark 7

If (1.19) holds, then

$$\begin{aligned} L(t,x;u)- \{\langle u, z\rangle -H(t,x;z)\} =\frac{1}{2}|\sigma (t,x)^{-1}(a(t,x)z-u+\xi (t,x))|^2. \end{aligned}$$

(2.28)

In the rest of this section, we prove Proposition 1.

Proof of Proposition 1

We prove (i). For $\{X(t)\}_{0\le t\le 1}\in \mathcal{A}$, by Jensen’s inequality,

$$\begin{aligned} E\biggl [\int _0^1 L(t,X (t);\beta _X (t,X))dt \biggr ] \ge E\biggl [\int _0^1 L(t,X (t);b_X(t,X(t)))dt \biggr ]. \end{aligned}$$

(2.29)

Theorem 1 implies the first equalities of (2.9)–(2.10) (see Remark 3, (ii)).

For $\nu \in \tilde{\mathcal{A}}$,

$$\begin{aligned} b_\nu (t,x):=\int _{\mathbb {R}^d}u\nu (t,x,du), \end{aligned}$$

(2.30)

where $\nu (t,x,du)$ denotes a regular conditional probability of $\nu$ given (t, x). Then by Jensen’s inequality,

$$\begin{aligned} \int _{[0,1]\times \mathbb {R}^d\times \mathbb {R}^d}L(t,x;u)\nu (dtdxdu) \ge \int _0^1 dt\int _{\mathbb {R}^d}L(t,x;b_\nu (t,x))\nu _{1,t} (dx). \end{aligned}$$

(2.31)

$b_\nu \in \mathbf{A} (\{\nu _{1,t}\}_{0\le t\le 1})$ from (2.5), since by Jensen’s inequality,

$$\begin{aligned} \int _{[0,1]\times \mathbb {R}^d}|b_\nu (t,x)|dt\nu _{1,t}(dx) \le \int _{[0,1]\times \mathbb {R}^d\times \mathbb {R}^d}|u|\nu (dtdxdu)<\infty , \end{aligned}$$

and for any $t\in [0,1]$ and $f\in C^{1,2}_b ([0,1]\times \mathbb {R}^d)$,

$$\begin{aligned}&\int _{[0,t]\times \mathbb {R}^d\times \mathbb {R}^d} \langle u, D_x f(s,x)\rangle \nu (dsdxdu)\nonumber \\&\quad =\int _0^t ds \int _{\mathbb {R}^d} \langle b_\nu (s,x), D_x f(s,x)\rangle \nu _{1,s} (dx). \end{aligned}$$

(2.32)

This implies the second equalities of (2.9)–(2.10) (see Remark 4).

The proof of (ii) is done by Lemma 1, (2.32), and Theorem 1.

We prove (iii). From (2.29) and the strict convexity of $u\mapsto L(t,x;u)$, (2.11) holds. For $b\in \mathbf{A} (\{P_t\}_{0\le t\le 1})$, $P_t(dx)\ll dx$, dt-a.e. from (A.0.0,ii), since $a, b\in L^1([0,1]\times \mathbb {R}^d,dtP_t(dx))$ (see [4], p. 1042, Corollary 2.2.2). For $\{p_{i}(t,x)dx\}_{0\le t\le 1}\subset \mathcal{P}(\mathbb {R}^d)$, $b_i\in \mathbf{A} (\{p_{i}(t,x)dx\}_{0\le t\le 1})$, $i=0,1$, and $\lambda \in [0,1]$,

$$\begin{aligned} p_\lambda :=(1-\lambda ) p_0+\lambda p_1 ,\quad b_\lambda :=1_{(0,\infty )}(p_\lambda )\frac{(1-\lambda )p_0b_0+\lambda p_{1}b_1}{p_\lambda }, \end{aligned}$$

(2.33)

where $1_A(x)$ denotes an indicator function of $A\subset \mathbb {R}$. Then $b_\lambda \in \mathbf{A}(\{p_\lambda (t,x)dx\}_{0\le t\le 1})$ and

$$\begin{aligned}&\int _0^1 dt \int _{\mathbf{R }^d}L(t,x;b_\lambda (t,x))p_\lambda (t,x)dx\nonumber \\&\quad \le (1-\lambda )\int _0^1 dt \int _{\mathbf{R }^d}L(t,x;b_0 (t,x))p_0(t,x)dx\nonumber \\&\qquad +\lambda \int _0^1 dt \int _{\mathbf{R }^d}L(t,x;b_1 (t,x))p_1 (t,x)dx. \end{aligned}$$

(2.34)

Here the equality holds if and only if $b_0=b_1$ dtdx-a.e. on the set $\{(t,x)\in [0,1]\times \mathbb {R}^d| p_0 (t,x)p_1 (t,x)>0\}$. $\square$

3 Stochastic optimal transport with a nonconvex cost

In this section, in the case where $d=1$ and where a is not fixed, we consider slightly relaxed versions of the SOTPs of which cost functions are not supposed to be convex. As a fundamental problem of the stochastic optimal control theory, the test of the Markov property of a minimizer of a stochastic optimal control problem is known. We also consider the Markov property of the minimizer of a finite-time horizon stochastic control problem. Our previous result [35] proved it in a one-dimensional case by the optimal transportation problem with a concave cost. We generalize it by Theorem 2 in Sect. 1.

Since a is not fixed in this section, we consider a new class of semimartingales.

Let $u=\{u(t)\}_{0\le t\le 1}$ and $\{W(t)\}_{0\le t\le 1}$ be a progressively measurable real valued process and a one-dimensional Brownian motion on the same complete filtered probability space, respectively. The probability space under consideration is not fixed in this section. Let $\sigma :[0,1]\times \mathbb {R}\longrightarrow \mathbb {R}$ be a Borel measurable function. Let $Y^{u,\sigma }=\{Y^{u,\sigma }(t)\}_{0\le t\le 1}$ be a continuous semimartingale such that the following holds weakly:

$$\begin{aligned} Y^{u,\sigma }(t)=Y^{u,\sigma }(0)+\int _0^t u (s)ds +\int _0^t \sigma (s,Y^{u,\sigma }(s))dW(s), \quad 0\le t\le 1, \end{aligned}$$

(3.1)

provided it exists.

For $r> 0$,

$$\begin{aligned} \mathcal{U}_{r}: & = {} \left\{ (u,\sigma ) \biggl | E\left[ \int _0^1 \left( \frac{\sigma (t,Y^{u,\sigma }(t))^2}{1+|Y^{u,\sigma }(t)|^2}+|u(t)|\right) dt\right] <\infty ,|\sigma |\ge r\right\} , \end{aligned}$$

(3.2)

$$\begin{aligned} \mathcal{U}_{r,Mar}: & = {} \left\{ (u,\sigma ) \in \mathcal{U}_{r}| u(\cdot )=b_{Y^{u,\sigma }}(\cdot ,Y^{u,\sigma }(\cdot ))\right\} , \end{aligned}$$

(3.3)

where $b_{Y^{u,\sigma }}(t,Y^{u,\sigma }(t)):=E[u(t)|(t,Y^{u,\sigma }(t))]$. For $(u,\sigma )\in \mathcal{U}_{r}$,

$$\begin{aligned} F^{Y^{u, \sigma }}_t (x): & = {} P(Y^{u, \sigma }(t)\le x), \end{aligned}$$

(3.4)

$$\begin{aligned} G^u_t (x): & = {} P(u(t)\le x), \end{aligned}$$

(3.5)

$$\begin{aligned} \tilde{b}_{u, Y^{u, \sigma }}(t,x): & = {} (G^u_t)^{-1} (1-F^{Y^{u, \sigma }}_t (x)),\quad (t,x)\in [0,1]\times \mathbb {R}. \end{aligned}$$

(3.6)

Here for a distribution function F on $\mathbb {R}$,

$$\begin{aligned} F^{-1}(v):=\inf \{x\in \mathbb {R}| F(x)\ge v\},\quad 0<v<1. \end{aligned}$$

$F^{-1}$ is called the quasi-inverse of F (see, e.g. [48, 51, 57]).

$$\begin{aligned} p^{Y^{u, \sigma }}(t,x):=\frac{P^{Y^{u,\sigma }(t)}(dx)}{dx} \end{aligned}$$

(3.7)

exists dt-a.e. since r is positive and $(\sigma ^2, b_{Y^{u,\sigma }})\in \mathbf{A}_0(\{P^{Y^{u,\sigma }(t)}\}_{0\le t\le 1})$ (see [4], p. 1042, Corollary 2.2.2). Indeed, by Jensen’s inequality,

$$\begin{aligned} \int _{\mathbb {R}} |b_{Y^{u, \sigma }}(t,y)|p^{Y^{u, \sigma }}(t,y)dy =E[|E[u(t)|(t,Y^{u, \sigma }(t))]|]\le E[|u(t)|]. \end{aligned}$$

(3.8)

From the idea of covariance kernels (see [6, 7, 34, 39]),

$$\begin{aligned}&\tilde{a}_{u,Y^{u, \sigma }}(t,x)\nonumber \\&\quad :=1_{(0,\infty )} (p^{Y^{u, \sigma }}(t,x)) \frac{2\int _{-\infty }^x (\tilde{b}_{u,Y^{u, \sigma }}(t,y) -b_{Y^{u, \sigma }}(t,y))p^{Y^{u, \sigma }}(t,y)dy}{p^{Y^{u, \sigma }}(t,x)}. \end{aligned}$$

(3.9)

The following holds and will be proved later.

Theorem 5

Let $r> 0$. For $(u,\sigma )\in \mathcal{U}_{r}$, there exists $\tilde{u}$ such that $(\tilde{u},\tilde{\sigma }:=(\sigma ^2+\tilde{a}_{u,Y^{u,\sigma }})^{\frac{1}{2}})\in \mathcal{U}_{r,Mar}$ and that the following holds:

$$\begin{aligned} P^{Y^{\tilde{u},\tilde{\sigma }}(t)} & = {} P^{Y^{u,\sigma }(t)}, \quad t\in [0,1], \end{aligned}$$

(3.10)

$$\begin{aligned} b_{Y^{\tilde{u},\tilde{\sigma }}} & = {} \tilde{b}_{u,Y^{u,\sigma }}, \end{aligned}$$

(3.11)

$$\begin{aligned} P^{b_{Y^{\tilde{u},\tilde{\sigma }}}(t,Y^{\tilde{u},\tilde{\sigma }}(t))} & = {} P^{u(t)},\quad dt\mathrm{-a.e.} \end{aligned}$$

(3.12)

For $r> 0$ and $\{P_t\}_{0\le t\le 1}\subset \mathcal{P}(\mathbb {R})$,

$$\begin{aligned} \mathbf{A}_{0,r}(\{P_t\}_{0\le t\le 1}): & = {} \biggl \{ (a,b) \in \mathbf{A}_{0}(\{P_t\}_{0\le t\le 1})\biggl | a\ge r^2,\nonumber \\&\int _0^1 dt\int _{\mathbb {R}^d}\left( \frac{a(t,x)}{1+|x|^2}+|b(t,x)|\biggl )P_t(dx)<\infty \right\} . \end{aligned}$$

(3.13)

Let $L_1, L_2:[0,1]\times \mathbb {R}\longrightarrow [0,\infty )$ be Borel measurable.

For $(u,\sigma )$,

$$\begin{aligned} J(u,\sigma ):=E\biggl [\int _0^1 (L_1 (t,Y^{u,\sigma } (t))+L_2 (t,u(t)))dt\biggl ]. \end{aligned}$$

(3.14)

For $(a,b)\in \mathbf{A}_{0}(\{P_t\}_{0\le t\le 1})$,

$$\begin{aligned} I(\{P_t\}_{0\le t\le 1},a,b):=\int _0^1 dt\int _{\mathbb {R}^d}(L_1 (t,x)+L_2(t,b(t,x)))P_t(dx). \end{aligned}$$

(3.15)

One easily obtains the following from Theorems 2 and 5.

Corollary 3

Suppose that $L_1, L_2:[0,1]\times \mathbb {R}\longrightarrow [0,\infty )$ are Borel measurable.

Then for any $r> 0$, the following holds. (i) For any $P_0,P_1\in \mathcal{P}(\mathbb {R})$,

$$\begin{aligned}&\inf \{J(u,\sigma )| (u,\sigma )\in \mathcal{U}_{r}, P^{Y^{u,\sigma }(t)}=P_t, t=0,1\}\nonumber \\&\quad =\inf \{J(u,\sigma )|(u,\sigma )\in \mathcal{U}_{r, Mar}, P^{Y^{u,\sigma }(t)}=P_t, t=0,1\}\nonumber \\&\quad =\inf \{I(\{Q_t\}_{0\le t\le 1},a,b)| (a,b)\in \mathbf{A}_{0,r}(\{Q_t\}_{0\le t\le 1}), Q_t=P_t, t=0,1\}. \end{aligned}$$

(3.16)

In particular, if there exists a minimizer in (3.16), then there exists a minimizer $(u,\sigma )\in \mathcal{U}_{r, Mar}$. (ii) For any $\{P_t\}_{0\le t\le 1}\subset \mathcal{P}(\mathbb {R})$,

$$\begin{aligned}&\inf \{J(u,\sigma )| (u,\sigma )\in \mathcal{U}_{r}, P^{Y^{u,\sigma }(t)}=P_t, 0\le t\le 1\}\nonumber \\&\quad =\inf \{J(u,\sigma )|(u,\sigma )\in \mathcal{U}_{r, Mar}, P^{Y^{u,\sigma }(t)}=P_t, 0\le t\le 1\} \nonumber \\&\quad =\inf \{I(\{P_t\}_{0\le t\le 1},a,b)| (a,b)\in \mathbf{A}_{0,r}(\{P_t\}_{0\le t\le 1})\}. \end{aligned}$$

(3.17)

In particular, if there exists a minimizer in (3.17), then there exists a minimizer $(u,\sigma )\in \mathcal{U}_{r,Mar}$.

Suppose that $L:[0,1]\times \mathbb {R}\times \mathbb {R}\longrightarrow [0,\infty ), \Psi :\mathbb {R}\longrightarrow [0,\infty )$ are Borel measurable. Then for any $P_0\in \mathcal{P}(\mathbb {R})$,

$$\begin{aligned}&\inf _{\begin{array}{c} {\scriptstyle (u,\sigma )\in \mathcal{U}_{r},} \\ {{\scriptstyle P^{Y^{u,\sigma }(0)}=P_0}} \end{array}} E\left[ \int _0^1 L(t,Y^{u, \sigma }(t);u(t))dt+\Psi (Y^{u, \sigma } (1))\right] \nonumber \\&\quad =\inf _{P\in \mathcal{P}(\mathbb {R})}\left\{ V_r(P_0,P)+\int _{\mathbb {R}} \Psi (x) P(dx)\right\} , \end{aligned}$$

(3.18)

where $V_r$ denotes V with $\mathcal{A}$ replaced by $\{Y^{u, \sigma }|(u, \sigma )\in \mathcal{U}_{r}\}$.

In particular, we easily obtain the following from Corollary 3.

Corollary 4

In addition to the assumption of Corollary 3, suppose that $\Psi :\mathbb {R}\longrightarrow [0,\infty )$ is Borel measurable. Then for any $r> 0$ and $P_0\in \mathcal{P}(\mathbb {R})$,

$$\begin{aligned}&\inf \{J(u,\sigma )+E[\Psi (Y^{u,\sigma }(1))]| (u,\sigma )\in \mathcal{U}_{r}, P^{Y^{u,\sigma }(0)}=P_0\}\nonumber \\&\quad =\inf \{J(u,\sigma )+E[\Psi (Y^{u,\sigma }(1))]|(u,\sigma )\in \mathcal{U}_{r, Mar}, P^{Y^{u,\sigma }(0)}=P_0\}. \end{aligned}$$

(3.19)

In particular, if there exists a minimizer in (3.19), then there exists a minimizer $(u,\sigma )\in \mathcal{U}_{r, Mar}$.

We prove Theorem 5 by Theorem 2.

Proof of Theorem 5

For $(u,\sigma )\in \mathcal{U}_{r}$, the following holds (see [35]):

$$\begin{aligned} \tilde{a}_{u,Y^{u, \sigma }}(t,\cdot )\ge 0, \quad P^{\tilde{b}_{u,Y^{u, \sigma }}(t,Y^{u, \sigma } (t))}=P^{u(t)},\quad dt\mathrm{-a.e.} \end{aligned}$$

(3.20)

Indeed,

$$\begin{aligned} \int _{-\infty }^x \tilde{b}_{u,Y^{u, \sigma }}(t,y)p^{Y^{u, \sigma }}(t,y)dy & = {} E[(G^u_t)^{-1} (1-F^{Y^{u, \sigma }}_t (Y^{u, \sigma } (t)));Y^{u, \sigma } (t)\le x],\\ \int _{-\infty }^{x} b_{Y^{u, \sigma }}(t,y)p^{Y^{u, \sigma }}(t,y)dy & = {} E[E[u(t)|(t,Y^{u, \sigma }(t))];Y^{u, \sigma } (t)\le x]\\ & = {} E[u(t);Y^{u, \sigma }(t)\le x]. \end{aligned}$$

For an $\mathbb {R}^2$-valued random variable $Z=(X,Y)$ on a probability space,

$$\begin{aligned} E[Y;X\le x] & = {} \int _0^\infty \{F_X(x)-F_Z(x,y)\}dy-\int _{-\infty }^0 F_Z(x,y)dy,\\ F_Z(x,y)\ge & {} \max (F_X(x)+F_Y(y)-1,0)\\ & = {} P(F_X^{-1} (U)\le x,F_Y^{-1} (1-U)\le y), \end{aligned}$$

where $F_X$ denotes the distribution function of X and U is a uniformly distributed random variable on [0, 1]. The distribution functions of $F_X^{-1} (U)$ and $F_Y^{-1} (1-U)$ are $F_X$ and $F_Y$, respectively. From (3.7), $F^{Y^{u, \sigma }}_t (Y^{u, \sigma }(t))$ is uniformly distributed on [0, 1] and $(F^{Y^{u, \sigma }}_t)^{-1} (F^{Y^{u, \sigma }}_t (Y^{u, \sigma } (t)))=Y^{u, \sigma }(t)$, P-a.s., dt-a.e. (see [17] or, e.g. [48, 51, 57]).

It is easy to see that the following holds from (3.8) and (3.20):

$$\begin{aligned} (\sigma ^2+\tilde{a}_{u,Y^{u,\sigma }}, \tilde{b}_{u,Y^{u,\sigma }})\in \mathbf{A}_0(\{P^{Y^{u,\sigma }(t)}\}_{0\le t\le 1}). \end{aligned}$$

Indeed, from (3.20), the following holds:

$$\begin{aligned} E\left[ \int _0^1|\tilde{b}_{u,Y^{u,\sigma }}(t,Y^{u,\sigma } (t))|dt\right] =E\left[ \int _0^1|u(t)|dt\right] <\infty . \end{aligned}$$

(3.21)

The following will be proved below:

$$\begin{aligned} E\left[ \int _0^1 \frac{\tilde{a}_{u,Y^{u,\sigma }}(t,Y^{u,\sigma }(t))}{1+|Y^{u,\sigma }(t)|^2}dt\right] <\infty . \end{aligned}$$

(3.22)

(3.21)–(3.22) and Theorem 2 complete the proof. We prove (3.22).

$$\begin{aligned}&E\left[ \int _0^1 \frac{\tilde{a}_{u,Y^{u,\sigma }}(t,Y^{u,\sigma }(t))}{2(1+|Y^{u,\sigma }(t)|^2)}dt\right] \nonumber \\&\quad =\int _0^1dt\int _{\mathbb {R}}\frac{1}{1+x^2}dx\int _{-\infty }^x (\tilde{b}_{u,Y^{u,\sigma }}(t,y) -b_{Y^{u,\sigma }}(t,y))p^{Y^{u,\sigma }}(t,y)dy. \end{aligned}$$

(3.23)

From (3.20),

$$\begin{aligned} \int _{-\infty }^\infty \tilde{b}_{u,Y^{u,\sigma }}(t,y)p^{Y^{u,\sigma }}(t,y)dy & = {} E[u(t)]=E[E[u(t)|(t,Y^{u,\sigma }(t))]]\nonumber \\ & = {} \int _{-\infty }^\infty b_{Y^{u,\sigma }}(t,y)p^{Y^{u,\sigma }}(t,y)dy,\quad dt\hbox {-a.e.} \end{aligned}$$

(3.24)

In particular, the following holds dt-a.e.:

$$\begin{aligned}&\int _{\mathbb {R}}\frac{1}{1+x^2}dx\int _{-\infty }^x (\tilde{b}_{u,Y^{u,\sigma }}(t,y) -b_{Y^{u,\sigma }}(t,y))p^{Y^{u,\sigma }}(t,y)dy\nonumber \\&\quad =\int _{-\infty }^0\frac{1}{1+x^2}dx\int _{-\infty }^x (\tilde{b}_{u,Y^{u,\sigma }}(t,y) -b_{Y^{u,\sigma }}(t,y))p^{Y^{u,\sigma }}(t,y)dy\nonumber \\&\qquad -\int _0^{\infty }\frac{1}{1+x^2}dx\int _x^{\infty } (\tilde{b}_{u,Y^{u,\sigma }}(t,y) -b_{Y^{u,\sigma }}(t,y))p^{Y^{u,\sigma }}(t,y)dy\nonumber \\&\quad =\int _{-\infty }^0(\tilde{b}_{u,Y^{u,\sigma }}(t,y) -b_{Y^{u,\sigma }}(t,y))p^{Y^{u,\sigma }}(t,y)dy\int _{y}^0\frac{1}{1+x^2}dx\nonumber \\&\qquad -\int _0^{\infty }(\tilde{b}_{u,Y^{u,\sigma }}(t,y) -b_{Y^{u,\sigma }}(t,y))p^{Y^{u,\sigma }}(t,y)dy\int _0^y\frac{1}{1+x^2}dx\nonumber \\&\quad \le \int _{\mathbb {R}}|\arctan y|( |\tilde{b}_{u,Y^{u,\sigma }}(t,y)|+|b_{Y^{u,\sigma }}(t,y)|)p^{Y^{u,\sigma }}(t,y)dy. \end{aligned}$$

(3.25)

Since $|\arctan y|$ is bounded, (3.8) and (3.21) completes the proof of (3.22).

$\square$

4 Semiconcavity and continuity of Schrödinger’s Problem

Proposition 1 and Lemma 1 imply that $\mathcal{P} (\mathbb {R}^d \times \mathbb {R}^d )\ni P\times Q\mapsto V(P,Q)$ is convex and lower semicontinuous.

In this section, we give a sufficient condition under which for a fixed $Q\in \mathcal{P} (\mathbb {R}^d )$, $L^2 (\varOmega , P;\mathbb {R}^d)\ni X\mapsto V_S(P^X,Q)$ is semiconcave and is continuous (see (1.19) for notation). More precisely, we show that there exists $C>0$ such that for a fixed $Q\in \mathcal{P} (\mathbb {R}^d )$,

$$\begin{aligned} L^2 (\varOmega , P;\mathbb {R}^d)\ni X\mapsto V_S(P^X,Q)-CE[|X|^2] \end{aligned}$$

is concave and is continuous. Here $L^2 (\varOmega , P;\mathbb {R}^d)$ denotes the space of all square integrable functions from a probability space $(\varOmega , \mathcal{F}, P)$ to $(\mathbb {R}^d,\mathbf{B}(\mathbb {R}^d))$. Let $W_2$ denote the Wasserstein distance of order 2, i.e. $T^{1/2}$ with $c=|y-x|^2$ in Remark 5. We also show the Lipschitz continuity of $\mathcal{P}_{2} (\mathbb {R}^d )\ni P\mapsto V_S(P,Q)$ in $W_2$ (see (4.4) for notation).

We first describe the assumptions in this section.

(A.5) $\sigma (t,x)=(\sigma ^{ij}(t,x))_{i,j=1}^d$, $(t,x)\in [0,1]\times \mathbb {R}^d$, is a $d\times d$-matrix. $a(t,x):=\sigma (t,x)\sigma (t,x)^*$, $(t,x)\in [0,1]\times \mathbb {R}^d$, is uniformly nondegenerate, bounded, once continuously differentiable, and uniformly Hölder continuous. $D_x a(t,x)$ is bounded and the first derivatives of a(t, x) are uniformly Hölder continuous in x uniformly in $t\in [0,1]$.

(A.6) $\xi (t,x):[0,1]\times \mathbb {R}^d\longrightarrow \mathbb {R}^d$ is bounded, continuous, and uniformly Hölder continuous in x uniformly in $t\in [0,1]$.

Remark 8

(A.5)–(A.6) imply (A.0.0), (A.1), and (A.2) for (1.19). (A.4)’ implies (A.0)–(A.3) and (A.5)–(A.6).

We describe the following fact.

Theorem 6

Suppose that (A.5)–(A.6) hold. Then for any $P_0\in \mathcal{P}(\mathbb {R}^d)$, the following SDE has the unique weak solution with a positive continuous transition probability density p(t, x; s, y), $0\le t<s\le 1$, $x,y\in \mathbb {R}^d$:

$$\begin{aligned} d\mathbf{X}(t) & = {} \xi (t,\mathbf{X}(t))dt+\sigma (t,\mathbf{X}(t))dW_\mathbf{X}(t),\quad 0< t<1,\nonumber \\ P^{\mathbf{X} (0)} & = {} P_0 \end{aligned}$$

(4.1)

(see [28]). Besides, there exist constants $C_1, C_2>0$ such that

$$\begin{aligned} -C_1+C_2^{-1}|x-y|^2\le -\log p(0,x;1,y)\le C_1+C_2|x-y|^2,\quad x,y\in \mathbb {R}^d \end{aligned}$$

(4.2)

(see [3, 22]).

Remark 9

If $V_S(P,Q)$ is finite, then the distribution of the minimizer X of $V_S(P,Q)$ is absolutely continuous with respect to $P^\mathbf{X}$. In particular, $Q(dx)\ll dx$ under (A.5)–(A.6). Indeed, $V_S(P,Q)$ is the relative entropy of $P^{X}$ with respect to $P^\mathbf{X}$ and $P^{\mathbf{X}(1)}$ has a density (see the discussion below Remark 3).

We recall the definition of displacement convexity.

Definition 7

(Displacement convexity (see [32])) Let $G:\mathcal{P}(\mathbb {R}^d )\longrightarrow \mathbb {R}\cup \{\infty \}$. G is displacement convex if the following is convex: for any $\rho _0, \rho _1\in \mathcal{P}(\mathbb {R}^d )$ and convex function $\varphi :\mathbb {R}^d\longrightarrow \mathbb {R}\cup \{\infty \}$,

$$\begin{aligned}{}[0,1]\ni t\mapsto G(\rho _t), \end{aligned}$$

(4.3)

where $\rho _t:= \rho _0(id+t(D\varphi -id))^{-1}$, $0< t<1$, provided $\rho _1= \rho _0(D\varphi )^{-1}$ and $\rho _t$ can be defined. Here id denotes an identity mapping.

Recall that a convex function is differentiable dx-a.e. in the interior of its domain (see, e.g. [61]) and $\rho _t$ in (4.3) is well defined if $\rho _0\in \mathcal{P}_{2,ac} (\mathbb {R}^d )$ and if $\rho _1\in \mathcal{P}_{2} (\mathbb {R}^d )$ (see, e.g. [61]). Here

$$\begin{aligned} \mathcal{P}_2 (\mathbb {R}^d ): & = {} \left\{ P\in \mathcal{P} (\mathbb {R}^d ) \biggl |\int _{\mathbb {R}^d}|x|^2P(dx)<\infty \right\} , \end{aligned}$$

(4.4)

$$\begin{aligned} \mathcal{P}_{ac} (\mathbb {R}^d ): & = {} \{p (x)dx\in \mathcal{P} (\mathbb {R}^d )\}, \end{aligned}$$

(4.5)

$$\begin{aligned} \mathcal{P}_{2,ac} (\mathbb {R}^d ): & = {} \mathcal{P}_{2} (\mathbb {R}^d )\cap \mathcal{P}_{ac} (\mathbb {R}^d ). \end{aligned}$$

(4.6)

The following implies that $L^2 (\varOmega , P;\mathbb {R}^d)\ni X\mapsto V_S(P^X,Q)$ is semiconvave for a fixed $Q\in \mathcal{P}_{ac} (\mathbb {R}^d )$ and will be proved later.

Theorem 7

Suppose that (A.4)’ holds and that there exists a constant $C>0$ such that $x\mapsto \log p(0,x;1,y) +C|x|^2$ is convex for any $y\in \mathbb {R}^d$. Then for any $Q\in \mathcal{P}_{ac} (\mathbb {R}^d )$, $X_i\in L^2 (\varOmega , P; \mathbb {R}^d), i=1,2$, and $\lambda _1\in (0,1)$,

$$\begin{aligned} \sum _{i=1}^2\lambda _i V_S(P^{X_i},Q)-\lambda _1\lambda _2 CE[|X_1-X_2|^2] \le V_S(P^{\sum _{i=1}^2\lambda _i X_i},Q), \end{aligned}$$

(4.7)

where $\lambda _2:=1-\lambda _1$. Equivalently, the following is convex:

$$\begin{aligned} L^2 (\varOmega , P;\mathbb {R}^d)\ni X\mapsto -V_S(P^X,Q)+CE[|X|^2]. \end{aligned}$$

(4.8)

In particular, the following is displacement convex:

$$\begin{aligned} \mathcal{P}_{2,ac} (\mathbb {R}^d )\ni P\mapsto -V_S(P,Q)+C\int _{\mathbb {R}^d}|x|^2P(dx). \end{aligned}$$

(4.9)

Remark 10

Suppose that $a_{ij}=a_{ij} (x), \xi _i=\xi _i(x)\in C_b^\infty (\mathbb {R}^d)$ and that a(x) is uniformly nonnegenerate. Then $D_x^2\log p(0,x;1,y)$ is bounded (see [58], Theorem B). In particular, there exists a constant $C>0$ such that for any $y\in \mathbb {R}^d$, $x\mapsto \log p(0,x;1,y) +C|x|^2$ is convex.

For $P\in \mathcal{P}(\mathbb {R}^d )$,

$$\begin{aligned} \mathcal{S}(P):= {\left\{ \begin{array}{ll} \displaystyle \int _{\mathbb {R}^d}p(x)\log p(x) dx,&{}P(dx)=p(x)dx,\\ \displaystyle \infty ,&{}otherwise. \end{array}\right. } \end{aligned}$$

(4.10)

Let $\mu (P,Q)$ denote the joint distribution at $t=0, 1$ of the minimizer of $V_S(P,Q)$, provided $V_S(P,Q)$ is finite. The following implies that $L^2 (\varOmega , P;\mathbb {R}^d)\ni X\mapsto V_S(P^X,Q)$ is continuous for a fixed $Q\in \mathcal{P}_{ac} (\mathbb {R}^d )$ such that $\mathcal{S}(Q)$ is finite.

The lower-semicontinuity of $\mathcal{P}(\mathbb {R}^d )\ni P\mapsto V_S(P,Q)$ is known and can be proved, e.g. from Proposition 1 and Lemma 1. That of (4.12) can be proved in the same way as Lemma 3.4 in [43]. We give the proof for the sake of completeness.

Theorem 8

Suppose that (A.5)–(A.6) hold. For $P, Q\in \mathcal{P}_{2} (\mathbb {R}^d )$, if $\mathcal{S}(Q)$ is finite, then $V_S (P,Q)$ is finite and the following holds:

$$\begin{aligned} -V_S (P,Q) & = {} H(P\times Q| \mu (P,Q))-\mathcal{S}(Q)\nonumber \\&+\int _{\mathbb {R}^d\times \mathbb {R}^d} \log p(0,x;1,y) P(dx)Q(dy). \end{aligned}$$

(4.11)

In particular, the following is weakly lower semicontinuous:

$$\begin{aligned} \mathcal{P}_{2} (\mathbb {R}^d )\ni P\mapsto -V_S(P,Q)+C_2\int _{\mathbb {R}^d\times \mathbb {R}^d}|x-y|^2P(dx)Q(dy) \end{aligned}$$

(4.12)

(see (4.2) for notation). The following is also continuous in the topology induced by $W_2$:

$$\begin{aligned} \mathcal{P}_{2} (\mathbb {R}^d )\ni P\mapsto V_S(P,Q). \end{aligned}$$

(4.13)

If $\mathcal{S}(Q)$ is infinite, then so is $V_S (P,Q)$.

Remark 11

For $C>0$ and $P, Q\in \mathcal{P} (\mathbb {R}^d )$,

$$\begin{aligned} \Psi _{Q,C} (P):=\mathcal{S}(P) -V_S(P,Q)+ C\int _{\mathbb {R}^d\times \mathbb {R}^d}|x-y|^2P(dx)Q(dy). \end{aligned}$$

(4.14)

$\Psi _{Q,C} (P)$ plays a crucial role in the construction of moment measures by the SOTP (see [43, 44] and also [54] for the approach by the OTP). Since $\mathcal{P}_{ac} (\mathbb {R}^d )\ni P\mapsto \mathcal{S}(P)$ is strictly displacement convex from Theorem 2.2 in [32], so is $\mathcal{P}_{2, ac} (\mathbb {R}^d )\ni P\mapsto \Psi _{Q,C} (P)$ under the assumption of Theorem 7.

From Theorem 7, under stronger assumptions than Theorem 8, for a fixed $Q\in \mathcal{P}_{2,ac} (\mathbb {R}^d )$ such that $\mathcal{S}(Q)$ is finite, we prove that $\mathcal{P}_2 (\mathbb {R}^d )\ni P\mapsto V_S(P,Q)$ is Lipschitz continuous in $W_2$.

Corollary 5

Suppose that (A.4)’ holds and that there exists a constant $C>0$ such that $\log p(0,x;1,y) +C|x|^2$ is convex in x for any $y\in \mathbb {R}^d$. Then for any $Q\in \mathcal{P}_{2,ac} (\mathbb {R}^d )$ such that $\mathcal{S}(Q)$ is finite, the following holds:

$$\begin{aligned}&|V_S(P_0,Q)-V_S(P_1,Q)|\nonumber \\&\quad \le f (\max (||x||_{L^2(P_0)}, ||x||_{L^2(P_1)}), ||x||_{L^2(Q)} )W_2(P_0,P_1), \quad P_0,P_1\in \mathcal{P}_2 (\mathbb {R}^d ), \end{aligned}$$

(4.15)

where $||x||_{L^2(P)}:=(\int _{\mathbb {R}^d}|x|^2P(dx))^{1/2}$, $P\in \mathcal{P}_2 (\mathbb {R}^d )$ and

$$\begin{aligned} f(x,y):=2C_2x^2+2(C_2y^2+C_1)+C \end{aligned}$$

(see (4.2) for notation). In particular, if $p(0,x;1,y)=(2\pi a)^{-d/2}\exp (-|y-x|^2/(2a)), a>0$, then

$$\begin{aligned}&|V_S(P_0,Q)-V_S(P_1,Q)|\nonumber \\&\quad \le \frac{1}{2a}\{||x||_{L^2(P_0)}+||x||_{L^2(P_1)}+2 (1+\max (\sigma _0,\sigma _1))||x||_{L^2(Q)} \}W_2(P_0,P_1), \end{aligned}$$

(4.16)

where

$$\begin{aligned} \sigma _i:=\biggl (\int _{\mathbb {R}^d} \biggl (x-\int _{\mathbb {R}^d}yP_i(dy)\biggr )^2P_i(dx)\biggr )^{1/2},\quad i=0,1. \end{aligned}$$

We prove Theorems 7 and 8, and Corollary 5.

Proof of Theorem 7

For any $f_{i}\in C^\infty _b (\mathbb {R}^d)$, $u_{i} (x):=\varphi (0,x;f_{i})$ (see (2.18) for notation). Then

$$\begin{aligned}&\sum _{i=1}^2\lambda _i \left\{ \int _{\mathbb {R}^d} f_{i} (x)Q(dx)-\int _{\mathbb {R}^d} u_i (x)P^{X_i}(dx)\right\} \nonumber \\&\quad \le V_S (P^{\sum _{i=1}^2\lambda _i X_i},Q)+\lambda _1\lambda _2CE[|X_1-X_2|^2]. \end{aligned}$$

(4.17)

Indeed,

$$\begin{aligned}&\sum _{i=1}^2\lambda _i \left\{ \int _{\mathbb {R}^d} f_{i} (x)Q(dx)-\int _{\mathbb {R}^d} u_i (x)P^{X_i}(dx)\right\} \\&\quad =\int _{\mathbb {R}^d} \sum _{i=1}^2\lambda _i f_{i} (x)Q(dx)- E\biggl [\sum _{i=1}^2\lambda _i\{u_{i}(X_i)+C|X_i|^2\}\biggl ] +CE\biggl [\sum _{i=1}^2\lambda _i |X_i|^2\biggl ],\\&\qquad \int _{\mathbb {R}^d} \sum _{i=1}^2\lambda _i f_{i} (x)Q(dx)\\&\quad \le V_S (P^{\sum _{i=1}^2\lambda _i X_i},Q)+\int _{\mathbb {R}^d} \varphi \biggl (0,x;\sum _{i=1}^2\lambda _i f_{i}\biggl )P^{\sum _{i=1}^2\lambda _i X_i}(dx) \end{aligned}$$

by the Duality Theorem for $V_S$ (see Corollary 2).

$$\begin{aligned}&\int _{\mathbb {R}^d} \varphi \biggl (0,x;\sum _{i=1}^2\lambda _i f_{i}\biggl )P^{\sum _{i=1}^2\lambda _i X_i}(dx) =E\biggl [\varphi \biggl (0,\sum _{i=1}^2\lambda _i X_i;\sum _{i=1}^2\lambda _i f_{i}\biggl )\biggl ]\\&\quad \le E\biggl [\sum _{i=1}^2\lambda _i\biggl \{u_{i}(X_i)+C|X_i|^2\biggl \}\biggl ] -CE\biggl [\biggl |\sum _{i=1}^2\lambda _i X_i\biggl |^2\biggl ]. \end{aligned}$$

In the inequality above, we considered as follows:

$$\begin{aligned} \varphi (t,x;f) & = {} \log \left( \int _{\mathbb {R}^d}p(t,x;1,y)\exp (f(y)) dy\right) , (t,x)\in [0,1)\times \mathbb {R}^d,\nonumber \\&\quad \int _{\mathbb {R}^d} \exp \biggl (\log p\biggl (0,\sum _{i=1}^2\lambda _iX_i;1,y\biggl )+C\biggl |\sum _{i=1}^2\lambda _i X_i\biggl |^2+\sum _{i=1}^2\lambda _if_{i} (y)\biggl )dy\nonumber \\\le & {} \int _{\mathbb {R}^d} \exp \biggl (\sum _{i=1}^2\lambda _i\{\log p(0,X_i;1,y)+C|X_i|^2+f_{i} (y)\}\biggl )dy\nonumber \\\le & {} \prod _{i=1}^2\left( \int _{\mathbb {R}^d} \exp (\log p(0,X_i;1,y)+C|X_i|^2+f_{i} (y))dy\right) ^{\lambda _i} \end{aligned}$$

(4.18)

by Hölder’s inequality. Taking the supremum in $f_i$ over $C^\infty _b (\mathbb {R}^d)$ on the left hand side of (4.17), the Duality Theorem for $V_S$ completes the proof (see Corollary 2). $\square$

Proof of Theorem 8

We prove the first part. We first prove that $V_S(P,Q)$ is finite. Indeed, from [53],

$$\begin{aligned} V_S(P,Q) & = {} \inf \{H(\mu (dxdy)|P(dx)p(0,x;1,y)dy):\mu _1=P,\mu _2=Q\}\nonumber \\\le & {} H(P(dx)Q(dy)|P(dx)p(0,x;1,y)dy)\nonumber \\ =& {} \mathcal{S}(Q)- \int _{\mathbb {R}^d\times \mathbb {R}^d}\{ \log p(0,x;1,y)\}P(dx)Q(dy)<\infty \end{aligned}$$

(4.19)

from (4.2) (see (2.3) for notation). Here for $\mu , \nu \in \mathcal{P}(\mathbb {R}^d\times \mathbb {R}^d )$,

$$\begin{aligned} H(\mu |\nu ):= {\left\{ \begin{array}{ll} \displaystyle \int _{\mathbb {R}^d\times \mathbb {R}^d}\left\{ \log \frac{\mu (dxdy)}{\nu (dxdy)}\right\} \mu (dxdy),&{}\mu \ll \nu ,\\ \displaystyle \infty ,&{}otherwise. \end{array}\right. } \end{aligned}$$

There exists a Borel measurable $f:\mathbb {R}^d\longrightarrow \mathbb {R}$ such that the following holds (see, e.g. [28]):

$$\begin{aligned} \mu (P,Q)(dxdy)=P(dx)p(0,x;1,y)\exp (f(y)-\varphi (0,x;f))dy \end{aligned}$$

(4.20)

(see (4.18) for notation).

Since $V_S(P,Q)$ is finite, $f\in L^1 (\mathbb {R}^d,P_1)$ and $\varphi (0,x;f)\in L^1 (\mathbb {R}^d,P_0)$ (see, e.g. [53]). In particular,

$$\begin{aligned} -V_S(P,Q) & = {} -\int _{\mathbb {R}^d\times \mathbb {R}^d}\left\{ \log \frac{\mu (P,Q)(dxdy)}{P(dx)p(0,x;1,y)dy}\right\} \mu (P,Q)(dxdy)\nonumber \\ & = {} \int _{\mathbb {R}^d\times \mathbb {R}^d}(-f(y)+\varphi (0,x;f))P(dx)Q(dy)\nonumber \\ & = {} \int _{\mathbb {R}^d\times \mathbb {R}^d}P(dx)Q(dy) \biggl \{\log \left( \frac{P(dx)Q(dy)}{\mu (P,Q)(dxdy)}\right) \nonumber \\&-\log q(y)+\log p(0,x;1,y)\biggr \}, \end{aligned}$$

(4.21)

which completes the proof of (4.11). $P\times Q\mapsto H(P\times Q| \mu (P,Q))$ is weakly lower semicontinuous since $P\times Q\mapsto \mu (P,Q)$ is weakly continuous (see [43]) and since $(\mu ,\nu )\mapsto H(\mu | \nu )$ is weakly lower semicontinuous (see, e.g. [18], Lemma 1.4.3). In particular, (4.12) is weakly lower semicontinuous from (4.11). The weak lower semicontinuity of (4.12) implies the upper semicontinuity of (4.13) since for $P_n, P\in \mathcal{P}(\mathbb {R}^d), n\ge 1$, $W_2 (P_n,P)\rightarrow 0$ as $n\rightarrow \infty$ if and only if $P_n\rightarrow P$ weakly and $\int _{\mathbb {R}^d} |x|^2P_n(dx)\rightarrow \int _{\mathbb {R}^d} |x|^2P(dx)$ (see, e.g. [61]).

(4.13) is also weakly lower semicontinuous by Proposition 1 and Lemma 1.

We prove the last part.

$$\begin{aligned} q(0,x;1,y):=p(0,x;1,y)\exp (f(y)-\varphi (0,x;f)). \end{aligned}$$

(4.22)

Then by Jensen’s inequality,

$$\begin{aligned}&V_S(P,Q)\nonumber \\&\quad =\int _{\mathbb {R}^d\times \mathbb {R}^d}\left\{ \log q(0,x;1,y)-\log p(0,x;1,y)\right\} P(dx)q(0,x;1,y)dy\nonumber \\&\quad \ge \mathcal{S}(Q)- \int _{\mathbb {R}^d\times \mathbb {R}^d}\left\{ \log p(0,x;1,y)\right\} P(dx)q(0,x;1,y)dy, \end{aligned}$$

(4.23)

since

$$\begin{aligned} Q(dy)=\left( \int _{\mathbb {R}^d}P(dx)q(0,x;1,y)\right) dy. \end{aligned}$$

(4.2) completes the proof.

$\square$

Remark 12

Under (A.5)-(A.6), from Theorem 6, (4.21), and (4.23), for $P, Q\in \mathcal{P}_2 (\mathbb {R}^d)$, if $\mathcal{S}(Q)$ is finite, then

$$\begin{aligned} -\infty< & {} -C_1+C_2^{-1}\int _{\mathbb {R}^d\times \mathbb {R}^d}|x-y|^2\mu (P,Q)(dxdy)\nonumber \\\le & {} -\int _{\mathbb {R}^d\times \mathbb {R}^d}\{\log p(0,x;1,y)\}\mu (P,Q)(dxdy)\nonumber \\\le & {} V_S(P,Q)-\mathcal{S}(Q)\nonumber \\\le & {} -\int _{\mathbb {R}^d\times \mathbb {R}^d}\{\log p(0,x;1,y)\}P(dx)Q(dy)\nonumber \\\le & {} C_1+C_2\int _{\mathbb {R}^d\times \mathbb {R}^d}|x-y|^2P(dx)Q(dy)<\infty . \end{aligned}$$

(4.24)

Remark 12 plays a crucial role in the proof of Corollary 5.

Proof of Corollary 5

Let $X,Y\in L^2 (\varOmega , P; \mathbb {R}^d)$ and $\lambda :=\min (1, ||X-Y||_2)$, where $||X||_2:=\{E[|X|^2]\}^{1/2}$.

We prove the following when $\lambda >0$.

$$\begin{aligned} V_S(P^X,Q)-V_S(P^Y,Q)\le \lambda \{2C_2(||X||_2^2+||x||_{L^2(Q)}^2)+2C_1+C\} \end{aligned}$$

(4.25)

(see (4.2) for notation). From Theorem 7,

$$\begin{aligned}&(1-\lambda )V_S(P^{X},Q)+\lambda V_S(P^{\lambda ^{-1}(Y-X)+X},Q)\\&\quad \le \lambda (1-\lambda )C||\lambda ^{-1}(Y-X)||_2^2+V_S(P^{Y},Q). \end{aligned}$$

since $Y=(1-\lambda )X+\lambda (\lambda ^{-1}(Y-X)+X)$. From this,

$$\begin{aligned}&V_S(P^X,Q)-V_S(P^Y,Q)\nonumber \\&\quad \le \lambda \{V_S(P^X,Q)-V_S(P^{\lambda ^{-1}(Y-X)+X},Q) +C(1-\lambda )||\lambda ^{-1}(Y-X)||_2^2\}. \end{aligned}$$

(4.26)

Since (A.4)’ implies (A.5)–(A.6),

$$\begin{aligned}&V_S(P^X,Q)-V_S(P^{\lambda ^{-1}(Y-X)+X},Q)\\&\quad \le \mathcal{S}(Q)+C_1+2C_2(||X||_2^2+||x||_{L^2(Q)}^2)- \mathcal{S}(Q)+C_1 \end{aligned}$$

from Remark 12. The following completes the proof of the first part:

$$\begin{aligned} (1-\lambda )||\lambda ^{-1}(Y-X)||_2^2 ={\left\{ \begin{array}{ll} 1-\lambda , &{}\lambda =||X-Y||_2<1,\\ 0=1-\lambda ,&{}\lambda =1. \end{array}\right. } \end{aligned}$$

We prove the second part. One can set $C= (2a)^{-1}$.

From (4.26), the following holds:

$$\begin{aligned}&V_S(P^X,Q)-V_S(P^Y,Q)\nonumber \\&\quad \le \lambda \biggl \{ V_S(P^X,Q)-V_S(P^{\lambda ^{-1}(Y-X)+X},Q)\nonumber \\&\qquad -\frac{1}{2a}||X||_2^2+ \frac{1}{2a}|| \lambda ^{-1}(Y-X)+X||_2^2 \biggr \}+ \frac{1}{2a}(||X||_2^2-||Y||_2^2), \end{aligned}$$

(4.27)

since

$$\begin{aligned} \lambda (1-\lambda )||\lambda ^{-1}(Y-X)||_2^2 =\lambda (-||X||_2^2+||\lambda ^{-1}(Y-X)+X||_2^2)+||X||_2^2-||Y||_2^2. \end{aligned}$$

The following completes the proof: from Remark 12,

$$\begin{aligned}&V_S(P^X,Q)-V_S(P^{\lambda ^{-1}(Y-X)+X},Q)-\frac{1}{2a}||X||_2^2+ \frac{1}{2a}|| \lambda ^{-1}(Y-X)+X||_2^2\\&\quad \le \frac{1}{a}\int _{\mathbb {R}^d\times \mathbb {R}^d}\langle x, y\rangle \left\{ \mu (P^{\lambda ^{-1}(Y-X)+X},Q)(dxdy)-P^X(dx)Q(dy)\right\} \\&\quad =\frac{1}{a}\int _{\mathbb {R}^d\times \mathbb {R}^d}\langle x-E[X], y\rangle \mu (P^{\lambda ^{-1}(Y-X)+X},Q)(dxdy)\\&\quad \le \frac{1}{a\lambda }(||X-Y||_2 +\lambda V(X)^{1/2})||x||_{L^2(Q)}. \end{aligned}$$

.

$\square$

References

Ambrosio, L.: Transport equation and Cauchy problem for BV vector fields. Invent. Math. 158, 227–260 (2004)
Article MathSciNet MATH Google Scholar
Ambrosio, L., Trevisan, D.: Well-posedness of Lagrangian flows and continuity equations in metric measure spaces. Anal. PDE 7(5), 1179–1234 (2014)
Article MathSciNet MATH Google Scholar
Aronson, D.G.: Bounds on the fundamental solution of a parabolic equation. Bull. Am. Math. Soc. 73, 890–896 (1967)
Article MathSciNet MATH Google Scholar
Bogachev, V.I., Krylov, N.V., Röckner, M.: Elliptic and parabolic equations for measures. Russ. Math. Surv. 64(6), 973–1078 (2009)
Article MathSciNet MATH Google Scholar
Bogachev, V. I., Röckner, M., Shaposhnikov, S. V.: On the Ambrosio–Figalli–Trevisan superposition principle for probability solutions to Fokker–Planck–Kolmogorov equations. J. Dyn. Differ. Equ. (2020)
Cacoullos, T., Papathanasiou, V., Utev, S.A.: Another characterization of the normal law and a proof of the central limit theorem connected with it. Theory Probab. Appl. 37, 581–588 (1992)
Article MathSciNet MATH Google Scholar
Cacoullos, T., Papathanasiou, V., Utev, S.A.: Variational inequalities with examples and an application to the central limit theorem. Ann. Probab. 22, 1607–1618 (1994)
Article MathSciNet MATH Google Scholar
Carlen, E.A.: Conservative diffusions. Commun. Math. Phys. 94, 293–315 (1984)
Article MathSciNet MATH Google Scholar
Carlen, E. A.: Existence and sample path properties of the diffusions in Nelson’s stochastic mechanics. In: Albeverio, S., Blanchard, Ph., Streit, L. (eds.) Stochastic processes-Mathematics and Physics, Bielefeld 1984, Lecture Notes in Math., Vol. 1158, pp. 25-51. Springer, Heidelberg (1986)
Carmona, R.: Probabilistic construction of Nelson processes. In: Itô, K., Ikeda, N. (eds.) Proc. Probabilistic Methods in Mathematical Physics, Katata 1985, pp. 55–81. Kinokuniya, Tokyo (1987)
Cattiaux, P., Léonard, C.: Minimization of the Kullback information of diffusion processes. Ann. Inst. H Poincaré Probab. Stat. 30, 83–132 (1994)
MathSciNet MATH Google Scholar
Cattiaux, P., Léonard, C.: Correction to: Minimization of the Kullback information of diffusion processes [Ann. Inst. H. Poincaré Probab. Statist. 30 (1994), no. 1, 83–132]. Ann Inst H Poincaré Probab Statist 31, 705–707 (1995)
Cattiaux, P., Léonard, C.: Large deviations and Nelson processes. Forum Math. 7, 95–115 (1995)
Article MathSciNet MATH Google Scholar
Cattiaux, P., Léonard, C.: Minimization of the Kullback information for some Markov processes. In: Azema, J. et al. (eds.) Séminaire de Probabilités, XXX, Lecture Notes in Math., Vol. 1626, pp. 288–311. Springer, Heidelberg (1996)
Crandall, M.G., Ishii, H., Lions, P.L.: User’s guide to viscosity solutions of second order partial differential equations. Bull. Am. Math. Soc. 27, 1–67 (1992)
Article MathSciNet MATH Google Scholar
Dai Pra, P.: A stochastic control approach to reciprocal diffusion processes. Appl. Math. Optim. 23, 313–329 (1991)
Article MathSciNet MATH Google Scholar
Dall’Aglio, G.: Sugli estremi dei momenti delle funzioni di ripartizione doppie. Ann. Scuola Normale Superiore Di Pisa, Cl. Sci. 3(1), 33–74 (1956)
MATH Google Scholar
Dupuis, P., Ellis, R.S.: A Weak Convergence Approach to the Theory of Large Deviations. John Wiley & Sons, New York (1997)
Book MATH Google Scholar
Figalli, A.: Existence and uniqueness of martingale solutions for SDEs with rough or degenerate coefficients. J. Funct. Anal. 254, 109–153 (2008)
Article MathSciNet MATH Google Scholar
Fleming, W.H., Soner, H.M.: Controlled Markov Processes and Viscosity Solutions. Springer, New York (1993)
MATH Google Scholar
Föllmer, H.: Random fields and diffusion processes. In: Hennequin, PL (ed.) École d’Été de Probabilités de Saint-Flour XV–XVII, 1985–87, Lecture Notes in Math., Vol. 1362, pp. 101–203. Springer, Heidelberg (1988)
Friedman, A.: Partial Differential Equations of Parabolic Type. Dover Publications, New York (2013)
Google Scholar
Gomes, D.A.: A stochastic analogue of Aubry-Mather theory. Nonlinearity 15, 581–603 (2002)
Article MathSciNet MATH Google Scholar
Gomes, D. A., Mitake, H, Tran, H. V.: The large time profile for Hamilton–Jacobi–Bellman equations. arXiv:2006.04785
Ikeda, N., Watanabe, S.: Stochastic Differential Equations and Diffusion Processes. North-Holland/Kodansha, Tokyo (1981)
MATH Google Scholar
Ioffe, A.D., Tihomirov, V.M.: Theory of Extremal Problems. North-Holland, Amsterdam (1979)
MATH Google Scholar
Jamison, B.: Reciprocal processes. Z. Wahrsch. Verw. Gebiete 30, 65–86 (1974)
Article MathSciNet MATH Google Scholar
Jamison, B.: The Markov process of Schrödinger. Z. Wahrsch. Verw. Gebiete 32, 323–331 (1975)
Article MATH Google Scholar
Koike, S.: A beginner’s guide to the theory of viscosity solutions. MSJ Memoirs, Vol. 13. Math. Soc. Japan., Tokyo (2004)
Léonard, C. : A survey of the Schrödinger problem and some of its connections with optimal transport. Special Issue on Optimal Transport and Applications. Discr. Contin. Dyn. Syst. 34, 1533–1574 (2014)
Liptser, R.S., Shiryaev, A.N.: Statistics of Random Processes I. Springer, Heidelberg (1977)
Book Google Scholar
McCann, R.J.: A convexity principle for interacting gases. Adv. Math. 128, 153–179 (1997)
Article MathSciNet MATH Google Scholar
Mikami, T.: Variational processes from the weak forward equation. Commun. Math. Phys. 135, 19–40 (1990)
Article MathSciNet MATH Google Scholar
Mikami, T.: Equivalent conditions on the central limit theorem for a sequence of probability measures on $\mathbb{R}$. Stat. Probab. Lett. 37, 237–242 (1998)
Article MathSciNet MATH Google Scholar
Mikami, T.: Markov marginal problems and their applications to Markov optimal control. In: McEneaney, W. M. etal. (eds.) Stochastic Analysis, Control, Optimization and Applications, A Volume in Honor of W. H. Fleming, pp. 457-476. Birkhäuser, Boston (1999)
Mikami, T.: Dynamical systems in the variational formulation of the Fokker–Planck equation by the Wasserstein metric. Appl. Math. Optim. 42, 203–227 (2000)
Article MathSciNet MATH Google Scholar
Mikami, T.: Optimal control for absolutely continuous stochastic processes and the mass transportation problem. Elect. Commun. Probab. 7, 199–213 (2002)
MathSciNet MATH Google Scholar
Mikami, T.: Monge’s problem with a quadratic cost by the zero-noise limit of $h$-path processes. Probab. Theory Related Fields 129, 245–260 (2004)
Article MathSciNet MATH Google Scholar
Mikami, T.: Covariance kernel and the central limit theorem in the total variation distance. J. Multivar. Anal. 90, 257–268 (2004)
Article MathSciNet MATH Google Scholar
Mikami, T.: Semimartingales from the Fokker–Planck equation. Appl. Math. Optim. 53, 209–219 (2006)
Article MathSciNet MATH Google Scholar
Mikami, T.: Marginal problem for semimartingales via duality. In: Giga, Y., Ishii, K., Koike, S. et al. (eds) International Conference for the 25th Anniversary of Viscosity Solutions, Gakuto International Series. Mathematical Sciences and Applications 30, pp. 133–152. Gakkotosho, Tokyo (2008)
Mikami, T.: Regularity of Schrödinger’s functional equation and mean field PDEs for h-path processes. Osaka J. Math. 56, 831–842 (2019)
MathSciNet MATH Google Scholar
Mikami, T.: Regularity of Schrödinger’s functional equation in the weak topology and moment measures. J. Math. Soc. Jpn. 73, 99–123 (2021)
Article MATH Google Scholar
Mikami, T.: Stochastic optimal transportation. A book in preparation
Mikami, T., Thieullen, M.: Duality theorem for stochastic optimal control problem. Stoc. Proc. Appl. 116, 1815–1835 (2006)
Article MathSciNet MATH Google Scholar
Nagasawa, M.: Transformations of diffusion and Schrödinger process. Probab. Theory Related Fields 82, 109–136 (1989)
Article MathSciNet MATH Google Scholar
Nagasawa, M.: Stochastic Processes in Quantum Physics (Monographs in Mathematics 94). Birkhaüser, Basel (2000)
Book MATH Google Scholar
Nelsen, R.B.: An Introduction to Copulas, 2nd edn. Springer, Heidelberg (2006)
MATH Google Scholar
Nelson, E.: Dynamical Theories of Brownian Motion. Princeton University Press, Princeton (1967)
Book MATH Google Scholar
Nelson, E.: Quantum Fluctuations. Princeton University Press, Princeton (1984)
MATH Google Scholar
Rachev, S. T., Rüschendorf, L.: Mass transportation problems, Vol. I: Theory, Vol. II: Application. Springer, Heidelberg (1998)
Röckner, M., Xie, L., Zhang, X.: Superposition principle for non-local Fokker-Planck operators. Probab. Theory Related Fields 178, 699–733 (2020)
Article MathSciNet MATH Google Scholar
Rüschendorf, L., Thomsen, W.: Note on the Schrödinger equation and $I$-projections. Statist. Probab. Lett. 17, 369–375 (1993)
Article MathSciNet MATH Google Scholar
Santambrogio, F.: Dealing with moment measures via entropy and optimal transport. J. Funct. Anal. 271, 418–436 (2016)
Article MathSciNet MATH Google Scholar
Schrödinger, E.: Ueber die Umkehrung der Naturgesetze. Sitz. Ber. der Preuss. Akad. Wissen., Berlin, Phys. Math. pp. 144–153 (1931)
Schrödinger, E.: Théorie relativiste de l’electron et l’interprétation de la mécanique quantique. Ann. Inst. H. Poincaré 2, 269–310 (1932)
MathSciNet MATH Google Scholar
Schweizer, B., Sklar, A.: Probabilistic Metric Space. Dover Publications, New York (2005)
MATH Google Scholar
Sheu, S.J.: Some estimates of the transition density of a nondegenerate diffusion Markov processes. Ann. Probab. 19, 538–561 (1991)
Article MathSciNet MATH Google Scholar
Tan, X., Touzi, N.: Optimal transportation under controlled stochastic dynamics. Ann. Probab. 41, 3201–3240 (2013)
Article MathSciNet MATH Google Scholar
Trevisan, D.: Well-posedness of multidimensional diffusion processes with weakly differentiable coefficients. Electron J. Probab. 21, 1–41 (2016)
Article MathSciNet MATH Google Scholar
Villani, C.: Topics in Optimal Transportation. American Mathematics Society, Providence, RI (2003)
Book MATH Google Scholar
Zambrini, J. C.: Variational processes. In: Albeverio, S. etal. (eds.) Stochastic processes in classical and quantum systems, Ascona 1985, Lecture Notes in Phys., Vol. 262., pp. 517–529. Springer, Heidelberg (1986)
Zheng, W.A.: Tightness results for laws of diffusion processes application to stochastic mechanics. Ann. Inst. Henri Poincaré 21, 103–124 (1985)
MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics, Tsuda University, 2-1-1 Tsuda-machi, Kodaira, Tokyo, 187-8577, Japan
Toshio Mikami

Authors

Toshio Mikami
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Toshio Mikami.

Additional information

This article is part of the topical collection Viscosity solutions, Dedicated to Hitoshi Ishii on the award of the 1st Kodaira Kunihiko Prize edited by Kazuhiro Ishige, Shigeaki Koike, Tohru Ozawa, and Senjo Shimizu.

Partially supported by JSPS KAKENHI Grant Numbers JP26400136 and 19K03548.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mikami, T. Stochastic optimal transport revisited. SN Partial Differ. Equ. Appl. 2, 5 (2021). https://doi.org/10.1007/s42985-020-00059-3

Download citation

Published: 13 January 2021
DOI: https://doi.org/10.1007/s42985-020-00059-3

Stochastic optimal transport revisited

Abstract

Similar content being viewed by others

Existence of Solutions to the Nonlinear Kantorovich Transportation Problem

Stochastic Optimal Transport with at Most Quadratic Growth Cost

Continuity and Estimates for Multimarginal Optimal Transportation Problems with Singular Costs

1 Introduction

Remark 1

Remark 2

Definition 1

Definition 2

Definition 3

Remark 3

Definition 4

Theorem 1

Theorem 2

2 SOTPs with a convex cost

Definition 5

Definition 6

Remark 4

Proposition 1

Remark 5

Lemma 1

Theorem 3

Corollary 1

Theorem 4

Remark 6

Corollary 2

Remark 7

Proof of Proposition 1

3 Stochastic optimal transport with a nonconvex cost

Theorem 5

Corollary 3

Corollary 4

Proof of Theorem 5

4 Semiconcavity and continuity of Schrödinger’s Problem

Remark 8

Theorem 6

Remark 9

Definition 7

Theorem 7

Remark 10

Theorem 8

Remark 11

Corollary 5

Proof of Theorem 7

Proof of Theorem 8

Remark 12

Proof of Corollary 5

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation