Abstract
We prove the Duality Theorems for the stochastic optimal transportation problems with a convex cost function without a regularity assumption that is often supposed in the proof of the lower semicontinuity of an action integral. In our new approach, we prove that the stochastic optimal transportation problems with a convex cost function are equivalent to a class of variational problems for the Fokker–Planck equation, which lets us revisit them. It is done by the so-called superposition principle and by an idea from the Mather theory. The superposition principle is the construction of a semimartingale from the Fokker–Planck equation and can be considered a class of the so-called marginal problems that construct stochastic processes from given marginal distributions. It was first considered in stochastic mechanics by Nelson, called Nelson’s problem, and was proved by Carlen first. The semimartingale is called the Nelson process, provided it is Markovian. We also consider the Markov property of a minimizer of the stochastic optimal transportation problem with a nonconvex cost in a one-dimensional case. In the proof, the superposition principle and the minimizer of an optimal transportation problem with a concave cost function play crucial roles. Lastly, we prove the semiconcavity and the Lipschitz continuity of Schrödinger’s problem that is a typical example of the stochastic optimal transportation problem.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
The construction of a stochastic process from given marginal distributions is called a marginal problem.
Schrödinger’s problem is the construction of a Markov diffusion process on [0, 1] from two endpoint marginal distributions at \(t=0,1\) by solving a variational problem on the relative entropy. We describe it briefly (see \(V_S\) in (1.19), (4.19), and also [28, 30, 44]). Let \(\sigma\) and \(\xi\) be, respectively, a \(d\times d\) nondegenerate matrix-valued and an \(\mathbb {R}^d\)-valued function on \([0,1]\times \mathbb {R}^d\). Suppose that the following stochastic differential equation has a weak solution \(\{X(t)\}_{0\le t\le 1}\) with a positive transition probability density \(p(s,x;t,y), 0\le s<t\le 1, x,y\in \mathbb {R}^d\):
where W(t) denotes a d-dimensional Brownian motion defined on a probability space (see Theorem 6 in Sect. 4). Let \(\mathcal{P} (\mathbb {R}^d )\) denote the set of all Borel probability measures on \(\mathbb {R}^d\). For any \(P_0, P_1\in \mathcal{P} (\mathbb {R}^d )\), there exists a unique product measure \(\nu _0(dx)\nu _1(dy)\) that satisfies the following:
This is Euler’s equation of Schrödinger’s problem and is called Schrödinger’s functional equation or the Schrödinger system (see [55, 56] and also [27] and Proposition 2.1 in [42]). Under some assumptions on \(\sigma\) and \(\xi\) (see, e.g. (A.5)–(A.6) in Sect. 4), if \(P_1(dy)\ll dy\), then there exists a unique weak solution \(\{Y(t)\}_{0\le t\le 1}\) to the following (see [28]) :
where \(a(t,x):=\sigma (t,x)\sigma (t,x)^*\), \(\sigma (t,x)^*\) denotes the transpose of \(\sigma (t,x)\), \(D_y:=\left( \partial /\partial y_i\right) _{i=1}^d\),
and \(P^{Y(0)}\) denotes the probability law of Y(0). Besides, the following holds:
which implies that \(P^{Y(1)}=P_1\) from (1.2). \(\{Y(t)\}_{0\le t\le 1}\) is called the h-path process for \(\{X(t)\}_{0\le t\le 1}\) from two endpoint marginals \(P_0, P_1\) at \(t=0,1\), respectively.
Remark 1
Schrödinger’s functional Eqs. (1.2)–(1.3) is equivalent to the following:
In particular, one only has to find a solution \(\overline{h}(1,\cdot )\) in (1.7).
Motivated by Schrödinger’s quantum mechanics, Nelson proposed the problem of the construction of a Markov diffusion process from the Fokker–Planck equation. We describe it. Let a and b be, respectively, a \(d\times d\) symmetric nonnegative definite matrix-valued and an \(\mathbb {R}^d\)-valued function on \([0,1]\times \mathbb {R}^d\), and let \(\{ P_t\}_{0\le t\le 1}\subset \mathcal{P} (\mathbb {R}^d )\).
By \((a,b)\in \mathbf{A} (\{P_t\}_{0\le t\le 1})\), we mean that \(a,b\in L^1([0,1]\times \mathbb {R}^d, dtP_t(dx))\) and the following Fokker-Planck equation holds: for any \(f \in C^{1,2}_b ([0,1]\times \mathbb {R}^d)\) and \(t\in [0,1]\),
Here \(\partial _s:=\partial /\partial s\), \(D_x^2:=\left( \partial ^2/\partial x_i\partial x_j\right) _{i,j=1}^d\), \(\langle x,y\rangle\) denotes the inner product of \(x, y\in \mathbb {R}^d\) and
We also write \((a,b)\in \mathbf{A}_0 (\{P_t\}_{0\le t\le 1})\) if \(a,b\in L^1_{loc}([0,1]\times \mathbb {R}^d, dtP_t(dx))\) and (1.10) holds for all \(f \in C^{1,2}_0 ([0,1]\times \mathbb {R}^d)\).
Remark 2
For \(\{ P_t\}_{0\le t\le 1}\) in (1.10), \(\mathbf{A} (\{P_t\}_{0\le t\le 1})\) is not necessarily a singleton (see [11,12,13,14, 33]).
The following is a generalized version of Nelson’s problem (see [47, 49, 50]).
Definition 1
(Nelson’s problem) For any \(\{ P_t\}_{0\le t\le 1}\subset \mathcal{P} (\mathbb {R}^d )\) such that \(\mathbf{A}_0 (\{P_t\}_{0\le t\le 1})\) is not empty and for any \((a,b)\in \mathbf{A}_0 (\{P_t\}_{0\le t\le 1})\), construct a \(d\times d\) matrix-valued function \(\sigma (t,x)\) on \([0,1]\times \mathbb {R}^d\) and a semimartingale \(\{X(t)\}_{0\le t\le 1}\) such that the following holds: for \((t,x)\in [0,1]\times \mathbb {R}^d\),
Here \(W_X\) denotes a d-dimensional Brownian motion.
When \(\sigma (t,x)\) is nondegenerate, \(W_X\) can be taken to be an \((\mathcal{F}_t^X)\)-Brownian motion, where \(\mathcal{F}_t^X\) denotes \(\sigma [X(s):0\le s\le t]\). Otherwise (1.12) means that \(X(t)-X(0)-\int _0^t b(s,X(s))ds\) is a local martingale with a quadratic variational process \(\int _0^t a (s, X(s))ds\) (see, e.g. [25]).
The first result on Nelson’s problem was given by E. Carlen when a is an identity matrix (see [8, 9], and also [10, 46, 63] for different approaches). We generalized it to the case with a variable diffusion matrix (see [33]). P. Cattiaux, C. Léonard extensively generalized it to the case where the jump-type Markov processes are also considered (see [11,12,13,14]). In these papers, they assumed the following condition.
Definition 2
(Finite energy condition (FEC))
There exists \((a,b) \in \mathbf{A} (\{P_t\}_{0\le t\le 1})\) such that the following holds:
We describe a class of stochastic optimal transportation problems (SOTPs for short) and approaches to the h-path process and Nelson’s problem by the SOTPs.
Fix a Borel measurable \(d\times d\)-matrix function \(\sigma (t,x)\). Let \(\mathcal{A}\) denote the set of all \(\mathbb {R}^d\)-valued, continuous semimartingales \(\{ X(t)\}_{0\le t\le 1}\) on a (possibly different) complete filtered probability space such that there exists a Borel measurable \(\beta _X :[0,1]\times C([0,1])\longrightarrow \mathbb {R}^d\) for which the following holds:
-
(i)
\(\omega \mapsto \beta _X (t,\omega )\) is \(\mathbf{B}(C([0,t]))_+\)-measurable for all \(t\in [0,1]\),
-
(ii)
\(X(t)=X(0)+\int _0^t \beta _X (s,X)ds+\int _0^t \sigma (s,X(s))dW_X(s)\), \(0\le t\le 1\),
-
(iii)
$$\begin{aligned} E\left[ \int _0^1\left\{ |\beta _X (t,X)|+|\sigma (t,X(t))|^2\right\} dt\right] <\infty . \end{aligned}$$
Here \(\mathbf{B}(C([0,t]))\) and \(\mathbf{B}(C([0,t]))_+\) denote the Borel \(\sigma\)-field of C([0, t]) and \(\cap _{s> t}{} \mathbf{B}(C([0,s]))\), respectively (see, e.g. [31]). \(|\cdot |:=\langle \cdot , \cdot \rangle ^{1/2}\).
Let \(L:[0,1]\times \mathbb {R}^d \times \mathbb {R}^d\longrightarrow [0,\infty )\) be continuous. The following is a class of the SOTPs (see [41, 45], and also [33, 37, 44]).
Definition 3
(Stochastic optimal transportation problems)
-
(1)
For \(P_0\), \(P_1\in \mathcal{P} (\mathbb {R}^d )\),
$$\begin{aligned} V (P_0,P_{1}):=\inf _{\begin{array}{c} {\scriptstyle X\in \mathcal{A},} \\ {{\scriptstyle P^{X (t)}=P_t, t=0,1}} \end{array}}\ E\biggl [\int _0^1 L(t,X (t);\beta _X (t,X))dt \biggr ]. \end{aligned}$$(1.15) -
(2)
For \(\{ P_t\}_{0\le t\le 1}\subset \mathcal{P} (\mathbb {R}^d )\),
$$\begin{aligned} \mathbf{V} (\{ P_t\}_{0\le t\le 1} ):=\inf _{\begin{array}{c} {\scriptstyle X\in \mathcal{A},} \\ {{\scriptstyle P^{X (t)}=P_t, 0\le t\le 1}} \end{array}} E\biggl [\int _0^1 L(t,X (t);\beta _X (t,X))dt \biggr ]. \end{aligned}$$(1.16)
If the set over which the infimum is taken is empty, then we set the infimum for infinity.
Suppose that one knows the marginal probability distributions of a stochastic system at times \(t=0, 1\) or \(t\in [0,1]\). To study the stochastic system on [0, 1] from the viewpoint of the principle of least action, one has to consider these kinds of problems.
Remark 3
(i) The sets of stochastic processes over which the infimum are taken in (1.15)-(1.16) can be empty. If \(P_1(dx)\ll dx\), then the case when it is not empty is known for (1.15) in [28] and for (1.16) in [5, 8,9,10,11,12,13,14, 33, 35,36,37, 40, 41, 46, 60, 63]. (ii) For \(\{X(t)\}_{0\le t\le 1}\in \mathcal{A}\),
Indeed, by Itô’s formula, (1.10) with \(a=\sigma \sigma ^*, b=b_X\) holds and by Jensen’s inequality,
Schrödinger’s problem which is a typical example of the SOTP is \(V_S:=V\) in (1.15) when the following holds:
(see, e.g. [30, 44, 53]). If \(V_S (P_0, P_1)\) is finite for \(P_0\), \(P_1\in \mathcal{P} (\mathbb {R}^d )\) and if \(\sigma\) and \(\xi\) satisfy nice conditions, then the minimizer uniquely exists and is the h-path process with two endpoint marginals \(P_0, P_1\) in (1.4)–(1.5) (see [16, 21, 44, 45, 51, 62]).
By the continuum limit of \(V (\cdot ,\cdot )\), we considered Nelson’s Problem in a more general setting, including the following case (see [33, 40]).
Definition 4
(Generalized finite energy condition (GFEC))
There exists \(\gamma >1\) and \((a,b) \in \mathbf{A} (\{P_t\}_{0\le t\le 1})\) such that the following holds:
As an application of the Duality Theorem for \(\mathbf{V}\), we also gave an approach to Nelson’s Problem under the condition which includes the GFEC (see [41]).
If (1.11)–(1.13) hold, then they also say that the superposition principle holds. When \(a\equiv 0\), the superposition principle was studied in [1, 2, 36, 37]. Trevisan’s result [60] almost completely solved Nelson’s problem (see also [5, 19]). In the case where the linear operator with the second order differential operator and with the Lévy measure is considered, it was studied in [14, 52].
Theorem 1
(See [60]) Suppose that there exists \(\{ P_t\}_{0\le t\le 1}\subset \mathcal{P} (\mathbb {R}^d )\) such that \((a,b)\in \mathbf{A} (\{P_t\}_{0\le t\le 1})\) exists.
Then Nelson’s problem (1.11)–(1.13) has a solution.
In his problem, Nelson considered the case where \(a=Identity\) and \(b=D_x \psi (t,x)\) for some function \(\psi\). It turned out that the Nelson process is the minimizer of \(\mathbf{V}_N:=\mathbf{V}\) when (1.19) with \(\sigma =Identity\) and \(\xi =0\) and the FEC hold (see Proposition 3.1 in [33] and also Theorem 4 in Sect. 2). Indeed, if \((a, D_x \psi _i)\in \mathbf{A} (\{P_t\}_{0\le t\le 1})\), \(i=1,2\), then \(D_x \psi _1=D_x \psi _2\), \(dtP_t(dx)\)-a.e.. In this sense, we consider that Nelson’s problem is the studies of the superposition principle and of the minimizer of \(\mathbf{V}\). In particular, if the superposition principle holds, then the set over which the infimum is taken in \(\mathbf{V}\) is not empty and then one can consider a minimizer of \(\mathbf{V}\), provided it is finite. There was a different approach by showing Proposition 1 in Sect. 2 via the Duality Theorems in Theorems 3 and 4 in Sect. 2 (see [41] and also [33, 40]). It is also generalized by the superposition principle and our previous approach to the first part of Nelson’s problem is not useful anymore.
In Sect. 2, we improve our previous results on the SOTPs with a convex cost function by the superposition principle in Theorem 1.
More precisely, we prove that the SOTPs are equivalent to variational problems for probability measures given by the Fokker–Planck equation and to those by a relaxed version of the Fokker–Planck equation (see Proposition 1 in Sect. 2). In particular, we can prove the convexity and the lower-semicontinuity of the SOTPs in marginal distributions by a finite-dimensional approach though the SOTPs are variational problems for semimartingales. It gives a new insight into the SOTPs and lets us revisit them.
In Sect. 3, in the case where \(d=1\) and where a is not fixed, we consider slightly relaxed versions of the SOTPs of which cost functions are not supposed to be convex. In this case, we need a generalization of Trevisan’s result which was recently obtained by Bogachev, Röckner, Shaposhnikov.
Theorem 2
(See [5]) Suppose that there exists \(\{ P_t\}_{0\le t\le 1}\subset \mathcal{P} (\mathbb {R}^d )\) such that \((a,b)\in \mathbf{A}_0 (\{P_t\}_{0\le t\le 1})\) exists and that the following holds:
Then Nelson’s problem (1.11)–(1.13) has a solution.
As a fundamental problem of the stochastic optimal control theory, the test of the Markov property of a minimizer is known. We also discuss this problem for a finite-time horizon stochastic optimal control problem.
In Sect. 4, we study the semiconcavity and the Lipschitz continuity of Schrödinger’s problem \(V_S\).
2 SOTPs with a convex cost
In this section, we discuss applications of D. Trevisan’s result to the Duality Theorems for the SOTPs in the case where \(u\mapsto L(t,x;u)\) is convex and where \(\sigma\) and \(a=\sigma \sigma ^*\) in (1.11) are fixed. We write \(b\in \mathbf{A} (\{P_t\}_{0\le t\le 1})\) if \((a,b)\in \mathbf{A} (\{P_t\}_{0\le t\le 1})\) for the sake of simplicity (see (1.10) for notation).
As a preparation, we introduce two classes of marginal problems which play crucial roles in the proof of the Duality Theorems for the SOTPs (see [40, 41]) and which will be proved to be equivalent to the SOTPs by D. Trevisan’s result.
The following can be considered as versions of the SOTPs for a flow of marginals which satisfy (1.10).
Definition 5
(SOTPs for marginal flows)
-
(1)
For \(P_0\), \(P_{1}\in \mathcal{P}(\mathbb {R}^d)\),
$$\begin{aligned} \mathrm{v}(P_0,P_{1}) :=\inf _{\begin{array}{c} {\scriptstyle b\in \mathbf{A} (\{Q_t\}_{0\le t\le 1}),} \\ {{\scriptstyle Q_t=P_t, t=0,1}} \end{array}} \int _0^1 dt \int _{\mathbf{R }^d}L(t,x;b(t,x))Q_t(dx). \end{aligned}$$(2.1) -
(2)
For \(\{ P_t\}_{0\le t\le 1}\subset \mathcal{P}(\mathbb {R}^d)\),
$$\begin{aligned} \mathbf{v} (\{ P_t\}_{0\le t\le 1} ) :=\inf _{b\in \mathbf{A} (\{ P_t\}_{0\le t\le 1})} \int _0^1 dt\int _{\mathbb {R}^d} L(t,x;b(t,x))P_t(dx). \end{aligned}$$(2.2)
For \(\mu (dxdu)\in \mathcal{P} ( \mathbb {R}^d\times \mathbb {R}^d )\),
We write \(\nu (dtdxdu)\in \tilde{\mathcal{A}}\) if the following holds. (i) \(\nu \in \mathcal{P} ([0,1]\times \mathbb {R}^d \times \mathbb {R}^d )\) and
(ii) \(\nu (dtdxdu)=dt\nu (t,dxdu)\), \(\nu (t,dxdu)\in \mathcal{P} ( \mathbb {R}^d\times \mathbb {R}^d )\), \(\nu _{1}(t, dx), \nu _{2}(t, du)\in \mathcal{P} ( \mathbb {R}^d)\), \(dt-\)a.e. and \(t\mapsto \nu _{1}(t, dx)\) has a weakly continuous version \(\nu _{1,t}(dx)\in \mathcal{P} ( \mathbb {R}^d)\) for which the following holds: for any \(t\in [0,1]\) and \(f\in C^{1,2}_b ([0,1]\times \mathbb {R}^d)\),
Here
We introduce a relaxed version of the problem above (see [23] and references therein for related topics).
Definition 6
(SOTPs for marginal measures)
-
(1)
For \(P_0\), \(P_{1}\in \mathcal{P}(\mathbb {R}^d)\),
$$\begin{aligned} \tilde{\mathrm{v}}(P_0,P_{1}) :=\inf _{\begin{array}{c} {\scriptstyle \nu \in \tilde{\mathcal{A}},} \\ {{\scriptstyle \nu _{1,t}=P_t, t=0,1}} \end{array}} \int _{[0,1]\times \mathbb {R}^d\times \mathbb {R}^d}L(t,x;u)\nu (dtdxdu). \end{aligned}$$(2.7) -
(2)
For \(\{ P_t\}_{0\le t\le 1}\subset \mathcal{P}(\mathbb {R}^d)\),
$$\begin{aligned} \tilde{\mathbf{v}} (\{ P_t\}_{0\le t\le 1} ) :=\inf _{\begin{array}{c} {\scriptstyle \nu \in \tilde{\mathcal{A}},} \\ {{\scriptstyle \nu _{1,t}=P_t, 0\le t\le 1}} \end{array}} \int _{[0,1]\times \mathbb {R}^d\times \mathbb {R}^d}L(t,x;u)\nu (dtdxdu). \end{aligned}$$(2.8)
Remark 4
If \(b\in \mathbf{A}(\{ P_t\}_{0\le t\le 1})\) and \(X\in \mathcal{A}\), then \(dtP_t(dx)\delta _{b(t,x)}(du)\in \tilde{\mathcal{A}}\) and \(dtP^{(X(t),\beta _X(t,X))}(dxdu)\in \tilde{\mathcal{A}}\), respectively. Here \(\delta _x\) denotes the delta measure on \(\{x\}\). In particular, \(dtP^{(X(t),\beta _X(t,X))}(dxdu)\) is the distribution of a \([0,1]\times \mathbb {R}^d\times \mathbb {R}^d\)-valued random variable \((t,X(t),\beta _X(t,X))\). This is why we call (2.7)–(2.8) SOTPs for marginal measures (see also Lemma 1 given later). One can also identify \(\{ P_t\}_{0\le t\le 1}\subset \mathcal{P}(\mathbb {R}^d)\) with \(dtP_t(dx)\in \mathcal{P}([0,1]\times \mathbb {R}^d)\) when \(\mathbf{V}, \mathbf{v}\) and \(\tilde{\mathbf{v}}\) are considered (see Theorem 4 and also [41, 44]).
We introduce assumptions.
(A.0.0). (i) \(\sigma _{ij}\in C_b([0,1]\times \mathbb {R}^d)\), \(i,j=1,\ldots ,d\). (ii) \(\sigma (\cdot )=(\sigma _{ij}(\cdot ))_{i,j=1}^d\) is a nondegenerate \(d\times d\)-matrix function on \([0,1]\times \mathbb {R}^d\).
(A.1). (i) \(L\in C([0,1]\times \mathbb {R}^d \times \mathbb {R}^d;[0,\infty ))\). (ii) \(\mathbb {R}^d\ni u\mapsto L(t,x;u)\) is convex for \((t,x)\in [0,1]\times \mathbb {R}^d\).
(A.2).
The following proposition gives the relations among and the properties of three classes of the SOTPs stated in Definitions 3, 5, and 6 above. In particular, it implies that they are equivalent in our setting and why they are all called the SOTPs. It also implies the convexities and the lower semicontinuities of \(V(P_0,P_{1})\) and \(\mathbf{V} (\{ P_t\}_{0\le t\le 1})\).
Proposition 1
-
(i)
Suppose that (A.1) holds. Then the following holds:
$$\begin{aligned} V(P_0,P_{1})=\mathrm{v}(P_0,P_{1})=\tilde{\mathrm{v}}(P_0,P_{1}),\quad P_0, P_{1}\in \mathcal{P}(\mathbb {R}^d), \end{aligned}$$(2.9)$$\begin{aligned} \mathbf{V} (\{ P_t\}_{0\le t\le 1} )= \mathbf{v} (\{ P_t\}_{0\le t\le 1} )=\tilde{\mathbf{v}} (\{ P_t\}_{0\le t\le 1} ), \{ P_t\}_{0\le t\le 1}\subset \mathcal{P}(\mathbb {R}^d). \end{aligned}$$(2.10) -
(ii)
Suppose, in addition, that (A.0.0,i) and (A.2) hold. Then there exist minimizers X of \(V (P_0,P_{1})\) and Y of \(\mathbf{V} (\{ P_t\}_{0\le t\le 1} )\) for which
$$\begin{aligned} \beta _X (t,X)=b_X(t,X(t)),\quad \beta _Y (t,Y)=b_Y(t,Y(t)), \end{aligned}$$(2.11)provided \(V (P_0,P_{1})\) and \(\mathbf{V} (\{ P_t\}_{0\le t\le 1} )\) are finite, respectively (see (1.17) for notation).
-
(iii)
Suppose, in addition, that (A.0.0,ii) holds and that \(\mathbb {R}^d\ni u\mapsto L(t,x;u)\) is strictly convex for \((t,x)\in [0,1]\times \mathbb {R}^d\). Then for any minimizers X of \(V (P_0,P_{1})\) and Y of \(\mathbf{V} (\{ P_t\}_{0\le t\le 1} )\), (2.11) holds and \(b_X\) and \(b_Y\) in (2.11) are unique on the support of \(dtP^{X(t)}(dx)\) and \(dtP^{Y(t)}(dx)\), respectively.
Remark 5
Let \(c\in C(\mathbb {R}^d\times \mathbb {R}^d;[0,\infty ))\). For \(P_0, P_{1}\in \mathcal{P}(\mathbb {R}^d)\),
(see (2.3) for notation). \(T_M(P_0, P_{1})\) and \(T(P_0, P_{1})\) are called Monge’s and Monge-Kantorovich’s problems, respectively. The second equalities in (2.9)–(2.10) are similar to the relation between Monge’s and Monge-Kantorovich’s problems since \(\tilde{\mathrm{v}}\) and \(\tilde{\mathbf{v}}\) are the infimums of linear functionals of measure (see, e.g. [51, 61]).
Before we prove Proposition 1, we state its application to the SOTPs.
For any \(s\ge 0\) and \(P\in \mathcal{P} (\mathbb {R}^d )\),
Let \(\mathcal{P} (\mathbb {R}^d )\) be endowed with a weak topology. Then the following is known.
Lemma 1
(See [41]) Suppose that (A.0.0,i) and (A.1)–(A.2) hold. Then for any \(s\ge 0\) and compact set \(K\subset \mathcal{P} (\mathbb {R}^d )\), the set \(\cup _{P\in K}{} \mathbf{\Psi }_{P}(s)\) is compact in \(\mathcal{P} ([0,1]\times \mathbb {R}^d \times \mathbb {R}^d )\).
Lemma 1 was given in [41] to prove the Duality Theorems for \(\mathrm{v}(P_0,P_{1})\) and \(\mathbf{v}(\{ P_t\}_{0\le t\le 1})\). By Proposition 1, it can be also used in the proof of the lower semicontinuities of \({V}(P_0,P_{1})\) and \(\mathbf{V}(\{ P_t\}_{0\le t\le 1})\). Besides, we do not need the following assumption anymore.
(A).
where the supremum is taken over all (t, x) and \((s,y) \in [0,1]\times \mathbb {R}^d\) for which \(|t-s|<\varepsilon _1\), \(|x-y|<\varepsilon _2\) and over all \(u\in \mathbb {R}^d\).
This assumption can be used to prove the lower semicontinuity of the following (see [26], Chapter 9.1):
We state additional assumptions and the improved versions of the Duality Theorems for \({V}(P_0,P_{1})\) and \(\mathbf{V}(\{ P_t\}_{0\le t\le 1})\).
(A.0). \(\sigma _{ij}\in C^{1}_b ([0,1]\times \mathbb {R}^d)\), \(i,j=1,\dots ,d\).
(A.3). (i) \(\partial _t L(t,x;u)\) and \(D_x L(t,x;u)\) are bounded on \([0,1]\times \mathbb {R}^d \times B_R\) for all \(R>0\), where \(B_R:=\{ x\in \mathbb {R}^d ||x|\le R\}\). (ii) \(C_L\) is finite, where
The following is a generalization of [41], in that we do not need the nondegeneracy of a and the assumption (A) and can be proved almost in the same way as in [41] by Proposition 1 and by Lemma 1. Indeed, in our previous papers, by the nondegeneracy of a, we made use of the Cameron–Martin–Maruyama–Girsanov formula to prove the convexity of \(P\mapsto V(P_0,P)\), which we can avoid by Proposition 1. The lower semicontinuity of \(P\mapsto V(P_0,P)\) can be proved by Proposition 1 and by Lemma 1. In [59], they considered a similar problem and used a general property on the convex combination of probability measures on an enlarged space, which allows them not to assume the nondegeneracy of a, though they assumed a condition which is similar to (A).
One can also find details in [44] (see [24] for related topics). We refer readers to [15, 20, 29] on the viscosity solution.
Theorem 3
(Duality Theorem for V) Suppose that (A.0)–(A.3) hold. Then, for any \(P_0\), \(P_{1}\in \mathcal{P} (\mathbb {R}^d )\),
where \(\varphi (t,x;f)\) denotes the minimal bounded continuous viscosity solution to the following HJB Eqn: on \([0,1)\times \mathbb {R}^d\),
We introduce the following condition to replace \(\varphi\) in (2.18) by classical solutions to the HJB Eq. (2.19).
(A.4). (i) “\(\sigma\) is an identity”, or “ \(\sigma (\cdot )=(\sigma _{ij}(\cdot ))_{i,j=1}^d\) is uniformly nondegenerate, \(\sigma _{ij}\in C^{1,2}_b ([0,1]\times \mathbb {R}^d)\), \(i,j=1,\ldots ,d\), and there exist functions \(L_1\) and \(L_2\) so that \(L=L_1 (t,x)+L_2 (t,u)\)”. (ii) \(L(t,x;u)\in C^1([0,1]\times \mathbb {R}^d \times \mathbb {R}^d;[0,\infty ))\) and is strictly convex in u. (iii) \(L\in C^{1,2,0}_b ([0,1]\times \mathbb {R}^d\times B_R )\) for any \(R>0\).
Since (A.4,i), (A.4,ii), and (A.4,iii) imply (A.0), (A.1), and (A.3,i), respectively, the following holds from Theorem 3, in the same way as in [41] (see also [44]).
Corollary 1
Suppose that (A.2), (A.3,ii), and (A.4) hold. Then (2.18) holds even if the supremum is taken over all classical solutions \(\varphi \in C_b^{1,2} ([0,1]\times \mathbb {R}^d)\) to the HJB Eqn (2.19). Besides, for any \(P_0, P_1\in \mathcal{P}(\mathbb {R}^d)\) for which \(V (P_0, P_1)\) is finite, a minimizer \(\{X(t)\}_{0\le t\le 1}\) of \(V (P_0, P_1)\) exists and the following holds: for any maximizing sequence \(\{\varphi _n\}_{n\ge 1}\) of (2.18),
In particular, there exists a subsequence \(\{n_k \}_{k\ge 1}\) for which
The following is also a generalization of [41] and can be proved almost in the same way as in [41] by Proposition 1 and Lemma 1.
Theorem 4
(Duality Theorem for \(\mathbf{V}\)) Suppose that (A.0)-(A.3) hold. Then for any \(\mathbf{P}:=\{ P_t\}_{0\le t\le 1}\subset \mathcal{P} (\mathbb {R}^d )\),
where \(\phi (t,x;f)\) denotes the minimal bounded continuous viscosity solution of the following HJB Eqn: on \([0,1)\times \mathbb {R}^d\),
Suppose that (A.4) holds instead of (A.0), (A.1), and (A.3,i). Then (2.22) holds even if the supremum is taken over all classical solutions \(\phi \in C^{1,2}_b ([0,1]\times \mathbb {R}^d)\) to the HJB Eqn (2.23). Besides, if \(\mathbf{V} (\mathbf{P})\) is finite, then a minimizer \(\{X(t)\}_{0\le t\le 1}\) of \(\mathbf{V} (\mathbf{P})\) exists and the following holds: for any maximizing sequence \(\{\phi _n\}_{n\ge 1}\) of (2.22),
In particular, there exists a subsequence \(\{n_k \}_{k\ge 1}\) for which
Remark 6
(See [41, 44]) (i) Suppose that (A.0)–(A.3) hold. Then for any \(f\in UC_b (\mathbb {R}^d)\), the following is the minimal bounded continuous viscosity solution of the HJB equation (2.19):
where \(\mathcal{A}_t\) denotes \(\mathcal{A}\) with a time interval [0, 1] replaced by [t, 1]. (ii) Suppose that (A.0)–(A.3) with L replaced by \(L(t,x;u)-f(t,x)\) hold. Then the following is the minimal bounded continuous viscosity solution of the HJB Eq. (2.23):
We consider Schrödinger’s and Nelson’s problems, i.e., \(V_S\) and \(V_N\). We introduce a new assumption.
(A.4)’. (1.19) holds, \(\sigma (\cdot )=(\sigma _{ij}(\cdot ))_{i,j=1}^d\) is uniformly nondegenerate, and \(a\in C^{1,2}_b ([0, 1] \times \mathbb {R}^d;M(d,\mathbb {R})), \xi \in C^{1,2}_b ([0, 1] \times \mathbb {R}^d;\mathbb {R}^d)\).
(A.4)’ implies (A.0)-(A.3). Besides, for \(f\in C^3_b (\mathbb {R}^d)\) and \(f\in C^{1,2}_b ([0,1]\times \mathbb {R}^d)\), the HJB equations (2.19) and (2.23) have unique classical solutions in \(C^{1,2}_b ([0,1]\times \mathbb {R}^d)\), respectively. They are also the minimal bounded continuous viscosity solutions of (2.19) and (2.23), respectively, since they have the same representation formulas given in Remark 6 (see, e.g. [20, 22] on classical solutions and Lemma 4.5 in [41] on viscosity solution). In particular, the following holds though (A.4)’ does not imply (A.4).
Corollary 2
Suppose that (A.4)’ holds. Then the assertions in Corollary 1 and Theorem 4 hold.
Remark 7
If (1.19) holds, then
In the rest of this section, we prove Proposition 1.
Proof of Proposition 1
We prove (i). For \(\{X(t)\}_{0\le t\le 1}\in \mathcal{A}\), by Jensen’s inequality,
Theorem 1 implies the first equalities of (2.9)–(2.10) (see Remark 3, (ii)).
For \(\nu \in \tilde{\mathcal{A}}\),
where \(\nu (t,x,du)\) denotes a regular conditional probability of \(\nu\) given (t, x). Then by Jensen’s inequality,
\(b_\nu \in \mathbf{A} (\{\nu _{1,t}\}_{0\le t\le 1})\) from (2.5), since by Jensen’s inequality,
and for any \(t\in [0,1]\) and \(f\in C^{1,2}_b ([0,1]\times \mathbb {R}^d)\),
This implies the second equalities of (2.9)–(2.10) (see Remark 4).
The proof of (ii) is done by Lemma 1, (2.32), and Theorem 1.
We prove (iii). From (2.29) and the strict convexity of \(u\mapsto L(t,x;u)\), (2.11) holds. For \(b\in \mathbf{A} (\{P_t\}_{0\le t\le 1})\), \(P_t(dx)\ll dx\), dt-a.e. from (A.0.0,ii), since \(a, b\in L^1([0,1]\times \mathbb {R}^d,dtP_t(dx))\) (see [4], p. 1042, Corollary 2.2.2). For \(\{p_{i}(t,x)dx\}_{0\le t\le 1}\subset \mathcal{P}(\mathbb {R}^d)\), \(b_i\in \mathbf{A} (\{p_{i}(t,x)dx\}_{0\le t\le 1})\), \(i=0,1\), and \(\lambda \in [0,1]\),
where \(1_A(x)\) denotes an indicator function of \(A\subset \mathbb {R}\). Then \(b_\lambda \in \mathbf{A}(\{p_\lambda (t,x)dx\}_{0\le t\le 1})\) and
Here the equality holds if and only if \(b_0=b_1\) dtdx-a.e. on the set \(\{(t,x)\in [0,1]\times \mathbb {R}^d| p_0 (t,x)p_1 (t,x)>0\}\). \(\square\)
3 Stochastic optimal transport with a nonconvex cost
In this section, in the case where \(d=1\) and where a is not fixed, we consider slightly relaxed versions of the SOTPs of which cost functions are not supposed to be convex. As a fundamental problem of the stochastic optimal control theory, the test of the Markov property of a minimizer of a stochastic optimal control problem is known. We also consider the Markov property of the minimizer of a finite-time horizon stochastic control problem. Our previous result [35] proved it in a one-dimensional case by the optimal transportation problem with a concave cost. We generalize it by Theorem 2 in Sect. 1.
Since a is not fixed in this section, we consider a new class of semimartingales.
Let \(u=\{u(t)\}_{0\le t\le 1}\) and \(\{W(t)\}_{0\le t\le 1}\) be a progressively measurable real valued process and a one-dimensional Brownian motion on the same complete filtered probability space, respectively. The probability space under consideration is not fixed in this section. Let \(\sigma :[0,1]\times \mathbb {R}\longrightarrow \mathbb {R}\) be a Borel measurable function. Let \(Y^{u,\sigma }=\{Y^{u,\sigma }(t)\}_{0\le t\le 1}\) be a continuous semimartingale such that the following holds weakly:
provided it exists.
For \(r> 0\),
where \(b_{Y^{u,\sigma }}(t,Y^{u,\sigma }(t)):=E[u(t)|(t,Y^{u,\sigma }(t))]\). For \((u,\sigma )\in \mathcal{U}_{r}\),
Here for a distribution function F on \(\mathbb {R}\),
\(F^{-1}\) is called the quasi-inverse of F (see, e.g. [48, 51, 57]).
exists dt-a.e. since r is positive and \((\sigma ^2, b_{Y^{u,\sigma }})\in \mathbf{A}_0(\{P^{Y^{u,\sigma }(t)}\}_{0\le t\le 1})\) (see [4], p. 1042, Corollary 2.2.2). Indeed, by Jensen’s inequality,
From the idea of covariance kernels (see [6, 7, 34, 39]),
The following holds and will be proved later.
Theorem 5
Let \(r> 0\). For \((u,\sigma )\in \mathcal{U}_{r}\), there exists \(\tilde{u}\) such that \((\tilde{u},\tilde{\sigma }:=(\sigma ^2+\tilde{a}_{u,Y^{u,\sigma }})^{\frac{1}{2}})\in \mathcal{U}_{r,Mar}\) and that the following holds:
For \(r> 0\) and \(\{P_t\}_{0\le t\le 1}\subset \mathcal{P}(\mathbb {R})\),
Let \(L_1, L_2:[0,1]\times \mathbb {R}\longrightarrow [0,\infty )\) be Borel measurable.
For \((u,\sigma )\),
For \((a,b)\in \mathbf{A}_{0}(\{P_t\}_{0\le t\le 1})\),
One easily obtains the following from Theorems 2 and 5.
Corollary 3
Suppose that \(L_1, L_2:[0,1]\times \mathbb {R}\longrightarrow [0,\infty )\) are Borel measurable.
Then for any \(r> 0\), the following holds. (i) For any \(P_0,P_1\in \mathcal{P}(\mathbb {R})\),
In particular, if there exists a minimizer in (3.16), then there exists a minimizer \((u,\sigma )\in \mathcal{U}_{r, Mar}\). (ii) For any \(\{P_t\}_{0\le t\le 1}\subset \mathcal{P}(\mathbb {R})\),
In particular, if there exists a minimizer in (3.17), then there exists a minimizer \((u,\sigma )\in \mathcal{U}_{r,Mar}\).
Suppose that \(L:[0,1]\times \mathbb {R}\times \mathbb {R}\longrightarrow [0,\infty ), \Psi :\mathbb {R}\longrightarrow [0,\infty )\) are Borel measurable. Then for any \(P_0\in \mathcal{P}(\mathbb {R})\),
where \(V_r\) denotes V with \(\mathcal{A}\) replaced by \(\{Y^{u, \sigma }|(u, \sigma )\in \mathcal{U}_{r}\}\).
In particular, we easily obtain the following from Corollary 3.
Corollary 4
In addition to the assumption of Corollary 3, suppose that \(\Psi :\mathbb {R}\longrightarrow [0,\infty )\) is Borel measurable. Then for any \(r> 0\) and \(P_0\in \mathcal{P}(\mathbb {R})\),
In particular, if there exists a minimizer in (3.19), then there exists a minimizer \((u,\sigma )\in \mathcal{U}_{r, Mar}\).
We prove Theorem 5 by Theorem 2.
Proof of Theorem 5
For \((u,\sigma )\in \mathcal{U}_{r}\), the following holds (see [35]):
Indeed,
For an \(\mathbb {R}^2\)-valued random variable \(Z=(X,Y)\) on a probability space,
where \(F_X\) denotes the distribution function of X and U is a uniformly distributed random variable on [0, 1]. The distribution functions of \(F_X^{-1} (U)\) and \(F_Y^{-1} (1-U)\) are \(F_X\) and \(F_Y\), respectively. From (3.7), \(F^{Y^{u, \sigma }}_t (Y^{u, \sigma }(t))\) is uniformly distributed on [0, 1] and \((F^{Y^{u, \sigma }}_t)^{-1} (F^{Y^{u, \sigma }}_t (Y^{u, \sigma } (t)))=Y^{u, \sigma }(t)\), P-a.s., dt-a.e. (see [17] or, e.g. [48, 51, 57]).
It is easy to see that the following holds from (3.8) and (3.20):
Indeed, from (3.20), the following holds:
The following will be proved below:
(3.21)–(3.22) and Theorem 2 complete the proof. We prove (3.22).
From (3.20),
In particular, the following holds dt-a.e.:
Since \(|\arctan y|\) is bounded, (3.8) and (3.21) completes the proof of (3.22).
\(\square\)
4 Semiconcavity and continuity of Schrödinger’s Problem
Proposition 1 and Lemma 1 imply that \(\mathcal{P} (\mathbb {R}^d \times \mathbb {R}^d )\ni P\times Q\mapsto V(P,Q)\) is convex and lower semicontinuous.
In this section, we give a sufficient condition under which for a fixed \(Q\in \mathcal{P} (\mathbb {R}^d )\), \(L^2 (\varOmega , P;\mathbb {R}^d)\ni X\mapsto V_S(P^X,Q)\) is semiconcave and is continuous (see (1.19) for notation). More precisely, we show that there exists \(C>0\) such that for a fixed \(Q\in \mathcal{P} (\mathbb {R}^d )\),
is concave and is continuous. Here \(L^2 (\varOmega , P;\mathbb {R}^d)\) denotes the space of all square integrable functions from a probability space \((\varOmega , \mathcal{F}, P)\) to \((\mathbb {R}^d,\mathbf{B}(\mathbb {R}^d))\). Let \(W_2\) denote the Wasserstein distance of order 2, i.e. \(T^{1/2}\) with \(c=|y-x|^2\) in Remark 5. We also show the Lipschitz continuity of \(\mathcal{P}_{2} (\mathbb {R}^d )\ni P\mapsto V_S(P,Q)\) in \(W_2\) (see (4.4) for notation).
We first describe the assumptions in this section.
(A.5) \(\sigma (t,x)=(\sigma ^{ij}(t,x))_{i,j=1}^d\), \((t,x)\in [0,1]\times \mathbb {R}^d\), is a \(d\times d\)-matrix. \(a(t,x):=\sigma (t,x)\sigma (t,x)^*\), \((t,x)\in [0,1]\times \mathbb {R}^d\), is uniformly nondegenerate, bounded, once continuously differentiable, and uniformly Hölder continuous. \(D_x a(t,x)\) is bounded and the first derivatives of a(t, x) are uniformly Hölder continuous in x uniformly in \(t\in [0,1]\).
(A.6) \(\xi (t,x):[0,1]\times \mathbb {R}^d\longrightarrow \mathbb {R}^d\) is bounded, continuous, and uniformly Hölder continuous in x uniformly in \(t\in [0,1]\).
Remark 8
(A.5)–(A.6) imply (A.0.0), (A.1), and (A.2) for (1.19). (A.4)’ implies (A.0)–(A.3) and (A.5)–(A.6).
We describe the following fact.
Theorem 6
Suppose that (A.5)–(A.6) hold. Then for any \(P_0\in \mathcal{P}(\mathbb {R}^d)\), the following SDE has the unique weak solution with a positive continuous transition probability density p(t, x; s, y), \(0\le t<s\le 1\), \(x,y\in \mathbb {R}^d\):
(see [28]). Besides, there exist constants \(C_1, C_2>0\) such that
Remark 9
If \(V_S(P,Q)\) is finite, then the distribution of the minimizer X of \(V_S(P,Q)\) is absolutely continuous with respect to \(P^\mathbf{X}\). In particular, \(Q(dx)\ll dx\) under (A.5)–(A.6). Indeed, \(V_S(P,Q)\) is the relative entropy of \(P^{X}\) with respect to \(P^\mathbf{X}\) and \(P^{\mathbf{X}(1)}\) has a density (see the discussion below Remark 3).
We recall the definition of displacement convexity.
Definition 7
(Displacement convexity (see [32])) Let \(G:\mathcal{P}(\mathbb {R}^d )\longrightarrow \mathbb {R}\cup \{\infty \}\). G is displacement convex if the following is convex: for any \(\rho _0, \rho _1\in \mathcal{P}(\mathbb {R}^d )\) and convex function \(\varphi :\mathbb {R}^d\longrightarrow \mathbb {R}\cup \{\infty \}\),
where \(\rho _t:= \rho _0(id+t(D\varphi -id))^{-1}\), \(0< t<1\), provided \(\rho _1= \rho _0(D\varphi )^{-1}\) and \(\rho _t\) can be defined. Here id denotes an identity mapping.
Recall that a convex function is differentiable dx-a.e. in the interior of its domain (see, e.g. [61]) and \(\rho _t\) in (4.3) is well defined if \(\rho _0\in \mathcal{P}_{2,ac} (\mathbb {R}^d )\) and if \(\rho _1\in \mathcal{P}_{2} (\mathbb {R}^d )\) (see, e.g. [61]). Here
The following implies that \(L^2 (\varOmega , P;\mathbb {R}^d)\ni X\mapsto V_S(P^X,Q)\) is semiconvave for a fixed \(Q\in \mathcal{P}_{ac} (\mathbb {R}^d )\) and will be proved later.
Theorem 7
Suppose that (A.4)’ holds and that there exists a constant \(C>0\) such that \(x\mapsto \log p(0,x;1,y) +C|x|^2\) is convex for any \(y\in \mathbb {R}^d\). Then for any \(Q\in \mathcal{P}_{ac} (\mathbb {R}^d )\), \(X_i\in L^2 (\varOmega , P; \mathbb {R}^d), i=1,2\), and \(\lambda _1\in (0,1)\),
where \(\lambda _2:=1-\lambda _1\). Equivalently, the following is convex:
In particular, the following is displacement convex:
Remark 10
Suppose that \(a_{ij}=a_{ij} (x), \xi _i=\xi _i(x)\in C_b^\infty (\mathbb {R}^d)\) and that a(x) is uniformly nonnegenerate. Then \(D_x^2\log p(0,x;1,y)\) is bounded (see [58], Theorem B). In particular, there exists a constant \(C>0\) such that for any \(y\in \mathbb {R}^d\), \(x\mapsto \log p(0,x;1,y) +C|x|^2\) is convex.
For \(P\in \mathcal{P}(\mathbb {R}^d )\),
Let \(\mu (P,Q)\) denote the joint distribution at \(t=0, 1\) of the minimizer of \(V_S(P,Q)\), provided \(V_S(P,Q)\) is finite. The following implies that \(L^2 (\varOmega , P;\mathbb {R}^d)\ni X\mapsto V_S(P^X,Q)\) is continuous for a fixed \(Q\in \mathcal{P}_{ac} (\mathbb {R}^d )\) such that \(\mathcal{S}(Q)\) is finite.
The lower-semicontinuity of \(\mathcal{P}(\mathbb {R}^d )\ni P\mapsto V_S(P,Q)\) is known and can be proved, e.g. from Proposition 1 and Lemma 1. That of (4.12) can be proved in the same way as Lemma 3.4 in [43]. We give the proof for the sake of completeness.
Theorem 8
Suppose that (A.5)–(A.6) hold. For \(P, Q\in \mathcal{P}_{2} (\mathbb {R}^d )\), if \(\mathcal{S}(Q)\) is finite, then \(V_S (P,Q)\) is finite and the following holds:
In particular, the following is weakly lower semicontinuous:
(see (4.2) for notation). The following is also continuous in the topology induced by \(W_2\):
If \(\mathcal{S}(Q)\) is infinite, then so is \(V_S (P,Q)\).
Remark 11
For \(C>0\) and \(P, Q\in \mathcal{P} (\mathbb {R}^d )\),
\(\Psi _{Q,C} (P)\) plays a crucial role in the construction of moment measures by the SOTP (see [43, 44] and also [54] for the approach by the OTP). Since \(\mathcal{P}_{ac} (\mathbb {R}^d )\ni P\mapsto \mathcal{S}(P)\) is strictly displacement convex from Theorem 2.2 in [32], so is \(\mathcal{P}_{2, ac} (\mathbb {R}^d )\ni P\mapsto \Psi _{Q,C} (P)\) under the assumption of Theorem 7.
From Theorem 7, under stronger assumptions than Theorem 8, for a fixed \(Q\in \mathcal{P}_{2,ac} (\mathbb {R}^d )\) such that \(\mathcal{S}(Q)\) is finite, we prove that \(\mathcal{P}_2 (\mathbb {R}^d )\ni P\mapsto V_S(P,Q)\) is Lipschitz continuous in \(W_2\).
Corollary 5
Suppose that (A.4)’ holds and that there exists a constant \(C>0\) such that \(\log p(0,x;1,y) +C|x|^2\) is convex in x for any \(y\in \mathbb {R}^d\). Then for any \(Q\in \mathcal{P}_{2,ac} (\mathbb {R}^d )\) such that \(\mathcal{S}(Q)\) is finite, the following holds:
where \(||x||_{L^2(P)}:=(\int _{\mathbb {R}^d}|x|^2P(dx))^{1/2}\), \(P\in \mathcal{P}_2 (\mathbb {R}^d )\) and
(see (4.2) for notation). In particular, if \(p(0,x;1,y)=(2\pi a)^{-d/2}\exp (-|y-x|^2/(2a)), a>0\), then
where
We prove Theorems 7 and 8, and Corollary 5.
Proof of Theorem 7
For any \(f_{i}\in C^\infty _b (\mathbb {R}^d)\), \(u_{i} (x):=\varphi (0,x;f_{i})\) (see (2.18) for notation). Then
Indeed,
by the Duality Theorem for \(V_S\) (see Corollary 2).
In the inequality above, we considered as follows:
by Hölder’s inequality. Taking the supremum in \(f_i\) over \(C^\infty _b (\mathbb {R}^d)\) on the left hand side of (4.17), the Duality Theorem for \(V_S\) completes the proof (see Corollary 2). \(\square\)
Proof of Theorem 8
We prove the first part. We first prove that \(V_S(P,Q)\) is finite. Indeed, from [53],
from (4.2) (see (2.3) for notation). Here for \(\mu , \nu \in \mathcal{P}(\mathbb {R}^d\times \mathbb {R}^d )\),
There exists a Borel measurable \(f:\mathbb {R}^d\longrightarrow \mathbb {R}\) such that the following holds (see, e.g. [28]):
(see (4.18) for notation).
Since \(V_S(P,Q)\) is finite, \(f\in L^1 (\mathbb {R}^d,P_1)\) and \(\varphi (0,x;f)\in L^1 (\mathbb {R}^d,P_0)\) (see, e.g. [53]). In particular,
which completes the proof of (4.11). \(P\times Q\mapsto H(P\times Q| \mu (P,Q))\) is weakly lower semicontinuous since \(P\times Q\mapsto \mu (P,Q)\) is weakly continuous (see [43]) and since \((\mu ,\nu )\mapsto H(\mu | \nu )\) is weakly lower semicontinuous (see, e.g. [18], Lemma 1.4.3). In particular, (4.12) is weakly lower semicontinuous from (4.11). The weak lower semicontinuity of (4.12) implies the upper semicontinuity of (4.13) since for \(P_n, P\in \mathcal{P}(\mathbb {R}^d), n\ge 1\), \(W_2 (P_n,P)\rightarrow 0\) as \(n\rightarrow \infty\) if and only if \(P_n\rightarrow P\) weakly and \(\int _{\mathbb {R}^d} |x|^2P_n(dx)\rightarrow \int _{\mathbb {R}^d} |x|^2P(dx)\) (see, e.g. [61]).
(4.13) is also weakly lower semicontinuous by Proposition 1 and Lemma 1.
We prove the last part.
Then by Jensen’s inequality,
since
(4.2) completes the proof.
\(\square\)
Remark 12
Under (A.5)-(A.6), from Theorem 6, (4.21), and (4.23), for \(P, Q\in \mathcal{P}_2 (\mathbb {R}^d)\), if \(\mathcal{S}(Q)\) is finite, then
Remark 12 plays a crucial role in the proof of Corollary 5.
Proof of Corollary 5
Let \(X,Y\in L^2 (\varOmega , P; \mathbb {R}^d)\) and \(\lambda :=\min (1, ||X-Y||_2)\), where \(||X||_2:=\{E[|X|^2]\}^{1/2}\).
We prove the following when \(\lambda >0\).
(see (4.2) for notation). From Theorem 7,
since \(Y=(1-\lambda )X+\lambda (\lambda ^{-1}(Y-X)+X)\). From this,
Since (A.4)’ implies (A.5)–(A.6),
from Remark 12. The following completes the proof of the first part:
We prove the second part. One can set \(C= (2a)^{-1}\).
From (4.26), the following holds:
since
The following completes the proof: from Remark 12,
.
\(\square\)
References
Ambrosio, L.: Transport equation and Cauchy problem for BV vector fields. Invent. Math. 158, 227–260 (2004)
Ambrosio, L., Trevisan, D.: Well-posedness of Lagrangian flows and continuity equations in metric measure spaces. Anal. PDE 7(5), 1179–1234 (2014)
Aronson, D.G.: Bounds on the fundamental solution of a parabolic equation. Bull. Am. Math. Soc. 73, 890–896 (1967)
Bogachev, V.I., Krylov, N.V., Röckner, M.: Elliptic and parabolic equations for measures. Russ. Math. Surv. 64(6), 973–1078 (2009)
Bogachev, V. I., Röckner, M., Shaposhnikov, S. V.: On the Ambrosio–Figalli–Trevisan superposition principle for probability solutions to Fokker–Planck–Kolmogorov equations. J. Dyn. Differ. Equ. (2020)
Cacoullos, T., Papathanasiou, V., Utev, S.A.: Another characterization of the normal law and a proof of the central limit theorem connected with it. Theory Probab. Appl. 37, 581–588 (1992)
Cacoullos, T., Papathanasiou, V., Utev, S.A.: Variational inequalities with examples and an application to the central limit theorem. Ann. Probab. 22, 1607–1618 (1994)
Carlen, E.A.: Conservative diffusions. Commun. Math. Phys. 94, 293–315 (1984)
Carlen, E. A.: Existence and sample path properties of the diffusions in Nelson’s stochastic mechanics. In: Albeverio, S., Blanchard, Ph., Streit, L. (eds.) Stochastic processes-Mathematics and Physics, Bielefeld 1984, Lecture Notes in Math., Vol. 1158, pp. 25-51. Springer, Heidelberg (1986)
Carmona, R.: Probabilistic construction of Nelson processes. In: Itô, K., Ikeda, N. (eds.) Proc. Probabilistic Methods in Mathematical Physics, Katata 1985, pp. 55–81. Kinokuniya, Tokyo (1987)
Cattiaux, P., Léonard, C.: Minimization of the Kullback information of diffusion processes. Ann. Inst. H Poincaré Probab. Stat. 30, 83–132 (1994)
Cattiaux, P., Léonard, C.: Correction to: Minimization of the Kullback information of diffusion processes [Ann. Inst. H. Poincaré Probab. Statist. 30 (1994), no. 1, 83–132]. Ann Inst H Poincaré Probab Statist 31, 705–707 (1995)
Cattiaux, P., Léonard, C.: Large deviations and Nelson processes. Forum Math. 7, 95–115 (1995)
Cattiaux, P., Léonard, C.: Minimization of the Kullback information for some Markov processes. In: Azema, J. et al. (eds.) Séminaire de Probabilités, XXX, Lecture Notes in Math., Vol. 1626, pp. 288–311. Springer, Heidelberg (1996)
Crandall, M.G., Ishii, H., Lions, P.L.: User’s guide to viscosity solutions of second order partial differential equations. Bull. Am. Math. Soc. 27, 1–67 (1992)
Dai Pra, P.: A stochastic control approach to reciprocal diffusion processes. Appl. Math. Optim. 23, 313–329 (1991)
Dall’Aglio, G.: Sugli estremi dei momenti delle funzioni di ripartizione doppie. Ann. Scuola Normale Superiore Di Pisa, Cl. Sci. 3(1), 33–74 (1956)
Dupuis, P., Ellis, R.S.: A Weak Convergence Approach to the Theory of Large Deviations. John Wiley & Sons, New York (1997)
Figalli, A.: Existence and uniqueness of martingale solutions for SDEs with rough or degenerate coefficients. J. Funct. Anal. 254, 109–153 (2008)
Fleming, W.H., Soner, H.M.: Controlled Markov Processes and Viscosity Solutions. Springer, New York (1993)
Föllmer, H.: Random fields and diffusion processes. In: Hennequin, PL (ed.) École d’Été de Probabilités de Saint-Flour XV–XVII, 1985–87, Lecture Notes in Math., Vol. 1362, pp. 101–203. Springer, Heidelberg (1988)
Friedman, A.: Partial Differential Equations of Parabolic Type. Dover Publications, New York (2013)
Gomes, D.A.: A stochastic analogue of Aubry-Mather theory. Nonlinearity 15, 581–603 (2002)
Gomes, D. A., Mitake, H, Tran, H. V.: The large time profile for Hamilton–Jacobi–Bellman equations. arXiv:2006.04785
Ikeda, N., Watanabe, S.: Stochastic Differential Equations and Diffusion Processes. North-Holland/Kodansha, Tokyo (1981)
Ioffe, A.D., Tihomirov, V.M.: Theory of Extremal Problems. North-Holland, Amsterdam (1979)
Jamison, B.: Reciprocal processes. Z. Wahrsch. Verw. Gebiete 30, 65–86 (1974)
Jamison, B.: The Markov process of Schrödinger. Z. Wahrsch. Verw. Gebiete 32, 323–331 (1975)
Koike, S.: A beginner’s guide to the theory of viscosity solutions. MSJ Memoirs, Vol. 13. Math. Soc. Japan., Tokyo (2004)
Léonard, C. : A survey of the Schrödinger problem and some of its connections with optimal transport. Special Issue on Optimal Transport and Applications. Discr. Contin. Dyn. Syst. 34, 1533–1574 (2014)
Liptser, R.S., Shiryaev, A.N.: Statistics of Random Processes I. Springer, Heidelberg (1977)
McCann, R.J.: A convexity principle for interacting gases. Adv. Math. 128, 153–179 (1997)
Mikami, T.: Variational processes from the weak forward equation. Commun. Math. Phys. 135, 19–40 (1990)
Mikami, T.: Equivalent conditions on the central limit theorem for a sequence of probability measures on \(\mathbb{R}\). Stat. Probab. Lett. 37, 237–242 (1998)
Mikami, T.: Markov marginal problems and their applications to Markov optimal control. In: McEneaney, W. M. etal. (eds.) Stochastic Analysis, Control, Optimization and Applications, A Volume in Honor of W. H. Fleming, pp. 457-476. Birkhäuser, Boston (1999)
Mikami, T.: Dynamical systems in the variational formulation of the Fokker–Planck equation by the Wasserstein metric. Appl. Math. Optim. 42, 203–227 (2000)
Mikami, T.: Optimal control for absolutely continuous stochastic processes and the mass transportation problem. Elect. Commun. Probab. 7, 199–213 (2002)
Mikami, T.: Monge’s problem with a quadratic cost by the zero-noise limit of \(h\)-path processes. Probab. Theory Related Fields 129, 245–260 (2004)
Mikami, T.: Covariance kernel and the central limit theorem in the total variation distance. J. Multivar. Anal. 90, 257–268 (2004)
Mikami, T.: Semimartingales from the Fokker–Planck equation. Appl. Math. Optim. 53, 209–219 (2006)
Mikami, T.: Marginal problem for semimartingales via duality. In: Giga, Y., Ishii, K., Koike, S. et al. (eds) International Conference for the 25th Anniversary of Viscosity Solutions, Gakuto International Series. Mathematical Sciences and Applications 30, pp. 133–152. Gakkotosho, Tokyo (2008)
Mikami, T.: Regularity of Schrödinger’s functional equation and mean field PDEs for h-path processes. Osaka J. Math. 56, 831–842 (2019)
Mikami, T.: Regularity of Schrödinger’s functional equation in the weak topology and moment measures. J. Math. Soc. Jpn. 73, 99–123 (2021)
Mikami, T.: Stochastic optimal transportation. A book in preparation
Mikami, T., Thieullen, M.: Duality theorem for stochastic optimal control problem. Stoc. Proc. Appl. 116, 1815–1835 (2006)
Nagasawa, M.: Transformations of diffusion and Schrödinger process. Probab. Theory Related Fields 82, 109–136 (1989)
Nagasawa, M.: Stochastic Processes in Quantum Physics (Monographs in Mathematics 94). Birkhaüser, Basel (2000)
Nelsen, R.B.: An Introduction to Copulas, 2nd edn. Springer, Heidelberg (2006)
Nelson, E.: Dynamical Theories of Brownian Motion. Princeton University Press, Princeton (1967)
Nelson, E.: Quantum Fluctuations. Princeton University Press, Princeton (1984)
Rachev, S. T., Rüschendorf, L.: Mass transportation problems, Vol. I: Theory, Vol. II: Application. Springer, Heidelberg (1998)
Röckner, M., Xie, L., Zhang, X.: Superposition principle for non-local Fokker-Planck operators. Probab. Theory Related Fields 178, 699–733 (2020)
Rüschendorf, L., Thomsen, W.: Note on the Schrödinger equation and \(I\)-projections. Statist. Probab. Lett. 17, 369–375 (1993)
Santambrogio, F.: Dealing with moment measures via entropy and optimal transport. J. Funct. Anal. 271, 418–436 (2016)
Schrödinger, E.: Ueber die Umkehrung der Naturgesetze. Sitz. Ber. der Preuss. Akad. Wissen., Berlin, Phys. Math. pp. 144–153 (1931)
Schrödinger, E.: Théorie relativiste de l’electron et l’interprétation de la mécanique quantique. Ann. Inst. H. Poincaré 2, 269–310 (1932)
Schweizer, B., Sklar, A.: Probabilistic Metric Space. Dover Publications, New York (2005)
Sheu, S.J.: Some estimates of the transition density of a nondegenerate diffusion Markov processes. Ann. Probab. 19, 538–561 (1991)
Tan, X., Touzi, N.: Optimal transportation under controlled stochastic dynamics. Ann. Probab. 41, 3201–3240 (2013)
Trevisan, D.: Well-posedness of multidimensional diffusion processes with weakly differentiable coefficients. Electron J. Probab. 21, 1–41 (2016)
Villani, C.: Topics in Optimal Transportation. American Mathematics Society, Providence, RI (2003)
Zambrini, J. C.: Variational processes. In: Albeverio, S. etal. (eds.) Stochastic processes in classical and quantum systems, Ascona 1985, Lecture Notes in Phys., Vol. 262., pp. 517–529. Springer, Heidelberg (1986)
Zheng, W.A.: Tightness results for laws of diffusion processes application to stochastic mechanics. Ann. Inst. Henri Poincaré 21, 103–124 (1985)
Author information
Authors and Affiliations
Corresponding author
Additional information
This article is part of the topical collection Viscosity solutions, Dedicated to Hitoshi Ishii on the award of the 1st Kodaira Kunihiko Prize edited by Kazuhiro Ishige, Shigeaki Koike, Tohru Ozawa, and Senjo Shimizu.
Partially supported by JSPS KAKENHI Grant Numbers JP26400136 and 19K03548.
Rights and permissions
About this article
Cite this article
Mikami, T. Stochastic optimal transport revisited. SN Partial Differ. Equ. Appl. 2, 5 (2021). https://doi.org/10.1007/s42985-020-00059-3
Published:
DOI: https://doi.org/10.1007/s42985-020-00059-3