1 Introduction

Since the seminal paper of Hobson [24], an important literature has developed on the topic of robust or model-free superhedging of some path-dependent derivative security with payoff \(\xi\), given the observation of the stochastic process of some underlying financial asset, together with a class of derivatives; see [7, 9, 10, 11, 12, 13, 14, 16, 17, 26, 28, 33] and the survey papers of Obłój [34] and Hobson [25]. In continuous-time models, these papers mainly focus on derivatives whose payoff \(\xi\) is stable under time change. Then, the key observation was that in the idealized context where all \(T\)-maturity European calls and puts, with all possible strikes, are available for trading, the model-free superhedging cost of \(\xi\) is closely related to the Skorokhod embedding problem. Indeed, the market prices of all \(T\)-maturity European calls and puts with all possible strikes allow recovering the marginal distribution of the underlying asset price at time \(T\).

Recently, this problem has been addressed via a new connection to the theory of optimal transportation; see [3, 21, 23, 1, 2, 18, 19]. Our interest in this paper is on the formulation of a Brenier theorem in the present martingale context. We recall that the Brenier theorem in the standard optimal transportation theory states that the optimal coupling measure is the gradient of some convex function which is identified in the one-dimensional case with the so-called Fréchet–Hoeffding coupling [6]. A remarkable feature is that this coupling is optimal for the class of coupling cost functions satisfying the so-called Spence–Mirrlees condition.

We first consider the one-period model. In the rest of the paper, we assume that one can borrow or lend at a zero rate of interest. Denote by \(X\), \(Y\) the prices of some underlying asset at the future maturities 0 and 1, respectively. Then the possibility of dynamic trading implies that the no-arbitrage condition is equivalent to the non-emptyness of the set \(\mathcal{M}_{2}\) of all joint measures ℙ on \(\mathbb{R}_{+}\times\mathbb{R}_{+}\) satisfying the martingale condition \(\mathbb{E}^{\mathbb{P}}[Y|X]=X\). The model-free subhedging and superhedging costs of some derivative security with payoff \(c(X,Y)\), given the marginal distributions \(X\sim\mu\) and \(Y\sim\nu\), is essentially reduced to the martingale transportation problems

$$\begin{aligned} \inf_{\mathbb{P}\in\mathbf{M}_{2}(\mu,\nu)}\mathbb{E}^{\mathbb{P}}[c(X,Y)] \qquad \mbox{and} \qquad & \sup_{\mathbb{P}\in\mathbf{M}_{2}(\mu,\nu)}\mathbb{E}^{ \mathbb{P}}[c(X,Y)], \end{aligned}$$

where \(\mathbf{M}_{2}(\mu,\nu)\) is the collection of all probability measures \(\mathbb{P}\in\mathcal{M}_{2}\) such that \(X\sim_{\mathbb{P}} \mu\), \(Y\sim_{\mathbb{P}}\nu\). Our main objective is to characterize the optimal coupling measures which solve the above problems. This provides some remarkable extremal points of the convex (and weakly compact) set \(\mathbf{M}_{2}(\mu, \nu)\). In the absence of marginal restrictions, Jacod and Yor [30] (see also Jacod and Shiryaev [29], Dubins and Schwarz [20] for the discrete-time setting) proved that a martingale measure \(\mathbb{P} \in\mathcal{M}_{2}\) is extremal if and only if all ℙ-local martingales admit a predictable representation. In the present one-period model, such extremal points of \({\mathcal {M}}_{2}\) consist of binomial models. For a specific class of coupling functions \(c\), the extremal points of the corresponding martingale transportation problem turn out to be of the same nature, and our main contribution in this paper is to provide an explicit characterization.

Our starting point is a paper by Hobson and Neuberger [27] who considered the specific case of the coupling function \(c(x,y):=|x-y|\), and provided a completely explicit solution of the optimal coupling measure and the corresponding optimal semi-static strategy. In a recent paper, Beiglböck and Juillet [4] address the problem from the viewpoint of optimal transportation. By a convenient extension of the notion of cyclic monotonicity, the authors of [4] introduce the notion of a left-monotone transference plan. They also introduce the notion of left curtain as a left-monotone transference plan concentrated on the graph of a binomial map. The remarkable result of [4] is the existence and uniqueness of the left-monotone transference plan which is indeed a left curtain, together with the optimality of this joint probability measure for some specific class \(\mathcal{C}_{\mathrm{BJ}}\) of coupling payoffs \(c(x,y)\). Notice that the coupling measure of [27] is not a left curtain, and \(\mathcal{C}_{\mathrm{BJ}}\) does not contain the coupling payoff \(|x-y|\).

As the first main contribution, we provide an explicit description of the left curtain \(\mathbb{P}_{*}\) of [4]. Then, by using the weak duality inequality,

  • we provide a larger class \(\mathcal{C}\supset\mathcal{C}_{\mathrm{BJ}}\) of payoff functions for which \(\mathbb{P}_{*}\) is optimal;

  • we identify explicitly the solution of the dual problem which consists of the optimal semi-static superhedging strategy;

  • as a by-product, the strong duality holds true.

Our class \(\mathcal{C}\) is the collection of all smooth functions \(c:\mathbb{R}\times\mathbb{R}\to\mathbb{R}\), with linear growth, such that \(c_{xyy}>0\). We argue that this is essentially the natural class for our martingale version of the Brenier theorem.

We next explore the multiple marginals extension of our result. In the context of a model in finite discrete time, we provide a direct extension of our result which applies to the context of the discretely monitored variance swap. This answers the open question of optimal model-free upper and lower bounds for this derivative security.

The paper is organized as follows. Section 2 provides a quick review of the Brenier theorem in the standard one-dimensional optimal transportation problem. The martingale version of the Brenier theorem is reported in Sect. 3. The explicit construction of the left-monotone martingale transport plan is described in Sect. 4, and the characterization of the optimal dual superhedging is given in Sect. 5. We report our extensions to the multiple marginals case in Sect. 6. Finally, Sect. 7 contains the proofs of our main results.

2 The Brenier theorem in one-dimensional optimal transportation

2.1 The two-marginals optimal transportation problem

Let \(X\), \(Y\) be two scalar random variables denoting the prices of two financial assets at some future maturity \(T\). The pair \((X,Y)\) takes values in \(\mathbb{R}^{2}\), and its distribution is defined by some \(\mathbb{P}\in{\mathcal {P}}_{\mathbb{R}^{2}}\), the set of all probability measures on \(\mathbb{R}^{2}\).

We assume that \(T\)-maturity European call options, on each asset and with all possible strikes, are available for trading at exogenously given market prices. Then it follows from Breeden and Litzenberger [5] that the marginal distributions of \(X\) and \(Y\) are completely determined by the second derivative of the corresponding (convex) call price functions with respect to the strike. We denote by \(\mu\) and \(\nu\) the implied marginal distributions of \(X\) and \(Y\), respectively, by \(\ell^{\mu}\), \(r^{\mu}\), \(\ell^{\nu}\), \(r^{\nu}\) the left and right endpoints of their supports, and by \(F_{\mu}\), \(F_{\nu}\) the corresponding cumulative distribution functions.

By definition of the problem, the probability measures \(\mu\) and \(\nu\) have finite first moments, i.e.,

$$\begin{aligned} \int|x|\mu(dx)+\int|y|\nu(dy) < & \infty, \end{aligned}$$
(2.1)

and although the supports of \(\mu\) and \(\nu\) could be restricted to the nonnegative real line for the financial application, we consider the more general case where \(\mu\) and \(\nu\) lie in \({\mathcal {P}}_{ \mathbb{R}}\), the collection of all probability measures on ℝ.

We consider a derivative security defined by the payoff \(c(X,Y)\) at maturity \(T\), for some upper semicontinuous function \(c:\mathbb{R} ^{2}\to\mathbb{R}\) satisfying the growth condition

$$ c(x,y) \leq\varphi(x)+\psi(y) \quad\mbox{for some } \varphi, \psi:\mathbb{R}\to\mathbb{R}, \varphi^{+} \in\mathbb{L}^{1}(\mu), \psi^{+}\in\mathbb{L}^{1}(\nu). $$
(2.2)

The model-independent upper bound for this payoff consistent with vanilla option prices of maturity \(T\) can then be framed as a Monge–Kantorovich (in short MK) optimal transport problem, namely

$$\begin{aligned} P^{0}_{2}(\mu,\nu) := \sup_{\mathbb{P}\in{\mathcal {P}}_{2}(\mu,\nu)} \mathbb{E}^{\mathbb{P}}[c(X,Y)], \end{aligned}$$

where

$$\begin{aligned} {\mathcal {P}}_{2}(\mu,\nu) := \{\mathbb{P}\in{\mathcal {P}}_{\mathbb{R}^{2}}:X \sim_{\mathbb{P}}\mu~\mbox{and}~ Y\sim_{\mathbb{P}}\nu\}. \end{aligned}$$

Here, for the sake of simplicity, we have assumed a zero interest rate. This can easily be relaxed by considering the forwards of \(X\) and \(Y\). Notice that \(c(X,Y)\) is measurable by the upper semicontinuity condition on \(c\), and \(\mathbb{E}^{\mathbb{P}}[c(X,Y)]\) is a well-defined scalar in \(\mathbb{R}\cup\{-\infty\}\) by conditions (2.1) and (2.2).

In the original optimal transportation problem as formulated by Monge, the above maximization problem was restricted to the following subclass of measures.

Definition 2.1

A probability measure \(\mathbb{P}\in\mathcal{P}_{2}(\mu,\nu)\) is called a transference map if \(\mathbb{P}(dx,dy)=\mu(dx) \delta_{\{T(x)\}}(dy)\) for some measurable map \(T:\mathbb{R}\to \mathbb{R}\). We say that \(T\) pushes forward \(\mu\) to \(\nu\), \(T_{\#} \mu=\nu\), if \(\mu(T^{-1}(A))=\nu(A)\) for all measurable sets \(A\). In other words, \(\nu\) is the image measure of \(\mu\) by \(T\).

The dual problem associated to the MK optimal transportation problem is defined by

$$\begin{aligned} D^{0}_{2}(\mu,\nu) := \inf_{(\varphi,\psi)\in{\mathcal {D}}^{0}_{2}} \big( \mu(\varphi)+\nu(\psi) \big), \end{aligned}$$

where \(\mu(\varphi):=\int\varphi d\mu\), \(\nu(\psi):=\int\psi d \nu\) and, denoting \(\varphi\oplus\psi(x,y):=\varphi(x)+\psi(y)\),

$$\begin{aligned} {\mathcal {D}}^{0}_{2} :=& \{(\varphi,\psi): \varphi^{+}\in\mathbb{L} ^{1}(\mu),\psi^{+}\in\mathbb{L}^{1}(\nu) ~\mbox{and}~ \varphi \oplus\psi\ge c \}. \end{aligned}$$

The dual problem \(D^{0}_{2}(\mu,\nu)\) is to find the cheapest superhedging strategy of the derivative security \(c(X,Y)\) using the market instruments consisting of \(T\)-maturity European calls and puts with all possible strikes. The weak duality inequality

$$\begin{aligned} P^{0}_{2}(\mu,\nu) \le& D^{0}_{2}(\mu,\nu) \end{aligned}$$

is immediate. For an upper semicontinuous payoff function \(c\), equality holds and an optimal probability measure \(\mathbb{P}^{*}\) for the MK problem \(P^{0}_{2}\) exists; see e.g. Villani [38, Theorem 1.3].

In this paper, our main interest is on the following results of Rachev and Rüschendorf [36]; see for instance [38, Theorem 2.18]. These results correspond to the one-dimensional version of the Brenier theorem [6], which provides an interesting characterization of \(\mathbb{P}^{*}\) in terms of the so-called Fréchet–Hoeffding push-forward of \(\mu\) to \(\nu\), defined by the map

$$\begin{aligned} T_{*} := F_{\nu}^{-1}\circ F_{\mu}, \end{aligned}$$
(2.3)

where \(F_{\nu}^{-1}\) is the right-continuous inverse of \(F_{\nu}\), i.e.,

$$\begin{aligned} F_{\nu}^{-1}(x) := \inf\{y:F_{\nu}(y)>x\}. \end{aligned}$$

In particular, the following result relates the MK optimal transportation problem \(P^{0}_{2}\) to the original Monge mass transportation problem for a remarkable class of coupling functions \(c\). We observe that the following result holds in wider generality; in particular, the set of measures \(\mathbb{P}_{T}\) induced by a map \(T\) pushing forward \(\mu\) to \(\nu\) is dense in \({\mathcal {P}}_{ \mathbb{R}^{2}}\) whenever \(\mu\) is atomless and the supports of \(\mu\) and \(\nu\) are contained in compact subsets. For the purpose of our financial interpretation, this result characterizes the structure of the worst case financial market that the derivative security hedger may face, and characterizes the optimal hedging strategies by the functions \(\varphi_{*}\) and \(\psi_{*}\) defined up to an irrelevant constant by

$$ \varphi_{*}(x):=c\big(x,T_{*}(x)\big)-\psi_{*}\circ T_{*}(x), \qquad \psi_{*}'(y):=c_{y}\big(T_{*}^{-1}(y),y\big), \quad x,y\in\mathbb{R}. $$
(2.4)

Theorem 2.2

(See e.g. [38, Theorem 2.44]) Let \(c\) be upper semicontinuous with linear growth. Assume that the partial derivative \(c_{xy}\) exists and satisfies the Spence–Mirrlees condition \(c_{xy} > 0\). Define \(T_{*}\) by (2.3) and \(\varphi_{*}\), \(\psi _{*}\) by (2.4). Assume further that \(\mu\) has no atoms, \(\varphi^{+}_{*}\in\mathbb{L}^{1}(\mu)\) and \(\psi^{+}_{*} \in\mathbb{L}^{1}(\nu)\). Then:

  1. (i)

    \(P^{0}_{2}(\mu,\nu)=D^{0}_{2}(\mu,\nu)=\int c(x,T _{*}(x)) \mu(dx)\).

  2. (ii)

    \((\varphi_{*},\psi_{*})\) is in \({\mathcal {D}}^{0}_{2}\), and is a solution of the dual problem \(D^{0}_{2}\).

  3. (iii)

    \(\mathbb{P}_{*}(dx,dy):=\mu(dx)\delta_{T_{*}(x)}(dy)\) is a solution of the MK optimal transportation problem \(P^{0}_{2}\), and is the unique optimal transference map.

Proof

We provide the proof for completeness, as our main result in this paper will be an adaptation of the subsequent argument. First, it is clear that \(\mathbb{P}_{*}\in\mathcal{P}(\mu,\nu)\). Then \(\mathbb{E}^{ \mathbb{P}_{*}}[c(X,Y)]\le P^{0}_{2}(\mu,\nu)\). We now prove that

$$\begin{aligned} (\varphi_{*},\psi_{*})\in\mathcal{D}^{0}_{2} \quad \mbox{and} \quad & \mu(\varphi_{*})+\nu(\psi_{*})=\mathbb{E}^{\mathbb{P}_{*}}[c(X,Y)]. \end{aligned}$$
(2.5)

In view of the weak duality \(P^{0}_{2}(\mu,\nu)\le D^{0}_{2}(\mu, \nu)\), this gives \(P^{0}_{2}(\mu,\nu)=D^{0}_{2}(\mu, \nu)\) and that \(\mathbb{P}_{*}\) and \((\varphi_{*},\psi_{*})\) are solutions of \(P^{0}_{2}(\mu,\nu)\) and \(D^{0}_{2}(\mu,\nu)\), respectively.

Under our assumption that \(\varphi_{*}^{+}\in\mathbb{L}^{1}(\mu)\), \(\psi_{*}^{+}\in\mathbb{L}^{1}(\nu)\), notice that (2.5) is equivalent to

$$\begin{aligned} 0 = H^{0}\big(x,T_{*}(x)\big) , \quad \mbox{where}& H^{0}:=\varphi_{*}\oplus\psi_{*}-c. \end{aligned}$$

By using the expression of \(\psi'_{*}\) in (2.4) and the expression of \(\varphi_{*}\), we obtain that

$$\begin{aligned} H^{0}_{y}(x,y) =& c_{y}\big(T_{*}^{-1}(y),y\big)-c_{y}(x,y) = \int _{x}^{T_{*}^{-1}(y)}c_{xy}(\xi,y)\,d\xi. \end{aligned}$$

It follows from the Spence–Mirrlees condition that \(T_{*}(x)\) is the unique solution of the first order condition \(H^{0}_{y}(x,y)=0\). Finally, we compute that

$$ H^{0}_{yy}\big(x,T_{*}(x)\big)T_{*}'(x)=c_{xy}\big(x,T_{*}(x)\big)>0 $$

by the Spence–Mirrlees condition, where the derivatives are in the sense of distributions. Hence \(T_{*}(x)\) is the unique global minimizer of \(H^{0}(x,\cdot)\) and \(\min_{y} H^{0}(x,y) =0\). □

We observe that we may also formulate sufficient conditions on the coupling function \(c\) so as to guarantee that the integrability conditions \(\varphi^{+}_{*}\in\mathbb{L}^{1}(\mu)\), \(\psi^{+}_{*}\in\mathbb{L}^{1}(\nu)\) hold true; see [38, Theorem 2.44].

Remark 2.3

(Symmetry: anti-monotone rearrangement map)

(i) Suppose that the coupling function \(c\) satisfies \(c_{xy}<0\). Then the upper bound \(P^{0}_{2}(\mu,\nu)\) is attained by the anti-monotone rearrangement map

$$\begin{aligned} \overline{\mathbb{P}}_{*}(dx,dy) := \mu(dx) \delta_{\{\overline{T}_{*}(x)\}}(dy), \quad \textit{where}& \overline{T}_{*}(x) := F_{\nu}^{-1}\circ\big( 1- F _{\mu}(-x)\big). \end{aligned}$$

To see this, it suffices to rewrite the optimal transportation problem equivalently with modified inputs

$$\begin{aligned} \overline{c}(x,y):=c(-x,y), \qquad \overline{\mu}(x):=\mu\big((-x,\infty)\big),& \qquad \overline{\nu}:=\nu, \end{aligned}$$

so that \(\overline{c}\) satisfies the Spence–Mirrlees condition \(\overline{c}_{xy}>0\).

(ii) Under the Spence–Mirrlees condition \(c_{xy}>0\), the lower bound problem is explicitly solved by the anti-monotone rearrangement. Indeed, it follows from the first part (i) of the present remark that

$$\begin{aligned} \inf_{\mathbb{P}\in{\mathcal {P}}_{2}(\mu,\nu)} \!\mathbb{E}^{\mathbb{P}}[c(X,Y)] =& - \sup_{\mathbb{P}\in{\mathcal {P}}_{2}(\mu,\nu)} \!\mathbb{E}^{\mathbb{P}}[-c(X,Y)] \\ =& -\mathbb{E}^{\overline{\mathbb{P}}_{*}}[-c(X,Y)] = \!\int \!\! c\big(x,\overline{T}_{*}(x)\big)\mu(dx). \end{aligned}$$

Remark 2.4

The Spence–Mirrlees condition is a natural requirement in the optimal transportation setting in the following sense. The optimization problem is not affected by a modification of the coupling function from \(c\) to \(\bar{c}:=c+a\oplus b\) for any \(a\in \mathbb{L}^{1}(\mu)\) and \(b\in\mathbb{L}^{1}(\nu)\). Since \(c_{xy}=\bar{c}_{xy}\), it follows that the Spence–Mirrlees condition is stable for the above transformation of the coupling function.

Example 2.5

(Basket option) Let \(c(x,y)=(x+y-k)^{+}\) for some \(k\in\mathbb{R}\) (see [15, 32] for multi-asset basket options). The result of Theorem 2.2 applies to this example as well, as it is shown in [38, Chap. 2] that the regularity condition \(c \in C^{1,1}\) is not needed. The upper bound is attained by the Fréchet–Hoeffding transference map \(T_{*}:=F_{\nu}^{-1}\circ F _{\mu}\), and the optimal hedging strategy is

$$\begin{aligned} \psi_{*}(y) = (y-\bar{y})^{+}, \qquad \varphi_{*}(x)=\big(T_{*}(x)+x-k\big)^{+} - \big(T_{*}(x)-\bar{y} \big)^{+} , \end{aligned}$$

where \(\bar{y}\) is defined by \(T_{*}(k-\bar{y})=\bar{y}\).

2.2 The multi-marginals optimal transportation problem

The previous results have been extended to the \(n\)-marginals optimal transportation problem by Gangbo and Świȩch [22], Carlier [8] and Pass [35]. Let \(X=(X_{1}, \ldots,X_{n})\) be a random variable with values in \(\mathbb{R}^{n}\), representing the prices at some fixed time horizon of \(n\) financial assets, and consider some upper semicontinuous payoff function \(c:\mathbb{R}^{n}\to\mathbb{R}\) with linear growth. Let \(\mu_{1}, \ldots,\mu_{n}\in\mathcal{P}_{\mathbb{R}}\) be the corresponding marginal distributions, and \(\mu:=(\mu_{1},\ldots,\mu_{n})\). The upper bound for the market price of a derivative security with payoff function \(c\) is defined by the optimal transportation problem

$$ P^{0}_{n}(\mu) := \sup_{\mathbb{P}\in{\mathcal {P}}_{n}(\mu)} \mathbb{E} ^{\mathbb{P}}[c(X)], $$
(2.6)

where

$$ {\mathcal {P}}_{n}(\mu) :=\{\mathbb{P}\in{\mathcal {P}}_{\mathbb{R}^{n}}: X_{i}\sim_{\mathbb {P}}\mu _{i}, 1\le i\le n \}. $$

Then, under convenient conditions on the coupling function \(c\) (see Pass [35] for the most general ones), there exists a solution \(\mathbb{P}_{*}\) to the MK optimal transportation problem \(P^{0}_{n}( \mu)\) which is the unique optimal transference map defined by \(T_{*}^{i}\), \(i=2,\ldots,n\), namely

$$ \mathbb{P}^{*}(dx_{1}, \ldots,dx_{n}) = \mu_{1}(dx_{1})\prod_{i=2} ^{n} \delta_{T^{i}_{*}(x_{1})}(dx_{i}), $$

where \(T^{i}_{*}=F_{\mu_{i}}^{-1}\circ F_{\mu_{1}}\), \(i=2,\ldots, n\). The optimal upper bound is then given by

$$ P^{0}_{n}(\mu) = \int c\big(\xi,T^{2}_{*}(\xi),\ldots, T^{n}_{*}( \xi)\big) \mu_{1}(d\xi). $$

3 Martingale transport problem: formulation and first intuitions

The main objective of this paper is to obtain a version of the Brenier theorem for the martingale transportation problem introduced by Beiglböck et al. [3] and Galichon et al. [21]. A first result in this direction was obtained by Hobson and Neuberger [27] in the context of the coupling function \(c(x,y)=|x-y|\). The general case was considered by Beiglböck and Juillet [4] who introduced the martingale version of the cyclic monotonicity condition in standard optimal transport, namely the martingale monotonicity condition, and showed existence and uniqueness of such a monotone martingale measure, and its optimality for a class of coupling functions. Our result complements the last reference by an explicit extension of the Fréchet–Hoeffding optimal coupling. We outline in Remarks 5.4 and 5.6 the main differences with [4, 27].

3.1 Probability measures in convex order

In the context of the financial motivation of Sect. 2.1, we interpret the pair of random variables \(X\), \(Y\) as the prices of the same financial asset at dates \(t_{1}\) and \(t_{2}\), respectively, with \(t_{1}< t_{2}\). Then the no-arbitrage condition states that the price process of the tradable asset is a martingale under the pricing and hedging probability measure. We therefore restrict the set of probability measures to

$$\begin{aligned} \mathcal{M}_{2}(\mu,\nu) :=& \{\mathbb{P}\in\mathcal{P}_{2}( \mu,\nu): \mathbb{E}^{\mathbb{P}}[Y|X]=X \}, \end{aligned}$$

where \(\mu\), \(\nu\) have finite first moment as in (2.1). This set of probability measures is clearly convex, and the martingale condition implies that \(\ell^{\nu}\le\ell^{\mu}\le r^{\mu}\le r ^{\nu}\). Throughout this paper, we denote

$$\begin{aligned} \delta F := F_{\nu}-F_{\mu}. \end{aligned}$$

By a classical result of Strassen [37], \(\mathcal{M}_{2}( \mu,\nu)\) is non-empty if and only if \(\mu\preceq\nu\) in the sense of convex ordering, i.e.,

  1. (i)

    \(\mu\), \(\nu\) have the same mean, \(\int\xi\,d\delta F(\xi)=0\), and

  2. (ii)

    \(\delta c(k):=\int(\xi-k)^{+}(\nu-\mu)(d\xi) \ge0\) for all \(k \in\mathbb{R}\).

By direct integration by parts, we see that

$$\begin{aligned} \delta c(k) = -\int_{[k,\infty)}\delta F(\xi)\,d\xi \quad \mbox{for all}& k\in\mathbb{R}. \end{aligned}$$

Consequently, we may express the last condition (ii) as

$$ \int_{[k,\infty)}\delta F(\xi)\,d\xi\le0 \quad \mbox{or, equivalently},\quad \int_{[-\infty,k)}\delta F(\xi)\,d\xi\ge0, \quad \mbox{for all } k\in\mathbb{R}, $$
(3.1)

where the last equivalence follows from the first property (i). A crucial ingredient for the present paper is the decomposition of the pair \((\mu,\nu)\) into irreducible components, as introduced by Beiglböck and Juillet [4].

Definition 3.1

Let \(\mu\preceq\nu\). We call the pair \((\mu,\nu)\) irreducible if the set \(I := \{\delta c>0\}\) is connected and \(\mu(I) = \mu(\mathbb{R})\). We denote by \(J\) the union of \(I\) and any endpoints of \(I\) that are atoms of \(\nu\), and we refer to the pair \((I,J)\) as the domain of \((\mu,\nu)\).

The following decomposition result is restated from Beiglböck and Juillet [4, Theorem 8.4].

Proposition 3.2

Let \(\mu\preceq\nu\) and let \((I_{k})_{1 \leq k \leq N}\) be the (open) components of \(\{\delta c > 0\}\), where \(N \in\{0,1,\dots,\infty\}\). Set \(I_{0} := \mathbb{R}\setminus\bigcup_{k\ge1}I_{k}\) and \(\mu_{k} = \mu|_{I_{k}}\) for \(k \ge0\), so that \(\mu= \sum_{k\ge0} \mu_{k}\). Then there exists a unique decomposition \(\nu= \sum_{k \ge0} \nu_{k}\) such that

\(\mu_{0} = \nu_{0}\) and \(\mu_{k} \preceq\nu_{k}\) for all \(k \ge1\), and

\(I_{k} = \{\delta c_{k}>0\}\) for all \(k \ge1\), where \(\delta c_{k}(x):=\int(\xi-x)^{+}(\nu_{k}-\mu_{k})(d\xi)\).

Moreover, any \(\mathbb{P}\in\mathcal{M}(\mu, \nu)\) admits a unique decomposition \(\mathbb{P}= \sum_{k\ge0}\mathbb{P}_{k}\) such that \(\mathbb{P}_{k}\in\mathcal{M}(\mu_{k},\nu_{k})\) for all \(k\ge0\).

Observe that the measure \(\mathbb{P}_{0}\) in the last statement is the trivial constant martingale transport from \(\mu_{0}\) to itself. In particular, \(\mathbb{P}_{0}\) does not depend on the choice of \(\mathbb{P}\in\mathcal{M}(\mu,\nu)\).

3.2 Problem formulation

Let \(c:\mathbb{R}^{2}\to\mathbb{R}\) be an upper semicontinuous function satisfying the growth condition (2.2), representing the payoff of a derivative security. In the present context, the model-independent upper bound for the price of the claim can be formulated by the martingale optimal transportation problem

$$ P_{2}(\mu,\nu) := \sup_{\mathbb{P}\in\mathcal{M}_{2}(\mu,\nu)} \mathbb{E}^{\mathbb{P}}[c(X,Y)]. $$

Remark 3.3

When \(\mu\) and \(\nu\) have finite second moments, notice that

$$ \mathbb{E}^{\mathbb{P}}[(X-Y)^{2}]=-\mathbb{E}^{\mathbb{P}}[X^{2}]+ \mathbb{E}^{\mathbb{P}}[Y^{2}]=\int\xi^{2}\,d\delta F(\xi) \quad\textit{for all }\mathbb{P}\in\mathcal{M}_{2}(\mu,\nu). $$

Thus the quadratic case, which is the typical example of coupling in the optimal transportation theory, is irrelevant in the present martingale version.

The Kantorovich dual in the present martingale transport problem is formulated as follows. Because of the possibility of dynamically trading the financial asset between times \(t_{1}\) and \(t_{2}\), the set of dual variables is defined by

$$ \mathcal{D}_{2}:=\{(\varphi,\psi,h):\varphi^{+}\in\mathbb{L} ^{1}(\mu),\psi^{+}\in\mathbb{L}^{1}(\nu), h\in\mathbb{L}^{0}, ~\mbox{and}~\varphi\oplus\psi+h^{\otimes}\ge c\}, $$
(3.2)

where \(\varphi\oplus\psi(x,y):=\varphi(x)+\psi(y)\) and \(h^{\otimes}(x,y):=h(x)(y-x)\). The dual problem is

$$\begin{aligned} D_{2}(\mu,\nu) :=& \inf_{(\varphi,\psi,h)\in\mathcal{D}_{2}} \big(\mu(\varphi)+\nu(\psi)\big), \end{aligned}$$

and can be interpreted as finding the cheapest superhedging strategy of the derivative \(c(X,Y)\) by dynamic trading in the underlying asset and static trading in the European options with maturities \(t_{1}\) and \(t_{2}\). Since \(\mu\), \(\nu\) have finite first moments and \(c\) satisfies the growth condition (2.2), the weak duality inequality

$$\begin{aligned} P_{2}(\mu,\nu) \le& D_{2}(\mu,\nu) \end{aligned}$$
(3.3)

follows immediately from the definition of both problems. The strong duality result (i.e., equality holds), together with the existence of a maximizer \(\mathbb{P}_{*}\in\mathcal{M}_{2}(\mu,\nu)\) for the martingale transportation problem \(P_{2}(\mu,\nu)\), is proved in [3]. However, existence does not hold in general for the dual problem \(D_{2}(\mu,\nu)\). An example of non-existence is provided in [3]. In the present paper, we obtain existence under a martingale version of the Spence–Mirrlees condition.

3.3 Monotone martingale transport plans

Our objective in this paper is to provide explicitly the left-monotone martingale transport plan introduced by Beiglböck and Juillet [4].

Definition 3.4

We say that \(\mathbb{P}\in\mathcal{M}_{2}(\mu,\nu)\) is left-monotone (resp. right-monotone) if there exists a Borel set \(\varGamma\subset\mathbb{R}\times\mathbb{R}\) such that \(\mathbb{P}[(X,Y)\in\varGamma]=1\) and for all \((x,y_{1}), (x,y_{2}), (x',y') \in\varGamma\) with \(x< x'\) (resp. \(x>x'\)), it must hold that \(y'\notin(y_{1},y_{2})\).

Similarly to [4], we consider probability measures \(\mu\), \(\nu\) satisfying the following restriction.

Assumption 3.5

The probability measures \(\mu\) and \(\nu\) have finite first moments, \(\mu\preceq\nu\) in convex order, and \(\mu\) has no atoms.

Under Assumption 3.5, Theorem 1.5 and Corollary 1.6 of [4] state that there exists a unique left-monotone martingale transport plan \(\mathbb{P}_{*}\in\mathcal{M}_{2}(\mu, \nu)\), and that the graph of \(\mathbb{P}_{*}\) is concentrated on two maps \(T_{d},T_{u}:\mathbb{R}\to\mathbb{R}\), \(T_{d}(x)\le x\le T_{u}(x)\) for all \(x\in\mathbb{R}\), i.e.,

$$\begin{aligned} &\mathbb{P}_{*}(dx,dy) = \mu(dx)\Big(q(x)\delta_{T_{u}(x)}+\big(1-q(x) \big)\delta_{T_{d}(x)}\Big)(dy), \\ &\quad \mbox{with } q(x)=\frac{x-T_{d}(x)}{(T_{u}-T_{d})(x)}\mathbf{1} _{\{(T_{u}-T_{d})(x)>0\}}. \end{aligned}$$
(3.4)

Remark 3.6

By the convex ordering condition (3.1), it follows that \(\delta F\) increases from and to zero at the left and right boundaries of its support, respectively. Moreover, \(\delta F\) is upper semicontinuous by the continuity of \(F_{\mu}\) in Assumption 3.5. Then the local suprema of \(\delta F\) are attained by local maximizers in \((\ell_{\mu},r_{\mu})\).

Let \(\mathbf{M}(\delta F)\) be the collection of all local maximizers of the function \(\delta F\). Moreover, for each local maximizer \(m\in\mathbf{M}(\delta F)\), we define

$$ \begin{aligned} m_{-} &:= \sup\{x< m:~\delta F(x)< \delta F(m)\}, \\ m_{+} &:= \inf\{x>m:~\delta F(x)< \delta F(m)\}. \end{aligned} $$
(3.5)

The set

$$ \mathbf{M}_{0}(\delta F) := \{ m\in\mathbf{M}(\delta F):~m=m_{+}~ \mbox{and}~\delta F=\delta F(m)~\mbox{on}~[m_{-},m] \} $$

will play a crucial role in our characterization. Our construction will be performed under the following additional assumption on the pair of measures \((\mu,\nu)\).

Assumption 3.7

\(\nu\) has no atoms, and \(\mathbf{M}_{0}(\delta F)\) is a finite set of points.

Under this assumption, the unique decomposition \(\mathbb{P}=\sum_{k \ge0}\mathbb{P}_{k}\) with measures \(\mathbb{P}_{k}\in \mathcal{M}(\mu_{k},\nu_{k})\) from Proposition 3.2 corresponds to the irreducible domains \((I_{k},I_{k})\), i.e., \(J_{k}=I_{k}\).

Finally, we observe that the construction of the left-monotone martingale transport plan will be elaborated separately on each irreducible component; see Theorem 4.5(ii) below. Therefore, without loss of generality, it suffices to provide the construction for an irreducible pair \((\mu,\nu)\), i.e.,

$$\begin{aligned} \delta c(x):=-\int_{x}^{\infty}\delta F(\xi)\,d\xi>0 \quad \mbox{for all}& x\in I. \end{aligned}$$
(3.6)

3.4 First intuitions

In this subsection, we provide a construction of the left-monotone transport plan, for an irreducible pair \((\mu,\nu)\) of measures in convex order, under the simplifying condition

$$\begin{aligned} \mathbf{M}(\delta F)=\mathbf{M}_{0}(\delta F)=\{m_{1}\} \quad \mbox{for some}& \ell_{\mu}< m_{1}< r_{\mu}, \end{aligned}$$
(3.7)

so that \(\delta F\) is strictly increasing on \((-\infty,m_{1}]\).

The definition of the left-monotone transport map suggests that \(T_{u}\) is nondecreasing and \(T_{d}\) nonincreasing. This is a first guess which will be verified under our simplifying condition (3.7). However, we emphasize that it will turn out to be wrong in the more general case studied in Sect. 4, but will serve to guide our intuition.

As a first consequence of the non-increase of \(T_{d}\) and the non-decrease of \(T_{u}\), we see that they have at most a countable number of discontinuities. Therefore, since \(\mu\) has no atoms, we may choose the maps \(T_{d}\) and \(T_{u}\) to be right-continuous. In order to allow a decreasing map \(T_{d}\), we guess that there exists some bifurcation point \(m\) such that

$$\begin{aligned} &T_{d}(x)=T_{u}(x) \quad \mbox{for } x\le m, \quad \mbox{and } \\ &T_{d}:(m,\infty)\to(-\infty,m) \quad \mbox{nonincreasing,} \\ &T_{u}:(m,\infty)\to(m,\infty)\quad \mbox{nondecreasing.} \end{aligned}$$

We denote by \(T_{d}^{-1}\), \(T_{u}^{-1}\) the right-continuous generalized inverses of \(T_{d}\) and \(T_{u}\), respectively. Since \(\nu\) has no atoms, we observe that for \(\mu\)-a.e. \(x\ge m\),

$$ \{x'\ge m:T_{u}(x')=T_{u}(x)\} = \{x'\ge m:T_{d}(x')=T_{d}(x)\} = \{x \}. $$
(3.8)

By the representation (3.4) of the left-monotone transport map, we have \(X\sim_{\mathbb{P}}\mu\) and the martingale condition \(\mathbb{E}^{\mathbb{P}}[Y|X]=X\) holds true. It remains to impose the mass conservation condition \(Y\sim_{\mathbb{P}}\nu\), i.e., \(\mathbb{P}[Y\in dy]=\nu(dy)\).

(i) Mass conservation condition. We consider separately the domains on both sides of the bifurcation point \(m\).

Upper support. Let \(y>m\) be a point of the support of \(\nu\). Then \(y:=T_{u}(x)\) for some \(x\ge m\), and

$$\begin{aligned} \mathbb{P}[Y\in dy] =& \mathbb{E}\big[q(X)\mathbf{1} _{\{T_{u}(X) \in dy\}}\big] = q(x)\,dF_{\mu}(x) \end{aligned}$$

by (3.8). Then the mass conservation condition in this case is

$$\begin{aligned} dF_{\nu}(T_{u}) =& q\,dF_{\mu}. \end{aligned}$$
(3.9)

Lower support. Let \(y< m\) be a point of the support of \(\nu\). Then \(y=T_{d}(x)\) for some \(x>m\), and

$$\begin{aligned} \mathbb{P}[Y\in dy] =& dF_{\mu}(y) + \mathbb{E}\big[\big(1-q(X) \big)\mathbf{1} _{\{T_{d}(X)\in dy\}}\big] = dF_{\mu}(y) -\big(1-q(x) \big)\,dF_{\mu}(x) \end{aligned}$$

by (3.8), where the last minus sign is due to the decrease of \(T_{d}\) on \((m,\infty)\). The mass conservation condition is then

$$\begin{aligned} d \delta F(T_{d}) =& -(1-q)\,dF_{\mu}. \end{aligned}$$
(3.10)

We are then reduced to the system of ODEs (3.9), (3.10) on \([m,\infty)\), with the boundary condition \(T_{u}(m)=T_{d}(m)=m\). Recall that we have to solve for the unknowns \(T_{u}\), \(T_{d}\), and also for the bifurcation level \(m\).

(ii) Determining the bifurcation point. Subtracting (3.9) and (3.10), we obtain \(dF_{\nu}(T_{u})=dF _{\mu}+d\delta F(T_{d})\). Integrating between \(m\) and \(x\) and using the boundary condition \(T_{u}(m)=T_{d}(m)=m\), we see that

$$\begin{aligned} F_{\nu}(T_{u})=F_{\mu}+\delta F(T_{d}) \quad \mbox{on}& [m,\infty). \end{aligned}$$
(3.11)

We expect that \(T_{u}\) and \(T_{d}\) are in a one-to-one relation. Since \(F_{\nu}\) is nondecreasing, the last equation allows indeed expressing \(T_{u}\) in terms of \(T_{d}\) by using the right-continuous inverse \(F_{\nu}^{-1}\). However, expressing \(T_{d}\) in terms of \(T_{u}\) requires that \(m\le m_{1}\) so that \(T_{d}\) takes values in the domain where \(\delta F\) is strictly increasing and thus has a continuous inverse \(\delta F^{-1}\). Then, using again (3.11), it follows from the non-decrease of \(F_{\nu}\) and the fact that \(x\le T_{u}(x)\) that

$$ \delta F(x) \le F_{\nu}\big(T_{u}(x)\big)-F_{\mu}(x) = \delta F \big(T_{d}(x)\big) \le\delta F(m) \quad \mbox{for all } x\ge m. $$

Consequently, the only possible choice for \(m\le m_{1}\) is

$$\begin{aligned} m =& m_{1}. \end{aligned}$$

(iii) Solving for \(T_{d}\) and \(T_{u}\). We continue our derivation under the simplifying condition (3.7). First, by (3.11), we express \(T_{u}\) in terms of \(T_{d}\) as

$$ T_{u}(x) = g\big(x,T_{d}(x)\big),\quad x\ge m, \quad \mbox{with } g(x,y):=F_{\nu}^{-1}\big(F_{\mu}(x)+\delta F(y)\big), $$
(3.12)

where we extend the definition of \(F_{\nu}^{-1}\) by setting \(F_{\nu}^{-1}=\infty\) on \((1,\infty)\) and \(F_{\nu}^{-1}= -\infty\) on \((-\infty,0)\). Next, by the definition of \(q\) together with (3.9), (3.10) and (3.12), we have

$$\begin{aligned} x\,dF_{\mu}= \big(qT_{u}+(1-q)T_{d}\big)\,dF_{\mu} =& T_{u}\,dF_{\nu}(T _{u})-T_{d}\,d\delta F(T_{d}) \\ =& g(x,T_{d})\big(dF_{\mu}+d\delta F(T_{d})\big) -T_{d}\,d\delta F(T _{d}). \end{aligned}$$

We are then reduced to the ordinary differential equation

$$\begin{aligned} \big(g(x,T_{d})-T_{d}\big)\,d\delta F(T_{d}) +\big(g(x,T_{d})-x\big)\,dF _{\mu}=0 \quad \mbox{on}& [m,\infty). \end{aligned}$$
(3.13)

Observe that

$$\begin{aligned} d_{y}g(x,y)\,d\delta F(y) =& (dF_{\nu}^{-1})\big(F_{\mu}(x)+\delta F(y) \big)\,dF_{\mu}(x)\,d\delta F(y) \\ =& d_{x}g(x,y)\,dF_{\mu}(x). \end{aligned}$$
(3.14)

Then, using

$$\begin{aligned} d_{x}\int_{m}^{T_{d}} \big(g(x,\xi)-\xi\big)\,d\delta F(\xi) =& \big(g(x,T_{d})-T_{d}\big)\,d\delta F(T_{d}) \\ &{}+\bigg(\int_{m}^{T_{d}} d_{y} g(x,y)\bigg)\,dF_{\mu}(x) \\ =& \big(g(x,T_{d})-T_{d}\big)\,d\delta F(T_{d}) \\ &{}+\big(g(x,T_{d})-g(x,m)\big)\,dF_{\mu}(x), \end{aligned}$$

we rewrite (3.13) as

$$\begin{aligned} d_{x}\int_{m}^{T_{d}} \big(g(x,\zeta)-\zeta\big)\,d\delta F(\zeta) + \big(g(x,m)-x\big)\,dF_{\mu}(x) =& 0, \end{aligned}$$

which provides by direct integration, and using the boundary condition \(T_{d}(m)=m\), that

$$\begin{aligned} G^{m}(T_{d},x)=0 \quad \mbox{for } x\ge m, \end{aligned}$$
(3.15)

where

$$\begin{aligned} G^{m}(t,x) &:= -\int_{t}^{m} \big(g(x,\zeta)-\zeta\big)\,d\delta F( \zeta) \\ & \phantom{:=:}+\int_{m}^{x}\big(g(\xi,m)-\xi\big)\,dF_{\mu}(\xi), \quad t\le m\le x. \end{aligned}$$
(3.16)

We finally verify that (3.15) uniquely defines \(T_{d}(x)\in(- \infty,m]\).

  • First, note that the function \(t\mapsto G^{m}(t,x)\) is continuous and strictly increasing for \(x\ge m\ge t\). Indeed, the continuity is inherited from the continuity of \(\delta F\). Next, for \(\zeta\le m< x\), it follows that we have \(F_{\mu}(x)>F_{\mu}(\zeta )\) or, equivalently, \(F_{\mu}(x)+\delta F(\zeta)>F_{\nu}( \zeta)\). Then \(g(x,\zeta)=F_{\nu}^{-1}(F_{\mu}(x)+\delta F( \zeta)) > \zeta\), and the strict increase of \(G^{m}\) in \(t\) is inherited from the strict increase of \(\delta F\) on \((-\infty,m_{1})\).

  • At \(t=m\), we compute that \(G^{m}(m,x)=\int_{m}^{x} (g(\xi,m)-\xi)\,dF _{\mu}(\xi)>0\) for \(x>m\). The last strict inequality follows from the fact that \(g(x,m)>x\) for all \(x>m\), under our simplifying condition (3.7), and the strict increase of \(F_{\mu}\) in a right neighborhood of \(m\).

  • Finally, as \(t\searrow-\infty\), we now show that \(G^{m}(-\infty,x)<0\) for all \(x>m\). By (3.14), we observe that

    $$\begin{aligned} d_{x} G^{m}(-\infty,x) =& -\bigg(\int_{-\infty}^{m}d_{\zeta}g(x, \zeta)\bigg)\,dF_{\mu}+\big(g(x,m)-x\big)\,dF_{\mu} \\ =& \big(g(x,-\infty)-x\big)\,dF_{\mu}= \big(F_{\nu}^{-1}\circ F _{\mu}(x)-x\big)\,dF_{\mu}. \end{aligned}$$

    By direct integration, this provides

    $$\begin{aligned} G^{m}(-\infty,x) =& G^{m}(-\infty,m)+\int_{m}^{x} \big(F_{\nu} ^{-1}\circ F_{\mu}(\xi)-\xi\big)\,dF_{\mu}(\xi) = \gamma(x), \end{aligned}$$

    where

    $$\begin{aligned} \gamma(x) := \int_{-\infty}^{F_{\nu}^{-1}\circ F_{\mu}(x)} \xi\,dF _{\nu}(\xi) -\int_{-\infty}^{x} \xi\,dF_{\mu}(\xi) \quad \mbox{for}& x\in\mathbb{R}. \end{aligned}$$
    (3.17)

    Notice that \(\gamma(-\infty)=0\), and since \(\mu\) and \(\nu\) have the same mean, \(\gamma(\infty)=0\). We next analyze the maximum of \(\gamma\). Since \(d\gamma(x)=(F_{\nu}^{-1}\circ F_{\mu}(x)-x)\,dF _{\mu}(x)\), we may restrict to a point \(x^{*}\in\mathrm{supp}(\mu)\) of local maximum of \(\gamma\), so that we obtain \(F_{\nu}^{-1}(F _{\mu}(x^{*})-) \le x^{*}\le F_{\nu}^{-1}(F_{\mu}(x^{*}))\), and therefore

    $$ \gamma(x^{*})=\int_{-\infty}^{x^{*}}\xi\,d\delta F(\xi)=-\int(x ^{*}-\xi)^{+}\,d\delta F(\xi)< 0 $$

    by the irreducibility condition (3.6) of the pair \((\mu,\nu)\).

4 Explicit construction of the left-monotone martingale transport plan

4.1 Preliminaries

We recall that our construction will be accomplished separately on each irreducible component, and consequently we may assume without loss of generality that the pair \((\mu,\nu)\) is irreducible so that (3.6) holds true.

Recall also the function \(g\) introduced in (3.12). In order to relax the simplifying condition (3.7), we need to introduce, for a measurable subset \(A\in{\mathcal {B}}_{\mathbb{R}}\) with \(\delta F\) increasing on \(A\), the analogue of (3.16), namely

$$ G^{m}_{A}(t,x) := -\int_{t}^{m} \big(g(x,\zeta)-\zeta\big) \mathbf{1} _{A}(\zeta)\,d\delta F(\zeta) +\int_{m}^{x} \big(g(\xi,m)- \xi\big)\,dF_{\mu}(\xi) $$
(4.1)

for \(t\le m\le x\). Notice that \(G^{m}_{A}\) is continuous in \(t\) by the continuity of \(\delta F\). Recall from Assumption 3.7 that \(\mathbf{M}_{0}(\delta F)\) is a finite set, so that

$$\begin{aligned} \mathbf{M}_{0}(\delta F)=\{m^{0}_{1},\ldots,m^{0}_{n}\} \quad \mbox{for some } -\infty< m^{0}_{1}< \cdots< m^{0}_{n}< \infty. \end{aligned}$$

We also need to introduce the set

$$\begin{aligned} B_{0}:=\{x\in\mathbb{R}:~\delta F~\mbox{increasing in a right neighborhood of}~x\}, \qquad x_{0}:=\inf B_{0}.& \end{aligned}$$

Here, \(x\in B_{0}\) means that for all \(\varepsilon>0\), we may find \(x_{\varepsilon}\in(x,x+\varepsilon)\) such that \(\delta F(x_{\varepsilon})> \delta F(x)\). Observe that

$$\begin{aligned} x_{0}< m^{0}_{1} \quad \mbox{and} \quad \delta F=0 \mbox{ on }(-\infty,x_{0}], \end{aligned}$$

where the first inequality is a direct consequence of the definition of \(x_{0}\) and \(m^{0}_{1}\), and the second property follows from the characterization (3.1) of the dominance \(\mu\preceq\nu\) in the convex order.

Recall the function \(\gamma\) of (3.17). Our construction uses recursively the following ingredients:

\(\mathbf{(I_{1})}\) :

\(m_{0}\in\{-\infty\}\cup\mathbf{M}_{0}( \delta F)\) and \(A_{0}\subset B_{0}\cap(-\infty,m_{0})\) with \(\delta F>0\) on \(A_{0}\), satisfying \(G_{A_{0}}^{m_{0}}(-\infty, \cdot)=\gamma\) and \(\int_{-\infty}^{m_{0}}\mathbf{1} _{A_{0}}\,d \phi(\delta F)=\int_{-\infty}^{m_{0}}\,d\phi(\delta F)\) for all nondecreasing maps \(\phi\);

\(\mathbf{(I_{2})}\) :

\(\bar{x}_{0}\in B_{0}\cap[m_{0},m^{0}_{n})\) and \(t_{0}\in A_{0}\cup\{-\infty\}\) satisfying \(\delta F(t_{0})= \delta F(\bar{x}_{0})\ge0\) and \(G_{A_{0}}^{m_{0}}(t_{0},\bar{x}_{0})=0\).

Lemma 4.1

Let \(m_{1}:=\min(\mathbf{M}_{0}(\delta F)\cap(\bar{x}_{0},\infty))\), \(A_{1}:=( A_{0}\setminus[t_{0},m_{0}]) \cup(\bar{x}_{0},m _{1})\). Then:

  1. (i)

    \(\delta F>0\) on \(A_{1}\), \(G_{A_{1}}^{m_{1}}(-\infty ,\cdot)=\gamma\), and \(\int_{-\infty}^{m_{1}}\mathbf{1} _{A_{1}}d \phi(\delta F)=\int_{-\infty}^{m_{1}}\,d\phi(\delta F)\) for all nondecreasing maps \(\phi\).

  2. (ii)

    For all \(x\ge m_{1}\) with \(\delta F(x)\le\delta F(m _{1})\), there exists a unique scalar \(t^{m_{1}}_{A_{1}}(x)\in A_{1}\) such that \(G^{m_{1}}_{A_{1}}(t^{m_{1}}_{A_{1}}(x),x)=0\).

  3. (iii)

    The function \(x\mapsto t^{m_{1}}_{A_{1}}(x)\) is decreasing \(\mu\)-a.e. on \([m_{1},x_{1}]\), where we define \(x _{1} :=\inf\{x>m_{1}:g(x,t^{m_{1}}_{A_{1}}(x))\le x\}\).

  4. (iv)

    If \(x_{1}<\infty\), then \(x_{1}\in B_{0}\cap[m_{1},m ^{0}_{n})\setminus\mathbf{M}_{0}(\delta F)\), and \(\delta F(t^{m_{1}} _{A_{1}}(x_{1}))=\delta F(x_{1})\ge0\).

The proof of this lemma is reported in Sect. 7.1.

4.2 Explicit construction

We start by defining

$$ \textstyle\begin{array}{c} T_{d}(x) = T_{u}(x) = x \quad \mbox{for } x\le x_{0}, \end{array} $$

and we continue the construction of the maps \(T_{d}\), \(T_{u}\) along the following steps.

Step 1. Set \(m_{0}:=-\infty\), \(A_{0}:=\emptyset \), \(\bar{x}_{0}:=x_{0}\), \(t_{0}=-\infty\), and notice that \(\mathbf{(I_{1})}\), \(\mathbf{(I_{2})}\) are obviously satisfied by these ingredients. We may then apply Lemma 4.1 and obtain \(m_{1}:=m^{0}_{1}\), the smallest point in \(\mathbf{M}_{0}(\delta F)\), and \(A_{1}, x_{1},t_{1}:=t_{A_{1}}^{m_{1}}(x_{1})\). Define the maps \(T_{d}\), \(T_{u}\) on \((x_{0},x_{1})\) by

$$ \textstyle\begin{array}{c} T_{d}(x) = T_{u}(x) = x \quad \mbox{for } x_{0}< x\le m_{1}, \\ T_{d}(x):=t^{m_{1}}_{A_{1}}(x) \quad\mbox{and}\quad T_{u}(x):=g \big(x,T_{d}(x)\big) \quad \mbox{for } m_{1}\le x< x_{1}. \end{array} $$

If \(x_{1}=\infty\), this completes the construction, and we set \(m_{j}=x_{j}=\infty\) for all \(j>1\). See Fig. 1 below for such an example. Otherwise, Lemma 4.1 guarantees that the new ingredients \((m_{1},A_{1},x_{1},t_{1})\) satisfy Conditions \(\mathbf{(I_{1})}\), \(\mathbf{(I_{2})}\), and we may continue with the next step.

Fig. 1
figure 1

Maps \(T_{d}\) and \(T_{u}\) built from two log-normal densities with variances 0.04 and 0.32. \(m_{1}=0.731\)

Step i. Suppose that the pair of maps \((T_{d},T _{u})\) is defined on \((-\infty,x_{i-1})\) for some quadruple \((m_{i-1},A_{i-1},x_{i-1},t_{i-1})\) satisfying Conditions \(\mathbf{(I_{1})}\), \(\mathbf{(I_{2})}\). We may then apply Lemma 4.1 and obtain \(m_{i}:=\min(\mathbf{M}_{0}(\delta F) \cap(x_{i-1},\infty))\) and \(A_{i}, x_{i},t_{i}:=t_{A_{i}} ^{m_{i}}(x_{i})\). Define the maps \(T_{d}\), \(T_{u}\) on \((x_{i-1},x_{i})\) by

$$ \textstyle\begin{array}{c} T_{d}(x) = T_{u}(x) = x \quad \mbox{for } x_{i-1}< x\le m_{i}, \\ T_{d}(x):=t^{m_{i}}_{A_{i}}(x) \quad\mbox{and}\quad T_{u}(x):=g \big(x,T_{d}(x)\big) \quad \mbox{for } m_{i}\le x< x_{i}. \end{array} $$

If \(x_{i}=\infty\), this completes the construction, and we set \(m_{j}=x_{j}=\infty\) for all \(j> i\). Otherwise, Lemma 4.1 guarantees that the new ingredients \((m_{i},A_{i},x_{i},t_{i})\) satisfy Conditions \(\mathbf{(I_{1})}\), \(\mathbf{(I_{2})}\), and we may continue with the next step.

Since \(\mathbf{M}_{0}(\delta F)\) is assumed to be finite, the last iteration can only have a finite number of steps. We observe that we may extend to the case where \(\mathbf{M}_{0}(\delta F)\) is countable; the delicate case of an accumulation point of \(\mathbf{M}_{0}(\delta F)\) could be addressed by means of transfinite induction. We deliberately choose to avoid such technicalities in order to focus on the main properties of the above construction.

Remark 4.2

(Some properties of \(T_{d}\)) From the above construction of \(T_{d}\), we see that

(i) \(T_{d}\) is right-continuous and decreasing on each interval \((m_{i},x_{i})\) \(\mu\)-a.e.; (ii) In general, the restriction of \(T_{d}\) to \(\bigcup_{i\ge0}(m_{i},x_{i})\) fails to be nondecreasing. However, for \(i\neq j\), we have \(T_{d}((m_{i},x_{i}))\cap T_{d}((m_{j},x_{j}))= \emptyset\). Consequently, the right-continuous inverse \(T_{d}^{-1}\) of \(T_{d}\) is well defined.

Remark 4.3

(Some properties of \(T_{u}\)) From the above construction of \(T_{u}\), we see that

(i) \(T_{u}\) is right-continuous, \(T_{u}([m_{i},x_{i}]) \subset[m_{i},x_{i}]\), and \(T_{u}(x)>x\) for \(x\in(m_{i},x_{i})\) for all \(i\);

(ii) \(T_{u}\) is nondecreasing, and strictly increasing \(\mu\)-a.e. The last property will be clear from Theorem 4.5(ii) below, and implies that the right-continuous inverse \(T_{u}^{-1}\) of \(T_{u}\) is well defined.

Remark 4.4

One could extend the above construction to the case where \(\mathbf{M}_{0}(\delta F)\) is countable with no point of right accumulation, thus weakening the conditions of Assumption 3.7. However, the condition in this assumption that \(F_{\nu}\) has no atoms is more difficult to bypass because then the ODEs in Theorem 4.5(ii) fail to hold, in general, due to the fact that \(T_{d}^{-1}\circ T_{d}(x)\) and \(T_{u}^{-1}\circ T_{u}(x)\) may be larger than \(\{x\}\).

4.3 The left-monotone martingale transport plan

The last construction provides, under Assumptions 3.5 and 3.7, our martingale version of the Fréchet–Hoeffding coupling for an irreducible pair \((\mu,\nu)\) with domain \((I,I)\), namely

$$\begin{aligned} T_{*}(x,dy) :=& \mathbf{1} _{D}(x) \delta_{\{x\}}(dy) \\ &{}+\mathbf{1} _{I\setminus D}(x) \big(q(x)\delta_{\{T_{u}(x)\}}(dy) +(1-q)(x) \delta_{\{T_{d}(x)\}}(dy) \big) \end{aligned}$$
(4.2)

with

$$\begin{aligned} D \;:=\; \bigcup\limits_{{i\ge0}}(x_{i-1},m_{i}] \quad\mbox{and} \quad& q(x) \;:=\; \frac{x-T_{d}(x)}{T_{u}(x)-T_{d}(x)}. \end{aligned}$$
(4.3)

We recall that our construction has a finite number of steps, \(N\le n\) say, due to our condition that \(\mathbf{M}_{0}(\delta F)\) is finite, and that the union in the definition of the set \(D\) is finite by our convention that \(m_{j+1}=x_{j}=\infty\) for all \(j\ge N\). Observe also from our previous construction that \(T_{d}(x)< x<T_{u}(x)\) on each \((x_{i},m_{i})\). Therefore, \(q\) takes values in \([0,1]\).

Theorem 4.5

Let \(\mu\preceq\nu\) be two probability measures on ℝ.

  1. (i)

    Assume that \((\mu,\nu)\) is irreducible, with domain \((I,I)\), and satisfies Assumptions  3.5 and 3.7. Then the probability measure \(\mathbb{P}_{*}(dx,dy):= \mu(dx)T_{*}(x,dy)\) on \(I\times I\) is the unique left-monotone transport plan in \(\mathcal{M}_{2}(\mu,\nu)\). Moreover, \(T_{u}\) and \(T_{d}\) solve the ODEs

    $$\begin{aligned} d(\delta F\circ T_{d}) &= -(1-q)\,dF_{\mu}, \\ d(F_{\nu}\circ T_{u}) &= q\,dF_{\mu} \end{aligned}$$

    whenever \(x\in[m_{i},x_{i})\) and \(T_{d}(x)\in\mathrm{int}(A_{i})\).

  2. (ii)

    Let \((\mu_{k},\nu_{k})_{k\ge0}\) be the decomposition of \((\mu,\nu)\) into irreducible components with corresponding domains \((I_{k},J_{k})_{k\ge0}\), as introduced in Proposition 3.2. Consider also the decomposition \(\mathbb{P}=\sum_{k\ge0} P_{k}\in\mathcal{M}(\mu,\nu)\) with \(\mathbb{P}_{k}\in\mathcal{M}(\mu_{k},\nu_{k})\), \(k\ge0\). Thenis left-monotone if and only if \(\mathbb{P}_{k}\) is left-monotone for all \(k\ge1\).

The proof of part (i) is reported in Sect. 7.1. Part (ii) is obvious given the decomposition of Proposition 3.2.

We conclude this subsection by the following remarkable property of \(T_{d}\) which uses the notation (3.5).

Proposition 4.6

Let \((\mu,\nu)\) be an irreducible component satisfying Assumptions 3.5 and 3.7. Let \(i\ge1\) be such that \(m_{i-}=m_{i}\). Then \(T_{d}(m_{i})=m_{i}\). If in addition \(F_{\mu}\), \(F _{\nu}\) are twice differentiable near \(m_{i}\), then \(T_{d}\) is also differentiable on \([m_{i},m_{i}+h)\) for some \(h>0\), with right derivatives at \(m_{i}\) given by

$$\begin{aligned} T_{d}'(m_{i}+)=-1/2 \quad\textit{and}\quad T_{d}''(m_{i}+)=+\infty . \end{aligned}$$

Proof

We denote \(f_{\mu}:=F'_{\mu}\), \(f_{\nu}:=F'_{\nu}\), \(\delta f:=f _{\nu}-f_{\mu}\).

By construction, we have \(T_{d}(m_{i})=m_{i}\) and the differentiation of the identity \(G_{A_{i}}^{m_{i}}(T_{d}(x),x)=0\) reproduces the mass conservation condition (3.13). This ordinary differential equation shows that \(T_{d}\) inherits the differentiability of \(F_{\nu}\) and \(F_{\mu}\) on \((m_{i},m_{i}+h)\) for some \(h>0\), with

$$\begin{aligned} T_{d}'(x) =& -\frac{g(x,T_{d}(x))-x}{g(x,T_{d}(x))-T_{d}(x)} \; \frac{f _{\mu}(x)}{\delta f(T_{d}(x))}, \quad x\in(m_{i},m_{i}+h). \end{aligned}$$

Let \(\varepsilon:=x-T_{d}(x)\) and recall that \(g(x,x)=x\). Then it follows from direct calculation that

$$\begin{aligned} g(x,T_{d})-x =& -\varepsilon\frac{\delta f}{f_{\nu}}(x) +\frac{ \varepsilon^{2}}{2}\bigg(\frac{\delta f'}{f_{\nu}} -\Big(\frac{ \delta f}{f_{\nu}} \Big)^{2} \frac{f_{\nu}'}{f_{\nu}} \bigg)(x) +o( \varepsilon^{2}), \\ \delta f\big( T_{d}(x) \big) =& \delta f(x) -\varepsilon\delta f'(x) +o(\varepsilon), \end{aligned}$$

where \(o\) is a continuous function with \(o(0)=0\). Then, for \(x\in(m_{i},m_{i}+h)\),

$$\begin{aligned} T_{d}'(x) =& \frac{-\frac{\delta f}{f_{\nu}} +\frac{1}{2}\varepsilon (\frac{\delta f'}{f_{\nu}} -(\frac{\delta f}{f_{\nu}})^{2} \frac{f _{\nu}'}{f_{\nu}} )+o(\varepsilon)}{1-\frac{\delta f}{f_{\nu}} + \frac{1}{2}\varepsilon(\frac{\delta f'}{f_{\nu}} -(\frac{\delta f}{f _{\nu}})^{2} \frac{f_{\nu}'}{f_{\nu}} )+o(\varepsilon)} \;\frac{f _{\mu}}{\delta f-\varepsilon\delta f'+o(\varepsilon)} (x). \end{aligned}$$

Notice that \(0\le x-m_{i}\le\varepsilon\). Then, since we have \(f_{\mu}(m_{i})=f_{\nu}(m_{i})\), we obtain that \(\delta f(x)=(x-m _{i})\delta f'(x)+\circ(\varepsilon)\) and therefore

$$\begin{aligned} T_{d}'(x) =& \frac{-\delta f(x)+\frac{1}{2}\varepsilon\delta f'(x)+o(\varepsilon )}{\delta f(x)-\varepsilon\delta f(x)+o(\varepsilon)} \\ =& \frac{-(x-m_{i})+\frac{1}{2}\varepsilon+o(\varepsilon)}{(x-m _{i})-\varepsilon+o(\varepsilon)} \\ =& \frac{-\frac{1}{2}+\frac{x-m_{i}}{\varepsilon}+o(1)}{1-\frac{x-m _{i}}{\varepsilon}+o(1)}, \quad x\in(m_{i},m_{i}+h), \end{aligned}$$
(4.4)

where we recall that \(\varepsilon=x-T_{d}(x)\). Since \(T_{d}\) is nonincreasing, this implies further that \(0\le x-m_{i}\le\frac{1}{2} \varepsilon\). Moreover, by the convergence \(T_{d}\to m\), we see that \(m=T_{d}(x)+(m-x)T_{d}'(x)+o (x-m)\) and thus \(\frac{x-T_{d}(x)}{x-m}=1-T_{d}'(x)+o (1)\). Substituting this in (4.4), we get

$$\begin{aligned} T_{d}'(x) =& \frac{\frac{1}{2}(1+T_{d}'(x))+o(1)}{-T_{d}'(x)+o(1)}, \quad x\in(m_{i},m_{i}+h), \end{aligned}$$

from which we conclude that \(T_{d}'(x)\to-1/2\) as \(x\searrow m_{i}\).

Finally, we compute \(T_{d}''(m_{i})\). By the ODE satisfied by \(T_{d}\) and the smoothness of \(g\), it follows that \(T_{d}'\) is differentiable at any \(x>m_{i}\). We then differentiate the ODE satisfied by \(T_{d}\) and use Taylor expansions as above. The result follows from direct calculation by sending \(x\searrow m_{i}\). □

5 Martingale one-dimensional Brenier theorem

5.1 Derivation of the optimal semi-static hedging strategy

Similarly to our construction, the optimal semi-static hedging strategy will be obtained separately on each irreducible component. Consequently, we may assume without loss of generality that the pair \((\mu,\nu)\) is irreducible.

We start by following the same line of argument as in the proof of Theorem 2.2. Our objective is to construct a triple

$$\begin{aligned} (\varphi_{*},\psi_{*},h_{*})\in\mathcal{D}_{2} \quad \mbox{such that}\quad& \mu(\varphi_{*})+\nu(\psi_{*})= \mathbb{E} ^{\mathbb{P}_{*}}[c(X,Y)]. \end{aligned}$$
(5.1)

This will provide equality in (3.3) with the optimality of \(\mathbb{P}_{*}\) for the optimal transportation problem \(P_{2}\) and the optimality of \((\varphi_{*},\psi_{*},h_{*})\) for the dual problem \(D_{2}\).

By the definition of the dual set \(\mathcal{D}_{2}\), we observe that the requirement (5.1) is equivalent to

$$ \varphi_{*}(X)+\psi_{*}(Y)+h_{*}(X)(Y-X)-c(X,Y)=0\quad \mathbb{P}_{*}\mbox{-a.s. for some function}~h_{*} $$
(5.2)

and that the function \(\varphi_{*}\) is determined from \((\psi_{*},h _{*})\) by

$$\begin{aligned} \varphi_{*}(x) & \phantom{:}= \max_{y\in\mathbb{R}}H(x,y), \\ \mbox{where } H(x,y) &:=c(x,y)-\psi_{*}(y)-h_{*}(x)(y-x), \quad x,y\in\mathbb{R}. \end{aligned}$$
(5.3)

Recall the set \(D\) defined in (4.3) on which we have \(T_{d}(x)=T _{u}(x)=x\), \(x\in D\), and the right-continuous inverse functions \(T_{d}^{-1}\), \(T_{u}^{-1}\) defined in Remark 4.2(ii) and Remark 4.3(iii). From the perfect replication property (5.2), it follows that \(h_{*}\) is determined on \(D^{c}\) in terms of \(\psi_{*}\) by

$$\begin{aligned} h_{*}(x) =& \frac{(c(x,\cdot)-\psi_{*})\circ T_{u}(x) -(c(x,\cdot)- \psi_{*})\circ T_{d}(x)}{(T_{u}-T_{d})(x)} \quad \mbox{for }x\in D^{c}. \end{aligned}$$
(5.4)

Since \(T_{u}\) and \(T_{d}\) are maximizers in (5.3), it follows from the first order condition that

$$\begin{aligned} \psi_{*}'\circ T_{u}(x) =& c_{y}\big(x,T_{u}(x)\big)-h_{*}(x), \quad x\in D^{c}, \end{aligned}$$
(5.5)
$$\begin{aligned} \psi_{*}'\circ T_{d}(x) =& c_{y}\big(x,T_{d}(x)\big)-h_{*}(x), \quad x\in D^{c}, \end{aligned}$$
(5.6)
$$\begin{aligned} \psi_{*}'(x) =& c_{y}(x,x)-h_{*}(x) \quad \mbox{for }x\in D. \end{aligned}$$
(5.7)

Differentiating (5.4) and using (5.5) and (5.6), we see that for \(x\in D^{c}\),

$$\begin{aligned} h_{*}' =& \frac{d}{dx}\bigg(\frac{c(\cdot,T_{u}) - c(\cdot,T_{d})}{T _{u} - T_{d}} \bigg) \\ &{}+\frac{T_{u}' - T_{d}'}{T_{u} - T_{d}}\; \frac{\psi_{*}(T_{u}) - \psi_{*}(T_{d})}{T_{u} - T_{d}} +\frac{T_{d}'(c_{y}(\cdot,T_{d})-h _{*}) - T_{u}'(c_{y}(\cdot,T_{u})-h_{*})}{T_{u} - T_{d}}, \end{aligned}$$

which leads by direct calculation to

$$\begin{aligned} h_{*}' =& \frac{c_{x}(\cdot,T_{u})-c_{x}(\cdot,T_{d})}{T_{u}-T_{d}} \quad \mbox{on }D^{c}. \end{aligned}$$
(5.8)

This determines \(h^{*}\) on \(D\) up to irrelevant constants. By evaluating (5.6) at a point \(T_{d}^{-1}(x)\in D\), it follows from (5.7) that

$$\begin{aligned} c_{y}(x,x)-h_{*}(x) = c_{y}\big(T_{d}^{-1}(x),x\big)-h_{*}\circ T_{d} ^{-1}(x), \quad x\in D. \end{aligned}$$
(5.9)

Since \(T_{d}\) and \(T_{u}\) take values in \(D\) and \(D^{c}\), respectively, and \(h_{*}\) is determined by (5.9) on \(D\), we see that \(h_{*}|_{D^{c}}\) is determined by (5.8), and (5.5), (5.6) determine \(\psi_{*}\) on ℝ.

5.2 Main result

The previous formal derivations suggest the following candidate functions for the semi-static hedging strategy. Up to a constant, the dynamic hedging component \(h_{*}\) is defined in each continuity point by

$$\begin{aligned} h_{*}' &=\frac{c_{x}(\cdot,T_{u})-c_{x}(\cdot,T_{d})}{T_{u}-T_{d}} \quad \mbox{on}~D^{c}, \end{aligned}$$
(5.10)
$$\begin{aligned} h_{*} &=h_{*}\circ T_{d}^{-1}+c_{y}(\cdot,\cdot)-c_{y}(T_{d}^{-1}, \cdot) \quad \mbox{on}~D. \end{aligned}$$
(5.11)

The payoff function \(\psi_{*}\) is defined up to a constant on each continuity interval by

$$\begin{aligned} \psi_{*}' &=c_{y}(T_{u}^{-1},\cdot)-h_{*}\circ T_{u}^{-1} \quad \mbox{on }D^{c}, \end{aligned}$$
(5.12)
$$\begin{aligned} \psi_{*}' &=c_{y}(T_{d}^{-1},\cdot)-h_{*}\circ T_{d}^{-1} \quad \mbox{on }D. \end{aligned}$$
(5.13)

The corresponding function \(\varphi_{*}\) is given by

$$\begin{aligned} \varphi_{*}(x) &= \mathbb{E}^{\mathbb{P}_{*}}[c(X,Y)-\psi_{*}(Y)|X=x] \\ &= q(x)\big(c(x,\cdot)-\psi_{*}\big)\circ T_{u}(x) +\big(1-q(x) \big)\big(c(x,\cdot)-\psi_{*}\big)\circ T_{d}(x), \quad x\in\mathbb{R}. \end{aligned}$$
(5.14)

Finally, we define \(h_{*}\) and \(\psi_{*}\) from (5.10)–(5.13) by imposing that

$$ c(\cdot,T_{u})-\psi_{*}(T_{u}) -\big(c(\cdot,T_{d})-\psi_{*}(T_{d}) \big) -(T_{u}-T_{d})h \quad\mbox{is continuous.} $$
(5.15)

The last requirement is obviously possible as the number of jumps of \(T_{d}\) and \(T_{u}\) is finite, due to our assumption that \(\mathbf{M} _{0}(\delta F)\) is finite. Indeed, (5.15) determines \(\psi_{*}(T_{u})\) from \(\psi_{*}(T_{d})\) at discontinuity points, from left to right.

Theorem 5.1

Let \((\mu,\nu)\) be an irreducible pair (without loss of generality) satisfying Assumptions 3.5 and 3.7. Assume further that \(\varphi_{*}^{+}\in\mathbb{L}^{1}(\mu)\), \(\psi_{*}^{+} \in\mathbb{L}^{1}(\nu)\), and that the partial derivative of the coupling function \(c_{xyy}\) exists and \(c_{xyy}>0\) on \(\mathbb{R} \times\mathbb{R}\). Then:

  1. (i)

    \((\varphi_{*},\psi_{*},h_{*})\in\mathcal{D}_{2}\).

  2. (ii)

    The strong duality holds for the martingale transportation problem, \(\mathbb{P}_{*}\) is a solution of \(P_{2}( \mu,\nu)\), and \((\varphi_{*},\psi_{*},h_{*})\) is a solution of \(D_{2}(\mu,\nu)\), i.e.,

    $$\begin{aligned} \int c\big(x,T_{*}(x,dy)\big)\mu(dx) =& \mathbb{E}^{\mathbb{P}_{*}}[c(X,Y)] = P_{2}(\mu,\nu)=D_{2}(\mu,\nu) \\ =&\mu(\varphi_{*})+\nu(\psi_{*}). \end{aligned}$$

Remark 5.2

(Symmetry: the right-monotone martingale transport plan)

(i) Suppose that \(c_{xyy}<0\). Then the upper bound \(P_{2}(\mu,\nu)\) is attained by the right-monotone martingale transport map

$$\begin{aligned} \bar{\mathbb{P}}_{*}(dx,dy) := \bar{\mu}(dx)\bar{T}_{*}(x,dy), \end{aligned}$$

where \(\bar{T}_{*}\) is defined as in (4.2) with the pair of probability measures \((\bar{\mu},\bar{\nu})\) by

$$\begin{aligned} F_{\bar{\mu}}(x):=1-F_{\mu}(-x), \qquad & F_{\bar{\nu}}(y) := 1-F_{\nu}(-y). \end{aligned}$$

To see this, we rewrite the optimal transportation problem equivalently with modified inputs

$$\begin{aligned} &\bar{c}(x,y):=c(-x,-y),\qquad \bar{\mu}\big((-\infty,x]\big):=\mu\big([-x,\infty)\big), \\ &\bar{\nu}\big((-\infty,y]\big) :=\nu\big([-y, \infty)\big), \end{aligned}$$

so that \(\bar{c}_{xyy}>0\) as required in Theorem 5.1. Note that the martingale constraint is preserved by the map \((x,y) \mapsto (-x,-y)\).

(ii) Suppose that \(c_{xyy}>0\). Then the lower bound problem is explicitly solved by the right-monotone martingale transport plan. Indeed, it follows from the first part (i) of the present remark that

$$\begin{aligned} \inf_{\mathbb{P}\in{\mathcal {P}}_{2}(\mu,\nu)} \!\! \mathbb{E}^{\mathbb{P}}[c(X,Y)] =& - \sup_{\mathbb{P}\in{\mathcal {P}}_{2}(\mu,\nu)} \!\! \mathbb{E}^{\mathbb{P}}[-c(X,Y)] \\ =& \mathbb{E}^{\bar{\mathbb{P}}_{*}}[c(X,Y)] = \int \!\! c\big(x,\bar{T}_{*}(x,dy)\big)\mu(dx). \end{aligned}$$

Remark 5.3

The martingale counterpart of the Spence–Mirrlees condition is \(c_{xyy}> 0\). We now argue that this condition is the natural requirement in the present setting. Indeed, the optimization problem is not affected by the modification of the coupling function from \(c\) to \(\bar{c}(x,y):=c(x,y)+a(x)+b(y)+h(x)(y-x)\) for any \(a\in\mathbb{L} ^{1}(\mu)\), \(b\in\mathbb{L}^{1}(\nu)\) and \(h\in\mathbb{L}^{0}\). Since \(c_{xyy}=\bar{c}_{xyy}\), it follows that the condition \(c_{xyy}> 0\) is stable for the above transformation of the coupling function.

Remark 5.4

(Comparison with Beiglböck and Juillet [4]) The remarkable notion of left-monotone martingale transport was introduced by Beiglböck and Juillet [4], where existence and uniqueness is proved.

1. We first show that their conditions on the coupling function fall in the context of our Theorem 5.1:

  • The first class of coupling functions which is considered in [4] has the form \(c(x,y)=h(y-x)\) for some differentiable function \(h\) whose derivative is strictly concave. Notice that this form of coupling essentially falls under our condition \(c_{xyy}>0\).

  • The second class of coupling functions which is considered in [4] has the form \(c(x,y)=\psi(x)\phi(y)\), where \(\psi\) is a nonnegative decreasing function and \(\phi\) a nonnegative strictly concave function. This class also essentially falls under our condition \(c_{xyy}>0\).

2. The proof of [4] does not use the dual formulation of the martingale optimal transport problem. They rather extend the concept of cyclical monotonicity to the martingale context, and provide an existence result without explicit characterization of the maps \((T_{d},T_{u})\). Also, our derivation of the optimal semi-static hedging strategy \((\varphi_{*},\psi_{*},h_{*})\) is new. We recall, however, that the result of [4] does not require our Assumption 3.7.

3. Our construction agrees with the example given by two log-normal distributions \(\mu= \mu_{0}:=e^{\mathcal{N}(-\sigma^{2}_{1}/2, \sigma^{2}_{1})}\) and \(\nu= \nu_{0}:=e^{\mathcal{N}(-\sigma^{2}_{2}/2, \sigma^{2}_{2})}\), \(\sigma^{2}_{1}<\sigma_{2}^{2}\), illustrated in Fig. 2 of [4]. By using our construction, we reproduce the left-monotone transference map in Fig. 1. Indeed, in this case, \(x_{0}=-\infty\), \(\delta F\) has a unique local (and therefore global) maximizer \(m_{1}\) of \(\delta F\), and \(x_{1}=\infty\). The left-monotone transport plan is explicitly obtained from our construction after Step 1, i.e., no further steps are needed in this case.

Example 5.5

We provide an example where \(\delta F\) has two local maxima and the construction needs two steps. Let \(\mu\) and \(\nu\) be defined by

$$ \mu= \mu_{1}:=\mathcal{N}(1,0.5), \qquad \nu= \nu_{1}:=\frac{1 }{3} \big( \mathcal{N}(1,2)+\mathcal{N}(0.6,0.1)+ \mathcal{N}(1.4,0.3) \big), $$

where the normal distributions in the definition of \(\nu_{1}\) are independent. Clearly, \(\mu\) and \(\nu\) have mean 1 and \(\mu\preceq \nu\). We also immediately check that \(\delta F\) has two local maxima, \(m_{1}=-0.15\) and \(m_{2}=0.72\). Figure 2 below reports the maps \(T_{u}\) and \(T_{d}\) as obtained from our construction.

Fig. 2
figure 2

\(\delta F\) has two local maxima (top), and \(T_{d}\), \(T_{u}\) corresponding to \(\mu_{1}\), \(\nu_{1}\) (bottom)

Remark 5.6

(Comparison with Hobson and Neuberger [27]) Our Theorem 5.1 does not apply to the coupling function \(c(x,y)=|x-y|\) considered by Hobson and Neuberger [27]. More importantly, the corresponding maps \(T^{\mathrm{HN}}_{u}\) and \(T^{\mathrm{HN}}_{d}\) introduced in [27] are both nondecreasing with \(T^{\mathrm{HN}}_{d}(x)< x< T ^{\mathrm{HN}}_{u}(x)\) for all \(x\in\mathbb{R}\). So our solution \((T_{d},T_{u})\) is of a different nature, and in contrast with the above \((T^{\mathrm{HN}}_{d},T^{\mathrm{HN}}_{u})\), our left-monotone martingale transport map \(T_{*}\) does not depend on the nature of the coupling function \(c\) as long as \(c_{xyy}>0\).

However, by following the line of argument of the proof of Theorem 5.1, we may recover the solution of Hobson and Neuberger [27]. As a matter of fact, our method of proof is similar to that of [27], as the dual problem \(D_{2}\) is exactly the Lagrangian obtained by the penalization of the objective function by Lagrange multipliers.

5.3 Some examples

Example 5.7

(Variance swap) The coupling function here is \(c(x,y)= (\ln(y/x))^{2}\), where \(\mu\) and \(\nu\) have support in \((0,\infty)\). In particular, it satisfies the requirement of Theorem 5.1 that \(c_{xyy}>0\). Then the optimal upper bound is given by

$$\begin{aligned} P_{2}(\mu,\nu) =& \int_{0}^{\infty}\bigg(q(x)\Big(\ln \frac{T_{u}(x) }{x}\Big)^{2} +(1-q)(x)\Big(\ln\frac{T_{d}(x) }{x} \Big)^{2} \bigg)\mu(dx), \end{aligned}$$

where \(q\) is set to an arbitrary value on \(D\). In Fig. 3, we have plotted \(\varphi_{*}\), \(\psi_{*}\) and \(h_{*}\) with marginal distributions \(\mu=\mu_{0}:=e^{\mathcal{N}(-\sigma_{1}^{2}/2,\sigma _{1}^{2})}\) and \(\nu=\nu_{0}:=e^{\mathcal{N}(-\sigma_{2}^{2}/2,\sigma _{2}^{2})}\), \(\sigma_{1}^{2}=0.04<\sigma_{2}^{2}=0.32\). We recall that the corresponding maps \(T_{d}\), \(T_{u}\) are plotted in Fig. 1. The expression for \(\psi_{*}\) is

$$\begin{aligned} \psi_{*}'(x) = \frac{2 }{x} \ln\frac{x }{T_{u}^{-1}(x)} + 2 \int_{x _{0}}^{T_{u}^{-1}(x)} \frac{ \ln\frac{T_{u}(\xi) }{T_{d}(\xi)} }{\xi(T_{u}(\xi)-T_{d}(\xi))}\,d\xi. \end{aligned}$$

In particular, \(\psi_{*}''(x)=\frac{2 }{x^{2}}\) for all \(x\leq m_{1}\).

Fig. 3
figure 3

Superreplication strategy for a 2-period variance swap given two log-normal densities with variances 0.04 and 0.32

Example 5.8

Consider the coupling function \(c(x,y)=-(\frac{y }{x})^{p}\), \(p>1\), and let the measures \(\mu\), \(\nu\) be supported in \((0,\infty)\). This payoff function also satisfies the condition of Theorem 5.1 that \(c_{xyy}>0\). The best upper bound is then given by

$$\begin{aligned} P_{2}(\mu,\nu) = -\int_{0}^{\infty}\bigg(q(x)\Big( \frac{T_{u}(x)}{x}\Big)^{p} +(1-q)(x)\Big(\frac{T_{d}(x)}{x}\Big)^{p} \bigg) \mu(dx). \end{aligned}$$

6 The \(n\)-marginals martingale transport

In this section, we provide a direct extension of our results to the martingale transportation problem under finitely many marginals constraints. Fix an integer \(n\ge2\), and let \(X=(X_{1},\ldots,X_{n})\) be a vector of \(n\) random variables denoting the prices of some financial asset at dates \(t_{1}<\cdots<t_{n}\). Consider the probability measures \(\mu=(\mu_{1},\ldots,\mu_{n})\in(\mathcal{P}_{\mathbb{R}})^{n}\) with \(\mu_{1}\preceq\cdots\preceq\mu_{n}\) in the convex order and

$$\begin{aligned} \int|\xi|\mu_{i}(d\xi)< \infty \quad\mbox{and}\quad& \int\xi\mu _{i}(d\xi)=X_{0} \quad \mbox{for all } i=1,\ldots,n. \end{aligned}$$

Similarly to the two-marginals case, we introduce the set

$$\begin{aligned} \mathcal{M}_{n}(\mu) :=& \{\mathbb{P}\in\mathcal{P}_{n}(\mu): X~\mbox{is a $\mathbb{P}$-martingale} \}, \end{aligned}$$

where \(\mathcal{P}_{n}(\mu)\) was defined in (2.6). In the present martingale version, we introduce the one-step ahead martingale transport maps defined by means of the \(n\) pairs of maps \((T_{d}^{i},T_{u}^{i})\) by

$$\begin{aligned} T_{*}^{i}(x_{i},\cdot) :=& \mathbf{1} _{D_{i}}\delta_{\{x_{i}\}} + \mathbf{1} _{D_{i}^{c}} \big(q_{i}(x_{i})\delta_{T_{u}^{i}(x_{i})} +(1-q _{i})(x_{i})\delta_{T_{d}^{i}(x_{i})} \big), \end{aligned}$$

where \(q_{i}(\xi):=(\xi-T_{d}^{i}(\xi))/(T_{u}^{i}-T_{d}^{i})( \xi)\) for \(\xi\in D_{i}^{c}\) and \((D_{i},T^{i}_{d},T^{i}_{u})_{i=1, \ldots, n-1}\) are defined as in Sect. 4.2 with the pair \((\mu_{i},\mu_{i+1})\).

The \(n\)-marginals martingale transport problem is defined by

$$\begin{aligned} P_{n}(\mu) = \sup_{\mathbb{P}\in\mathcal{M}_{n}(\mu)} \mathbb{E} ^{\mathbb{P}}[c(X)], \end{aligned}$$

where the map \(c:\mathbb{R}^{n}\to\mathbb{R}\) is of the form

$$ c(x_{1},\ldots,x_{n}) = \sum_{i=1}^{n-1} c^{i}(x_{i},x_{i+1}) $$

for some upper semicontinuous functions \(c^{i}:\mathbb{R}\times \mathbb{R}\to\mathbb{R}\) with linear growth (or Condition (2.2)), \(i=1,\ldots,n-1\).

The dual problem is defined by

$$ D_{n}(\mu) := \inf_{(u,h)\in\mathcal{D}_{n}}\sum_{i=1}^{n}\mu_{i}(u _{i}), $$

where \(u=(u_{1},\ldots,u_{n})\) with components \(u^{i}:\mathbb{R} \to\mathbb{R}\) and \(h=(h_{1},\ldots,h_{n-1})\) with components \(h_{i}:\mathbb{R}^{i}\to\mathbb{R}\), taken from the set of dual variables

$$ \mathcal{D}_{n} := \bigg\{ (u,h):u_{i}^{+}\in\mathbb{L}^{1}(\mu_{i}), h_{i}\in\mathbb{L}^{0}(\mathbb{R}^{i}), ~\mbox{and}~ \bigoplus_{i=1} ^{n} u_{i} + \sum_{i=1}^{n-1} h_{i}^{\otimes^{i}} \ge c \bigg\} . $$

Here, \(\oplus_{i=1}^{n} u_{i}(x)=\sum_{i\le n}u_{i}(x_{i})\) and \(h_{i}^{\otimes^{i}}(x)=h_{i}(x_{1},\ldots,x_{i})(x_{i+1}-x_{i})\).

Similarly to the two-marginals problems studied before, the weak duality inequality \(P_{n}(\mu)\le D_{n}(\mu)\) is obvious, and we shall obtain equality in the following result under convenient conditions.

To derive the structure of the optimal hedging strategy, we consider the two-marginals problems for \((\mu_{i},\mu_{i+1})\) with coupling functions \(c^{i}\). By Theorem 5.1, we have for \(i=1,\ldots,n-1\) that

$$\begin{aligned} P^{i}_{2}(\mu_{i},\mu_{i+1}) := \sup_{\mathbb{P}\in\mathcal{M}(\mu_{i},\mu_{i+1})}\mathbb{E}^{ \mathbb{P}}[c^{i}(X,Y)] =& \inf_{(\varphi,\psi,h)\in\mathcal{D}^{i}_{2}} \big(\mu_{i}(\varphi )+\mu_{i+1}(\psi)\big) \\ =&\mu_{i}(\varphi_{i}^{*})+\mu_{i+1}(\psi_{i}^{*}), \end{aligned}$$

where \(\mathcal{D}^{i}_{2}\) is defined as in (3.2) with \(c^{i}\) substituted for \(c\), and \((\varphi_{i}^{*},\psi_{i}^{*},h_{i} ^{*})\in\mathcal{D}^{i}_{2}\) are defined as in (5.10)–(5.14) with \(c^{i}\) substituted for \(c\) and \((T_{u}^{i},T_{d}^{i})\) substituted for \((T_{u},T_{d})\). Finally, we define

$$\begin{aligned} u^{*}_{i}(x_{i}):=\mathbf{1} _{\{i< n\}}\varphi^{*}_{i}(x_{i})+\mathbf{1} _{\{i>1\}}\psi^{*}_{i-1}(x_{i}), \quad i=1,\ldots,n, \end{aligned}$$

and \(u^{*}:=(u^{*}_{1},\ldots,u^{*}_{n})\), \(h^{*}:=(h^{*}_{1},\ldots ,h^{*}_{n-1})\).

Theorem 6.1

Let \((\mu_{i})_{1\le i\le n}\) be probability measures onwithout atoms, and with \(\mu_{1}\preceq\cdots\preceq\mu _{n}\) in convex order, \((\mu_{i-1},\mu_{i})\) irreducible, and \(\mathbf{M}_{0}(F_{\mu_{i}}-F_{\mu_{i-1}})\) finite for all \(1< i \le n\). Assume further that

  • \(c^{i}\) have linear growth, the cross derivatives \(c^{i}_{xyy}\) exist and satisfy \(c^{i}_{xyy}>0\);

  • \(\varphi_{i}^{*}\), \(\psi_{i}^{*}\) satisfy the integrability conditions \((\varphi_{i}^{*})^{+} \in\mathbb{L}^{1}(\mu_{i})\), \((\psi_{i}^{*})^{+}\in\mathbb{L}^{1}(\mu_{i+1})\).

Then the strong duality holds, the probability measure

$$ \mathbb{P}^{*}_{n}(dx)=\mu_{1}(dx_{1})\prod_{i=1}^{n-1}T_{*}^{i}(x _{i},dx_{i+1}) $$

on \(\mathbb{R}^{n}\) is optimal for the martingale transportation problem \(P_{n}(\mu)\), and \((u^{*},h^{*})\) is optimal for the dual problem \(D_{n}(\mu)\), i.e.,

$$ \mathbb{P}^{*}_{n}\in\mathcal{M}_{n}(\mu), ~~ (u^{*},h^{*})\in\mathcal{D}_{n}, \quad\mathit{and}\quad\mathbb{E} ^{\mathbb{P}^{*}_{n}}[c(X)]=P_{n}(\mu)=D_{n}(\mu)=\sum_{i=1}^{n}\mu _{i}(u^{*}_{i}). $$

Proof

Clearly, we have \(\mathbb{P}^{*}_{n}\in\mathcal{M}_{n}(\mu)\), which gives the inequality \(\mathbb{E}^{\mathbb{P}^{*}_{n}}[c(X)] \le P_{n}(\mu)\). We next observe that \((u^{*},h^{*})\in\mathcal{D} _{n}\) from our construction. Then

$$ D_{n}(\mu)\le\sum_{i\le n}\mu_{i}(u_{i}^{*})= \mathbb{E}^{ \mathbb{P}^{*}_{n}}[c(X)]. $$

The required result follows from the weak duality inequality \(P_{n}(\mu)\le D_{n}(\mu)\). □

Remark 6.2

The optimal lower bound for a coupling function as in Theorem 6.1 is attained by the mirror solution introduced in Remark 5.2.

Example 6.3

(Discrete monitoring variance swaps) This is a continuation of our Example 5.7. Suppose that \((\mu_{i})_{1\le i\le n}\) have support in \((0,\infty)\) with mean \(X_{0}\) and satisfy the conditions of Theorem 6.1. Let \(c(x_{1},\ldots,x_{n}):=\sum_{i=1} ^{n} (\ln\frac{x_{i} }{x_{i-1}})^{2}\). Then

$$\begin{aligned} P_{n}(\mu) =& \int\bigg(\ln\frac{\xi}{X_{0}}\bigg)^{2}\mu_{1}(d \xi) \\ &{} + \sum_{i=1}^{n-1} \int_{0}^{\infty}\bigg(q_{i}(\xi) \Big(\ln\frac{T ^{i}_{u}(\xi)}{\xi}\Big)^{2} +(1-q_{i})(\xi) \Big(\ln\frac{T^{i} _{d}(\xi)}{\xi}\Big)^{2} \bigg)\mu_{i}(d\xi). \end{aligned}$$

This optimal bound depends on all the marginals. The optimal lower bound is attained by our mirror solution; see Remark 6.2.

Remark 6.4

In a related robust hedging problem, Hobson and Klimmek [26] derived an optimal upper bound for a derivative \(c(x_{1},\ldots,x_{n})=\sum_{i=1}^{n-1}c^{0}(x_{i},x_{i+1})\). They also deal with the pricing of variance swaps in a continuous-time framework. The difference with our problem above is that they are only given the marginal distribution \(\mu_{n}\) for \(X_{n}\). See also Kahalé [31]. We should like to emphasize that [26] assume the variance kernel \(c^{0}\) to satisfy the conditions \(c^{0}(x,x)=c^{0}_{y}(x,x)=0\), \((x-y)c^{0}_{xy}+c^{0}_{x}>0\), together with our Spence–Mirrlees condition \(c^{0}_{xyy}>0\). These conditions on \(c^{0}\) seem entirely appropriate in the continuous-time setting. In the context of our problem with finitely many given marginals \(\mu_{1},\ldots,\mu_{n}\), notice that apart from the Spence–Mirrlees condition, none of these requirements are preserved by the transformation of Remark 5.3.

7 Proofs of the main results

7.1 Construction of the left-monotone map

This section is devoted to the proof of Theorem 4.5.

Proof of Lemma 4.1(i)

That \(\delta F>0\) on \(A_{1}\) is obvious by construction. Also, for a nondecreasing function \(\phi\), the equality \(\int_{-\infty}^{m_{1}}\mathbf{1} _{A_{1}}d \phi(\delta F)=\int_{-\infty}^{m_{1}}\,d\phi(\delta F)\) follows immediately from the corresponding property verified by the pair \((m_{0},A_{0})\), the definition of \(A_{1}\), and the fact that \(\delta F(t_{0})=\delta F(\bar{x}_{0})\).

We next verify that \(G^{m_{1}}_{A_{1}}(-\infty,\cdot)=\gamma\), where

$$\begin{aligned} G^{m_{1}}_{A_{1}}(-\infty,x) = -\int_{-\infty}^{m_{1}} \big( g(x, \xi)- \xi\big) 1_{A_{1}}(\xi)\,d\delta F + \int_{m_{1}}^{x} \big(g( \xi,m_{1})-\xi\big)\,dF_{\mu}(\xi). \end{aligned}$$

By direct differentiation, we see that

$$\begin{aligned} dG^{m_{1}}_{A_{1}}(-\infty,x) =& \bigg(-\int_{-\infty}^{m_{1}} \!\! \mathbf{1} _{A_{1}}(\zeta)\,d_{\zeta}g(x,\zeta) +g(x,m_{1})-x\bigg)\,dF _{\mu}(x) \\ =& \big(F_{\nu}^{-1}\circ F_{\mu}(x)-x\big)\,dF_{\mu}(x), \end{aligned}$$

where the last equality follows from the first part of (i). We then rewrite

$$\begin{aligned} G^{m_{1}}_{A_{1}}(-\infty,x) =& G^{m_{1}}_{A_{1}}(-\infty,\bar{x} _{0})+\int_{\bar{x}_{0}}^{x} \big(F_{\nu}^{-1}\circ F_{\mu}(\xi)- \xi\big)\,dF_{\mu}(\xi). \end{aligned}$$
(7.1)

Since \(A_{1}=(A_{0}\setminus(t_{0},m_{0}]) \cup[\bar{x}_{0},m_{1}]\) and \(G_{A_{0}}^{m_{0}}(t_{0},\bar{x}_{0})=0\), we compute that

$$\begin{aligned} G^{m_{1}}_{A_{1}}(-\infty,\bar{x}_{0}) =& -\int_{-\infty}^{t_{0}} \big(g(\bar{x}_{0},\zeta)-\zeta\big)\mathbf{1} _{A_{0}}(\zeta)\,d\delta F(\zeta) +G_{A_{0}}^{m_{0}}(t_{0},\bar{x}_{0}) \\ &{} -\int_{\bar{x}_{0}}^{m_{1}}\big(g(\bar{x}_{0},\zeta)-\zeta\big)\,d \delta F(\zeta) +\int_{m_{1}}^{\bar{x}_{0}}\big(g(\xi,m_{1})-\xi \big)\,dF_{\mu}(\xi) \\ =& G_{A_{0}}^{m_{0}}(-\infty,\bar{x}_{0}) -\int_{\bar{x}_{0}}^{m _{1}}\big(g(\bar{x}_{0},\zeta)-\zeta\big)\,d\delta F(\zeta) \\ &{}+\int_{m_{1}}^{\bar{x}_{0}}\big(g(\xi,m_{1})-\xi\big)\,dF_{\mu}( \xi) \\ =& G_{A_{0}}^{m_{0}}(-\infty,\bar{x}_{0}) -\int_{\bar{x}_{0}}^{m _{1}}g(\bar{x}_{0},\zeta)\,d\delta F(\zeta) +\int_{m_{1}}^{\bar{x} _{0}}g(\xi,m_{1})\,dF_{\mu}(\xi) \\ &{}+\int_{\bar{x}_{0}}^{m_{1}}\zeta\, dF_{\nu}(\zeta) \\ =& G_{A_{0}}^{m_{0}}(-\infty,\bar{x}_{0}), \end{aligned}$$

where the last equality follows from a direct change of variables in the second and third terms. Plugging this into (7.1), it follows from a direct change of variables in the integral that

$$\begin{aligned} G^{m_{1}}_{A_{1}}(-\infty,x) =& G_{A_{0}}^{m_{0}}(-\infty,x) = \gamma(x). \end{aligned}$$

(ii) As \(m_{1}\in\mathbf{M}_{0}(\delta F)\), the definition of \(\mathbf{M}_{0}(\delta F)\) implies that \(F_{\mu}(x)>F_{ \mu}(\zeta)\) for all \(x>m_{1}\) and \(\zeta\in A_{1}\). Since \(\nu\) has no atoms, its right-continuous inverse \(F_{\nu}^{-1}\) is strictly increasing, implying that \(g(x,\zeta)-\zeta>g(\zeta, \zeta)-\zeta=F_{\nu}^{-1}\circ F_{\nu}(\zeta)-\zeta\). Moreover, since \(\delta F\) is strictly increasing on \(A_{1}\), we see that \(F_{\nu}\) is strictly increasing on \(A_{1}\), and therefore \(F_{\nu}^{-1}\circ F_{\nu}(\zeta)=\zeta\). Hence, \(g(x,\zeta)- \zeta>0\) on \(A_{1}\), and it follows that for \(t< m_{1}\le x\),

$$\begin{aligned} t\mapsto G^{m_{1}}_{A_{1}}(t,x) \mbox{is continuous, strictly increasing on $A_{1}$, and flat on $(-\infty,m_{1}]\setminus A_{1}$}.& \end{aligned}$$

We next verify that \(G^{m_{1}}_{A_{1}}(m_{1},x):=\int_{m_{1}}^{x} (g( \xi,m_{1})-\xi)\,dF_{\mu}(\xi)>0\) as long as \(\delta F(m_{1})> \delta F(x)\). Indeed, for \(\xi\in(m_{1},x)\), we have \(\delta F(m _{1})>\delta F(\xi)\), implying that \(g(\xi,m_{1})>F_{\nu}^{-1} \circ F_{\nu}(\xi)\) by the increase of \(F_{\nu}^{-1}\). Notice that the right-continuous inverse \(F_{\nu}^{-1}\) satisfies \(F_{\nu}^{-1} \circ F_{\nu}(\xi)\ge\xi\). Then \(g(\xi,m_{1})>\xi\), and we deduce that \(G^{m_{1}}_{A_{1}}(m_{1},x)>0\) from the fact that \(F_{\mu}\) is strictly increasing in a right neighborhood of \(m_{1}\), by the definition of \(\mathbf{M}_{0}(\delta F)\).

Then, in order to establish the existence and uniqueness of \(t^{m_{1}}_{A_{1}}(x)\), it remains to verify that \(G^{m_{1}}_{A_{1}}(- \infty,x)=\gamma(x)<0\) for all \(x\ge m_{1}\).

Since \(\delta F\) increases from zero at the left extreme of its support, and increases to zero at the right extreme of its support, we see that \(\gamma(x)<0\) near both extremes of its support. Next, let \(x^{*}\) be any possible local maximizer of \(\gamma\). Then it follows from the first order condition in the expression (7.1) that \(\gamma\) is flat off \(\mbox{supp}(\mu)\), and we may assume that \(x^{*}\) is either an interior point of \(\mbox{supp}(F_{\mu})\) or a left accumulation point of \(\mbox{supp}(F_{\mu})\). In both cases, it follows from the first order condition that

$$\begin{aligned} F_{\nu}^{-1}\big(F_{\mu}(x^{*})-\big) \le x^{*} \le F_{\nu}^{-1} \big(F_{\mu}(x^{*})\big). \end{aligned}$$

If \(F_{\nu}^{-1}\) is continuous at the point \(F_{\mu}(x^{*})\), then \(\delta F(x^{*})=0\) and it follows that

$$ \gamma(x^{*}) = \int_{(-\infty,x^{*}]}\xi\,d\delta F(\xi) = - \int_{(-\infty,x^{*}]}(x^{*}-\xi)\,d\delta F(\xi) = -\int(x^{*}- \xi)^{+}\,d\delta F(\xi). $$

By the fact that the pair \((\mu,\nu)\) is irreducible, it follows from (3.6) that \(\gamma(x^{*})<0\).

In the alternative case that \(F_{\nu}^{-1}\) jumps at the point \(F_{\mu}(x^{*})\), notice that \(F_{\nu}\) is flat on the right of \(F_{\nu}^{-1}\circ F_{\mu}(x^{*})\), and therefore the conclusion \(\gamma(x^{*})<0\) holds true in this case as well.

(iii) Direct differentiation reveals that

$$\begin{aligned} dG_{A_{1}}^{m_{1}}\big(t_{A_{1}}^{m_{1}}(x),x\big) =& -\Big(g\big(t _{A_{1}}^{m_{1}}(x),x\big)-t_{A_{1}}^{m_{1}}(x)\Big)\, d\big(\delta F \circ t_{A_{1}}^{m_{1}}\big)(x) \\ &{}+\big(g(x,m_{1})-x\big)\,dF_{\mu}(x). \end{aligned}$$

The required result follows immediately from the restriction of \(t_{A_{1}}^{m_{1}}(x)\) to take values in a set of increase of \(\delta F\).

(iv) Suppose \(x_{1}<\infty\). Then since the possible jumps of \(F_{\nu}^{-1}\) are positive, it follows from the definition of \(x_{1}\) that \(g(x_{1},t_{A_{1}}^{m_{1}}(x_{1}))=x_{1}\), and \(F_{\mu}(x_{1})+\delta F(t_{A_{1}}^{m_{1}}(x_{1}))\) is a continuity point of \(F_{\nu}^{-1}\). Consequently, \(\delta F(t_{A_{1}}^{m_{1}}(x _{1}))=\delta F(x_{1})\), and

$$\begin{aligned} x_{1} =& \inf\big\{ x>m_{1}:~\delta F\big(t_{A_{1}}^{m_{1}}(x) \big)\le\delta F(x) \big\} . \end{aligned}$$
(7.2)

Since \(t_{1}:=t_{A_{1}}^{m_{1}}(x_{1})\in A_{1}\), we see that \(x_{1}\in B_{0}\) is necessarily a point of (right) increase of \(\delta F\), and we have

– either \(t_{1}\in[\bar{x}_{0},m_{1}]\), implying that \(\delta F(x _{1})=\delta F(t_{1})\ge\delta F(\bar{x}_{0})\ge0\);

– or \(t_{1}\in A_{0}\setminus(t_{0},m_{0}]\), implying again that \(\delta F(x_{1})\ge0\).

Finally, since \(\delta F\) increases to zero at the right extreme of its support, it follows from the fact that \(x_{1}\in B_{0}\) and \(\delta F(x_{1})\ge0\) that \(x_{1}\le m_{n}\), and by (7.2) together with the non-increase of \(t_{A_{1}}^{m_{1}}\), we see that \(x_{1}\notin\mathbf{M}_{0}(\delta F)\). □

Proof of Theorem 4.5(i)

By construction, the probability measure \(\mathbb{P}_{*}\) satisfies the left-monotonicity property of Definition 3.4. In the rest of this proof, we verify that \(\mathbb{P}_{*}\in\mathcal{M}_{2}(\mu,\nu)\). In particular, by the uniqueness result of Beiglböck and Juillet [4, Theorem 1.5 and Corollary 1.6], this implies that \(\mathbb{P}_{*}\) is the unique left-monotone transport plan.

First, by the definition of \(\mathbb{P}_{*}\) in (4.2), \(X\sim_{\mathbb{P}_{*}}\mu\) and \(\mathbb{E}^{\mathbb{P}_{*}}[Y|X]=X\). It remains to verify that \(Y\sim_{\mathbb{P}_{*}}\nu\). We argue as in the beginning of Sect. 7.1, considering separately the following alternatives for any point \(y\in\mathbb{R}\):

Case 1. \(y=y_{d}\in D\cap B_{0}\) corresponds to some point \(x\) such that \(y_{d}=T_{d}(x)\), and we see from the definition of \(\mathbb{P}_{*}\) that

$$\begin{aligned} \mathbb{P}_{*}[Y\in dy] = dF_{\mu}\big(T_{d}(x)\big)-(1-q)\,dF_{\mu}(x) \quad\mbox{and}\quad& dF_{\nu}\big(T_{u}(x)\big)=q\,dF_{\mu}. \end{aligned}$$

Since \(dF_{\nu}(T_{u})=q\,dF_{\mu}\) and \(T_{u}(x)=g(x,T_{d}(x))\), this provides

$$\begin{aligned} \mathbb{P}_{*}[Y\in dy] = d\big(F_{\mu}(T_{d})-F_{\mu}+F_{\nu}(T _{u})\big)(x) = dF_{\nu}(y). \end{aligned}$$

Case 2. \(y=y_{u}\in D^{c}\) corresponds to some \(x\) such that \(y_{u}=T_{u}(x)\). By the definition of \(\mathbb{P}_{*}\) and the fact that \(dF_{\nu}(T_{u})=q\,dF_{\mu}\), we see that

$$ \mathbb{P}_{*}[Y\in dy] = q\,dF_{\mu}(x) = dF_{\nu}(y). $$

Case 3. In the remaining alternative \(y\in D\setminus B_{0}\), we observe that the function \(\delta F\) is flat near \(y\), and there is no \(x\neq y\) such that \(T_{d}(x)=y\) or \(T_{u}(x)=y\). Then it follows from the definition of \(\mathbb{P}_{*}\) that

$$ \mathbb{P}_{*}[Y\in dy]=dF_{\mu}(y)=dF_{\nu}(y). $$

It remains to justify the ODEs satisfied by \(T_{u}\) and \(T_{d}\) as reported in part (i) of Theorem 4.5. Recall from Step i of the construction in Sect. 4.2 that \(T_{d}(x)\) is defined by the integral equation \(G_{A_{i}}(T_{d}(x),x)=0\) for \(m_{i}\le x< x_{i}\), where \(G_{A}\) is defined in (4.1). Differentiating this integral equation at a continuity point of \(T_{d}\), we see that

$$\begin{aligned} 0 =& -\big(F_{\nu}^{-1}\circ F_{\mu}(x)-x\big)\,dF_{\mu}(x) +\Big(g \big(x,T_{d}(x)\big)-F_{\nu}^{-1}\circ F_{\mu}(x)\Big)\,dF_{\mu}(x) \\ &{} +\Big(g\big(x,T_{d}(x)\big)-T_{d}(x)\Big)\,d\delta F\big(T_{d}(x) \big) \\ =& \Big(g\big(x,T_{d}(x)\big)-x\Big)\,dF_{\mu}(x) +\Big(g\big(x,T_{d}(x) \big)-T_{d}(x)\Big)\,d\delta F\big(T_{d}(x)\big). \end{aligned}$$

Since \(T_{u}=g(\cdot,T_{d})\) this is the required ODE. The ODE for \(T_{u}\) is obtained by using the relation \(T_{u}=g(\cdot,T_{d})\). □

7.2 Optimal semi-static strategy: proof of Theorem 5.1

Following the line of argument of the proof of Theorem 2.2, we see from the weak duality (3.3) that

$$\begin{aligned} \mathbb{E}^{\mathbb{P}_{*}}[c(X,Y)] \le P_{2}(\mu,\nu) \le D_{2}( \mu,\nu). \end{aligned}$$

Then the proof of Theorem 5.1 is completed by the following result.

Lemma 7.1

Let \(\mu\), \(\nu\) be as in Assumptions 3.5 and 3.7, and suppose that the payoff function \(c\) satisfies \(c_{xyy}>0\). Then \(\varphi_{*}\oplus\psi_{*}+h_{*}^{\otimes}\ge c\).

Proof

(i) We first verify that the second order condition for a local maximum of \(H(x,\cdot)\) is satisfied on \(D^{c}\). Differentiating (5.5), (5.6) and using the expression of \(h_{*}'\) in (5.10), we see that

$$\begin{aligned} H_{yy}(\cdot,T_{u})\,dT_{u} &= c_{yy}(\cdot,T_{u})\,dT_{u}-d\psi_{*}'(T _{u}) \\ &= \frac{c_{x}(\cdot,T_{u})-c_{x}(\cdot,T_{d})}{T_{u}-T_{d}}\,dx -c _{xy}(\cdot,T_{u})\,dx \end{aligned}$$

on \(D^{c}\). Since \(c_{xyy}>0\), this implies that

$$ H_{yy}(\cdot,T_{u})T_{u}'=\frac{c_{x}(\cdot,T_{u})-c_{x}(\cdot,T _{d})}{T_{u}-T_{d}} -c_{xy}(\cdot,T_{u})< 0, $$

and by the non-decrease of \(T_{u}\), it follows that \(H_{yy}(\cdot,T _{u})<0\). Similarly,

$$\begin{aligned} H_{yy}(\cdot,T_{d}) T_{d}' &= \big(c_{yy}(\cdot,T_{d})-\psi_{*}'' \circ T_{d}\big)T_{d}' \\ &= \frac{c_{x}(\cdot,T_{u})-c_{x}(\cdot,T_{d})}{T_{u}-T_{d}} -c _{xy}(\cdot,T_{d}) \;>\; 0 \end{aligned}$$

on \(D^{c}\), and by the non-increase of \(T_{d}\), this implies that \(H_{yy}(\cdot,T_{d})<0\).

(ii) We next show that \(y\mapsto H(\cdot,y)\) is increasing before \(T_{d}\) and decreasing after \(T_{u}\). In particular, this implies that

$$\begin{aligned} \varphi_{*}(x) =\max_{y\in[T_{d}(x),T_{u}(x)]}H(x,y) \quad \mbox{for all}& x\in\mathbb{R}. \end{aligned}$$

Set \(y:=T_{u}(x)\), let \(m_{i}\) be the local maximum from which \((T_{d},T_{u})(x)\) is constructed, and consider an arbitrary \(y'=T_{u}(x')>y\) for some \(x'>x\). We only report the proof for the case \(x'\in(m_{j},x_{j}]\) for some \(j\ge i\); the remaining case \(x'\in(x_{j},m_{j+1}]\) for some \(j\ge i\) is treated similarly. Recalling that \(H_{y}(x,T_{u}(x))=0\), we decompose

$$\begin{aligned} H_{y}(x,y') =& H_{y}(x,y')-H_{y}(x,m_{j}\vee y) +\sum_{k=i+1}^{j} (A _{k}+B_{k}), \end{aligned}$$

where the last sum is set to zero whenever \(i=j\), and

$$ A_{k}:=H_{y}(x,m_{k})-H_{y}(x,x_{k-1}), \quad \quad B_{k}:=H_{y}(x,x_{k-1})-H _{y}\big(x,m_{k-1}\vee T_{u}(x)\big). $$

We next compute from the expression of \(h_{*}\) in (5.10), (5.11) that

$$\begin{aligned} H_{y}(x,y')-H_{y}(x,m_{j}\vee y) =& \int_{m_{j}\vee y}^{y'}\big(c _{yy}(x,\xi')-\psi''_{*}(\xi')\big)\,d\xi' \\ \le& \int_{m_{j}\vee y}^{y'}\Big(c_{yy}(x,\xi')-c_{yy}\big(T_{u} ^{-1}(\xi'),\xi'\big)\Big)\,d\xi' \\ =& -\int_{m_{j}\vee y}^{y'} \int_{x}^{T_{u}^{-1}(\xi')} c_{xyy}( \xi,\xi')\,d\xi\,d\xi' < 0, \end{aligned}$$

where the second inequality follows from the second order condition verified in (i). Similarly, we compute that

$$\begin{aligned} A_{k} =& \int_{(x_{k-1},m_{k}]} \big(c_{yy}(x,\xi')-\psi''_{*}( \xi')\big)\,d\xi' \\ \le& \int_{x_{k-1}}^{m_{k}} \Big(c_{yy}(x,\xi')-c_{yy}\big(T_{d} ^{-1}(\xi'),\xi'\big)\Big)\,d\xi' \\ =& -\int_{x_{k-1}}^{m_{k}} \int_{x}^{T_{d}^{-1}(\xi')}c_{xyy}( \xi,\xi')\,d\xi\,d\xi' < 0, \end{aligned}$$

where we used again the second order condition verified in (i). Finally,

$$\begin{aligned} B_{k} =& \int_{m_{k-1}\vee T_{u}(x)}^{y} \big(c_{yy}(x,\xi')-\psi _{*}''(\xi')\big)\,d\xi' \\ \le& \int_{m_{k-1}\vee T_{u}(x)}^{y} \Big(c_{yy}(x,\xi')-c_{yy} \big(T_{u}^{-1}(\xi'),\xi'\big)\Big)\,d\xi' \\ =& -\int_{m_{k-1}\vee T_{u}(x)}^{y} \int_{x}^{T_{u}^{-1}(y')}c_{xyy}( \xi,\xi')\,d\xi\,d\xi' < 0. \end{aligned}$$

A similar argument also shows that \(H_{y}(x,y')<0\) for \(y'< T_{d}(x)\).

(iii) We next show that \(H(\cdot,T_{d})=H(\cdot,T_{u})\). Denote \(\delta H:=H(\cdot,T_{u})-H(\cdot,T_{d})\) and compute

$$\begin{aligned} \delta H' =& c_{x}(\cdot,T_{u})-c_{x}(\cdot,T_{d})-(T_{u}-T_{d})h _{*}' \\ &{} +\big(c_{y}(\cdot,T_{u})-\psi_{*}'(T_{u})-h_{*}\big)T_{u}' - \big(c_{y}(\cdot,T_{d})-\psi_{*}'(T_{d})-h_{*}\big)T_{d}' \end{aligned}$$

in the distributional sense. By the definition of \(\psi_{*}\) and \(h_{*}\), it follows that \(\delta H'= 0\) at any continuity point. Since \(\delta H\) is continuous by our construction, see (5.15), this shows that \(\delta H(x)=\delta H(m_{i})=0\), where \(m_{i}\) is the local maximizer from which \((T_{d},T_{u})(x)\) is defined.

(iv) We finally show that \(T_{u}\) and \(T_{d}\) are global maximizers of \(y\mapsto H(\cdot,y)\). Let \(x\in D^{c}\) and denote by \(m\) the local maximizer from which \(T_{d}(x)\) and \(T_{u}(x)\) are constructed. For fixed \(T=T_{u}(t)\in(m,T_{u}(x))\), it follows from similar calculations as in the previous step that

$$\begin{aligned} \partial_{x} \big(H(\cdot,T_{u})-H(\cdot,T)\big) =& c_{x}(\cdot,T _{u})-c_{x}(\cdot,T)-(T-T_{d})h_{*}' \\ =& (T_{u}-T)\bigg( \frac{c_{x}(\cdot,T_{u})-c_{x}(\cdot,T)}{T_{u}-T} -\frac{c_{x}( \cdot,T_{u})-c_{x}(\cdot,T_{d})}{T_{u}-T_{d}} \bigg) \\ >&0 \end{aligned}$$

by the condition \(c_{xyy}>0\). Then \(H(\cdot,T_{u})-H(\cdot,T)=\int _{t}^{\cdot}\partial_{x} \{H(\cdot,T_{u})-H(\cdot,T)\}>0\).

By a similar calculation, we also show that \(H(x,T_{d}(x))-H(x,T) \ge0\) for all \(T\in (T_{d}(x),m)\). Since \(H(x,T_{u}(x))=H(x,T_{d}(x))\) by the previous step, this completes the proof that \(T_{d}\) and \(T_{u}\) are global maximizers of \(y\mapsto H(\cdot,y)\). □