1 Introduction

Statement of the problem

The valuation and hedging of contingent claims are major concerns in finance, both from a theoretical and a practical point of view. The continuous-time theory is well established (see for instance Karatzas and Shreve [16, Chap. 2]). But in practice, hedging can be performed only at discrete times, say \(t_{0} = 0 < t_{1} < \cdots< t_{N}= T\), yielding a residual risk. Here, we intend to hedge the claim \(H_{T}\) at time \(T\) using \(d\) hedging instruments with price processes \(X= ( X^{(1)}, \dots, X^{(d)})\). So the local P&L\({\mathcal{E}}_{n}\) associated with the hedging times \({t_{n}}\) and \({t_{n+1}}\) can be written as

$$ {\mathcal{E}}_{n}= V_{t_{n+1}}- V_{t_{n}}- {\langle{\vartheta_{t_{n}}}, X_{t_{n+1}}-X_{t_{n}} \rangle}. $$
(1.1)

Here, \(V\) stands for the valuation process and \(\vartheta= (\vartheta^{(1)}, \dots, \vartheta^{(d)})\) for the hedging process. Also, \(\vartheta^{(i)}\) denotes the number of shares invested in the \(i\)th hedging instrument. By considering discounted prices, we suppose the non-risky asset has zero drift.

In high-frequency hedging, the impact of discrete-time compared to continuous-time hedging is small (see for instance the convergence results by Bertsimas et al. [4] for smooth payoffs and convergence in \(L^{2}\) and in distribution, and the results by Gobet and Temam [14] for irregular payoffs which usually modify the convergence rate). In low-frequency hedging such as in energy markets (see Christodoulou et al. [7]), the risk of the local P&L is bigger and may become an issue. Our aim is to find valuation/hedging rules \((V,\vartheta)\) minimising this risk. We differ from the existing results (for instance, those related to the quadratic local risk minimisation by Föllmer and Schweizer [10, 24]) by dealing with a risk function \(\ell\) penalising profits (\({\mathcal{E}}_{n}< 0 \)) and losses (\({\mathcal{E}}_{n}> 0\)) asymmetrically. So the integrated local risk under study takes the form

$$ \mathscr{E}_{N}(V,\vartheta) = \sum_{n=0}^{N-1} \mathbb{E}[ \ell({ \mathcal{E}}_{n}) ]. $$

The simplest case of such a risk function \(\ell\) is

$$ \ell_{\gamma}(y) = \big( 1 + \gamma\operatorname{Sgn}(y)\big)^{2} y^{2}/2, $$
(1.2)

with \(\gamma\in(0,1)\) to penalise losses more than profits (see Fig. 1). We define the above sign function as \(\operatorname{Sgn}(y) :=\mathbf{I}_{\{y>0\}} - \mathbf{I}_{\{y<0 \}}\). Such a risk function is inspired from the asymmetric quadratic loss of Newey and Powell [18] in the context of statistical estimation, and was later studied by Bellini et al. [3] to define a new coherent risk measure known as expectile, where the \(\alpha\)-expectile is linked to the loss function (1.2) with \(\alpha=\frac{(1+\gamma)^{2}}{2(1+\gamma^{2}) }\in[\frac{1}{2 },1]\). Expectiles are known to be appropriate risk measures when one wishes to weight differently profits and losses, like in our framework.

Fig. 1
figure 1

Plot of the risk function \(\ell_{\gamma}\) for different \(\gamma\)

In this setting, our aim is to study the asymptotics of the minimum

$$ \min_{ \substack{(V, \vartheta) \in{\mathcal{A}}_{V,\vartheta} \\V_{T} = H_{T} } } \mathscr{E}_{N}(V,\vartheta) $$
(1.3)

as the number \(N\) of hedging dates becomes larger. To simplify, we take equidistant hedging times \({t_{n}}= n\varepsilon_{N}\) with time step \(\varepsilon_{N}= T/N\). The minimum (1.3) is computed over the set \({\mathcal{A}}_{V,\vartheta}\) of all adapted (to the underlying filtration \(({\mathcal{F}}_{t})_{t\geq0}\)) and appropriately integrable pairs \((V,\vartheta)\), under the replication constraint \(V_{T} = H_{T}\).

There are a few results in that direction. In [20], Pham deals with an \(L^{p}\)-risk function of the losses and a fixed number of trading dates. In [21], Pochart and Bouchaud consider the expected shortfall risk function. Their research concentrates on numerics for a fixed number of dates and does not handle any asymptotic analysis as \(N\to\infty\). In [1], Abergel and Millot study pseudo-optimal strategies and get asymptotic results under the condition that the risk function is of class \({\mathrm{C}}^{3}\). So their analysis discards the prototype risk function (1.2). Indeed, the discontinuity of the second derivative \(\ell_{\gamma}''\) complicates the analysis and fully changes the nature of subsequent results. As a comparison, in [1, Sect. 4.1, “Complete markets case”], the limiting valuation/hedging rule does not depend on the risk function (provided that it is of class \({\mathrm{C}}^{3}\)), whereas in our setting, the limit strongly depends on \(\ell_{\gamma}\) (only piecewise \({\mathrm{C}}^{2}\)) through the parameter \(\gamma>0\). In short, the existing references consider different settings and difficulties from ours.

Exogenous reference valuation and \(f\)-PDE valuation

The minimisation problem (1.3) appears attractive, but its study in the asymptotic regime \(N\to\infty\) is tough in the case of the asymmetric risk function (1.2). To tackle this problem, we slightly change the approach. See Fig. 2 for an overview of our analysis.

Fig. 2
figure 2

Diagram of the analysis of the problem

First, we suppose the hedging instruments are modelled by a stochastic differential equation (SDE) with drift \({\mu}\) and diffusion \({\sigma}\). We also consider contingent claims of the form \(H_{T} = h( X_{T})\). Second, we suppose that the contingent claim is evaluated exogenously by a valuation process \(V_{t} = { v}(t,{ X_{t}})\) for some function \({ v}\). For instance, \({ v}\) could be given by a mark-to-model value promoted by the regulator or the central counterparty (CCP). The latter imposes a minimum margin requirement which the hedging entity has to comply with. Given this exogenous reference valuation, the trader will determine how to hedge on each interval \([ {t_{n}},{t_{n+1}}]\) by choosing an adapted valuation/hedging rule \((\tilde{V}_{t_{n}},\tilde{\vartheta}_{t_{n}})\) and considering the related conditional local risk

$$ {\mathcal{R}}_{n}(\gamma):= \mathbb{E}[ \ell_{\gamma}( V_{t_{n+1}}- \tilde{V}_{t_{n}}- {\langle\tilde{\vartheta}_{t_{n}}, X_{t_{n+1}}-X_{t_{n}} \rangle} ) \vert{ \mathcal{F}}_{t_{n}}]. $$
(1.4)

In addition, the valuation/hedging rule of the trader will be parametrised by a possibly nonlinear function \(f\). Inspired by the connection between dynamic risk valuations, nonlinear partial differential equations (PDEs) and nonlinear backward SDEs developed by many authors such as El Karoui et al. [9], Peng [19], Cheridito et al. [6], Crépey [8, Chap. 4], Zhang [25, Chaps. 4, 5 and 12], we introduce the concept of an \(f\)-PDE valuation. Let

$$ {\sigma}: [0,T] \times{\mathbb{R}}^{d} \to{\mathbb{R}}^{d \times d},\qquad f: [0,T] \times{\mathbb{R}}^{d} \times{ \mathbb{R}}\times{\mathbb{R}}^{d} \times{\mathbb{R}}^{d \times d} \to{\mathbb{R}}$$

be continuous functions. Let \(\tau\in(0,T]\) be a time horizon and \({ v}(\tau,\cdot)\) a reference valuation at time \(\tau\). Given \(\tau\) and \({ v}(\tau,\cdot)\), the function \(u_{\tau}: [0,\tau]\times{\mathbb{R}}^{d} \to{\mathbb{R}}\) is a solution to the\(f\)-PDE if it satisfies

$$\begin{aligned} &{\partial_{t} u_{\tau}}(t,x)+ \frac{1}{2}\operatorname{Tr}[ {\sigma }{\sigma}^{\intercal}D^{2}_{x} u_{\tau}](t,x)+ f\big(t, x, u_{\tau}(t,x), D_{x} u_{\tau}(t,x), D^{2}_{x} u_{\tau}(t,x)\big) \\ &= 0 \end{aligned}$$
(1.5)

for all \((t,x)\in[0,T] \times{\mathbb{R}}^{d}\) with the terminal condition

$$ u_{\tau}(\tau,x)={ v}(\tau,x) $$

at time \(\tau\). The \(f\)-PDE valuation is the mapping from \((\tau, { v}(\tau,\cdot))\) to the solution \(u_{\tau}\) of the \(f\)-PDE (1.5). We refer to \(f\) as the kernel. The heuristic justification from a mathematical finance point of view works as follows. As previously mentioned, in [9, 6, 8, 25], many valuation and hedging problems can be recast in terms of (first- and second-order) nonlinear backward SDEs (with a nonlinearity \(f\) which accounts for imperfections, frictions, uncertainties, etc.), and in a Markovian setting, these BSDEs are tightly related to (first and second order) nonlinear PDEs of the form (1.5) via Feynman–Kac formulas. The kernel \(f\equiv0\) corresponds to the usual (frictionless) risk-neutral valuation [16, Chap. 2]; the case of pricing with uncertain volatility in dimension 1 (i.e., \(\sigma\in[\underline{\sigma}, \overline{\sigma}]\)) is related to the Black–Scholes–Barenblatt PDE, derived by Avellaneda et al. [2], of the form \({\partial_{t} u_{\tau}} +\frac{1}{2}\sup_{\underline{\sigma}\leq \sigma\leq \overline{\sigma}} (\sigma^{2}{\partial_{x}^{2} u_{\tau}})=0\), yielding a nonlinear \(f\) depending on the second derivative; other kernels appear for instance in [9] or [25, Chap. 12]. Moreover, in all these continuous-time representations, under mild conditions, the hedging portfolio is computed as the space derivative of the PDE solution along the asset path. Loosely speaking, the \(f\)-PDE is typically the nonlinear valuation/hedging rule of a trader who accounts for some frictions or uncertainties modelled by \(f\). In [19], Peng establishes a converse by showing that any coherent dynamic valuation must be given by a BSDE with some \(f\). All this gives justification for parametrising valuation/hedging rules through an \(f\)-PDE.

So far, \(f\) is arbitrary and therefore, we are in a position to potentially consider all valuation/hedging strategies with the most usual friction or uncertainty types. In the conditional local risk expression given by (1.4), consistently with the above heuristics, we then set

$$ \tilde{V}_{t_{n}}=u^{(n+1)}({t_{n}}, X_{t_{n}}),\qquad \tilde{\vartheta}_{t_{n}}=D_{x} u^{(n+1)}({t_{n}}, X_{t_{n}}), $$

where we denote \(u^{(n+1)}:=u_{t_{n+1}}\). Note that parametrising the \(f\)-PDE solutions with the maturity \(\tau\) is expected to enable the strategy on the interval \([{t_{n}},{t_{n+1}}]\) to have as a target the reference value at time \({t_{n+1}}\).

Our contributions

Our first main result is to prove the existence (Theorem 2.6) of the following limit, called the asymptotic risk,

$$ \mathscr{R}_{\gamma}({ v}, f) = \lim_{ N\to\infty} \frac{ 1 }{ \varepsilon_{N}} \sum_{n=0}^{N-1} \mathbb{E}[ { \mathcal{R}}_{n}(\gamma)]. $$
(1.6)

Moreover, we give an explicit expression for \(\mathscr{R}_{\gamma}({ v}, f)\) depending on \(\gamma, { v}, f, {\sigma}, X\) and \(T\). Then we discuss the existence of an optimal kernel \(f^{*}\) such that the \(f^{*}\)-PDE valuation minimises the asymptotic risk in the sense that

$$ \mathscr{R}_{\gamma}({ v}, f^{*})\leq\mathscr{R}_{\gamma}({ v}, f) $$
(1.7)

for any admissible \(f\). In the one-dimensional case, this optimal kernel \(f^{*}\) is explicit (see (2.18)) and depends on the risk parameter \(\gamma\); its input variables are only the reference valuation’s second derivative and the price process volatility. This result is interesting on its own, apart from the problem (1.3), since when the reference price is exogenously given (like in the CCP case mentioned in the introduction), we get the optimal valuation/hedging kernel to use to minimise the risk measured in an asymmetric manner.

We can go further in order to propose a candidate for the solution to (1.3), according to the diagram of Fig. 2. This is a situation where the trader would like to use an endogenous reference valuation consistent with her \(f^{*}\)-PDE valuation/hedging rule. In other words, we choose the reference valuation as the solution to the \(f^{*}\)-PDE (1.5). Here, the payoff \(h: {\mathbb{R}}\to{\mathbb{R}}\) is the \(f^{*}\)-PDE terminal condition at time \(T\). We denote by \({ v}^{*}\) the resulting valuation. In dimension one, this PDE takes the form

$$\begin{aligned} &{\partial_{t} { v}^{*}}(t,x)+ \frac{1}{2}{\sigma}^{2}(t,x) {\partial_{x}^{2} { v}^{*}}(t,x) \\ &+ c_{1}^{*}{\sigma}^{2}(t,x)\big({\partial_{x}^{2} { v}^{*}}(t,x)\big)^{+}- c_{2}^{*}{ \sigma}^{2}(t,x)\big({\partial_{x}^{2} { v}^{*}}(t,x)\big)^{-}=0 \end{aligned}$$

for some constants \(c_{1}^{*}\geq0\) and \(c_{2}^{*}\leq0\) depending on the risk parameter \(\gamma\). When \({\sigma}\) is constant, observe that the above PDE coincides with the aforementioned Black–Scholes–Barenblatt equation from [2] (with adjusted \(\underline{{\sigma}}\) and \(\overline{{\sigma}}\) depending on \({\sigma}\) and \(\gamma\)). In higher dimension, \({ v}^{*}\) solves a fully nonlinear PDE with a nonlinear term depending on the Hessian \(D^{2}_{x} { v}^{*}\) (see the nonlinear PDE (2.12)). All in all, this gives somehow an endogenously consistent way to valuate the claim \(h\) by accounting for local hedging errors measured with the asymmetric risk function \(\ell_{\gamma}\), which constitutes to the best of our knowledge an original contribution.

Summing up, instead of minimising (1.3) and then taking the limit in \(N\) after rescaling by \(\varepsilon_{N}\), we take first the limit in \(N\) of the cumulated integrated local risk for a wide class of \(f\)-PDE valuations and then minimise over all kernels \(f\); see the diagram of Fig. 2. We do not prove that inverting minimisation and limit holds true in this setting. In other words, we do not claim that the limit of the minimum (1.3) rescaled by \(\varepsilon_{N}\) corresponds to \(\mathscr{R}_{\gamma}({ v}^{*},f^{*})\) and that

$$\begin{aligned} \tilde{V}_{t_{n}}\approx{ v}^{*}({t_{n}}, X_{t_{n}}),\qquad \tilde{\vartheta}_{t_{n}}\approx D_{x} { v}^{*}({t_{n}}, X_{t_{n}}). \end{aligned}$$
(1.8)

However, our numerical tests in dimension 1 seem to corroborate this (see in particular Table 2 and Figs. 8, 9). Proving this result rigorously is so far an open problem that we expect to handle in the future.

The paper is structured as follows. Below, we present the notations and conventions used throughout the paper. In Sect. 2, we define the stochastic setting, then state the assumptions and the main results. The reader only interested by the main theoretical results (without their proofs) could stick to Sects. 1 and 2. The proofs are gathered in Sect. 3. In order to grasp the main proof ideas behind Theorem 2.6, we have split Sect. 3 into several parts: we first give an overview of the main steps of the proof, and then separate the proof into independent sub-results (see Propositions 3.1, 3.2 and Lemma 3.3) that are combined to obtain Theorem 2.6. Further technical results are collected in the Appendix. Section 4 contains our numerical experiments.

Usual notations

Let \(d \in{\mathbb{N}}\) and \(a,b\) in \({\mathbb{R}}^{d}\). We denote by \({\langle a, b \rangle}=\sum_{i=1}^{d} a_{i} b_{i}\) the scalar product on \({\mathbb{R}}^{d}\), adopted for both row or column vectors \(a\) and \(b\). We set \({\left\vert a\right\vert} = \sqrt{ {\langle a, a \rangle} }\). We denote by \({\mathcal{M}}^{d}\) the set of all \(d\times d\) matrices with real entries. By \({\mathcal{S}}^{d}\), we denote all symmetric matrices in \({\mathcal{M}}^{d}\). For \(A\in{\mathcal{M}}^{d}\), we denote by \(\operatorname{Tr}[A]\) and \(A^{\intercal}\), respectively, the trace and the transpose of \(A\) and set \({\left\vert A\right\vert} = \sqrt{\operatorname{Tr}[ A A^{\intercal} ]}\).

Let \(E, E'\) be two generic Euclidean spaces and \(\phi: [0,T] \times E\) an \(E'\)-valued function. In this paper, we say that \(\phi\) satisfies a local regularity condition in time and space if for some real \(q>0\), the coefficient

$$ \|\phi\|_{\text{H}^{1/2,1}_{\mathrm{loc},\mathrm{ pol}}} :=\sup_{ \substack{t,t'\in[0,T]\\t\neq t'}} \sup_{ \substack{x,x'\in E\\x\neq x'}} \frac{ {\vert\phi(t,x)-\phi(t',x') \vert}}{(|t-t'|^{1/2} + {\left \vert x-x'\right\vert}) (1 + {\left\vert x\right\vert}^{q} + {\left\vert x'\right\vert}^{q})} $$

is finite; then \(\phi\) is said to be in \(\text{H}^{1/2,1}_{\mathrm{loc},\mathrm{ pol}}\). We are aware that \(\|\phi\|_{\text{H}^{1/2,1}_{\mathrm{loc},\mathrm{ pol}}}\) depends on \(q\), but in the following, the precise value of \(q\) is unimportant and we prefer to avoid the reference to \(q\) in the notation \(\|\phi\|_{\text{H}^{1/2,1}_{\mathrm{loc},\mathrm{ pol}}}\) for the sake of simplicity.

Observe that \(\phi\in\text{H}^{1/2,1}_{\mathrm{loc},\mathrm{ pol}}\) means that \(\phi\) is locally 1/2-Hölder-continuous in time and Lipschitz-continuous in space and has polynomial growth in space uniformly in time. Furthermore, for any \(\phi_{1}\) and \(\phi_{2}\) in \(\text{H}^{1/2,1}_{\mathrm{loc},\mathrm{ pol}}\), the product \(\phi_{1}\phi_{2}\), the pair \((\phi_{1},\phi_{2})\) and the composition \(\phi_{1}(t,\phi_{2}(t,\cdot))\) are also in \(\text{H}^{1/2,1}_{\mathrm{loc},\mathrm{ pol}}\).

The set \({\mathrm{C}}^{1,2}([0,T] \times E; E')\) denotes the set of functions \(\phi: [0,T] \times E\to E'\) such that the partial derivatives \(\partial_{t} \phi, \partial_{x_{i}} \phi, \partial_{x_{i}} \partial_{x_{j}} \phi\) exist and are continuous for any \(1\le i,j\le d\). When \(E = {\mathbb{R}}^{d}\) and the domain \(E'\) is unambiguous, we simply write \({\mathrm{C}}^{1,2}([0,T] \times{\mathbb{R}}^{d})\).

For every function \(\phi\in{\mathrm{C}}^{1,2}([0,T] \times{\mathbb{R}}^{d}; {\mathbb{R}}) \), we denote its gradient in space by the row vector \(D_{x} \phi= ({\partial_{x_{i}} \phi})_{1\le i\le d}\) and its Hessian by \(D^{2}_{x} \phi=({\partial_{x_{i}} {\partial_{x_{j}} \phi}})_{1\le i,j\le d}\). Also, let \({{\mathcal{L}}}_{t} \phi: [0,T] \times{\mathbb{R}}^{d}\to{ \mathbb{R}}\) be given by

$$ {{\mathcal{L}}}_{t} \phi(t,x)= {\partial_{t} \phi}(t,x)+ \frac{1}{2} \operatorname{Tr}[{\sigma}{\sigma}^{\intercal} D^{2}_{x} \phi ](t,x). $$

Notice that \(\phi, {\partial_{t} \phi}, D_{x} \phi, D^{2}_{x} \phi \in\text{H}^{1/2,1}_{ \mathrm{loc},\mathrm{ pol}}\) is a sufficient condition to have \(\phi\in{\mathrm{C}}^{1,2}\) and to be able to apply Itô’s formula.

2 Model, assumptions and main results

2.1 Probabilistic risk model

We fix a finite time horizon \(T > 0\). Let \(W = (W^{(1)}, \dots, W^{(d)}) : [0,T] \times\Omega\to{ \mathbb{R}}^{d}\) be a standard Brownian motion on a probability space \((\Omega, {\mathcal{F}}, \mathbb{P})\). Let \(\mathbb{F}= ({\mathcal{F}}_{t})_{t\in[0,T]}\) be the augmented and completed filtration generated by \(W\). We consider the \(\mathbb{F}\)-adapted process \(X= ( X^{(1)}, \dots, X^{(d)} ) : [0,T] \times\Omega\to{ \mathbb{R}}^{d}\) satisfying the stochastic differential equation (SDE)

$$ {\mathrm{d}}{ X_{t}}= {\mu}(t,{ X_{t}}){\mathrm{d}}t+ {\sigma}(t,{ X_{t}}){ \mathrm{d}}W_{t} $$
(2.1)

with initial value \(X_{0} = x_{0} \in{\mathbb{R}}^{d}\). The coefficients \({\mu}:[0,T] \times{\mathbb{R}}^{d}\to{\mathbb{R}}^{d}\) and \({\sigma}:[0,T] \times{\mathbb{R}}^{d}\to{\mathcal{M}}^{d}\) are Lipschitz in space, uniformly in time (see Assumption 2.1 later).

Given \(N\in{\mathbb{N}}\) equidistant hedging times \(t_{0} = 0 < t_{1}< \cdots< t_{N} = T\) on the interval \([0,T]\) with \({t_{n}}= n\varepsilon_{N}\) and \(\varepsilon_{N}= T/N\), we write

$$ \varphi_{t}^{N}:=\sup\{ {t_{n}}: {t_{n}}\le t \},\qquad \bar{\varphi}_{t}^{N}:=\inf\{ {t_{n}}: {t_{n}}> t \}, $$

and the increment of \(X\) from \({t_{n}}\) to \({t_{n+1}}\) is \(\Delta X_{n} : = X_{t_{n+1}}- X_{t_{n}}\).

In the following, we systematically consider the risk function \(\ell_{\gamma}\) defined in (1.2). It is a convex and continuously differentiable function satisfying \(\ell_{\gamma}(0) = \ell_{\gamma}'(0) = 0\) and \(\ell_{\gamma}(y) = \ell_{-\gamma}(-y)\). In addition, it is symmetric if and only if \(\gamma= 0\). Further, \(\ell_{\gamma}'\) is a piecewise continuously differentiable function with \(\ell_{\gamma}''\) being discontinuous as soon as \(\gamma\neq0\),

$$ \ell_{\gamma}'(y) = \big(1 +\gamma\operatorname{Sgn}(y)\big)^{2} y, \qquad\ell_{\gamma}''(y) = \big(1 + \gamma\operatorname{Sgn}(y) \big)^{2}, $$
(2.2)

where \(\ell_{\gamma}''\) is extended to 0 as \(\ell_{\gamma}''(0) = 1\) owing to \(\operatorname{Sgn}(0) = 0\) (see Fig. 3). In all the sequel, we assume\(\gamma\in[0,1)\).

Fig. 3
figure 3

Plot of the derivatives of the risk function \(\ell_{\gamma}\) for different \(\gamma\): \(\ell_{\gamma}'\) (left) and \(\ell_{\gamma}''\) (right)

For a payoff function \(h: {\mathbb{R}}^{d} \to{\mathbb{R}}\), the inputs of our approach are a reference valuation \({ v}: [0,T] \times{\mathbb{R}}^{d} \to{\mathbb{R}}\) such that \({ v}(T,\cdot) = h(\cdot)\) and a kernel

$$ f: [0,T] \times{\mathbb{R}}^{d} \times{\mathbb{R}}\times{ \mathbb{R}}^{d} \times{\mathcal{S}}^{d} \to{\mathbb{R}}. $$

Both are assumed to be smooth functions (see Assumptions 2.2 and 2.3). We consider the \(f\)-PDE valuation giving rise to the family of functions \(u_{t_{n+1}}:[0,{t_{n+1}}] \times{\mathbb{R}}^{d}\to{\mathbb{R}} \) indexed by the hedging times \({t_{n+1}}\). These functions are the solutions to the PDE (1.5) with the Cauchy boundary condition \(u_{t_{n+1}}({t_{n+1}},\cdot) = { v}({t_{n+1}},\cdot)\) at time \({t_{n+1}}\). Also, they are assumed to be smooth in the sense of Assumption 2.4. In this context, we set \(u^{(n+1)}= u_{t_{n+1}}\) and define the local P&L \({\mathcal{E}}_{n}: \Omega\to{\mathbb{R}}\) (see (1.1)) by

$$ {\mathcal{E}}_{n}:= u^{(n+1)}({t_{n+1}}, X_{t_{n+1}})- u^{(n+1)}({t_{n}}, X_{t_{n}})- D_{x} u^{(n+1)}({t_{n}}, X_{t_{n}})\Delta X_{n} $$
(2.3)

and the conditional local risk by

$$ {\mathcal{R}}_{n}(\gamma):= \mathbb{E}[\ell_{\gamma}({\mathcal{E}}_{n}) \vert{\mathcal{F}}_{t_{n}}]. $$
(2.4)

As explained in the introduction, our aim is to analyse the asymptotic behaviour of the integrated conditional local risk after appropriate renormalisation, i.e.,

$$ \mathscr{R}_{N,\gamma}({ v},f) := \frac{1}{\varepsilon_{N}} \sum_{n= 0}^{N-1} \mathbb{E}[{\mathcal{R}}_{n}(\gamma)]. $$
(2.5)

2.2 Asymptotic risk \(\mathscr{R}_{\gamma}\) given a reference valuation \({ v}\) and a kernel \(f\)

Here, we study the asymptotic risk \(\mathscr{R}_{\gamma}\) (as defined in (1.6)) when a reference valuation \({ v}\) and a kernel \(f\) are given. We state the following assumptions.

Assumption 2.1

The coefficients \({\mu}\)\(: [0,T] \times{\mathbb{R}}^{d}\to{\mathbb{R}}^{d} \) and \({\sigma}\)\(: [0,T] \times{\mathbb{R}}^{d}\to{\mathcal{M}}^{d} \) are in \(\text{H}^{1/2,1}_{\mathrm{loc},\mathrm{ pol}}\).

Assumption 2.2

The reference valuation \({ v}\)\(: [0,T] \times{\mathbb{R}}^{d}\to{\mathbb{R}}\) is in \(\text{H}^{1/2,1}_{\mathrm{loc},\mathrm{ pol}}\). Further, \(D_{x} { v}\text{ and }D^{2}_{x} { v}\) exist and are in \(\text{H}^{1/2,1}_{\mathrm{loc},\mathrm{ pol}}\).

Assumption 2.3

The kernel \(f\)\(:[0,T] \times{\mathbb{R}}^{d}\times{\mathbb{R}}\times{ \mathbb{R}}^{d}\times{\mathcal{S}}^{d}\to{\mathbb{R}}\) is in \(\text{H}^{1/2,1}_{\mathrm{loc},\mathrm{ pol}}\).

Assumption 2.4

For all \(\tau\in(0,T]\), there is a unique classical solution \(u_{\tau}\) to the PDE (1.5) with the terminal condition \(u_{\tau}(\tau,\cdot) = { v}(\tau,\cdot)\) at time \(\tau\). In addition,

$$ {\partial_{t} u_{\tau}},{\partial_{x_{i}} u_{\tau}},{\partial_{x_{i}} {\partial_{x_{j}} u_{\tau}}}, {\partial_{t} {\partial_{x_{i}} u_{\tau}}},{\partial_{x_{i}} {\partial_{x_{j}} {\partial_{x_{k}} u_{\tau}}}} $$

exist and are in \(\text{H}^{1/2,1}_{\mathrm{loc},\mathrm{ pol}}\).

Assumption 2.5

The symmetric matrix \(( {\sigma}^{\intercal}(D^{2}_{x} { v}){\sigma})(t,{ X_{t}})\) is not zero, \({\mathrm{d}}t\otimes{\mathrm{d}}\mathbb{P}\)-a.e. (this is a non-degeneracy condition).

For stating the asymptotic result below, we need to introduce an extra Brownian motion \(B\), independent of \(W\), with the same dimension as \(W\). Both are defined on an extended probability space with obvious definitions. Whenever necessary, the expectation with respect to the distribution of \(B\), or \(W\), or both, is denoted by \(\mathbb{E}^{B}\), or \(\mathbb{E}^{W}\), or \(\mathbb{E}^{W\otimes B}\).

Theorem 2.6

Let\(B = (B^{(1)}, \dots, B^{(d)} ) : [0,1] \times\Omega\to{ \mathbb{R}}^{d}\)be a standard Brownian motion independent from\(W\). Consider\(\mathscr{R}_{N,\gamma}({ v},f)\)given by (2.5) in the form

$$ \mathscr{R}_{N,\gamma}({ v},f) = \frac{1}{\varepsilon_{N}} \sum_{n= 0}^{N-1 } \mathbb{E}[ \ell_{\gamma}({\mathcal{E}}_{n}) ], $$

where\({\mathcal{E}}_{n}\)is given by (2.3). Under the Assumptions2.12.5, the limit of\(\mathscr{R}_{N,\gamma}({ v},f)\)as\(N\to\infty\)exists and is given by

$$\begin{aligned} \mathscr{R}_{\gamma}({ v},f) = \mathbb{E}\bigg[ \int_{0}^{T} \int_{0}^{1}& \ell_{\gamma}'' \bigg( \int_{0}^{\theta} B_{\theta'}^{ \intercal} G_{t} {\mathrm{d}}B_{\theta'} - F_{t} \theta\bigg) \\ &\times\bigg( F_{t}^{2} \theta- F_{t} \int_{0}^{\theta} B_{ \theta'}^{\intercal} G_{t}{\mathrm{d}}B_{\theta'} + {\left\vert G_{t} B_{\theta}\right\vert}^{2}/2\bigg) {\mathrm {d}}\theta{\mathrm{d}}t\bigg], \end{aligned}$$
(2.6)

where

$$\begin{aligned} F_{t} & = f\big( t,{ X_{t}}, { v}(t,{ X_{t}}), D_{x} { v}(t,{ X_{t}}), D^{2}_{x} { v}(t,{ X_{t}})\big) \in{\mathbb{R}}, \\ G_{t} & = \big( {\sigma}^{\intercal} (D^{2}_{x} { v}) {\sigma} \big)(t,{ X_{t}})\in{\mathcal{S}}^{d}. \end{aligned}$$

The long and delicate proof is postponed to Sect. 3.

If the risk function \(\ell\) is different from \(\ell_{\gamma}\) but has the same characteristics (i.e., \(\ell\) convex, \(\ell(0)=\ell'(0)=0\), \(\ell''(0-)=\ell_{\gamma}''(0-)\) and \(\ell''(0+)=\ell_{\gamma}''(0+)\)), we obtain the same limit for the asymptotic risk (2.6), because only the left and right derivatives of the risk function at zero matter. The full derivation follows the same arguments as for \(\ell_{\gamma}\), and the details are left to the reader.

2.3 Optimising over the kernel \(f\)

Here, we study the optimisation problem over the kernel \(f\) described in (1.7). To specify the definition of the optimal kernel \(f^{*}\), we rewrite the asymptotic risk in (2.6) as a functional \(\mathscr{R}_{\gamma}: \Omega_{{ v}}\times\Omega_{f} \to{ \mathbb{R}}\) given by

$$ \mathscr{R}_{\gamma}({ v},f) = \mathbb{E}\bigg[ \int_{0}^{T} {R_{ \gamma}}\left( G_{t}, F_{t} \right) {\mathrm{d}}t\bigg], $$

where \({R_{\gamma}}:{\mathcal{S}}^{d}\times{\mathbb{R}}\to{ \mathbb{R}}\) is defined as

$$\begin{aligned} {R_{\gamma}}(S,a) = \mathbb{E}\bigg[ \int_{0}^{1} &\ell_{\gamma}'' \big( (B_{\theta}^{\intercal} S B_{\theta}- \operatorname{Tr}[S] \theta)/2 - a\theta\big) \\ &\times\big( a^{2}\theta- a (B_{\theta}^{\intercal} S B_{\theta}- \operatorname{Tr}[S] \theta)/2 + (B_{\theta}^{\intercal} S^{ \intercal} S B_{\theta})/2 \big) {\mathrm{d}}\theta\bigg], \end{aligned}$$
(2.7)

with

$$ \Omega_{{ v}} = \big\{ { v}\in\text{H}^{1/2,1}_{\mathrm {loc},\mathrm{ pol}}: D_{x} { v}, D^{2}_{x} { v}\in\text{H}^{1/2,1}_{\mathrm{loc},\mathrm{ pol}} \big\} , \quad\Omega_{f} = \big\{ f: f\in\text{H}^{1/2,1}_{\mathrm{loc},\mathrm{ pol}}\big\} . $$

To go from (2.6) to (2.7), we have applied Itô’s formula to simplify the stochastic integral with respect to \(B\). We prefer to keep the statement (2.6) with the stochastic integrals because this is the representation that comes up directly from the proof. We aim at proving the existence of minimisers to the variational problem

$$ \min_{f\in\Omega_{f}} \mathscr{R}_{\gamma}({ v},f) $$
(2.8)

for all \({ v}\in\Omega_{{ v}}\). Observe that the minimiser \(f^{\dagger}(t,x,y,z,A)\) defined (for any fixed \((t,x,y,z,A)\)) by

$$ f^{\dagger}(t,x,y,z,A)= \underset{a\in{\mathbb{R}}}{\operatorname{argmin}}{R_{\gamma}} \left( ( {\sigma}^{\intercal} A{\sigma})(t,x), a\right) $$

is also a minimiser to (2.8) provided it is in \(\Omega_{f}\). To see this, we just need to integrate over \(t\in[0,T]\) and take the expectation on both sides of

$$\begin{aligned} &{R_{\gamma}}\left( G(t,{ X_{t}}), f^{\dagger} \big( t,{ X_{t}}, { v}(t,{ X_{t}}), D_{x} { v}(t,{ X_{t}}), D^{2}_{x} { v}(t,{ X_{t}}) \big) \right) \\ &\le{R_{\gamma}}\big( G(t,{ X_{t}}), F(t,{ X_{t}})\big). \end{aligned}$$

This is why we seek a minimiser to \(a \mapsto{R_{\gamma}}(S,a)\) for a given symmetric matrix \(S\).

We now prove the existence of a minimiser.

Proposition 2.7

Let\(\gamma\in[0,1)\)and\(S \in{\mathcal{S}}^{d}\). Consider the minimisation problem

$$ \min_{a \in{\mathbb{R}}} {R_{\gamma}}(S, a). $$
(2.9)

Under the hypotheses of Theorem 2.6, there exists a global minimiser\(a^{*}\in{\mathbb{R}}\)such that\({R_{\gamma}}(S,a^{*}) \le{R_{\gamma}}(S, a)\)for all\(a \in{\mathbb{R}}\).

If \(a^{*}\) is unique, a natural candidate for \(f^{*}\) is then given by

$$ f^{*}(t,x,y,z,A) = a^{*}\big( {\sigma}^{\intercal}(t,x)A {\sigma}(t,x) \big), $$
(2.10)

for any \((t,x,y,z, A) \in[0,T] \times{\mathbb{R}}^{d} \times{\mathbb{R}} \times{\mathbb{R}}^{d} \times{\mathcal{S}}^{d}\).

Proof of Proposition 2.7

First, we show that the function \({R_{\gamma}}(S,a)\) is coercive and continuous in \(a\). For any \(\theta\in(0,1]\), we consider \(Z^{a}_{\theta}= (B_{\theta}^{\intercal} S B_{\theta}- \operatorname{Tr}[S]\theta)/2 - a \theta\). Through simple computations, we check that \(Z^{a}_{\theta}\) is continuous in \(a\) and integrable with respect to \({\mathrm{d}}\mathbb{P}^{B}\otimes{\mathrm{d}}\theta\).

Regarding the coercivity, we exhibit a coercive function which bounds \({R_{\gamma}}(S,a)\) from below. Owing to the boundedness of \(\ell_{\gamma}''\), we estimate \({\mathrm{d}}\mathbb{P}^{B}\otimes{\mathrm{d}}\theta\)-almost surely

$$\begin{aligned} \ell_{\gamma}'' (Z^{a}_{\theta}) ( -aZ^{a}_{\theta}+ {\left\vert SB_{\theta}\right\vert}^{2}/2 ) & \ge(1 - \gamma)^{2} (a^{2}\theta+ {\left\vert SB_{\theta}\right \vert}^{2}/2 ) \\ &\phantom{=:} - (1 + \gamma)^{2} |a| ( | B_{\theta}^{\intercal} S B_{\theta}| + | \operatorname{Tr}[S]|\theta)/2. \end{aligned}$$

By integrating in \(\theta\) and taking the expectation of this estimate, we get

$$ {R_{\gamma}}(S, a) \ge(1 - \gamma)^{2} ( a^{2}/2 + \operatorname{Tr}[S^{\intercal} S]/4 ) - (1 + \gamma)^{2} |a| ( \mathbb{E}[| G^{\intercal} S G| ] + |\operatorname{Tr}[S]| )/4, $$

where \(G\) is a standard normal random vector. Then we conclude that \(a \mapsto{R_{\gamma}}(S,a)\) is coercive.

Regarding the continuity, we first take \(S = {\mathbf{0}}\) and obtain

$$ {R_{\gamma}}({\mathbf{0}},a) = \big(1 + \gamma\operatorname{Sgn}(-a) \big)a^{2}/2. $$

Therefore, \(a\mapsto{R_{\gamma}}({\mathbf{0}},a) \) is a continuous and strictly convex function and hence has a unique global minimiser given by \(a^{*}= 0\). Now we take \(S \neq{\mathbf{0}}\) and decompose \({R_{\gamma}}(S,a)\) as

$$ {R_{\gamma}}(S, a) = - \mathbb{E}\bigg[ \int_{0}^{1} aZ^{a}_{\theta }\ell_{\gamma}'' ( Z^{a}_{\theta} ){\mathrm{d}}\theta\bigg] + \frac{1}{2}\mathbb{E}\bigg[ \int_{0}^{1} \ell_{\gamma}'' ( Z^{a}_{\theta}) {\left\vert SB_{\theta}\right\vert}^{2}{\mathrm {d}}\theta\bigg]. $$
(2.11)

By plugging in the expression of \(\ell_{\gamma}''\) (see (2.2)), we get

$$ a \mapsto a Z^{a}_{\theta}\ell_{\gamma}'' ( Z^{a}_{\theta} ) = (1 + \gamma^{2}) a Z^{a}_{\theta} + 2\gamma a|Z^{a}_{\theta}|, $$

which is continuous \({\mathrm{d}}\mathbb{P}^{B}\otimes{\mathrm {d}}\theta\)-almost surely and bounded by \((1 + \gamma)^{2} |a||Z^{a}_{\theta}|\) (integrable with respect to \({\mathrm{d}}\mathbb{P}^{B}\otimes{\mathrm {d}}\theta\) locally uniformly in \(a\)). By the dominated convergence theorem, we conclude that the first term of the decomposition in (2.11) is continuous in \(a\). Also, we estimate \(|\ell_{\gamma}'' ( Z^{a}_{\theta}) \, {\left\vert SB_{\theta}\right\vert}^{2} | \le(1 + \gamma^{2}) {\left\vert SB_{\theta}\right\vert}^{2} \), which is integrable uniformly in \(a\). Because \(B^{\intercal}_{\theta} S B_{\theta}\) has a density with respect to Lebesgue measure (see the proof of Proposition A.3 in the Appendix), we get that \(Z^{a}_{\theta} \neq0\)\({\mathrm{d}}\mathbb{P}^{B}\otimes{\mathrm{d}}\theta\)-almost surely. It holds that

$$ a \mapsto\ell_{\gamma}'' ( Z^{a}_{\theta}){\left\vert SB_{\theta}\right\vert}^{2} $$

is continuous \({\mathrm{d}}\mathbb{P}^{B}\otimes{\mathrm{d}}\theta \)-almost surely, due to the continuity of \(\ell_{\gamma}''\) on \({\mathbb{R}}^{*}\). Now, we conclude that the second term of the decomposition in (2.11) is also continuous in \(a\) by applying again the dominated convergence theorem. Therefore, we have proved that \({R_{\gamma}}(S,a)\) is continuous in \(a\).

Take \(\alpha\in{\mathbb{R}}\) large enough such that \(K = \{a:{R_{\gamma}}(S,a) \le\alpha\}\) is nonempty. Because of the continuity and coercivity of \({R_{\gamma}}(S,a)\), \(K\) is compact. Then by Weierstrass’ theorem, we conclude the announced result. □

Here, we have just shown the existence of a minimiser \(a^{*}\) to problem (2.9) for a given symmetric matrix \(S\). The regularity of \(a^{*}(S)\) has not been analysed, because uniqueness has not been proved. In fact, the uniqueness and smoothness of \(f^{*}\) for problem (2.8) is challenging in the general case. Certainly, if \(a^{*}(S)\) is unique, then we could define \(f^{*}\) as in (2.10).

Then, a natural candidate for the endogenous valuation/hedging rule (as explained in the introduction) is given by the solution to the nonlinear \(f^{*}\)-PDE

$$\begin{aligned} \textstyle\begin{cases} {\partial_{t} { v}^{*}}(t,x)+ \frac{1}{2}\operatorname{Tr}[{\sigma }{\sigma}^{ \intercal} D^{2}_{x} { v}^{*} ](t,x)+a^{*}\big( {\sigma}^{\intercal }(t,x)D^{2}_{x} { v}^{*}(t,x){\sigma}(t,x)\big)=0, \\ { v}^{*}(T,x)=h(x). \end{cases}\displaystyle \end{aligned}$$
(2.12)

This PDE is fully nonlinear with a nonlinear term depending on the Hessian. Unfortunately, in full generality, we are not able to prove the existence/uniqueness of a solution \({ v}^{*}\) satisfying Assumption 2.2. Also proving that the new kernel \(f^{*}\) fulfils Assumption 2.3 is not straightforward. Fortunately, the one-dimensional case provides us with a quasi-explicit formulation for \(a^{*}\), which hopefully is a first step in the analysis of the PDE (2.12). Further investigation is left to future research.

Coming back to the initial problem (1.3), we conjecture that

$$\begin{aligned} \tilde{V}_{t_{n}}\approx{ v}^{*}({t_{n}}, X_{t_{n}}),\qquad \tilde{\vartheta}_{t_{n}}\approx D_{x} { v}^{*}({t_{n}}, X_{t_{n}}), \end{aligned}$$

which will be numerically tested in Sect. 4.

2.4 Quasi-explicit solution in the one-dimensional case

In this section, we present a quasi-explicit formulation of the optimal kernel \(f^{*}\) in the one-dimensional case. Here, \(( B_{\theta}^{\intercal} S B_{\theta}- \operatorname{Tr}[S]\theta)/2 \) becomes \((B_{\theta}^{2} - \theta) y/2\) for \(y = S \in{\mathbb{R}}\). So we can rewrite the function \({R_{\gamma}}(S,a )\) given by (2.7) as

$$ {R_{\gamma}}(y,a) = \mathbb{E}\biggl[ \int_{0}^{1} \ell_{\gamma}'' \big( y (B_{\theta}^{2} - \theta)/2 - a\theta\big) \big( a^{2} \theta- ay (B_{\theta}^{2} - \theta)/2 + y^{2}B^{2}_{\theta}/2 \big) {\mathrm{d}}\theta\biggr]. $$

Let \(a^{*}\in{\mathbb{R}}\) be a global minimiser of \(a \mapsto{R_{\gamma}}(y, a)\). In the following proposition, we sum up some interesting properties of \(a^{*}\). We denote by \(\Phi_{{\mathcal{N}}}\) the cumulative distribution function of the standard normal distribution and by \(\phi_{{\mathcal{N}}}= \Phi_{{\mathcal{N}}}'\) its density.

Proposition 2.8

Let\(\gamma\in[0,1)\).

(a) Let\(c_{1}^{*} \in{\mathbb{R}}\)and\(c_{2}^{*} \in{\mathbb{R}}\)be global minimisers of

$$ c \mapsto{R_{\gamma}}(1,c)\quad\textit{and}\quad c \mapsto{R_{ \gamma}}(- 1,c), $$

respectively. Then\(a^{*}(y) = c_{1}^{*}y\mathbf{I}_{\{y > 0\}} + c_{2}^{*}y \mathbf{I}_{\{y < 0\}}\)is a global minimiser of

$$ a \mapsto{R_{\gamma}}(y, a). $$

(b) The mappings

$$ c\mapsto{R_{\gamma}}(1,c)\quad\textit{and} \quad c\mapsto{R_{\gamma}}(- 1,c) $$

are strictly convex. Thus\(c_{1}^{*}\)and\(c_{2}^{*}\)are uniquely characterised by

$$ (1 + \gamma^{2} )c_{1}^{*} + \gamma T(c_{1}^{*}) = 0\quad\textit{and} \quad(1 + \gamma^{2} )c_{2}^{*} - \gamma T(c_{2}^{*}) = 0, $$

respectively, where

$$ T(c) = 2c\mathbf{I}_{\{2c + 1\le0\}} + \big(8c \Phi_{{\mathcal{N}}}( - \sqrt{2c + 1} ) - 4\phi_{{\mathcal{N}}}(\sqrt{2c + 1} )\sqrt{2c + 1} - 2c\big)\mathbf{I}_{\{2c + 1 > 0\}}. $$

Therefore, the minimiser\(a^{*}(y)\)is unique.

(c) For\(\gamma=0\), \(c_{1}^{*}=c_{2}^{*}=0\), and for\(\gamma\in(0,1)\), \(c_{1}^{*}>0>c_{2}^{*}\).

Proof

(a) First, for \(y = 0\), we get \({R_{\gamma}}(0,a) = (1 + \gamma\operatorname{Sgn}( - a))^{2}a^{2}/2 \). So \(a^{*}(0) = 0\). Now we consider the more interesting case \(y \neq0\). By setting \(c = a/y\), we rewrite \({R_{\gamma}}(y,a)\) as

$$\begin{aligned} &{R_{\gamma}}\left(y,c y\right) \\ & = \mathbb{E}\bigg[ \int_{0}^{1} \ell_{\gamma}''\big( (B_{\theta}^{2} - \theta)/2 - c\theta\big) \big( c^{2}\theta- c(B_{\theta}^{2} - \theta)/2 + B_{\theta}^{2}/2 \big) {\mathrm{d}}\theta\bigg] y^{2} \mathbf{I}_{\{y > 0\}} \\ & \phantom{=:}+ \mathbb{E}\bigg[ \int_{0}^{1} \ell_{\gamma}'' \big( - (B_{\theta}^{2} - \theta)/2 + c\theta\big) \big(c^{2}\theta- c( B_{\theta}^{2} - \theta)/2 + B_{\theta}^{2}/2\big) {\mathrm{d}}\theta\bigg]y^{2} \mathbf{I}_{\{y < 0\}}, \end{aligned}$$
(2.13)

because \(\ell_{\gamma}''(y\zeta) = \ell_{\gamma}''(\zeta)\) if \(y > 0\) and \(\ell_{\gamma}''(y\zeta) = \ell_{\gamma}''( - \zeta)\) if \(y < 0\), for any \(\zeta\in{\mathbb{R}}\).

Consider a global minimiser \(c^{*}(y)\) of \(c \mapsto{R_{\gamma}}(y, c y)\); then \(a^{*}(y) = c^{*}(y) y\) is also a global minimiser of \(a \mapsto{R_{\gamma}}(y, a)\). Because \((y,c)\mapsto{R_{\gamma}}\left(y, c y\right)\) is multiplicatively separable on \(\{y > 0\}\) and on \(\{y < 0\}\), we write \(c^{*}(y) = c_{1}^{*}\mathbf{I}_{\{y > 0\}} + c_{2}^{*}\mathbf{I}_{\{y < 0\}} \), where \(c_{1}^{*}\) and \(c_{2}^{*}\) are global minimisers of \(c \mapsto{R_{\gamma}}(1,c)\) and \(c \mapsto{R_{\gamma}}( - 1,c)\), respectively.

(b) Let \(G\) be a standard normal random variable. It will be useful later to know \(\mathbb{E}[G^{2}\mathbf{I}_{\{G < \alpha\}}]\) for any real \(\alpha\): we have

$$\begin{aligned} \mathbb{E}[G^{2}\mathbf{I}_{\{G < \alpha\}} ] & = - \alpha\phi_{{ \mathcal{N}}}\left(\alpha\right) + \Phi_{{\mathcal{N}}}\left( \alpha\right), \\ \mathbb{E}[G^{2}\mathbf{I}_{\{G > \alpha\}} ] & = \alpha\phi_{{ \mathcal{N}}}\left( - \alpha\right) + \Phi_{{\mathcal{N}}} \left( - \alpha\right), \\ \mathbb{E}[G^{2}\mathbf{I}_{\{ - \alpha< G < \alpha\}} ] & = - 2 \alpha\phi_{{\mathcal{N}}}\left( - \alpha\right) + \big(\Phi_{{ \mathcal{N}}}\left(\alpha\right) - \Phi_{{\mathcal{N}}}\left( - \alpha\right)\big). \end{aligned}$$
(2.14)

Now \(B_{\theta}\overset{({\mathrm{d}})}{=}\sqrt{\theta}G\) for all \(\theta\) in \([0,1]\). From (2.13), we get

$$ {R_{\gamma}}(1,c) = \frac{1 + \gamma^{2}}{2}T_{1}(c) + \gamma T_{2}(c), \quad{R_{\gamma}}(- 1,c) = \frac{1 + \gamma^{2}}{2}T_{1}(c) - \gamma T_{2}(c), $$
(2.15)

where

$$ \begin{aligned} T_{1}(c) & = \mathbb{E}[ c^{2} - c(G^{2} - 1)/2 + G^{2}/2 ] = c^{2} + 1/2, \\ T_{2}(c) & = \mathbb{E}\big[ \operatorname{Sgn}\big( (G^{2} - 1)/2 - c \big) \big( c^{2} - c(G^{2} - 1)/2 + G^{2}/2 \big) \big]. \end{aligned} $$

Considering \(\alpha(c) = \sqrt{2c + 1}\), we have

$$ \operatorname{Sgn}\big(\! (G^{2} {-} 1)/2 {-} c \big) \!= \mathbf{I}_{ \{2c + 1 < 0\}} {+} \mathbf{I}_{\{2c + 1 > 0\}} (\mathbf{I}_{\{G< - \alpha(c)\}} {+} \mathbf{I}_{\{G > \alpha(c)\}} {-} \mathbf{I}_{\{ - \alpha(c) < G < \alpha(c)\}} ). $$

From the expectations in (2.14), we deduce

$$\begin{aligned} T_{2}(c) & = \mathbf{I}_{\{2c + 1 < 0\}} ( c^{2} + 1/2 ) \\ &\phantom{=:} +\mathbf{I}_{\{2c + 1 > 0\}} ( c^{2} + c/2 ) \mathbb{E}[ \mathbf{I}_{\{G < -\alpha(c)\}} + \mathbf{I}_{\{G > \alpha(c)\}} - \mathbf{I}_{\{ - \alpha(c) < G < \alpha(c)\}} ] \\ &\phantom{=:} + \mathbf{I}_{\{2c + 1 > 0\}} (1/2 - c/2 ) \\ &\phantom{=:+}\times\mathbb{E}[G^{2}\mathbf{I}_{\{G < -\alpha(c)\}} + G^{2} \mathbf{I}_{\{G > \alpha(c)\}} - G^{2}\mathbf{I}_{\{ - \alpha(c) < G < \alpha(c)\}} ] \\ & = \mathbf{I}_{\{2c + 1 < 0\}} (c^{2} + 1/2 ) + \mathbf{I}_{\{2c + 1 > 0\}} \beta(c), \end{aligned}$$

where

$$ \beta(c) = (c^{2} + 1/2 ) \Big(3 - 4\Phi_{{\mathcal{N}}}\big( \alpha(c)\big)\Big) + 2\left(1 - c\right)\alpha(c)\phi_{{ \mathcal{N}}}\big(\alpha(c)\big). $$

We easily check that \({R_{\gamma}}(1,c)\) and \({R_{\gamma}}( - 1,c)\) are \({\mathrm{C}}^{0}\) and piecewise \({\mathrm{C}}^{2}\). Let us compute their first derivatives for \(c < -1/2\) and \(c > -1/2\); these are

$$\begin{aligned} \partial_{c}{R_{\gamma}}(1,c) & = (1 + \gamma^{2} )c + \gamma \left(\mathbf{I}_{\{2c + 1 < 0\}} 2c + \mathbf{I}_{\{2c + 1 > 0\}} \beta'(c)\right) \\ & = \mathbf{I}_{\{2c + 1 < 0\}} \left(1 + \gamma\right)^{2}c + \big( (1 + \gamma^{2} )c + \gamma\beta'(c) \big)\mathbf{I}_{\{2c + 1 > 0\}}, \quad \end{aligned}$$
(2.16)
$$\begin{aligned} \partial_{c}{R_{\gamma}}(- 1,c) & = (1 + \gamma^{2} )c - \gamma \left(\mathbf{I}_{\{2c + 1 < 0\}} 2c + \mathbf{I}_{\{2c + 1 > 0\}} \beta'(c)\right) \\ & = \mathbf{I}_{\{2c + 1 < 0\}} (1 - \gamma)^{2}c + \big( (1 + \gamma^{2} )c - \gamma\beta'(c)\big)\mathbf{I}_{\{2c + 1 > 0\}}, \quad \end{aligned}$$
(2.17)

where

$$ \beta'(c) = 8c \Phi_{{\mathcal{N}}}( - \sqrt{2c + 1}) - 4\phi_{{ \mathcal{N}}}(\sqrt{2c + 1})\sqrt{2c + 1} - 2c. $$

Standard computations show that \(\partial_{c}{R_{\gamma}}( - 1,c) \) and \(\partial_{c}{R_{\gamma}}(1,c) \) are both continuous at \(c = -1/2\). Moreover, we see that \(\partial_{c}{R_{\gamma}}(1,c)\) and \(\partial_{c}{R_{\gamma}}( - 1,c)\) are strictly increasing in \(c\) under the condition that \(|\beta''(c)| \le2\) on \(\{2c + 1 > 0\}\). Indeed, we have

$$ \beta''(c) = 6 - 8 \Phi_{{\mathcal{N}}}(\sqrt{2c + 1} ) \in[ - 2,2], $$

due to \(\Phi_{{\mathcal{N}}}(\sqrt{2c + 1})\in[1/2,1]\) for all \(2c + 1 >0\). Because \({R_{\gamma}}(1,c)\), \({R_{\gamma}}( - 1,c)\) are both strictly convex, the optimal values \(c_{1}^{*}\) and \(c_{2}^{*}\) are unique and characterised respectively by \(\partial_{c}{R_{\gamma}}(1,c_{1}^{*}) = 0 \) and \(\partial_{c}{R_{\gamma}}( - 1,c_{2}^{*}) = 0\).

(c) The case \(\gamma=0\) is clear from (2.15). Now let \(\gamma\in(0,1)\). From the explicit representations (2.16) and (2.17), we directly get \(\partial_{c}{R_{\gamma}}(-1,0)=-\gamma\beta'(0)>0\) and \(\partial_{c}{R_{\gamma}}(1,0)=\gamma\beta'(0)=-4\gamma\phi_{{ \mathcal{N}}}(1) <0\). Therefore, since \({R_{\gamma}}(1,\cdot)\) is strictly convex and decreasing around 0, its minimum must be achieved on the positive line, i.e., \(c_{1}^{*}>0\). Similarly, the minimum of \({R_{\gamma}}(-1,\cdot)\) must be achieved on the negative line, i.e., \(c_{2}^{*}<0\). □

We depict the global minimiser \(a^{*}\) in Fig. 4. We show the approximate values of \(c_{1}^{*}\) and \(c_{2}^{*}\) calculated by a root finding algorithm in Table 1.

Fig. 4
figure 4

Global minimiser \(a^{*}(y)\)

Table 1 Optimal slopes \(c_{1}^{*}\) and \(c_{2}^{*}\)

In the spirit of (2.10), we set

$$ f^{*}(t,x, y,z, A) = f^{*}_{\gamma} \big({\sigma}^{2}(t,x)A\big) $$

with \(f^{*}_{\gamma}\) denoting the optimal kernel in dimension 1, given by

$$ f^{*}_{\gamma}(y) :=a^{*}(y) = c_{1}^{*} y \mathbf{I}_{\{y > 0\}} + c_{2}^{*} y \mathbf{I}_{\{y < 0\}}. $$
(2.18)

3 Proof of Theorem 2.6

The proof is long and technical. For this reason, we split it into different stages.

First, we study the conditional local risk \({\mathcal{R}}_{n}(\gamma)\) on the interval \([{t_{n}},{t_{n+1}}]\) by using a time-space rescaling argument (see Sect. 3.1). This rescaling turns out to be essential to pass to the limit later.

Second, we derive an explicit approximation of the conditional local risk \({\mathcal{R}}_{n}(\gamma)\) (see Sect. 3.2).

Finally, we prove that the remainder terms converge almost surely to 0. For this, we show that the Greeks of \(u_{\tau}(t,\cdot)\) converge to those of \({ v}(\tau,\cdot)\) as \(t\uparrow\tau\) (see Sect. 3.3). Also, to get the limit in (2.6), we need to pass to the limit in \(\ell''_{\gamma}\) along some random sequences; this is possible since their limiting point equals 0 (the discontinuity point of \(\ell''_{\gamma}\)) on a set of measure zero (defined later in (3.16)).

In the proof, we use several constants \(C_{n,N}(\xi)\) depending polynomially on the space variable \(\xi\) (uniformly in the interval \([{t_{n}},{t_{n+1}}]\) and in the number of time steps). To simplify, we write \(C_{n,N}(\xi) \in C_{\mathrm{pol}}\) if for some real \(q > 0\),

$$ \sup_{N\in{\mathbb{N}}} \sup_{ 0\le n\le N- 1} \sup_{\xi\in{ \mathbb{R}}^{d}} \frac{| C_{n,N}(\xi)|}{1 + |\xi|^{q}} < + \infty. $$

This upper bound depends on the polynomial bounds on the functions \({\mu}, {\sigma}, f, { v}\) and \(u\).

3.1 Preliminary time-space rescaling and conditioning

First, we start by a few observations.

– Thanks to the Markov property of the SDE and in view of our smoothness assumptions, \({\mathcal{R}}_{n}(\gamma)\) is a continuous function of \({t_{n}}\) and \(X_{t_{n}}\) only (see (2.4)).

\({\mathcal{R}}_{n}(\gamma)\) goes to zero at rate \(\varepsilon_{N}^{2}\), because we prove that the remainder is a second-order stochastic Taylor expansion (i.e., of order \(\varepsilon_{N}\)) and that it appears inside \(\ell_{\gamma}\) which is positively homogenous of degree 2. Rescaling it by \(\varepsilon_{N}\), we expect to get a non-zero limit for the aggregated value of \({\mathcal{R}}_{n}(\gamma)\) (see (2.5)).

– Note that \(\ell_{\gamma}''\) has a jump discontinuity at zero (see (2.2)). To decompose the conditional local risk, we thus need to apply a stronger version of Itô’s formula, known as the Itô–Tanaka formula.

In view of a Taylor–Itô expansion, we consider the process \(X^{ \varepsilon_{N}} = ( X^{\varepsilon_{N}}_{\theta})_{\theta\in[0,1]}\) satisfying

$$\begin{aligned} {\mathrm{d}} X^{\varepsilon_{N}}_{\theta}&= \varepsilon_{N}{\mu}({t_{n}}+ \theta\varepsilon_{N}, X^{\varepsilon_{N}}_{\theta}) {\mathrm {d}}\theta+ \varepsilon_{N}^{1/2}{\sigma}({t_{n}}+ \theta\varepsilon_{N}, X^{ \varepsilon_{N}}_{\theta}) {\mathrm{d}}B_{\theta}, \\ X^{\varepsilon_{N}}_{0} &= \xi\in{\mathbb{R}}^{d}, \end{aligned}$$
(3.1)

where \(B\) is an extra Brownian motion independent from \(W\). This is a time-space rescaling of the original process starting from \(\xi\) at \({t_{n}}\).

Denoting by \(X^{t,\xi}\) the SDE solution starting from \(\xi\) at \(t\), we notice that the processes \(( X_{{t_{n}}+ \theta\varepsilon_{N}}^{{t_{n}},\xi})_{\theta \in[0,1]} \) and \(( X^{\varepsilon_{N}}_{\theta})_{\theta\in[0,1]} \) have the same distribution. This is because both processes satisfy the same SDE generated by Brownian motions both independent from \({\mathcal{F}}_{{t_{n}}}\). Thus we can rewrite \({\mathcal{R}}_{n}(\gamma)\) (see (2.4)) as a continuous function in terms of \(X_{t_{n}}\) and \(X^{\varepsilon_{N}}_{\theta}\). Setting

$$\begin{aligned} & T^{\varepsilon_{N}}({t_{n}},\xi) \\ & =\varepsilon_{N}^{ - 2} \mathbb{E}^{B}\big[ \ell_{\gamma}\big( u^{(n+1)}({t_{n+1}}, X^{\varepsilon_{N}}_{1}) - u^{(n+1)}({t_{n}},\xi)- D_{x} u^{(n+1)}({t_{n}}, \xi)( X^{\varepsilon_{N}}_{1} - \xi) \big) \big] \end{aligned}$$

leads to

$$ {\mathcal{R}}_{n}(\gamma)= \varepsilon_{N}^{2} T^{\varepsilon_{N}}({t_{n}}, X_{t_{n}}). $$
(3.2)

3.2 Stochastic expansion of the conditional local risk at time \(t_{n}\)

Proposition 3.1

Using the notations and assumptions of Theorem 2.6, define the functions\(F^{(n+1)}:[0,{t_{n+1}}] \times{\mathbb{R}}^{d} \to{\mathbb{R}} \)and\(G^{(n+1)}: [0,{t_{n+1}}] \times{\mathbb{R}}^{d} \to{ \mathcal{S}}^{d}\)by

$$\begin{aligned} F^{(n+1)}(t,\cdot) & = f\big(t,\cdot, u^{(n+1)}(t,\cdot), D_{x} u^{(n+1)} (t,\cdot), D^{2}_{x} u^{(n+1)} (t,\cdot)\big), \\ G^{(n+1)}(t,\cdot) & = \big({\sigma}^{\intercal}(D^{2}_{x} u^{(n+1)}){ \sigma}\big) (t,\cdot). \end{aligned}$$
(3.3)

For any\({t_{n}}\)and\(\xi\in{\mathbb{R}}^{d}\), let\(X^{\varepsilon_{N}}: [0,1] \times\Omega\to{\mathbb{R}}^{d}\)be the strong solution to the SDE (3.1) with\(X^{\varepsilon_{N}}_{0} = \xi\), and let\({\mathcal{E}}^{\varepsilon_{N}}: [0,1] \times\Omega\to{ \mathbb{R}}\)be the stochastic process defined by

$$ {\mathcal{E}}^{\varepsilon_{N}}_{\theta}= u^{(n+1)}({t_{n}}+ \theta \varepsilon_{N}, X^{\varepsilon_{N}}_{\theta}) - u^{(n+1)}({t_{n}}, \xi)- {\langle D_{x} u^{(n+1)}({t_{n}},\xi), X^{\varepsilon_{N}}_{\theta}- \xi \rangle} $$
(3.4)

so that

$$ T^{\varepsilon_{N}}({t_{n}},\xi)= \varepsilon_{N}^{ - 2} \mathbb{E}^{B}[ \ell_{\gamma}({\mathcal{E}}^{\varepsilon_{N}}_{1}) ]. $$
(3.5)

Then we have the local risk decomposition

$$ \begin{aligned} T^{\varepsilon_{N}}({t_{n}},\xi)= \mathbb{E}^{B} \bigg[ \int_{0}^{1} &\ell_{\gamma}'' \Big( E_{\theta}\big( G^{(n+1)}({t_{n}}, \xi), F^{(n+1)}({t_{n}},\xi)\big) + R^{\varepsilon_{N}}_{\theta}({t_{n}}, \xi)\Big) \\ &\times Q_{\theta}\big( G^{(n+1)}({t_{n}},\xi), F^{(n+1)}({t_{n}}, \xi)\big){\mathrm{d}}\theta\bigg] + C_{n,N}(\xi) \varepsilon_{N}^{1/2}, \end{aligned} $$

where

$$\begin{aligned} E_{\theta}(S,y) & = \int_{0}^{\theta} B_{\theta'}^{\intercal} S { \mathrm{d}}B_{\theta'} - y \theta , \end{aligned}$$
(3.6)
$$\begin{aligned} Q_{\theta}(S,y) &= y^{2}\theta- y\int_{0}^{\theta} B_{\theta'} ^{ \intercal} S {\mathrm{d}}B_{\theta'} +{\left\vert S B_{\theta} \right\vert}^{2}/2 , \end{aligned}$$
(3.7)
$$\begin{aligned} R^{\varepsilon_{N}}_{\theta}({t_{n}},\xi)& = {\mathcal{E}}^{ \varepsilon_{N}}_{\theta}/\varepsilon_{N}- E_{\theta}\big( G^{(n+1)}({t_{n}}, \xi), F^{(n+1)}({t_{n}},\xi)\big), \end{aligned}$$
(3.8)

for some constant\(C_{n,N}(\xi)\in C_{\mathrm{pol}}\).

The proof of Proposition 3.1 is delicate. We postpone it to Sect. 3.5. In order to perform a second-order stochastic expansion, we need \(u^{(n+1)}\) and \(D_{x} u^{(n+1)}\) to be in \({\mathrm{C}}^{1,2}\) to apply Itô’s formula. Additionally, we require \({\sigma}\), \(D_{x} u^{(n+1)}\), \(D^{2}_{x} u^{(n+1)}\), \({\partial_{t} D_{x} u^{(n+1)}}\) and \(D^{2}_{x} D_{x} u^{(n+1)}\) to have polynomial growth to obtain proper integrability along the computations. Finally, we ask for \({\sigma}\) and \(D^{2}_{x} u^{(n+1)}\) to be in \(\text{H}^{1/2,1}_{\mathrm{loc},\mathrm{ pol}}\), which is useful in the stochastic expansion of the gradient \(D_{x} u^{(n+1)}\). All the above conditions are satisfied thanks to our assumptions.

3.3 Approximation of sensitivities in small time

First, notice that the above expansion of \(T^{\varepsilon_{N}}({t_{n}},\xi)\) depends on \(u^{(n+1)}\), solution of the PDE (1.5) on the subinterval \([{t_{n}},{t_{n+1}}]\), whose size goes to 0. Therefore, by invoking a small-time approximation argument, we replace \(u^{(n+1)}\) and its first and second derivatives by the terminal value \({ v}({t_{n+1}},\cdot)\) and its first and second derivatives. Notice that the reference valuation \({ v}\) is independent of \(\varepsilon_{N}\). This is the matter of the following statement, proved in Appendix A.1.

Proposition 3.2

With the notations and assumptions of Theorem 2.6, there exists some constant\(C_{n,N}(\xi)\in C_{\mathrm{pol}}\)such that

$$\begin{aligned} | u^{(n+1)}({t_{n}},\xi)- { v}({t_{n+1}},\xi)| &\le C_{n,N}(\xi) \varepsilon_{N}^{1/2}, \end{aligned}$$
(3.9)
$$\begin{aligned} {\vert D_{x} u^{(n+1)}({t_{n}},\xi)- D_{x} { v}({t_{n+1}},\xi) \vert} &\le C_{n,N}( \xi) \varepsilon_{N}^{1/2}, \end{aligned}$$
(3.10)
$$\begin{aligned} { \vert D^{2}_{x} u^{(n+1)}({t_{n}},\xi)- D^{2}_{x} { v}({t_{n+1}},\xi) \vert} &\le C_{n,N}( \xi) \varepsilon_{N}^{1/2}. \end{aligned}$$
(3.11)

3.4 Aggregation of local risk and passage to the limit

We set

$$\begin{aligned} F(t,\xi)& = f\big(t,\xi,{ v}(t,\xi),D_{x} { v}(t,\xi),( D^{2}_{x} { v})(t,\xi)\big) \in{\mathbb{R}}, \\ G(t,\xi)& = \big({\sigma}^{\intercal}(D^{2}_{x} { v}){\sigma} \big)(t,\xi)\in{\mathcal{S}}^{d}. \end{aligned}$$
(3.12)

Replacing \(\xi\) by \(X_{t_{n}}\) in the expansion of \(T^{\varepsilon_{N}}({t_{n}},\xi)\) in Proposition 3.1 leads to

$$\begin{aligned} T^{\varepsilon_{N}}({t_{n}}, X_{t_{n}})& = \mathbb{E}^{B}\bigg[ \int_{0}^{1} \ell_{\gamma}'' \Big(E_{\theta}\big( G^{(n+1)}({t_{n}}, X_{t_{n}}), F^{(n+1)}({t_{n}}, X_{t_{n}})\big) + R^{\varepsilon_{N}}_{\theta}({t_{n}}, X_{t_{n}})\Big) \\ &\quad\ \qquad\ \quad \times Q_{\theta}\big( G^{(n+1)}({t_{n}}, X_{t_{n}}), F^{(n+1)}({t_{n}}, X_{t_{n}})\big){\mathrm{d}}\theta\bigg] \\ &\phantom{=:}+ C_{n,N}( X_{t_{n}})\varepsilon_{N}^{1/2}, \end{aligned}$$

where \(C_{n,N}( X_{t_{n}})\in C_{\mathrm{pol}}\). By replacing \(u^{(n+1)}({t_{n}},\cdot)\) by its terminal value \({ v}({t_{n+1}},\cdot)\) in \(F^{(n+1)}({t_{n}},\cdot)\) and \(G^{(n+1)}({t_{n}},\cdot)\) (see (3.3)), we get \(F({t_{n+1}},\cdot)\) and \(G({t_{n+1}},\cdot)\) (see (3.12)). Hence,

$$\begin{aligned} T^{\varepsilon_{N}}({t_{n}}, X_{t_{n}})& = \mathbb{E}^{B}\bigg[ \int_{0}^{1} \ell_{\gamma}'' \Big(E_{\theta}\left( G({t_{n+1}}, X_{t_{n}}), F({t_{n+1}}, X_{t_{n}})\right) + \bar{R}^{\varepsilon_{N}}_{\theta}({t_{n}}, X_{t_{n}})\Big) \\ &\qquad\qquad\quad\times Q_{\theta}\big( G({t_{n+1}}, X_{t_{n}}), F({t_{n+1}}, X_{t_{n}}) \big) {\mathrm{d}}\theta\bigg] \\ &\phantom{=:}+ \bar{C}^{\varepsilon_{N}}({t_{n}}, X_{t_{n}})+ C_{n,N}( X_{t_{n}}) \varepsilon_{N}^{1/2}, \end{aligned}$$

where

$$\begin{aligned} \bar{R}^{\varepsilon_{N}}_{\theta}({t_{n}},\xi)&:=E_{\theta}\big( G^{(n+1)}({t_{n}}, \xi), F^{(n+1)}({t_{n}},\xi)\big) - E_{\theta}\big( G({t_{n+1}},\xi), F({t_{n+1}},\xi)\big) \\ &\phantom{=::}+ R^{\varepsilon_{N}}_{\theta}({t_{n}},\xi), \end{aligned}$$
(3.13)
$$\begin{aligned} \bar{C}^{\varepsilon_{N}}({t_{n}},\xi)&:=\mathbb{E}^{B}\bigg[ \int_{0}^{1} \ell_{\gamma}'' \Big( E_{\theta}\big( G({t_{n+1}},\xi), F({t_{n+1}}, \xi)\big) + \bar{R}^{\varepsilon_{N}}_{\theta}({t_{n}},\xi)\Big) \\ &\phantom{:}\qquad\quad\quad\ \ \times\Big(Q_{\theta}\big( G^{(n+1)}({t_{n}}, \xi), F^{(n+1)}({t_{n}},\xi)\big) \\ &\phantom{:}\qquad\quad\quad\quad\ \ \ \ - Q_{\theta}\big( G({t_{n+1}},\xi), F({t_{n+1}}, \xi)\big)\Big){\mathrm{d}}\theta\bigg]. \end{aligned}$$
(3.14)

In the sequel, we require estimates of \(\bar{R}^{\varepsilon_{N}}_{\theta}({t_{n}}, X_{t_{n}})\) and \(\bar{C}^{\varepsilon_{N}}({t_{n}}, X_{t_{n}})\), summarised in the following lemma, proved later in Sect. 3.6.

Lemma 3.3

Under the assumptions of Theorem 2.6, for any\(p \geq1\), there exists a constant\(K_{p}\)such that

(a)

$$\begin{aligned} \mathbb{E}\Big[ \sup_{0 \leq n\leq N- 1} \sup_{\theta\in [0,1]}&\big|E_{\theta}\big( G^{(n+1)}({t_{n}}, X_{t_{n}}), F^{(n+1)}({t_{n}}, X_{t_{n}}) \big) \\ &- E_{\theta}\big( G({t_{n+1}}, X_{t_{n}}), F({t_{n+1}}, X_{t_{n}}) \big)\big|^{p}\Big] \le K_{p} \varepsilon_{N}^{p/2}; \end{aligned}$$

(b) \(\sup_{0 \leq n\leq N- 1}\mathbb{E}[ | \bar{C}^{\varepsilon_{N}}({t_{n}}, X_{t_{n}})| ]\le K_{1}\varepsilon_{N}^{1/2}\);

(c) \(\sup_{0 \leq n\leq N- 1} \sup_{\theta\in[0,1]} | \bar{R}^{ \varepsilon_{N}}_{\theta}({t_{n}}, X_{t_{n}})| \longrightarrow0\)as\(N\to\infty\), \({\mathrm{d}}\mathbb{P}^{W}\otimes{\mathrm{d}}\mathbb{P}^{B}\)-a.s.

Proof of Theorem 2.6

We have \(\varepsilon_{N}^{ - 1}{\mathcal{R}}_{n}(\gamma)= T^{\varepsilon _{N}}({t_{n}}, X_{t_{n}})\varepsilon_{N}\) from the definition of \(T^{\varepsilon_{N}}\) in (3.2). By summing over \(0 \le n\le N- 1\), we obtain

$$\begin{aligned} & \varepsilon_{N}^{ - 1} \mathbb{E}\biggr[ \sum_{n = 0}^{N - 1}{ \mathcal{R}}_{n}(\gamma)\biggl] \\ &= \mathbb{E}\bigg[ \sum_{n= 0}^{N- 1} T^{\varepsilon_{N}}({t_{n}}, X_{t_{n}})\varepsilon_{N}\bigg] = \mathbb{E}\bigg[ \int_{0}^{T} T^{ \varepsilon_{N}}( \varphi_{t}^{N}, X_{ \varphi_{t}^{N}} ){\mathrm{d}}t \bigg] \\ & =\mathbb{E}^{W\otimes B}\bigg[ \int_{0}^{T} \int_{0}^{1} \ell _{\gamma}'' \Big( E_{\theta}\big( G( \bar{\varphi}_{t}^{N},X_{ \varphi_{t}^{N}}), F( \bar{\varphi}_{t}^{N},X_{ \varphi_{t}^{N}}) \big) + \bar{R}^{\varepsilon_{N}}_{\theta}( \varphi_{t}^{N},X_{ \varphi_{t}^{N}}) \Big) \\ & \qquad\qquad\qquad\qquad\times Q_{\theta}\big( G( \bar{\varphi}_{t}^{N}, X_{ \varphi_{t}^{N}}), F( \bar{\varphi}_{t}^{N}, X_{ \varphi_{t}^{N}}) \big){\mathrm{d}}\theta\ {\mathrm{d}}t\bigg] \\ & \phantom{=:}+ \sum_{n = 0}^{N - 1} \mathbb{E}\big[ \bar{C}^{\varepsilon_{N}}({t_{n}}, X_{t_{n}})\varepsilon_{N}+ C_{n,N}( X_{t_{n}})\varepsilon_{N}^{3/2} \big]. \end{aligned}$$
(3.15)

The last sum goes to 0 as \(N\to\infty\), due to Lemma 3.3 and Proposition 3.1. It remains to determine the limit of the first term in (3.15). We achieve this result by applying the dominated convergence theorem, as follows.

1) Because of \({\sigma},{ v},D_{x} { v},D^{2}_{x} { v},f\in\text{H}^{1/2,1}_{ \mathrm{loc},\mathrm{ pol}}\) (therefore, they are continuous in time and space) and the path-continuity of \(X\), we get \({\mathrm{d}}\mathbb{P}^{W}\)-a.s. for any \(t\) that

$$\begin{aligned} & \big({\sigma}^{\intercal}(D^{2}_{x} { v}){\sigma}\big) ( \bar{\varphi}_{t}^{N},X_{ \varphi_{t}^{N}}) \underset{N\to\infty}{\longrightarrow}\big({\sigma}^{\intercal}( D^{2}_{x} { v}){\sigma}\big) (t,X_{t}), \\ & f\big( \bar{\varphi}_{t}^{N},X_{ \varphi_{t}^{N}},{ v}( \bar{\varphi}_{t}^{N},X_{ \varphi_{t}^{N}}),D_{x} { v}( \bar {\varphi}_{t}^{N},X_{ \varphi_{t}^{N}}), D^{2}_{x} { v}( \bar{\varphi}_{t}^{N},X_{ \varphi_{t}^{N}})\big) \\ & \underset{N\to\infty}{\longrightarrow}f\big(t,X_{t},{ v}(t,X_{t}), D_{x} { v}(t,X_{t}),D^{2}_{x} { v}(t,X_{t})\big). \end{aligned}$$

Hence we have \({\mathrm{d}}\mathbb{P}^{W}\otimes{\mathrm{d}}\mathbb{P}^{B}\)-a.s. for any \(\theta,t\) that

$$\begin{aligned} E_{\theta}\big( G( \bar{\varphi}_{t}^{N},X_{ \varphi_{t}^{N}}), F( \bar{\varphi}_{t}^{N},X_{ \varphi_{t}^{N}})\big) & \underset{N\to\infty}{\longrightarrow}E_{\theta}\big(\Gamma (t,X_{t}),F(t,X_{t}) \big), \\ Q_{\theta}\big( G( \bar{\varphi}_{t}^{N},X_{ \varphi_{t}^{N}}), F( \bar{\varphi}_{t}^{N},X_{ \varphi_{t}^{N}})\big) & \underset{N\to\infty}{\longrightarrow}Q_{\theta}\big( G(t,X_{t}), F(t,X_{t}) \big), \end{aligned}$$

because \(E_{\theta}\) and \(Q_{\theta}\) (see (3.6) and (3.7)) are continuous in \(S\) and \(y\), \({\mathrm{d}}\mathbb{P}^{B}\otimes{\mathrm{d}}\theta\)-a.s. Also, from part (c) of Lemma 3.3, we have

$$ \sup_{0\leq n\leq N- 1} \sup_{\theta\in[0,1]} | \bar{R}^{ \varepsilon_{N}}_{\theta}({t_{n}}, X_{t_{n}})| \underset{N\to\infty}{\longrightarrow}0, $$

\({\mathrm{d}}\mathbb{P}^{W}\otimes{\mathrm{d}}\mathbb {P}^{B}\)-almost surely.

2) Seeing that the second derivative \(\ell_{\gamma}''\) is discontinuous at 0 and the set

$$\begin{aligned} {\mathcal{A}}:=\Big\{ (\omega,t,\theta)\in\Omega\times[0,T] \times[0,1]\ :\ E_{\theta}\Big( G\big(t,X_{t}(\omega)\big), F\big(t,X_{t}( \omega)\big)\Big)(\omega) = 0\Big\} \end{aligned}$$
(3.16)

has measure zero (see Proposition A.3 in the Appendix), it holds that

$$\begin{aligned} &\ell_{\gamma}''\Big(E_{\theta}\big( G( \bar{\varphi}_{t}^{N},X_{ \varphi_{t}^{N}}), F( \bar{\varphi}_{t}^{N},X_{ \varphi_{t}^{N}}) \big) + \bar{R}^{\varepsilon_{N}}_{\theta}( \bar{\varphi}_{t}^{N},X_{ \varphi_{t}^{N}})\Big) \\ & \underset{N\to\infty}{\longrightarrow}\ell_{\gamma}''\Big(E_{\theta}\big( G(t,X_{t}), F(t,X_{t})\big)\Big), \end{aligned}$$

\({\mathrm{d}}\mathbb{P}^{W}\otimes{\mathrm{d}}\mathbb {P}^{B}\otimes{\mathrm{d}}t \otimes{\mathrm{d}}\theta\)-almost surely.

3) Because of the boundedness of \(\ell_{\gamma}''\) and the polynomial growth of \({\sigma},{ v},D_{x} { v},D^{2}_{x} { v}\), we have

$$\begin{aligned} & \Big|\ell_{\gamma}''\left(E_{\theta}\big(\Gamma( \bar{\varphi }_{t}^{N},X_{ \varphi_{t}^{N}}),F( \bar{\varphi}_{t}^{N},X_{ \varphi_{t}^{N}}) \big) + \bar{R}^{\varepsilon_{N}}_{\theta}( \bar{\varphi}_{t}^{N},X_{ \varphi_{t}^{N}})\right) \\ &\times Q_{\theta}\big( G( \bar{\varphi}_{t}^{N},X_{ \varphi_{t}^{N}}), F( \bar{\varphi}_{t}^{N},X_{ \varphi_{t}^{N}})\big)\Big| \\ &\le C\bigg(1 + \sup_{t\in[0,T]}|X_{t}| + {\left\vert B_{\theta}\right\vert} + {\left\vert\int_{0}^{\theta} B_{\theta}{\mathrm{d}}B_{\theta}^{\intercal}\right\vert}\bigg)^{q} \end{aligned}$$

for some positive constants \(C\) and \(q\). By dominated convergence, we conclude that

$$\begin{aligned} &\mathbb{E}^{W\otimes B}\bigg[\int_{0}^{T}\int_{0}^{1}\ell_{\gamma }''\Big(E_{\theta}\big(\Gamma( \bar{\varphi}_{t}^{N},X_{ \varphi_{t}^{N}}),F( \bar{\varphi}_{t}^{N},X_{ \varphi_{t}^{N}}) \big) + \bar{R}^{\varepsilon_{N}}_{\theta}( \bar{\varphi}_{t}^{N},X_{ \varphi_{t}^{N}})\Big) \\ &\phantom{=:}\qquad\qquad\quad\ \ \times Q_{\theta}\big(\Gamma( \bar{\varphi}_{t}^{N},X_{ \varphi_{t}^{N}}),F( \bar{\varphi}_{t}^{N},X_{ \varphi_{t}^{N}})\big){\mathrm{d}}\theta\ {\mathrm{d}}t\bigg] \\ &\underset{N\to\infty}{\longrightarrow}\mathbb{E}^{W\otimes B} \bigg[\int_{0}^{T}\int_{0}^{1}\ell_{\gamma}''\Big(E_{\theta}\big( \Gamma(t,X_{t}),F(t,X_{t})\big)\Big)Q_{\theta}\big(\Gamma (t,X_{t}),F(t,X_{t}) \big){\mathrm{d}}\theta\ {\mathrm{d}}t\bigg]. \end{aligned}$$

This completes the proof of Theorem 2.6. □

3.5 Proof of Proposition 3.1

For the sake of conciseness, we set \(u = u^{(n+1)}\). By substituting \(X^{\varepsilon_{N}}_{\theta}\) in (3.1) into \({\mathcal{E}}^{\varepsilon_{N}}_{\theta}\) in (3.4), we get

$$\begin{aligned} {\mathcal{E}}^{\varepsilon_{N}}_{\theta}& = u ({t_{n}}+ \theta \varepsilon_{N}, X^{\varepsilon_{N}}_{\theta}) - u({t_{n}},\xi)- \varepsilon_{N}\int_{0}^{\theta}D_{x} u({t_{n}},\xi){\mu}({t_{n}}+ \theta'\varepsilon_{N},X_{\theta'}^{\varepsilon_{N}} ){\mathrm{d}} \theta' \\ &\phantom{=:} -\varepsilon_{N}^{1/2}\int_{0}^{\theta}D_{x} u({t_{n}},\xi){ \sigma}({t_{n}}+ \theta'\varepsilon_{N},X_{\theta'}^{\varepsilon_{N}} ){\mathrm{d}}B_{\theta'}, \end{aligned}$$
(3.17)

where \(D_{x} u(\cdot,\cdot)\) is a row vector.

In the proof, we apply the Itô–Tanaka formula to \(\ell_{\gamma}({\mathcal{E}}^{\varepsilon_{N}}_{\theta})\) between \(\theta= 0\) and \(\theta= 1\) and perform some Taylor–Itô expansions in terms of \(\varepsilon_{N}\). Because \(u,{\partial_{t} u},D_{x} u,D^{2}_{x} u\) are in \(\text{H}^{1/2,1}_{\mathrm{loc},\mathrm{ pol}}\), we have \(u\in C^{1,2}([{t_{n}},{t_{n+1}}]\times{\mathbb{R}}^{d};{ \mathbb{R}})\). Applying Itô’s formula to \(u ({t_{n}}+ \theta\varepsilon_{N}, X^{\varepsilon_{N}}_{\theta}) \) yields

$$\begin{aligned} u({t_{n}}+ \theta\varepsilon_{N}, X^{\varepsilon_{N}}_{\theta}) - u({t_{n}}, \xi)& =\varepsilon_{N}^{1/2}\int_{0}^{\theta} (D_{x} u { \sigma}) ({t_{n}}+ \theta'\varepsilon_{N},X_{\theta'}^{ \varepsilon_{N}} ){\mathrm{d}}B_{\theta'} \\ & \phantom{=:}+\varepsilon_{N}\int_{0}^{\theta} ({{\mathcal{L}}}_{{t_{n}}+ \theta'\varepsilon_{N}}u ) ({t_{n}}+ \theta'\varepsilon_{N},X_{ \theta'}^{\varepsilon_{N}} ){\mathrm{d}}\theta' \\ & \phantom{=:}+\varepsilon_{N}\int_{0}^{\theta}\big( (D_{x} u ){\mu}\big) ({t_{n}}+ \theta'\varepsilon_{N},X_{\theta'}^{\varepsilon_{N}} ){\mathrm{d}} \theta'. \end{aligned}$$
(3.18)

Here, we write \((\Delta D_{x} u) (t,\zeta) = D_{x} u (t,\zeta) - D_{x} u ({t_{n}},\xi)\) for any \(t\in[{t_{n}},{t_{n+1}}]\) and \(\zeta\in{\mathbb{R}}^{d}\). Replacing (3.18) in (3.17) leads to

$$\begin{aligned} {\mathcal{E}}^{\varepsilon_{N}}_{\theta}& = \varepsilon_{N}^{1/2} \int_{0}^{\theta}\big( (\Delta D_{x} u ){\sigma}\big) ({t_{n}}+ \theta'\varepsilon_{N},X_{\theta'}^{\varepsilon_{N}} ){\mathrm{d}}B_{ \theta'} \\ & \phantom{=:}+\varepsilon_{N}\int_{0}^{\theta} ({{\mathcal{L}}}_{{t_{n}}+ \theta'\varepsilon_{N}}u ) ({t_{n}}+ \theta'\varepsilon_{N},X_{ \theta'}^{\varepsilon_{N}} ){\mathrm{d}}\theta' \\ & \phantom{=:}+\varepsilon_{N}\int_{0}^{\theta}\big( (\Delta D_{x} u ){\mu} \big) ({t_{n}}+ \theta'\varepsilon_{N},X_{\theta'}^{\varepsilon_{N}} ){\mathrm{d}}\theta'. \end{aligned}$$
(3.19)

Now we first use that \(u\) solves the PDE (1.5) to simplify the second term above. Then we apply the Itô–Tanaka formula to the convex function \(\ell_{\gamma}\) (see [23, Theorem VI.1.5 and Corollary VI.1.6]) composed with the process \({\mathcal{E}}^{\varepsilon_{N}}_{\theta}\) between \(\theta= 0\) and \(\theta= 1\). Because \(\ell_{\gamma}'(y) = \ell_{\gamma}''(y)y\) for all \(y \in{\mathbb{R}}\), we get

$$\begin{aligned} \ell_{\gamma}({\mathcal{E}}^{\varepsilon_{N}}_{1} )& = - \varepsilon_{N}\int_{0}^{1}\ell_{\gamma}'' ({\mathcal{E}}^{ \varepsilon_{N}}_{\theta}){\mathcal{E}}^{\varepsilon_{N}}_{\theta }F^{(n+1)}({t_{n}}+ \theta\varepsilon_{N}, X^{\varepsilon_{N}}_{\theta}){\mathrm {d}}\theta \\ & \phantom{=:}+\varepsilon_{N}^{1/2}\int_{0}^{1}\ell_{\gamma}'' ({\mathcal{E}}^{ \varepsilon_{N}}_{\theta}){\mathcal{E}}^{\varepsilon_{N}}_{\theta }\big( (\Delta D_{x} u ){\sigma}\big) ({t_{n}}+ \theta \varepsilon_{N}, X^{\varepsilon_{N}}_{\theta}){\mathrm{d}}B_{\theta} \\ & \phantom{=:}+\varepsilon_{N}\int_{0}^{1}\ell_{\gamma}'' ({\mathcal{E}}^{ \varepsilon_{N}}_{\theta}){\mathcal{E}}^{\varepsilon_{N}}_{\theta }\big( (\Delta D_{x} u ){\mu}\big) ({t_{n}}+ \theta\varepsilon_{N}, X^{\varepsilon_{N}}_{\theta}){\mathrm{d}}\theta \\ &\phantom{=:} +\frac{1}{2}\varepsilon_{N}\int_{0}^{1}\ell_{\gamma}'' ({ \mathcal{E}}^{\varepsilon_{N}}_{\theta}) {\big\vert \big( (\Delta D_{x} u ){\sigma}\big) ({t_{n}}+ \theta \varepsilon_{N},X^{\varepsilon_{N}}_{\theta})\big\vert }^{2}{ \mathrm{d}}\theta. \end{aligned}$$

Considering \(T^{\varepsilon_{N}}({t_{n}},\xi)\) in (3.5), taking the expectation of the above expression and dividing by \(\varepsilon_{N}^{2}\) gives

$$ T^{\varepsilon_{N}}({t_{n}},\xi)= T^{\varepsilon_{N}}_{1}({t_{n}}, \xi)+ T^{\varepsilon_{N}}_{2}({t_{n}},\xi)+ T^{\varepsilon_{N}}_{3}({t_{n}}, \xi), $$
(3.20)

where

$$\begin{aligned} T^{\varepsilon_{N}}_{1}({t_{n}},\xi)&:= -\varepsilon_{N}^{ - 1}\mathbb{E}^{B}\bigg[\int_{0}^{1}\ell _{\gamma}'' ({\mathcal{E}}^{\varepsilon_{N}}_{\theta}){\mathcal{E}}^{ \varepsilon_{N}}_{\theta}F^{(n+1)}({t_{n}}+ \theta\varepsilon_{N}, X^{ \varepsilon_{N}}_{\theta}){\mathrm{d}}\theta\bigg], \end{aligned}$$
(3.21)
$$\begin{aligned} T^{\varepsilon_{N}}_{2}({t_{n}},\xi)&:= \frac{1}{2}\varepsilon_{N}^{ - 1}\mathbb{E}^{B}\bigg[\int_{0}^{1} \ell_{\gamma}'' ({\mathcal{E}}^{\varepsilon_{N}}_{\theta}) {\big\vert \big( (\Delta D_{x} u ){\sigma}\big) ({t_{n}}+ \theta \varepsilon_{N},X^{\varepsilon_{N}}_{\theta})\big\vert }^{2}{ \mathrm{d}}\theta\bigg], \end{aligned}$$
(3.22)
$$\begin{aligned} T^{\varepsilon_{N}}_{3}({t_{n}},\xi)&:= \varepsilon_{N}^{ - 1}\mathbb{E}^{B}\bigg[\int_{0}^{1}\ell_{\gamma }'' ({\mathcal{E}}^{\varepsilon_{N}}_{\theta}){\mathcal{E}}^{ \varepsilon_{N}}_{\theta}\big( (\Delta D_{x} u ){\mu}\big) ({t_{n}}+ \theta\varepsilon_{N}, X^{\varepsilon_{N}}_{\theta}){\mathrm {d}}\theta \bigg]. \end{aligned}$$
(3.23)

Here we have used that the stochastic integral in \(\ell_{\gamma}({\mathcal{E}}^{\varepsilon_{N}}_{1})\) has expectation zero, which follows directly from \(\mathbb{E}[ \int_{0}^{1}|{\mathcal{E}}^{\varepsilon_{N}}_{\theta }|^{4}{\mathrm{d}}\theta]< + \infty\) and from the polynomial growth of \({\sigma}\) and \(D_{x} u\) (because \({\sigma},D_{x} u\in\text{H}^{1/2,1}_{\mathrm{loc},\mathrm{ pol}}\)). Now we analyse the expansion of \({\mathcal{E}}^{\varepsilon_{N}}_{\theta}\) and then apply it to \(T_{i}^{\varepsilon_{N}}({t_{n}},\xi)\) for \(i = 1,2,3\).

1) Stochastic Taylor expansion of\(( (\Delta D_{x} u ){\sigma}) ({t_{n}}+ \theta\varepsilon_{N}, X^{\varepsilon_{N}}_{\theta})\)and\({\mathcal{E}}^{\varepsilon_{N}}_{\theta}\). We approximate \(\Delta D_{x} u\) up to order \(\varepsilon_{N}^{1/2}\) by setting

$$\begin{aligned} (\Delta D_{x} u ) ({t_{n}}+ \theta\varepsilon_{N}, X^{ \varepsilon_{N}}_{\theta}) = \varepsilon_{N}^{1/2}B_{\theta}^{ \intercal}\big({\sigma}^{\intercal} (D^{2}_{x} u )\big)({t_{n}}, \xi)+ r_{\theta}^{\varepsilon_{N}}, \end{aligned}$$
(3.24)

where \(\Delta D_{x} u\) and \(r_{\theta}^{\varepsilon_{N}}\) are row vectors.

Lemma 3.4

Let\(p\ge2\). Under the assumptions of Theorem 2.6, we have

  1. (a)

    \(\sup_{0\leq n\leq N- 1} \sup_{\theta\in[0,1]}\mathbb{E}^{B}[ {\vert r_{\theta}^{\varepsilon_{N}} \vert}^{p} ]\le C_{n,N}(\xi) \varepsilon_{N}^{p}\),

  2. (b)

    \(\sup_{0\leq n\leq N- 1} \sup_{\theta\in[0,1]}\mathbb{E}^{B}[ {\vert(\Delta D_{x} u ) ({t_{n}}+ \theta\varepsilon_{N},X^{\varepsilon _{N}}_{\theta}) \vert}^{p} ] \le C_{n,N}(\xi)\varepsilon_{N}^{p/2}\),

for some constant\(C_{n,N}(\xi)\in C_{\mathrm{pol}}\).

Proof of Lemma 3.4

(b) This follows directly from Lemma A.1 and from part (a), using standard computations.

(a) Because \(D_{x} u,{\partial_{t} D_{x} u},D^{2}_{x} u, D^{2}_{x}D_{x} u\) are in \(\text{H}^{1/2,1}_{\mathrm{loc},\mathrm{ pol}}\), we have that \(D_{x} u\) is in \(C^{1,2}([{t_{n}},{t_{n+1}}]\times{\mathbb{R}}^{d})\). By applying Itô’s formula to \(D_{x} u ({t_{n}}+ \theta\varepsilon_{N}, X^{\varepsilon _{N}}_{\theta})\), we get

$$\begin{aligned} r_{\theta}^{\varepsilon_{N}} & = \varepsilon_{N}\int_{0}^{\theta} \big({{\mathcal{L}}}_{{t_{n}}+ \theta'\varepsilon_{N}}D_{x} u + { \mu}^{\intercal}(D^{2}_{x} u)\big) ({t_{n}}+ \theta'\varepsilon_{N},X_{ \theta'}^{\varepsilon_{N}} ){\mathrm{d}}\theta' \\ & \phantom{=:}+\varepsilon_{N}^{1/2}\int_{0}^{\theta}{\mathrm{d}}B_{\theta'}^{ \intercal} \Big(\big({\sigma}^{\intercal} (D^{2}_{x} u )\big) ({t_{n}}+ \theta'\varepsilon_{N},X_{\theta'}^{\varepsilon_{N}} ) - \big({ \sigma}^{\intercal} (D^{2}_{x} u )\big)({t_{n}},\xi)\Big). \end{aligned}$$

Using the Hölder inequality, the BDG inequality and the polynomial growth conditions on the functions (because \({\sigma},D_{x} u,{\partial_{t} D_{x} u},D^{2}_{x} u, D_{x}^{2}D_{x} u \in\text{H}^{1/2,1}_{\mathrm{loc},\mathrm{ pol}}\)), we estimate

$$\begin{aligned} & \mathbb{E}^{B}[{\vert r_{\theta}^{\varepsilon_{N}} \vert}^{p} ] \\ &\le2^{p - 1}\varepsilon_{N}^{p}\int_{0}^{\theta}\mathbb{E}^{B} \big[ {\big\vert \big({{\mathcal{L}}}_{{t_{n}}+ \theta'\varepsilon_{N}}D_{x} u + {\mu}^{\intercal}(D^{2}_{x} u)\big) ({t_{n}}+ \theta'\varepsilon _{N},X_{\theta'}^{\varepsilon_{N}} )\big\vert }^{p} \big]{\mathrm{d}}\theta'\theta \\ &\phantom{=:} +2^{p - 1}C_{\text{BDG}}\varepsilon_{N}^{p/2}\int_{0}^{\theta} \mathbb{E}^{B}\big[\big\vert \big({\sigma}^{\intercal} (D^{2}_{x} u ) \big) ({t_{n}}+ \theta'\varepsilon_{N},X_{\theta'}^{\varepsilon_{N}} ) \\ &\phantom{=:}\qquad\qquad\qquad\qquad\qquad\quad- \big({\sigma}^{ \intercal} (D^{2}_{x} u )\big)({t_{n}},\xi)\big\vert ^{p}\big]{\mathrm{d}} \theta' . \end{aligned}$$

Using the growth conditions from the assumptions and applying the bounds (A.1) in Lemma A.1 to \({\sigma}^{\intercal}(D^{2}_{x} u)\in\text{H}^{1/2,1}_{\mathrm {loc},\mathrm{ pol}} \) (because \({\sigma}\) and \(D^{2}_{x} u\) are in \(\text{H}^{1/2,1}_{\mathrm{loc},\mathrm{ pol}}\)), we get the announced estimate. □

Plugging the decomposition (3.24) into the expression of \({\mathcal{E}}^{\varepsilon_{N}}_{\theta}\) in (3.19) gives

$$ {\mathcal{E}}^{\varepsilon_{N}}_{\theta}= \varepsilon_{N}E_{\theta }\big( G^{(n+1)}({t_{n}},\xi), F^{(n+1)}({t_{n}},\xi)\big) + \varepsilon_{N}R^{\varepsilon_{N}}_{\theta}({t_{n}},\xi), $$
(3.25)

where

$$ E_{\theta}\big( G^{(n+1)}({t_{n}},\xi), F^{(n+1)}({t_{n}},\xi)\big) = \int_{0}^{\theta}B_{\theta'}^{\intercal} G^{(n+1)}({t_{n}},\xi){ \mathrm{d}}B_{\theta'} - F^{(n+1)}({t_{n}},\xi)\theta $$
(3.26)

and

$$\begin{aligned} R^{\varepsilon_{N}}_{\theta}({t_{n}},\xi)& = - \int_{0}^{\theta} \big( F^{(n+1)}({t_{n}}+ \theta'\varepsilon_{N},X_{\theta'}^{ \varepsilon_{N}}) - F^{(n+1)}({t_{n}},\xi)\big){\mathrm{d}}\theta' \\ & \phantom{=:}+\int_{0}^{\theta}B_{\theta'}^{\intercal}\big({\sigma}^{ \intercal} (D^{2}_{x} u )\big)({t_{n}},\xi)\big({\sigma}({t_{n}}+ \theta'\varepsilon_{N},X_{\theta'}^{\varepsilon_{N}} ) - {\sigma}({t_{n}}, \xi)\big){\mathrm{d}}B_{\theta'} \\ & \phantom{=:}+\varepsilon_{N}^{ - 1/2}\int_{0}^{\theta} r_{\theta'}^{ \varepsilon_{N}}{\sigma}({t_{n}}+ \theta'\varepsilon_{N},X_{ \theta'}^{\varepsilon_{N}} ){\mathrm{d}}B_{\theta'} \\ & \phantom{=:}+\varepsilon_{N}^{1/2}\int_{0}^{\theta}B_{\theta'}^{\intercal} \big({\sigma}^{\intercal} (D^{2}_{x} u )\big)({t_{n}},\xi){\mu}({t_{n}}+ \theta'\varepsilon_{N},X_{\theta'}^{\varepsilon_{N}} ){\mathrm{d}} \theta' \\ & \phantom{=:}+\int_{0}^{\theta} r_{\theta'}^{\varepsilon_{N}}{\mu}({t_{n}}+ \theta'\varepsilon_{N},X_{\theta'}^{\varepsilon_{N}} ){\mathrm{d}} \theta'. \end{aligned}$$
(3.27)

Lemma 3.5

Under the assumptions of Theorem 2.6, we have

  1. (a)

    \(\sup_{\theta\in[0,1]}\mathbb{E}^{B}[ |E_{\theta}( G^{(n+1)}({t_{n}}, \xi), F^{(n+1)}({t_{n}},\xi)) |^{2} ]\le C_{n,N}(\xi)\),

  2. (b)

    \(\sup_{\theta\in[0,1]}\mathbb{E}^{B}[ | R^{\varepsilon_{N}}_{ \theta}({t_{n}},\xi)|^{2} ]\le C_{n,N}(\xi)\varepsilon_{N}\),

for some constant\(C_{n,N}(\xi)\in C_{\mathrm{pol}}\).

Proof

(a) From (3.26), we get

$$\begin{aligned} & \mathbb{E}^{B}\Big[\Big|E_{\theta}\Big(\big({\sigma }^{\intercal}( D^{2}_{x} u ){\sigma}\big)({t_{n}},\xi), F^{(n+1)}({t_{n}},\xi) \Big)\Big|^{2}\Big] \\ & \le2 | F^{(n+1)}({t_{n}},\xi)|^{2} \theta^{2} + 2 \mathbb{E}^{B} \bigg[\bigg| \int_{0}^{\theta}B_{\theta'}^{\intercal}\Big(\big({ \sigma}^{\intercal}(D^{2}_{x} u){\sigma}\big)\Big)({t_{n}},\xi){ \mathrm{d}}B_{\theta'}\bigg|^{2}\bigg] , \end{aligned}$$

and we conclude by using the Itô isometry and the growth conditions on the coefficients (because \({\sigma},u,D_{x} u,D^{2}_{x} u,f\in\text{H}^{1/2,1}_{\mathrm {loc},\mathrm{ pol}} \)).

(b) From (3.27), we estimate

$$\begin{aligned} & \mathbb{E}^{B}[ | R^{\varepsilon_{N}}_{\theta}({t_{n}},\xi)|^{2} ] \\ &\le5\int_{0}^{1}\mathbb{E}^{B}\big[ | F^{(n+1)}({t_{n}}+ \theta' \varepsilon_{N},X_{\theta'}^{\varepsilon_{N}} ) - F^{(n+1)}({t_{n}}, \xi)|^{2} \big]{\mathrm{d}}\theta' \\ &\quad +5\mathbb{E}^{B}\bigg[\int_{0}^{1}\big\vert B_{\theta'}^{ \intercal}\big({\sigma}^{\intercal} (D^{2}_{x} u )\big)({t_{n}}, \xi)\big({\sigma}({t_{n}}+ \theta'\varepsilon_{N},X_{\theta'}^{ \varepsilon_{N}} ) - {\sigma}({t_{n}},\xi)\big)\big\vert ^{2}{\mathrm{d}} \theta'\bigg] \\ &\quad +5\varepsilon_{N}^{ - 1}\mathbb{E}^{B}\bigg[\int_{0}^{1} \vert r_{ \theta'}^{\varepsilon_{N}}{\sigma}({t_{n}}+ \theta'\varepsilon_{N},X_{ \theta'}^{\varepsilon_{N}} ) \vert^{2}{\mathrm{d}}\theta' \bigg] \\ &\quad +5\varepsilon_{N}\int_{0}^{1}\mathbb{E}^{B}\big[\big|B_{\theta'}^{ \intercal}\big({\sigma}^{\intercal} (D^{2}_{x} u )\big)({t_{n}}, \xi){\mu}({t_{n}}+ \theta'\varepsilon_{N},X_{\theta'}^{ \varepsilon_{N}} )\big|^{2}\big]{\mathrm{d}}\theta' \\ &\quad +5\int_{0}^{1}\mathbb{E}^{B}[ | r_{\theta'}^{\varepsilon_{N}}{ \mu}({t_{n}}+ \theta'\varepsilon_{N},X_{\theta'}^{\varepsilon_{N}} ) |^{2} ]{\mathrm{d}}\theta' \end{aligned}$$

for all \(\xi\in{\mathbb{R}}^{d}\), \(n\in\{0,\ldots,N - 1\}\) and \(\theta\in[0,1]\). Now we deduce the inequality (b) by using \(f\), \(u\), \(D_{x} u\), \(D^{2}_{x} u\), \({\sigma}\in\text{H}^{1/2,1}_{\mathrm{loc},\mathrm{ pol}}\) and by applying Lemmas A.1 and 3.4. □

2) Expansion of\(T^{\varepsilon_{N}}_{1}({t_{n}},\xi)\)and\(T^{\varepsilon_{N}}_{3}({t_{n}},\xi)\). From \(\ell_{\gamma}''(y) = \ell_{\gamma}''\left(y/\varepsilon_{N} \right)\) for all \(y \in{\mathbb{R}}\) and the expansion of \({\mathcal{E}}^{\varepsilon_{N}}_{\theta}\) in (3.25), we get

$$ \ell_{\gamma}'' ({\mathcal{E}}^{\varepsilon_{N}}_{\theta}) = \ell _{\gamma}''\Big(E_{\theta}\big( G^{(n+1)}({t_{n}},\xi), F^{(n+1)}({t_{n}}, \xi)\big) + R^{\varepsilon_{N}}_{\theta}({t_{n}},\xi)\Big). $$
(3.28)

By combining this with (3.21) and (3.25), we obtain

$$\begin{aligned} T^{\varepsilon_{N}}_{1}({t_{n}},\xi)&= -\mathbb{E}^{B}\bigg[\int_{0}^{1} \ell_{\gamma}''\Big(E_{\theta}\big( G^{(n+1)}({t_{n}},\xi), F^{(n+1)}({t_{n}}, \xi)\big)+ R^{\varepsilon_{N}}_{\theta}({t_{n}},\xi)\Big) \\ &\phantom{=:}\qquad\qquad\times F({t_{n}},\xi)E_{\theta}\big( G^{(n+1)}({t_{n}}, \xi), F^{(n+1)}({t_{n}},\xi)\big){\mathrm{d}}\theta\bigg] \\ &\phantom{=:}+C_{1}^{\varepsilon_{N}}({t_{n}},\xi), \end{aligned}$$

where

$$\begin{aligned} C_{1}^{\varepsilon_{N}}({t_{n}},\xi):=-\mathbb{E}^{B}\bigg[\int_{0}^{1}& \ell_{\gamma}''\Big(E_{\theta}\big( G^{(n+1)}({t_{n}},\xi), F^{(n+1)}({t_{n}}, \xi)\big) + R^{\varepsilon_{N}}_{\theta}({t_{n}},\xi)\Big) \\ &\times R^{\varepsilon_{N}}_{\theta}({t_{n}},\xi) F^{(n+1)}({t_{n}}+ \theta\varepsilon_{N}, X^{\varepsilon_{N}}_{\theta}){\mathrm {d}}\theta \bigg] \\ -\mathbb{E}^{B}\bigg[\int_{0}^{1}&\ell_{\gamma}'' ({\mathcal{E}}^{ \varepsilon_{N}}_{\theta})\big( F^{(n+1)}({t_{n}}+ \theta \varepsilon_{N}, X^{\varepsilon_{N}}_{\theta}) - F^{(n+1)}({t_{n}}, \xi)\big) \\ &\times E_{\theta}\big( G^{(n+1)}({t_{n}},\xi), F^{(n+1)}({t_{n}}, \xi)\big){\mathrm{d}}\theta\bigg] . \end{aligned}$$
(3.29)

The estimates of \(C_{1}^{\varepsilon_{N}}({t_{n}},\xi)\) and \(T^{\varepsilon_{N}}_{3}({t_{n}},\xi)\) are summarised in the following lemma.

Lemma 3.6

Under the assumptions of Theorem 2.6, we have

  1. (a)

    \(\mathbb{E}^{B}[ |C_{1}^{\varepsilon_{N}}({t_{n}},\xi)| ]\le C_{n,N}( \xi)\varepsilon_{N}^{1/2}\),

  2. (b)

    \(\mathbb{E}^{B}[ | T^{\varepsilon_{N}}_{3}({t_{n}},\xi)| ]\le C_{n,N}( \xi)\varepsilon_{N}^{1/2}\),

for some constant\(C_{n,N}(\xi)\in C_{\mathrm{pol}}\).

Proof

(a) From (3.29), it readily follows that

$$\begin{aligned} |C_{1}^{\varepsilon_{N}}({t_{n}},\xi)| &\le K\mathbb{E}^{B}\bigg[ \int_{0}^{1} | F^{(n+1)}({t_{n}}+ \theta\varepsilon_{N}, X^{ \varepsilon_{N}}_{\theta}) | | R^{\varepsilon_{N}}_{\theta}({t_{n}}, \xi)|{\mathrm{d}}\theta\bigg] \\ & \phantom{=:}+K\mathbb{E}^{B}\bigg[\int_{0}^{1} | F^{(n+1)}({t_{n}}+ \theta \varepsilon_{N}, X^{\varepsilon_{N}}_{\theta}) - F^{(n+1)}({t_{n}}, \xi)| \\ &\phantom{=:}\qquad\qquad\quad\,\,\,\times\big|E_{\theta}\big( G^{(n+1)}({t_{n}}, \xi), F^{(n+1)}({t_{n}},\xi)\big)\big|{\mathrm{d}}\theta\bigg], \end{aligned}$$

where \(K\) is an upper bound for \(\ell_{\gamma}''\). For the first term above, we use that \(F^{(n+1)}\) has polynomial growth in its arguments (because \(u^{(n+1)},D_{x} u^{(n+1)},D^{2}_{x} u^{(n+1)}, f\in \text{H}^{1/2,1}_{\mathrm{loc},\mathrm{ pol}}\)) and Lemma 3.5 (b). For the second term, applying the Cauchy–Schwarz inequality with Lemmas A.1 and 3.5 (a) yields

$$ |C_{1}^{\varepsilon_{N}}({t_{n}},\xi)|\le C_{n,N}(\xi)\varepsilon_{N}^{1/2} $$

as announced.

(b) Similarly to (a), from (3.23) we write

$$\begin{aligned} | T^{\varepsilon_{N}}_{3}({t_{n}},\xi)| & \le K\int_{0}^{1} \mathbb{E}^{B}\big[\big|E_{\theta}\big( G^{(n+1)}({t_{n}},\xi), F^{(n+1)}({t_{n}}, \xi)\big) + R^{\varepsilon_{N}}_{\theta}({t_{n}},\xi)\big| \\ &\phantom{=}\qquad\qquad\quad\times\big|\big( (\Delta D_{x} u ){\mu} \big) ({t_{n}}+ \theta\varepsilon_{N}, X^{\varepsilon_{N}}_{\theta}) \big|\big]{\mathrm{d}}\theta \\ & \le K\int_{0}^{1}\sqrt{\mathbb{E}^{B}\big[\big|E_{\theta}\big( G^{(n+1)}({t_{n}}, \xi), F^{(n+1)}({t_{n}},\xi)\big) + R^{\varepsilon_{N}}_{\theta}({t_{n}}, \xi)\big|^{2}\big]} \\ &\phantom{=:}\qquad\,\,\,\,\,\, \times\sqrt{\mathbb{E}^{B}\big[\big|\big( ( \Delta D_{x} u ){\mu}\big) ({t_{n}}+ \theta\varepsilon_{N}, X^{ \varepsilon_{N}}_{\theta})\big|^{2}\big]}{\mathrm{d}}\theta. \end{aligned}$$

It is now straightforward to conclude that the above is bounded by \(C_{n,N}(\xi)\varepsilon_{N}^{1/2}\) by using Lemmas A.1, 3.4 and 3.5. □

3) Expansion of\(C_{2}^{\varepsilon_{N}}({t_{n}},\xi)\). Using the expansion of \(\Delta D_{x} u\) in (3.24), we obtain

$$\begin{aligned} & {\left\vert\big( (\Delta D_{x} u ){\sigma}\big) ({t_{n}}+ \theta \varepsilon_{N},X^{\varepsilon_{N}}_{\theta})\right\vert}^{2} \\ & = \varepsilon_{N} {\big\vert B_{\theta}^{\intercal}\big({\sigma}^{\intercal }(D^{2}_{x} u)\big)({t_{n}},\xi){\sigma}({t_{n}}+ \theta\varepsilon _{N},X^{\varepsilon_{N}}_{\theta})\big\vert }^{2} + {\vert r_{\theta}^{\varepsilon_{N}}{\sigma}({t_{n}}+ \theta\varepsilon _{N},X^{\varepsilon_{N}}_{\theta}) \vert}^{2} \\ & \phantom{=:}+2\varepsilon_{N}^{1/2}\big\langle B_{\theta}^{\intercal}\big({ \sigma}^{\intercal}(D^{2}_{x} u)\big)({t_{n}},\xi){\sigma}({t_{n}}+ \theta\varepsilon_{N}, X^{\varepsilon_{N}}_{\theta}) , r_{\theta}^{ \varepsilon_{N}}{\sigma}({t_{n}}+ \theta\varepsilon_{N}, X^{ \varepsilon_{N}}_{\theta})\big\rangle . \end{aligned}$$

Using the identity \({\sigma}(t,\zeta) = \Delta{\sigma}(t,\zeta) + {\sigma}({t_{n}}, \xi)\) in the first term of the previous equation, we get

$$\begin{aligned} {\left\vert\big( (\Delta D_{x} u ){\sigma}\big) ({t_{n}}+ \theta \varepsilon_{N},X^{\varepsilon_{N}}_{\theta})\right\vert}^{2} = \varepsilon_{N} {\big\vert B_{\theta}^{\intercal}\big({\sigma}^{\intercal }(D^{2}_{x} u){\sigma}\big)({t_{n}},\xi)\big\vert }^{2}+ c _{\theta}^{\varepsilon_{N}}({t_{n}},\xi), \end{aligned}$$
(3.30)

where

$$\begin{aligned} & c_{\theta}^{\varepsilon_{N}}({t_{n}},\xi) \\ & = \varepsilon_{N} {\big\vert B_{\theta}^{\intercal}\big({\sigma}^{\intercal }(D^{2}_{x} u)\big)({t_{n}},\xi)\Delta{\sigma}({t_{n}}+ \theta \varepsilon_{N},X^{\varepsilon_{N}}_{\theta})\big\vert }^{2} + {\vert r_{\theta}^{\varepsilon_{N}}{\sigma}({t_{n}}+ \theta\varepsilon _{N},X^{\varepsilon_{N}}_{\theta}) \vert}^{2} \\ & \phantom{=:}+2\varepsilon_{N}\big\langle B_{\theta}^{\intercal}\big({\sigma}^{ \intercal}(D^{2}_{x} u)\big)({t_{n}},\xi)\Delta{\sigma}({t_{n}}+ \theta\varepsilon_{N}, X^{\varepsilon_{N}}_{\theta}) , B_{\theta}^{ \intercal}\big({\sigma}^{\intercal}(D^{2}_{x} u){\sigma}\big)({t_{n}}, \xi)\big\rangle \\ &\phantom{=:} +2\varepsilon_{N}^{1/2}\big\langle B_{\theta}^{\intercal}\big({ \sigma}^{\intercal}(D^{2}_{x} u)\big)({t_{n}},\xi){\sigma}({t_{n}}+ \theta\varepsilon_{N}, X^{\varepsilon_{N}}_{\theta}) , r_{\theta}^{ \varepsilon_{N}}{\sigma}({t_{n}}+ \theta\varepsilon_{N}, X^{ \varepsilon_{N}}_{\theta})\big\rangle . \end{aligned}$$
(3.31)

From (3.30) and (3.28), the expression of \(T^{\varepsilon_{N}}_{2}\) in (3.22) becomes

$$\begin{aligned} T^{\varepsilon_{N}}_{2}({t_{n}},\xi)= \frac{1}{2}\mathbb{E}^{B} \bigg[\int_{0}^{1}&\ell_{\gamma}''\Big(E_{\theta}\big( G^{(n+1)}({t_{n}}, \xi), F^{(n+1)}({t_{n}},\xi)\big) + R^{\varepsilon_{N}}_{\theta}({t_{n}}, \xi)\Big) \\ & \times {\big\vert B_{\theta}^{\intercal}\big({\sigma}^{\intercal }(D^{2}_{x} u){\sigma}\big)({t_{n}},\xi)\big\vert }^{2}{ \mathrm{d}}\theta\bigg] +C_{2}^{\varepsilon_{N}}({t_{n}},\xi), \end{aligned}$$

where

$$\begin{aligned} C_{2}^{\varepsilon_{N}}({t_{n}},\xi):=\frac{1}{2}\mathbb{E}^{B} \bigg[\int_{0}^{1}&\ell_{\gamma}''\Big(E_{\theta}\big( G^{(n+1)}({t_{n}}, \xi), F^{(n+1)}({t_{n}},\xi)\big) + R^{\varepsilon_{N}}_{\theta}({t_{n}}, \xi)\Big) \\ & \times\varepsilon_{N}^{ - 1}c_{\theta}^{\varepsilon_{N}}({t_{n}}, \xi){\mathrm{d}}\theta\bigg]. \end{aligned}$$
(3.32)

The estimate of \(C_{2}^{\varepsilon_{N}}({t_{n}},\xi)\) is summarised in the following lemma.

Lemma 3.7

Under the assumptions of Theorem 2.6, we have

$$ |C_{2}^{\varepsilon_{N}}({t_{n}},\xi)|\le C_{n,N}(\xi)\varepsilon_{N}^{1/2}, $$

for some constant\(C_{n,N}(\xi)\in C_{\mathrm{pol}}\).

Proof

From the expression for \(c_{\theta}^{\varepsilon_{N}}({t_{n}},\xi)\) in (3.31), we write

$$\begin{aligned} \mathbb{E}^{B}[ |c_{\theta}^{\varepsilon_{N}}({t_{n}},\xi)| ] & \le\varepsilon_{N}\mathbb{E}^{B}\big[ {\big\vert B_{\theta}^{\intercal}\big({\sigma}^{\intercal }(D^{2}_{x} u)\big)({t_{n}},\xi)\Delta{\sigma}({t_{n}}+ \theta \varepsilon_{N},X^{\varepsilon_{N}}_{\theta})\big\vert }^{2} \big] \\ & \phantom{=:}+\mathbb{E}^{B}[ {\vert r_{\theta}^{\varepsilon_{N}}{\sigma}({t_{n}}+ \theta\varepsilon _{N},X^{\varepsilon_{N}}_{\theta}) \vert}^{2} ] \\ & \phantom{=:}+2\varepsilon_{N}\sqrt{\mathbb{E}^{B}\big[ {\big\vert B_{\theta}^{\intercal}\big({\sigma}^{\intercal }(D^{2}_{x} u)\big)({t_{n}},\xi)\Delta{\sigma}({t_{n}}+ \theta \varepsilon_{N},X^{\varepsilon_{N}}_{\theta})\big\vert }^{2} \big]} \\ & \phantom{=:} \quad\times\sqrt{\mathbb{E}^{B}\big[ {\big\vert B_{\theta}^{\intercal}\big({\sigma}^{\intercal }(D^{2}_{x} u){\sigma}\big)({t_{n}},\xi)\big\vert }^{2} \big]} \\ & \phantom{=:}+2\varepsilon_{N}^{1/2}\sqrt{\mathbb{E}^{B}\big[ {\big\vert B_{\theta}^{\intercal}\big({\sigma}^{\intercal }(D^{2}_{x} u)\big)({t_{n}},\xi){\sigma}({t_{n}}+ \theta\varepsilon _{N},X^{\varepsilon_{N}}_{\theta})\big\vert }^{2} \big]} \\ & \phantom{=:} \quad\times\sqrt{\mathbb{E}^{B}\big[ {\big\vert r_{\theta}^{\varepsilon_{N}}{\sigma}({t_{n}}+ \theta \varepsilon_{N},X^{\varepsilon_{N}}_{\theta})\big\vert }^{2} \big]} \\ & \leq C_{n,N}(\xi)\varepsilon_{N}^{3/2}. \end{aligned}$$

Again we have used the polynomial growth condition on \({\sigma},D^{2}_{x} u\) and the local regularity condition on \({\sigma}\in\text{H}^{1/2,1}_{\mathrm{loc},\mathrm{ pol}}\) with Lemma A.1, and Lemma 3.4 (a). Consequently, and in view of the definition (3.32) of \(C_{2}^{\varepsilon_{N}}({t_{n}},\xi)\), we get the estimate

$$ |C_{2}^{\varepsilon_{N}}({t_{n}},\xi)|\le\frac{1}{2}|\ell_{\gamma }''|_{\infty} \, \varepsilon_{N}^{ - 1} \sup_{\theta\in[0,1]} \mathbb{E}^{B}[ |c_{\theta}^{\varepsilon_{N}}({t_{n}},\xi)| ], $$

which leads to the announced result. □

4) Expansion of\(T^{\varepsilon_{N}}({t_{n}},\xi)\). From (3.20) and the previous expansions of \(T_{i}^{\varepsilon_{N}}({t_{n}},\xi)\) for \(i = 1,2,3\), we deduce

$$\begin{aligned} T^{\varepsilon_{N}}({t_{n}},\xi)& = \mathbb{E}^{B}\bigg[\int_{0}^{1} \ell_{\gamma}''\Big(E_{\theta}\big( G^{(n+1)}({t_{n}},\xi), F^{(n+1)}({t_{n}}, \xi)\big) + R^{\varepsilon_{N}}_{\theta}({t_{n}},\xi)\Big) \\ & \qquad\qquad\,\,\,\,\, \times Q_{\theta}\big( G^{(n+1)}({t_{n}}, \xi), F^{(n+1)}({t_{n}},\xi)\big){\mathrm{d}}\theta\bigg] \\ &\phantom{=:} +C_{1}^{\varepsilon_{N}}({t_{n}},\xi)+ C _{2}^{\varepsilon_{N}}({t_{n}}, \xi)+ T^{\varepsilon_{N}}_{3}({t_{n}},\xi), \end{aligned}$$

where \(Q_{\theta}\) is defined in (3.7). Since \(C_{1}^{\varepsilon_{N}}({t_{n}},\xi)\), \(C_{2}^{\varepsilon_{N}}({t_{n}},\xi)\) and \(T^{\varepsilon_{N}}_{3}({t_{n}},\xi)\) satisfy \(\varepsilon_{N}^{1/2}\)-bounds, we get the result of Proposition 3.1.  □

3.6 Proof of Lemma 3.3

(a) From the definition of \(E_{\theta}\) in (3.6), it follows that

$$\begin{aligned} & \big|E_{\theta}\big( G^{(n+1)}({t_{n}}, X_{t_{n}}), F^{(n+1)}({t_{n}}, X_{t_{n}})\big) - E_{\theta}\big(\Gamma({t_{n+1}}, X_{t_{n}}),F({t_{n+1}}, X_{t_{n}})\big)\big| \\ & \le\big| f\big({t_{n}}, X_{t_{n}}, u^{(n+1)}({t_{n}}, X_{t_{n}}), D_{x} u^{(n+1)}({t_{n}}, X_{t_{n}}),D^{2}_{x} u^{(n+1)}({t_{n}}, X_{t_{n}}) \big) \\ & \phantom{=::}-f\big({t_{n+1}}, X_{t_{n}}, { v}({t_{n+1}}, X_{t_{n}}), D_{x} { v}({t_{n+1}}, X_{t_{n}}),D^{2}_{x} { v}({t_{n+1}}, X_{t_{n}}) \big) \big| \\ & \phantom{=:}+ {\big\vert \big({\sigma}^{\intercal} (D^{2}_{x} u^{(n+1)} ){\sigma }\big)({t_{n}},X_{t_{n}})- \big({\sigma}^{\intercal} (D^{2}_{x} { v}){\sigma}\big)({t_{n+1}},X_{t_{n}})\big\vert } {\left\vert\int_{0}^{\theta} B_{\theta'}{\mathrm{d}}B_{\theta '}^{\intercal}\right\vert} \\ & \leq C_{n,N}( X_{t_{n}})\varepsilon_{N}^{1/2} \left(1+ {\left\vert\int_{0}^{\theta} B_{\theta'}{\mathrm{d}}B_{\theta '}^{\intercal}\right\vert} \right), \end{aligned}$$

for some constant \(C_{n,N}( X_{t_{n}})\in C_{\mathrm{pol}}\), where we have used Proposition 3.2 and the assumptions on the coefficients, prices and Greeks. Thanks to the Burkholder–Davis–Gundy (BDG) inequalities, we conclude the proof of (a).

(b) From \(\bar{C}^{\varepsilon_{N}}({t_{n}},\xi)\) in (3.14), we get

$$\begin{aligned} \mathbb{E}[ | \bar{C}^{\varepsilon_{N}}({t_{n}}, X_{t_{n}})| ] \le|\ell_{\gamma}''|_{\infty}\int_{0}^{1}\mathbb{E}\big[& \big|Q_{\theta}\big( G^{(n+1)}({t_{n}}, X_{t_{n}}), F^{(n+1)}({t_{n}}, X_{t_{n}}) \big) \\ &- Q_{\theta}\big(\Gamma({t_{n+1}}, X_{t_{n}}),F({t_{n+1}}, X_{t_{n}}) \big) \big|\big]{\mathrm{d}}\theta. \end{aligned}$$

Considering the expression of \(Q_{\theta}\) in (3.7), we can apply the same arguments as for (a). Further details are left to the reader.

(c) Let \(p\geq1\) and set \(Z_{N}:=\sup_{0\leq n\leq N- 1} \sup_{\theta\in[0,1]} | \bar{R}^{ \varepsilon_{N}}_{\theta}({t_{n}}, X_{t_{n}})|^{p}\). From the definition (3.13) of \(\bar{R}^{\varepsilon_{N}}_{\theta}({t_{n}},\xi)\), we can write

$$\begin{aligned} \mathbb{E}[Z_{N}] & \le2^{p - 1}\mathbb{E}\Big[ \sup_{0\leq n \leq N- 1} \sup_{\theta\in[0,1]}\big|E_{\theta}\big( G^{(n+1)}({t_{n}}, X_{t_{n}}), F^{(n+1)}({t_{n}}, X_{t_{n}})\big) \\ &\hspace{40mm}-E_{\theta}\big(\Gamma({t_{n+1}}, X_{t_{n}}),F({t_{n+1}}, X_{t_{n}}) \big)\big|^{p}\Big] \\ &\phantom{=:}+2^{p - 1}\mathbb{E}\Big[ \sup_{0\leq n\leq N- 1} \sup_{\theta \in[0,1]} | R^{\varepsilon_{N}}_{\theta}({t_{n}}, X_{t_{n}})|^{p} \Big]\le K_{p} N\varepsilon_{N}^{p/2} \end{aligned}$$

due to (a) and Lemma 3.8 below. Finally, applying Lemma A.2 to the above \(Z_{N}\) with \(p > 4\), we are done.  □

In the proof, we have used the following result, useful to justify the a.s. convergence to 0 of remainder terms.

Lemma 3.8

Let\(R^{\varepsilon_{N}}_{\theta}({t_{n}},\xi)\)be given by (3.8) and\(p\geq1\). Under the assumptions of Theorem 2.6, there exists a finite positive constant\(K_{p}\), depending on the coefficients\({\mu}\), \({\sigma}\), \(f\), \(u^{(n+1)}\)and their derivatives, such that

$$\begin{aligned} \mathbb{E}\Big[ \sup_{0\leq n\leq N- 1} \sup_{\theta\in[0,1]} | R^{ \varepsilon_{N}}_{\theta}({t_{n}}, X_{t_{n}})|^{p}\Big]\le K_{p} N \varepsilon_{N}^{p/2}. \end{aligned}$$

Proof

We first claim that we have the upper bound

$$ \mathbb{E}^{B}\Big[ \sup_{\theta\in[0,1]} | R^{\varepsilon _{N}}_{\theta}({t_{n}},\xi)|_{2}^{p} \Big]\le C_{n,N}(\xi) \varepsilon_{N}^{p} $$
(3.33)

for some constant \(C_{n,N}(\xi)\in C_{\mathrm{pol}}\). With this control at hand, we complete the proof by using the rough inequality

$$ \mathbb{E}\Big[ \sup_{0 \leq n \leq N - 1} \sup_{\theta\in[0,1]} | R^{\varepsilon_{N}}_{\theta}({t_{n}}, X_{t_{n}})|^{p} \Big] \leq \sum_{n = 0}^{N - 1} \mathbb{E}\Big[ \sup_{\theta\in[0,1]} | R^{ \varepsilon_{N}}_{\theta}({t_{n}}, X_{t_{n}})|^{p} \Big]. $$

So it is enough to show (3.33). Regarding the control of \(R^{\varepsilon_{N}}_{\theta}({t_{n}},\xi)\), we follow the proof of Lemma 3.5 (b). The adaptation is obvious since instead of taking \(p = 2\), we take \(p \geq1\). Then we handle the supremum over \(\theta\) inside the expectation by using the BDG inequalities. The other arguments are unchanged, leading to the announced estimate. We leave the details to the reader. □

4 Numerical experiments

In this section, we compute a numerical approximation of \({ v}^{*}\) in (2.12), a solution to the \(f\)-PDE in (1.5) using the optimal kernel \(f^{*}\) defined in (2.10). In Sect. 2.4, we have obtained a quasi-explicit formulation for the optimal kernel \(f^{*}_{\gamma}\) (see (2.18)) in the one-dimensional case. Therefore, we only perform numerical experiments in dimension \(d=1\) with the risk parameter \(\gamma\in\{0.0,0.1,0.2,0.3\}\). First, in Sect. 4.1, we present the numerical solution for a set of European options. Then, in Sect. 4.2, we compute the asymptotic risk \(\mathscr{R}_{\gamma}({ v}^{*},f)\) for different kernels \(f\in\{f^{*}_{0},f^{*}_{0.1},f^{*}_{0.2},f^{*}_{0.3}\}\) confirming the optimality of \(f^{*}_{\gamma}\). Finally, in Sect. 4.3, we compare numerically the solution to the \(f\)-PDE with the solution to the minimisation problem (1.3). We aim to check the conjecture whether one can interchange the limit in \(N\) and the minimisation over strategies in our setting (see the diagram in Fig. 2). Alternatively, we verify whether the solution to the minimisation problem in discrete time (see (1.3)) corresponds, for \(N\) large, to the solution of the nonlinear \(f^{*}\)-PDE (1.5).

4.1 The \(f^{*}\)-PDE valuation for different options

Here we show the numerical solution to (2.12) for different option payoffs \(h\) (as terminal condition) under the assumption that the underlying process \(X\) satisfies the SDE (2.1) with \({\sigma}(t,x) = {\sigma}x\). We consider the value function \(U(t,x)\) as the solution to the \(f\)-PDE valuation (in forward form, by reversing time)

$$ \frac{\partial U}{\partial t}(t,x)=\alpha(x){\partial_{x}^{2} U}(t,x) +f\big(2 \alpha(x){\partial_{x}^{2} U}(t,x)\big), \qquad(t,x)\in(0,T]\times{ \mathbb{R}}, $$
(4.1)

where \(\alpha(x)=\frac{1}{2}{\sigma}^{2} x^{2}\) and \(f:{\mathbb{R}}\to{\mathbb{R}}\) is a real-valued function to be chosen. Seeing that (4.1) has a second-order partial differential in space and first-order in time, we require for a numerical resolution one initial and two boundary conditions. Also, the payoff of European options with maturity \(T\), denoted by \(h(x)\), will be used as initial condition for (4.1). We have chosen the following options:

(i) call option with payoff function \(h(x)= (x-K_{0} )^{+}\) and put option with payoff function \(h(x)= (K_{0}-x )^{+}\), where \(x\mapsto x^{+}=\max(x,0)\) and \(K_{0}\) is the strike price;

(ii) asset-or-nothing call option with payoff function \(h(x)=x\mathbf{I}_{\{x-K_{0}>0\}}\) and asset-or-nothing put option with \(h(x)=x\mathbf{I}_{\{x-K_{0}<0\}}\), where \(K_{0}\) is the strike price;

(iii) bull spread option with payoff function \(h(x)= (x-K_{1} )^{+}- (x-K_{2} )^{+}\) and bear spread option with \(h(x)= (K_{2}-x )^{+}- (K_{1}-x )^{+}\), where \(K_{1},K_{2}\) are strike prices with \(K_{2}>K_{1}\).

We examine the asset-or-nothing options because of their discontinuous payoff. We analyse the spread options because of the change of convexity. We are aware that these payoffs do not satisfy the assumptions of Theorem 2.6, but we believe that these hypotheses are only sufficient and the previous asymptotic analysis can also be applied to those payoffs.

In the following numerical examples, we consider the following parameters:

Set

Strike

Volatility

Maturity

A (vanilla and digital)

\(K_{0}=100\)

σ = 0.3

T = 1

B (spread)

\((K_{1},K_{2})=(90,110)\)

σ = 0.3

T = 1

Space discretisation

Here, we detail our numerical scheme. We look for a second-order accurate solution to the PDE in (4.1) on a finite domain \(L=[0,x_{\max}]\). Let \(I\in{\mathbb{N}}\). We equally discretise \(L\) in \(I+1\) points \(\left\{ x_{0},x_{1},\ldots,x_{I-1},x_{I}\right\} \) such that \(\Delta x=x_{\max}/I\) and \(x_{i}=i\Delta x\) for \(0\le i\le I\). Assuming that \(U\) is smooth enough, we get the second-order approximation of the second derivative of \(U\) as

$$\begin{aligned} \frac{U_{i+1}-2U_{i}+U_{i-1}}{\Delta x^{2}} & ={\partial_{x}^{2} U} (x_{i})+ \mathbf{O}(\Delta x^{2}), \end{aligned}$$

for every \(1\le i\le I-1\), with \(U_{i}\) denoting \(U(x_{i})\). Thanks to the second-order approximation, we obtain from (4.1) a semi-discretisation

$$ {\partial_{t} U_{i}}= \alpha_{i} \frac{U_{i+1}-2U_{i}+U_{i-1}}{\Delta x^{2}} +f\bigg(2\alpha_{i} \frac{U_{i+1}-2U_{i}+U_{i-1}}{\Delta x^{2}}\bigg), $$
(4.2)

for every \(1\le i\le I-1\), where the factor \(\alpha_{i}\) is \(\alpha(x)\) evaluated in each \(x_{i}\). Assuming that \(f\) in (4.1) is Lipschitz-continuous, the system of equations (4.2) is a second-order approximation of the PDE (4.1) and can be viewed in matrix form as

$$ \frac{{\mathrm{d}}\mathbf{U}}{{\mathrm{d}}t}=A\mathbf {U}+f(2A\mathbf{U}), $$

where \(A\) is the coefficient matrix and \(\mathbf{U}\) the discrete solution. Besides the system in (4.2), \(\mathbf{U}\) satisfies \(U_{0}=b_{\min}\) and \(U_{n}=b_{\max}\), where \(b_{\min}\) and \(b_{\max}\) represent a Dirichlet-type boundary condition imposed on the numerical solution. Therefore, the matrix \(A\) is of form

$$ A_{00}=0,\quad A_{ii-1}=\alpha_{i}/\Delta x^{2},\quad A_{ii}=-2 \alpha_{i}/\Delta x^{2},\quad A_{ii+1}=\alpha_{i}/\Delta x^{2}, \quad A_{II}=0. $$

After the space discretisation, there remains a system of ordinary differential equations

$$ \frac{{\mathrm{d}}\mathbf{U}}{{\mathrm{d}}t}=A\mathbf{U}+\mathbf {F},\qquad \mathbf{U}(0)=h,\qquad\mathbf{F}:=f(2A\mathbf{U}). $$

Time discretisation

Now we apply a second-order method in time. Let \(J\in{\mathbb{N}}\). Divide the time interval \([0,T]\) in \(J\) intervals with a constant time step \(\Delta t=T/J\).

Denote by \(\mathbf{U}^{j}\) (resp. \(\mathbf{F}^{j}\)) the vector \(\mathbf{U}\) (resp. \(\mathbf{F}\)) evaluated at \(t=j\Delta t\). Because of the nonlinearity of \(\mathbf{F}\) regarding \(\mathbf{U}\), we use Adams–Moulton (AM) methods with Adams–Bashforth (AB) methods to construct a predictor–corrector algorithm with AM and AB of the same order. Here we apply the second-order Adams–Bashforth (AB2) method to predict \(\mathbf{F}^{j+1}\), and we use \({\bar{F}}^{j+1}\) within the second-order Adams–Moulton (AM2) method:

1) We predict \((\mathbf{U}^{j+1},\mathbf{F}^{j+1})\) with AB2 which gives us

$$\begin{aligned} {\bar{F}}^{j+1} =f (2A{\bar{U}}^{j+1} ), \quad{\bar{U}}^{j+1} =\mathbf{U}^{j}+\Delta t\bigg(\frac{3}{2} (A\mathbf{U}^{j}+\mathbf{F}^{j} )-\frac{1}{2}(A\mathbf{U}^{j-1}+\mathbf{F}^{j-1} )\bigg). \end{aligned}$$

2) We correct \((\mathbf{U}^{j+1},\mathbf{F}^{j+1})\) with AM2 which gives us

$$\begin{aligned} \mathbf{F}^{j+1} =f (2A\mathbf{U}^{j+1} ), \quad\mathbf{U}^{j+1} = \mathbf{U}^{j}+\Delta t\bigg(\frac{1}{2}(A\mathbf{U}^{j+1}+ {\bar{F}}^{j+1} )+\frac{1}{2}(A\mathbf{U}^{j}+\mathbf{F}^{j} ) \bigg). \end{aligned}$$

Here, \(f\) is computed as the optimal kernel \(f^{*}_{\gamma}\) given in (2.18). Further, in Table 1, we have given the constants \(c_{1}^{*}\) and \(c_{2}^{*}\) computed by using a root finding algorithm. Since the algorithm looks two steps back, we need some initialisation steps. Therefore we use the AB1 (forward Euler) and the AM1 (backward Euler) method for the prediction and correction part, respectively, so that

$$ \begin{aligned} {\bar{U}}^{1} & =\mathbf{U}^{0}+\Delta t (A\mathbf{U}^{0}+ \mathbf{F}^{0} ), &\quad\mathbf{F}^{0} &=f (2\mathbf{U}^{0} ), \\ \mathbf{U}^{1} & =\mathbf{U}^{0}+\Delta t (A\mathbf{U}^{1}+ {\bar{F}}^{1} ), &\quad{\bar{F}}^{1} &=f (2{\bar {U}}^{1} ). \end{aligned} $$

Initial boundary conditions

Regarding the boundary conditions, we have stipulated a space domain \(L=[0,x_{\max}]\), where \(x_{\max}\) is supposed to be large enough. Then we use Dirichlet boundary conditions \(U(t,0)=b_{\min}(t)\) and \(U(t,x_{\max})=b_{\max}(t)\) for any \(t\) in \([0,T]\). We set the left boundary \(b_{\min}(t)=h(0)\) and the right boundary \(b_{\max}(t)=h(x_{\max})\) for any \(t\) in \([0,T]\). Regarding the numerical solution, we fix \(x_{\max} = 400\), \(I=200\) and \(J=200\).

In Fig. 5, we show the vanilla option values plotted for different risk parameters \(\gamma\). We depict analogous plots for digital and spread options in Figs. 6 and 7, respectively. We remark that the numerical solutions are increasing as functions of \(\gamma\). Intuitively, whenever the seller’s risk aversion increases, it will be more reasonable that he asks for a higher option price. According to Proposition 2.8, we have that \(y\mapsto f^{*}_{\gamma}(y)\) is nonnegative for all \(\gamma\in(0,1)\). Therefore, the nonlinear source of the PDE (4.1) is nonnegative irrespective of the sign of the second derivatives. Our risk-averse valuation adds a risk premium to the risk-neutral one whenever the underlying price varies too quickly, i.e., proportionally to the Greek gamma.

Fig. 5
figure 5

Vanilla options: numerical approximation of \(f^{*}\)-PDE solution \(U\) at final time (i.e., initial time \(t=0\) for the solution to (2.12)) for different \(\gamma\)

Fig. 6
figure 6

Digital options: numerical approximation of \(f^{*}\)-PDE solution \(U\) at final time (initial time for the solution to (2.12)) for different \(\gamma\)

Fig. 7
figure 7

Spread options: numerical approximation of \(f^{*}\)-PDE solution \(U\) at final time (initial time for the solution to (2.12)) for different \(\gamma\)

4.2 The asymptotic risk \(\mathscr{R}_{\gamma}({ v},f)\) for different kernels \(f\)

Here we test the asymptotic risk \(\mathscr{R}_{\gamma}({ v},f)\) (see in (2.6)) for different kernels \(f\) given a reference valuation \({ v}\). Consider \({ v}^{*}_{\gamma}(t,\cdot)=U(T-t,\cdot)\), where \(U\) is the solution to the PDE (4.1) (in forward form) using the optimal kernel \(f^{*}_{\gamma}\) given by Proposition 2.8. Then we confirm numerically the optimality of \(f^{*}_{\gamma}\) for the reference valuation \({ v}^{*}_{\gamma}\) by computing \(\mathscr{R}_{\gamma}({ v}^{*}_{\gamma},f^{*}_{\gamma'})\) for a different \(\gamma'\); recall that the optimality in Propositions 2.7 and 2.8 is in the sense of

$$\begin{aligned} \mathscr{R}_{\gamma}({ v},f^{*}_{\gamma})\leq\mathscr{R}_{\gamma}({ v},f) \end{aligned}$$
(4.3)

for any \({ v}, f\) and in particular for \({ v}={ v}^{*}_{\gamma}\) and \(f=f^{*}_{\gamma'}\). To achieve that, we approximate \(\mathscr{R}_{\gamma}({ v}^{*}_{\gamma},f^{*}_{\gamma'})\) by forward Monte Carlo simulations of \(X\). In addition, we use the numerical PDE solution to compute the partial derivatives of \({ v}^{*}_{\gamma}\). We denote its estimate by \(\hat{\mathscr{R}}_{N,M}(\gamma,\gamma')\), where \(N\) is the number of time steps and \(M\) is the number of paths \(( X_{t_{n}})_{n=0}^{N}\).

Set \({\sigma}=0.3\), \(N=20\), \(M=5\times10^{5}\) and \(X_{0}\in\{90,110\}\). The number of time steps used in the PDE resolution between each time step of the MC algorithm is 50. We study the following options:

  1. (i)

    call option with \(K_{0}=100\) and \(T=1\);

  2. (ii)

    bear option with \(K_{1}=80\), \(K_{2}=120\) and \(T=1\).

Let \(\gamma\in[0,1)\) and \(X_{0}\in{\mathbb{R}}_{+}\) be fixed. Thanks to Theorem 2.6 and Proposition 2.7 (see also (4.3)), we expect that the minimum of \(\hat{\mathscr{R}}_{N,M}\left(\gamma',\gamma\right)\) in \(\gamma'\) is attained at \(\gamma'=\gamma\). In Table 2, we compute the numerical approximation \(\hat{\mathscr{R}}_{N,M}\left(\gamma,\gamma'\right)\) for \((\gamma,\gamma')\)\(\in\{0.0,0.1,0.2,0.3\}^{2}\) to verify this claim.

Table 2 Asymptotic risk estimate \(\hat{\mathscr{R}}_{N,M}\) for \(N=20\) and \(M=5\times10^{5}\)

4.3 The \(f^{*}\)-PDE valuation/hedging rule and the discrete-time problem solution

Let \(U_{\gamma}\) be the solution to the forward \(f^{*}_{\gamma}\)-PDE (4.1). Here, we compare the \(f^{*}_{\gamma}\)-PDE valuation/hedging rule \(\varphi^{*}_{\gamma}(t,\cdot)=(U_{\gamma}(T-t,\cdot), {\partial_{x} U_{\gamma}}(T-t,\cdot))\) for \(t\in[0,T]\) and the discrete-time problem solution \(\varphi^{N}_{{t_{n}}}(t,\cdot)=(V^{N}_{\gamma}({t_{n}},\cdot), \delta^{N}_{\gamma}({t_{n}},\cdot))\) for \(0\le n\le N\), where \(N\) is the number of hedging times. We approximate \(\varphi^{N}_{\gamma}\) by \(\varphi^{N,M}_{\gamma}\) by using a regression Monte Carlo (RMC) algorithm, where \(M\) is the number of Monte Carlo paths.

RMC algorithm

Here we present our RMC algorithm, which is a variation of the hedged Monte Carlo algorithm (proposed by Potters et al. [22]) with a fixed point stage. We determine the option value by working step by step from \(T=N\Delta t\) to the present \(t=0\), where \(\Delta t\) is the time interval. We denote the underlying asset price \(X\) at time \({t_{n}}=n\Delta t \) by \(X_{n}\), and the option value \({\mathcal{V}}_{n}(X_{n})\) at time \({t_{n}}\) only depends on the current asset price \(X_{n}\). We introduce the hedge \(\delta_{n}(X_{n})\), which is the amount of the underlying asset in the portfolio at time \({t_{n}}\) when the asset price is \(X_{n}\).

The average risk, over all paths of the underlying process, is given by

$$ {\mathcal{R}}_{n}=\big\langle \ell_{\gamma}\big( {\mathcal {V}}_{n+1}(X_{n+1})-{ \mathcal{V}}_{n}(X_{n})-\delta_{n}(X_{n}) (X_{n+1}-X_{n} )\big) \big\rangle _{M}, $$

where the angle brackets \(\langle\cdots\rangle_{M} \) denote here the average over the sampled asset values. The functional minimisation of \({\mathcal{R}}_{n}\) with respect to \({\mathcal{V}}_{n}(X_{n})\) and \(\delta_{n}(X_{n})\) gives us equations which allow us to determine the option value and hedge provided that \({\mathcal{V}}_{n+1}\) is known. We generate a set of \(M\) paths \(X_{n}^{m}\), where \(n\) is the time index and \(m\) the path index. We decompose \({\mathcal{V}}_{n}\) and \(\delta_{n}\) over a set of \(K\) basis functions \(L_{k}^{n}\) and \(C_{k}^{n}\). The use of local basis functions in RMC is presented in [13]. Therefore, we choose \(L_{k}^{n}\) and \(C_{k}^{n}\) as a piecewise linear and a piecewise constant function, respectively, on each partition of the real line. In addition, we use adaptive breakpoints as proposed by Bouchard and Warin in [5], i.e.,

$$ {\mathcal{V}}_{n}^{K}(x):=\sum_{k=1}^{K}a_{k}^{n}L_{k}^{n}(x), \qquad\delta_{n}^{K}(x):=\sum_{k=1}^{K}b_{k}^{n}C_{k}^{n}(x). $$

In other words, we reduce the original functional optimisation problem (find the functions \({\mathcal{V}}_{n}\) and \(\delta_{n}\)) to a numerical optimisation (find the coefficients \(a_{k}^{n}\) and \(b_{k}^{n} \)). We have a good approximation of the true functional solution conditionally on \(K\) being large enough. We then solve \(N\) minimisation problems backward in time from the maturity \(T\), where \({\mathcal{V}}_{N}(x)\) is equal to the payoff function \(h\). For each step \(n\), we minimise

$$ \frac{1}{M}\sum_{m=1}^{M}\ell_{\gamma}\big( {\mathcal{E}}_{n,m}^{K} ({\mathcal{V}}_{n+1},a^{n},b^{n} )\big), $$

where

$$ {\mathcal{E}}_{n,m}^{K} ({\mathcal{V}},a,b ) :={\mathcal{V}}(X_{n+1}^{m})- \sum_{k=1}^{K}a_{k}L_{k}^{n}(X_{n}^{m})-\sum _{k=1}^{K}b_{k}C_{k}^{n}(X_{n}^{m}) (X_{n+1}^{m}-X_{n}^{m} ). $$

Thanks to the choice of the risk function \(\ell\), we can write \(\ell_{\gamma}(y)= (y w_{\gamma}(y) )^{2}\) with a weight function \(w_{\gamma}(y)=1+\gamma\operatorname{Sgn}(y)\). Then for each \(n\in\{N-1,\ldots,0\}\), we solve, starting from the quadratic optimal solution, the fixed point problem

$$\begin{aligned} (a^{n,0},b^{n,0} ) & :=\underset{(a,b)}{\operatorname{argmin}} \frac{1}{M}\sum_{m=1}^{M}\!\big( {\mathcal{E}}_{n,m}^{K} ({ \mathcal{V}}_{n+1},a,b )\big)^{2}, \\ (a^{n,p+1},b^{n,p+1} ) & :=\underset{(a,b)}{\operatorname{argmin}} \frac{1}{M}\sum_{m=1}^{M}\!\Big( {\mathcal{E}}_{n,m}^{K} ({ \mathcal{V}}_{n+1},a,b ) w_{\gamma}\big({\mathcal{E}}_{n,m}^{K} ({ \mathcal{V}}_{n+1},a^{n,p},b^{n,p} )\big) \Big)^{2} \end{aligned}$$

for every \(p\in\{0,\ldots,P-1\}\), where \({\mathcal{V}}_{n+1}={\mathcal{V}}_{n+1}^{K,P}\),

$$ {\mathcal{V}}_{n+1}^{K,P}:=\sum_{k=1}^{K}a_{k}^{n+1,P}L_{k}^{n+1}, \qquad\delta_{n+1}^{K,P}:=\sum_{k=1}^{K}b_{k}^{n+1,P}C_{k}^{n+1}. $$

The least squares problem with weights is solved using standard procedures. From a practical point of view, we have used a C++ library called StOpt (see the documentation by Gevret et al. [12]) to implement this previous RMC with local basis functions and adaptative breakpoints. Even though we do not establish any theoretical convergence result, we know that the previous algorithm is strongly related to an RMC method for computing generalised BSDEs proposed by Gobet et al. [17, 15]. In the following, we denote the optimal strategy \(({\mathcal{V}}_{n}^{K,P}(\cdot),\delta_{n}^{K,P}(\cdot))\) by \(\varphi^{N,M}_{\gamma}({t_{n}},\cdot)\).

Set of parameters

Regarding the RMC algorithm, we set \(M=8\times10^{5}\), \(N=40\), \(K=80\) and \(P=20\). For the underlying process, we set \({\sigma}=0.3\), \(X_{0} = 100\) and \(T= 1\). Here, we compare the optimal valuation/hedging rule \(\varphi^{*}_{\gamma}(t,\cdot)\) and the discrete-time problem solution \(\varphi^{N}_{\gamma}({t_{n}},\cdot)\) for a call option with strike \(K_{0}=100\) and a bear option with strikes \(K_{1}=80\), \(K_{2}=120\).

Thanks to the previous algorithm, we compute the option value \(V^{N,M}_{\gamma}(t_{n},\cdot)\). From the finite difference scheme in Sect. 4.1, we have the value function \(U_{\gamma}(T-t_{n},\cdot)\). Here we consider \(\gamma\in\{0.0,0.1,0.2,0.3\}\) and \(t_{n}\in\{0.1,0.3\}\). In Fig. 8, we present the relative error \(V^{N,M}_{\gamma}(t_{n},\cdot)/U_{\gamma}(T-t_{n},\cdot)-1\) for a call option. We show analogous plots for a bear spread option in Fig. 9. We observe that the relative errors seem to confirm numerically the conjecture (1.8): the optimal price in discrete time for a large number of hedging times coincides asymptotically with the \(f^{*}\)-PDE solution.

Fig. 8
figure 8

Relative error \(V^{N,M}_{\gamma}(t_{n},\cdot)/U_{\gamma}(T-t_{n},\cdot)-1\) for a call option

Fig. 9
figure 9

Relative error \(V^{N,M}_{\gamma}(t_{n},\cdot)/U_{\gamma}(T-t_{n},\cdot)-1\) for a bear option