Keywords

1 Introduction

A Lévy process X(t) has the structure \(X(t)=at+\sigma W(t)+J(t)\) where W(t) is standard Brownian motion (BM) and J(t) an independent pure jump process (see further below). This class of processes has been used in numerous application areas, of which we in particular mention finance [14, 35] and queueing [15].

Calculations for a Lévy process are, however, in general more difficult than for BM, and an abundance of expressions that are explicit for BM are not so even in the most popular parametric Lévy models. Simulation of X(t) is therefore one of the main computational tools. For example in finance, it is most often the simplest vehicle for evaluating option prices of the form

$$\begin{aligned} {{\mathbb {E}}}\Psi \bigl (X(0:T)\bigr ) \end{aligned}$$
(1)

where T is the maturity time, X(0 : T) stands for the whole path \(\bigl \{X(t)\bigr \}_{0\le t\le T}\), and \(\Psi \) is a suitable path functional.

The simulation of the \(at+\sigma W(t)\) component is straightforward, so we assume that X(t) is a pure jump process. The main characteristic of such a process is its Lévy measure \(\nu \), which with a few exceptions we throughout assume absolutely continuous with density \(n(x)\ge 0\). Conditions needed on \(\nu \) are \(\int _{|x|>\varepsilon } \nu ({\textrm{d}} x)<\infty \) and \(\int _{|x|\le \varepsilon } x^2\nu ({\textrm{d}} x)<\infty \) for some (and then all) \(\varepsilon >0\). The process is said to have finite activity if \(\lambda =\int _{{\mathbb {R}}} \nu ({\textrm{d}} x)<\infty \) and is then a compound Poisson process with Poisson rate \(\lambda \) and density \(n(x)/\lambda \) of the jumps. Sample paths of X(t) are of finite variation if and only if \(\int _{|x|\le \varepsilon } x\nu ({\textrm{d}} x)<\infty \). The picture is roughly that jumps in \([x,x+{\textrm{d}} x)\) occur at Poisson rate \(n(x)\,{\textrm{d}} x\) and independently for different values of x. In the infinite variation case, X(t) is, however, only completely specified by n(x) up to a drift term (see further Sect. 2). The process is called spectrally positive or negative if \(n(x)\equiv 0\) for \(x<0\), resp. \(x>0\); otherwise, we refer to it as two-sided. In finance, the most popular classes of jump processes are the NIG (Normal Inverse Gaussian), tempered stable (TS or CGMY), VG (Variance Gamma) and Meixner ones, and we survey these in Sect. 3.

Exact simulation of the whole path X(0 : T) is obviously impossible due to the presence of infinitely many jumps of the process. One could hope that one can perform exact simulation of X(T) for any given T and thereby a discrete skeleton \(X(h),X(2h),\ldots \) for any h. As surveyed briefly in Sect. 8, this is simple for VG, with a little added effort also possible for NIG and CGMY with finite variation, and presumably possible but quite tedious for Meixner. In general, this is however not feasible and we focus on two approximative alternatives. They both consist in simulating the finite number of jumps which are in some sense “big” as a compound Poisson process, and replacing the infinity of the remaining “small” ones with an easily simulated approximation. The path X(0 : T) can then by obtaining by assigning i.i.d. uniform [0, T] location to the jumps and possibly filling in some information provided by the particular form of the approximation. The first of these approaches is classical and widely applied, and simply defines the big jumps as those of absolute value \(>\varepsilon \); we refer to this as the \(\varepsilon \)-algorithm. These jumps are those coming from the part of \(\nu \) concentrated on \(\{|x|>\varepsilon \}\). By definition, this is a finite measure and so the corresponding contribution to X can be simulated as a compound Poisson process. The second approach, which does not appear to have been considered in the simulation literature, relies on a completely monotone (CM) structure

$$\begin{aligned} n(x)\ =\ \int _0^\infty {\textrm{e}}^{-xt}\, V({\textrm{d}} t)\ =\ \int _0^\infty {\textrm{e}}^{-xt}v(t)\, {\textrm{d}} t\end{aligned}$$
(2)

of the Lévy density where V is a Radon measure with density v. This holds in many main examples and represents the jumps as an infinite mixture of exponential(t) jumps with the rate t having weight v(t)/t (see further Sect. 5). The compound Poisson part is then obtained by restricting V to (0, A) for some \(A<\infty \), meaning that exponential jumps with mean \(< 1/A\) are left out. We refer to the method as the CM-algorithm. In both approaches, the computational effort as measured by the Poisson mean goes to infinity as \(\varepsilon \rightarrow 0\), resp. \(A\rightarrow \infty \). As for the approximation of the small jump part, the standard choice in the \(\varepsilon \)-algorithm is a normal with the same mean and variance and is substantiated in [6] by a limit result as \(\varepsilon \rightarrow 0\) (further relevant references pertaining to this are [16, 37]). However, we shall also consider gamma alternatives in 2–3 variants and illustrate by examples that these perform at least as well, in some cases even convincingly better. Doing so, our point of view is largely empirical: for the practitioner, comparison of approaches as \(\varepsilon \rightarrow 0\) matters less than performance for \(\varepsilon \) so moderate that the computational effort is within reach. As \(\varepsilon \rightarrow 0\), the small jumps contribute less, and hence limit results become less relevant. Similar remarks apply to the CM-algorithm. We also point out that in some types of applications, the approximation of the small jumps need not necessarily be simulated, but instead it may be used via conditional Monte Carlo for providing smooth density estimates and variance reduction.

2 Lévy Processes

For the general theory of Lévy processes, see e.g. [33] and [10]. A jump process is constructed from a Poisson random measure \(L({\textrm{d}} t,{\textrm{d}} x)\) on \((0,\infty )\times {\mathbb {R}}/\!\{0\}\) with intensity measure \({\textrm{d}} t\otimes \nu ({\textrm{d}} x)\). In the finite variation case \(\int |x|\,\nu ({\textrm{d}} x)<\infty \), one has

$$\begin{aligned} X(t)\ {}&=\ \int _{s\le t,\,x\in {\mathbb {R}}} x\,L({\textrm{d}} s,{\textrm{d}} x)\,,\quad \kappa (\theta )\ =\ \int _{-\infty }^\infty \bigl ({\textrm{e}}^{\theta x}-1\bigr )\,\nu ({\textrm{d}} x) \end{aligned}$$
(3)

at least for \(\Re (\theta )=0\) and in our examples in a strip containing the imaginary axis. Here \(\kappa (\theta )=\log {{\mathbb {E}}}{\textrm{e}}^{\theta X(1)}\) is the so-called Lévy exponent or cumulant function. In the infinite variation case, there are too many small jumps for these integrals to converge. Instead, so-called compensation is needed and consists in appropriate centerings and limits. Traditionally, jumps of absolute size \(<1\) are centered, which leads to

$$\begin{aligned} X(t)\ {}&=\ at+\lim _{{\varepsilon \rightarrow 0}}\left\{ \int _{{s\le t,\,\varepsilon<|x|<\infty }} x\,L({\textrm{d}} s,{\textrm{d}} x)-t\int _{{\varepsilon <|x|\le 1}}x\,\nu ({\textrm{d}} x)\right\} \,,\end{aligned}$$
(4a)
$$\begin{aligned} \kappa (\theta )\ {}&=\ a+\int _{{-\infty }}^{\infty } \bigl ({\textrm{e}}^{{\theta x}}-1-\theta x{\mathbb {I}}(|x|\le 1)\bigr )\,\nu ({\textrm{d}} x) \end{aligned}$$
(4b)

for some a. Obviously, taking 1 as truncation point is arbitrary, and other choices lead to different values of a. If the mean \({{\mathbb {E}}} X(1)=\kappa '(0)\) is finite, it may be more convenient to center all jumps, and one then has

$$\begin{aligned} X(t)\ {}&=\ t\kappa '(0)+\lim _{\varepsilon \rightarrow 0}\left\{ \int _{s\le t,\,|x|>\varepsilon } x\,L({\textrm{d}} s,{\textrm{d}} x)-t\int _{|x|>\varepsilon }x\,\nu ({\textrm{d}} x)\right\} \,,\end{aligned}$$
(5a)
$$\begin{aligned} \kappa (\theta )\ {}&=\ \kappa '(0)+\int _{-\infty }^\infty \bigl ({\textrm{e}}^{\theta x}-1-\theta x\bigr )\,\nu ({\textrm{d}} x)\,. \end{aligned}$$
(5b)

The cumulants \(\kappa _k\) of X(1) are given as the kth derivatives \(\kappa ^{(k)}(0)\) of \(\kappa (\theta )\) at \(\theta =0\). In particular, \(\kappa _1={{\mathbb {E}}} X(1)\), \(\kappa _2={\mathbb {V}ar} X(1)\), and the skewness and (excess) kurtosis are \(\kappa _3/\kappa _2^{3/2}\), resp. \(\kappa _4/\kappa _2^2\). For \(k\ge 2\), one alternatively has

$$\begin{aligned} \kappa _k\ =\ \int _{-\infty }^\infty x^k\,\nu ({\textrm{d}} x), \end{aligned}$$
(6)

and this expression is also valid for \(k=1\) in the finite variation case.

3 Main Examples

In the absolutely continuous case, define the Lévy density \(n(x)= {\textrm{d}} \nu (x)/{\textrm{d}} x\) as the density of the Lévy measure w.r.t. Lebesgue measure.

The NIG process [9] has parameters \(\alpha ,\delta >0\), \(\beta \in (-\alpha ,\alpha )\) and \(\mu \in {\mathbb {R}}\). The Lévy density is

$$\begin{aligned} n(x)\ =\ \frac{\alpha \delta }{\pi |x|}K_1\bigl (\alpha |x|\bigr ) {\textrm{e}}^{\beta x},\ \ x\in {\mathbb {R}}, \end{aligned}$$
(7)

where as usual \(K_1(z)\) denotes the modified Bessel function of the third kind with index 1. The cumulant function and the density of X(1) are, respectively,

$$\begin{aligned}\begin{gathered} \kappa (s)\ =\ \mu s+\delta \left( \sqrt{\alpha ^2-\beta ^2}-\sqrt{\alpha ^2-(\beta +s)^2}\right) \,,\ {\alpha -\beta<\Re (s)<\alpha +\beta }\,, \\ \frac{\alpha \delta }{\pi }\exp \bigl \{\delta \sqrt{\alpha ^2-\beta ^2} +\beta (x-\mu )\bigr \}\frac{K_1\bigl (\alpha \sqrt{\delta ^2+(x-\mu )^2}\bigr )}{\sqrt{\delta ^2+(x-\mu )^2}}\,. \end{gathered}\end{aligned}$$

The Meixner (MX) process [18, 28, 35] has parameters \(a,d>0\), \(b\in (-\pi ,\pi )\) and \(m\in {\mathbb {R}}\). The Lévy density is

$$\begin{aligned} n(x)\ =\ d \frac{\exp \{bx/a\}}{x\sinh (\pi |x|/a)}\ =\ 2d \frac{\exp \bigl \{bx/a-\pi |x|/a\bigr \}}{|x|\bigl (1-\exp \{-2\pi |x|/a\}\bigr )}\,. \end{aligned}$$
(8)

The cumulant function and the density of X(1) are, respectively,

$$\begin{aligned} \nonumber \kappa (s)\ =\ 2d\log \bigl (\cos (b/2)\bigr )-2d\log \bigl (\cos (as+b)/2)\bigr )+ms\,, \ \ {\tfrac{\pi +b}{a}<\Re (s)<\tfrac{\pi -b}{a}}\,,\\ \frac{(2\cos (b/2))^{2d}}{3a\pi \Gamma (2d)}{\textrm{e}}^{b(x-m)/a}\bigl |\Gamma (d+{\textrm{i}}(x-m)/a)\bigr |^2\,. \end{aligned}$$
(9)

For the tempered stable (TS) process [3, 12, 24]

$$\begin{aligned} n(x)\ =\ \delta _\pm {\textrm{e}}^{-\beta _\pm |x|}/|x|^{\alpha _\pm +1} \end{aligned}$$
(10)

where \(\delta _+,\beta _+\) are for \(x>0\) and \(\delta _-,\beta _-\) for \(x<0\). When \(\delta _+=\delta _-\), \(\alpha _+=\alpha _-\), the TS process goes under the acronym CGMY process in particular in finance, where the traditional notation is \(\delta _+=\delta _-=C\), \(\alpha _+=\alpha _-=Y\), G instead of \(\beta _-\) and M instead of \(\beta _+\). Cf. the author names in [12]! In terms of the positive jumps, \(\alpha _+<0\) corresponds to a compound process, \(\alpha _+=0\) to a gamma process where X(1) is gamma distributed with shape parameter \(\delta _+\) and rate parameter \(\beta _+\), \(0<\alpha _+<1\) to infinite activity but finite variation, and \(1\le \alpha _+<2\) to infinite variation. The cumulant function is

$$\begin{aligned} \kappa (s)\ =\ \delta _-\Gamma (-\alpha _-)\bigl ((\beta _-+s)^{\alpha _-}-\beta _-^{\alpha _-}\bigr )+ \delta _+\Gamma (-\alpha _+)\bigl ((\beta _+-s)^{\alpha _+}-\beta _+^{\alpha _+}\bigr )\,, \end{aligned}$$
(11)

\(-\beta _-<\Re (s)<\beta _+\). Here and at other places in the theory, exceptions apply when \(\alpha _+\) or \(\alpha _-\) or both equals 0 or 1. The case \(\alpha _+=\alpha _-=0\) is the VG process (the difference between two gamma processes).

Starting from [12, 13], the density of X(1) in the TS process has traditionally been computed by Fourier inversion via (11). However, it is pointed out in [3] that the density can be expressed as

$$\begin{aligned} f(x)\ =\ \exp \{-\beta x-\delta \Gamma (-\alpha )\beta ^\alpha \}f_0(x) \end{aligned}$$
(12)

where \(f_0\) is the density of a strictly \(\alpha \)-stable distribution \(S_\alpha (\sigma ,1,0)\) distribution with \(\sigma =\bigl (-\delta \Gamma (-\alpha )\cos (\pi \alpha /2)\bigr )^{1/\alpha }\). See also [27, 30]. Given the availability of software for stable distributions, (12) provides an easy approach to numerical computations.

In all these examples, one has

$$\begin{aligned} n(x)\,\sim \,\frac{\delta }{x^{1+\alpha ^*}}\ \ \text {as } x\downarrow 0\end{aligned}$$
(13)

for some \(\delta \) and some \(\alpha ^*\in [0,2)\) (subject to this, \(\alpha ^*\) is sometimes referred to as the Blumenthal-Getoor index). In fact, for TS this holds since \({\textrm{e}}^{-\beta x}\rightarrow 1\), whereas one has \(\alpha ^*=1\) for NIG and MX, as follows from known asymptotics of \(K_1\), resp. \(1-\exp \{-2\pi xa\}\) \(\sim 2\pi x/a\).

4 The \(\varepsilon \)–Algorithm

Typically, the positive and negative jumps are simulated separately, so we consider only the spectrally positive case in the following.

When truncating the jumps to \([\varepsilon ,\infty )\), the exactly simulated compound Poisson part of X(1) is \(X_{\varepsilon ,\infty }(1)=\) \(\sum _1^N Y_n(\varepsilon )\) where N is Poisson \(\lambda (\varepsilon )\) and \(Y_1(\varepsilon ),Y_2(\varepsilon ),\ldots \) are i.i.d. with density \(g(x;\varepsilon )\) with

$$\begin{aligned} \lambda (\varepsilon )=\int _\varepsilon ^\infty n(x)\,{\textrm{d}} x\,,\quad g(x;\varepsilon )=\frac{n(x)}{\lambda (\varepsilon )}\,,\ \varepsilon<x<\infty . \end{aligned}$$

Some approximation \(\widehat{X}_{0,\varepsilon }(1)\) of jumps of value \(<\varepsilon \) is then used, and one returns the r.v. \(\widehat{X}_{0,\varepsilon }(1)+X_{\varepsilon ,\infty }(1)\). For these approximations, one typically needs the cumulants of \(X_{0,\varepsilon }(1)\) which according to (6) are \(\kappa _{k;0,\varepsilon }=\) \(\int _0^\varepsilon x^k\,\nu ({\textrm{d}} x)\) if either \(k\ge 2\) or \(k\ge 1\) and the process has finite variation; in the infinite variation case, \(\kappa _{1;0,\varepsilon }=0\) subject to (5a). In practice, \(\int _0^\varepsilon x^k\,\nu ({\textrm{d}} x)\) is seldom explicit, but needs to be evaluated by numerical integration. Alternatively, one may note that subject to (13), one has

$$\begin{aligned} \kappa _{k;0,\varepsilon }\ =\ \int _0^\varepsilon x^k\,\nu ({\textrm{d}} x)\ \sim \ \delta \frac{\varepsilon ^{k-\alpha ^*}}{k-\alpha ^*} \quad \text {if }\alpha ^*<1,\ k\ge 1\text { or }1\le \alpha ^*<2,\ k\ge 2.\end{aligned}$$
(14)

The most naive choice is \(\widehat{X}_{0,\varepsilon }(1)\equiv 0\). However, it was suggested in [11] and [32] to take \(X_{0,\varepsilon }(t)\) as a BM with fitted mean and variance when \(\varepsilon <1\). Supporting limit theorems were given in [6], establishing the validity of this procedure when X is not too close to the finite activity case \(\int \nu ({\textrm{d}} x)<\infty \) and \(\nu \) satisfies some weak smoothness conditions (a simple proof under the stronger condition (13) follows by paralleling the proof of Proposition 3 below). We shall here suggest gamma alternatives in two variants.

Recall that the gamma distribution with shape parameter r and rate parameter b has density \(b^rx^{r-1}{\textrm{e}}^{-bx}/\Gamma (r)\) and cumulant function \(\log \bigl (b/(b-z)\bigr )^r={-r\log (1-z/b)}\) with kth derivative \(r(k-1)! b^{-k}(1-z/b)^{-k}\). Thus the kth cumulant is \(\kappa _k= r(k-1)!/b^k\); in particular the skewness is \((2r/b^3)/(r/b^2)^{3/2}\ =\ 2 r^{-1/2}\). Given a distribution or a set of data with cumulants \(\kappa _k^\#\), the most obvious possibility is to fit the mean and variance which leads to

figure a

One could also consider a three-parameter gamma family by allowing a shift m, and fitting the mean, variance and skewness then gives

figure b

Note that for a Lévy process, (\(\Gamma _1\)) does not make sense in the infinite variation case since then \(\kappa _{1;0,\varepsilon }=0\) subject to (5). For a subordinator (a spectrally positive process with a non-negative linear drift), (\(\Gamma _2\)) may be controversial because it may destroy the property of the process being non-decreasing. The normal approximation has the same problem, but not (\(\Gamma _1\)). Both of (\(\Gamma _1\)), (\(\Gamma _2\)) asymptotically agree with the normal approximation as \(\varepsilon \downarrow 0\). This follows since (14) implies that \(b\rightarrow \infty \) in both cases, which implies a gamma distribution to be asymptotically normal.

Efficiently generating r.v.’s from the density \(g(x;\varepsilon )=n(x)/\lambda (\varepsilon )\), \(x>\varepsilon \) may not always be trivial. However, a general set-up covering many examples is

$$\begin{aligned} (n_1n_2)\ \ \ &n(x)=\frac{n_1(x)}{n_2(x)} {\text { for } x>0} \text { with }n_1(x){\text { strictly decreasing, }n'_2(x)>0}, \\ &{n_1(x)\text { integrable on } (x_0,\infty )\text { and }1/n_2(x)\text { on } (\varepsilon ,x_0) \text { for all } 0<\varepsilon<x_0<\infty }. \end{aligned}$$

In the TS situation, \(n_1(x)=d{\textrm{e}}^{-\beta x}\), \(n_2(x)=x^{1+\alpha }\); for the positive jumps of MX, one may take \(n_1(x)=2d \exp \bigl \{bx/a-\pi x/a\bigr \}\), \(n_2(x)=x\bigl (1-\exp \{-2\pi x/a\}\bigr )\); etc.

Even for the TS case, the c.d.f. of \(g(x;\varepsilon )\) is not explicitly available. Thus inversion is not feasible and acceptance-rejection (A-R) seems the reasonable approach. What suggests itself is to either use the exponential\((\beta )\) distribution on \((\varepsilon ,\infty )\) as proposal and reject w.p. proportional to \(1/x^{1+\alpha }\), or to use the Pareto\((\alpha )\) distribution on \((\varepsilon ,\infty )\) as proposal and reject w.p. proportional to \({\textrm{e}}^{-\beta x}\). However, the first procedure would lead to a high rejection rate for small or moderate x, and the second for large or moderate x. So, a reasonable compromise is to choose some threshold \({x_0}\) and use the Pareto proposal on \((\varepsilon ,x_0)\) and the exponential on \((x_0,\infty )\). An equivalent formulation is to decompose \(X_{\varepsilon ,\infty }\) into two compound Poisson terms, one having jumps in \((\varepsilon ,x_0]\) and the other having jumps in \((x_0,\infty )\). Note that the proposal on \((\varepsilon ,x_0)\) (a truncated Pareto) is easily simulated by inversion as \(\bigl (1/\varepsilon ^\alpha -\alpha \mu _2(x_0)U\bigr )^{-1/\alpha }\) with U uniform(0, 1), cf. [4, p. 39].

In order to analyze this A-R procedure in the general set-up of \((n_1n_2)\), define for a fixed \(\varepsilon >0\)

$$\begin{aligned}\begin{gathered} \lambda _1(x_0)=\int _\varepsilon ^{x_0} n(x)\,{\textrm{d}} x\,,\ \ \mu _1({x_0})=\int _\varepsilon ^{x_0} \frac{1}{n_2(x)}\,{\textrm{d}} x\,, \ \ C_1(x_0)=\frac{n_1(\varepsilon )\mu _1({x_0})}{\lambda _1({x_0})}\,,\\ \lambda _2({x_0})=\int _{x_0} ^\infty n(x)\,{\textrm{d}} x\,,\ \ \mu _2({x_0})=\int _{x_0}^\infty n_1(x)\,{\textrm{d}} x\,, \ \ C_2(x_0)=\frac{\mu _2({x_0})}{\lambda _2({x_0})n_2({x_0})}\,.\end{gathered}\end{aligned}$$

The target distributions are then

$$f_1(x)=\frac{n_1(x)}{\lambda _1(x_0)n_2(x)},\ \varepsilon<x<x_0,\text {\ \ and\ \ } f_2(x)=\frac{n_1(x)}{\lambda _2(x_0)n_2(x)},\ x_0<x<\infty ,$$

and the proposals are

$$g_1(x)=\frac{1}{\mu _1(x_0)n_2(x)},\ \varepsilon<x<x_0,\text {\ \ and\ \ } g_2(x)=\frac{n_1(x)}{\mu _2(x_0)},\ x_0<x<\infty .$$

Then \(f_1(x)\le C_1(x_0)g_1(x)\) and \(f_2(x)\le C_2(x_0)g_2(x)\), and we may use A-R with acceptance probabilities

$$\frac{f_1(x)}{C_1(x_0)g_1(x)}=\frac{n_1(x)\mu _1(x_0)}{\lambda _1(x_0)C_1(x_0)},\quad \frac{f_2(x)}{C_2(x_0)g_2(x)}=\frac{\mu _2(x_0)}{\lambda _2(x_0)C_2(x_0)n_2(x)}$$

for r.v. generation from \(f_1\), resp. \(f_2\). This gives expected numbers \(C_1(x_0)\), \(C_2(x_0)\) of samplings from \(g_1(x)\), resp. \(g_2(x)\), and as measure \(E(x_0)\) of the computational effort, we shall use the total number of these samplings, i.e.

$$E(x_0)\ =\ \lambda _1({x_0})C_1({x_0})+\lambda _2({x_0})C_2({x_0})\ =\ n_1(\varepsilon )\mu _1({x_0})+\frac{\mu _2({x_0})}{n_2({x_0})}\,.$$

Of course, if the costs to generate from \(g_1(x)\), resp. \(g_2(x)\) are very different, \(E(x_0)\) needs to be reflected to reflect this disparity.

Proposition 1

Consider the function \(E(x_0)\), \(\varepsilon \le x_0\le \infty \), If \(n_2'(x_0)/n_2(x_0)\rightarrow 0\) as \(x_0\rightarrow \infty \), then \(E(x_0)\) attains its minimum for some \(\varepsilon<x_0^*<\infty \) satisfying \(\psi (x_0^*)=0\) where \(\psi (x_0)=\) \(n_2(x_0)\bigl (n_1(\varepsilon )-n_1(x_0)\bigr )-\mu _2(x_0)n'_2(x_0)\). In particular, for the TS case \(x_0^*\) is the unique solution in \((\varepsilon ,\infty )\) of

$$\begin{aligned} x_0^*({\textrm{e}}^{\beta (x_0^*-\varepsilon )}-1) \ =\ \frac{1+\alpha }{\beta }.\end{aligned}$$
($\Gamma _1$)

Proof

We have \(\frac{{\textrm{d}}}{{\textrm{d}} x_0}E(x_0)= \psi (x_0)/n_2(x_0)^2\). Here \(\psi (\varepsilon )=-\mu _2(\varepsilon )n'_2(\varepsilon )<0\). As \(x_0\rightarrow \infty \), we have \(\liminf \bigl (n_1(\varepsilon )-n_1(x_0)\bigr )>0\) and \(\mu _2(x_0)\rightarrow 0\), and so \(n_2'(x_0)/n_2(x_0)\rightarrow 0\) implies \(\psi (x_0)>0\) for all large \(x_0\). This gives the first part of the result. For the second on the TS case, we get

$$\begin{aligned}\psi (x_0)\ =\ {x_0^{1+\alpha }(d{\textrm{e}}^{-\beta \varepsilon }-d{\textrm{e}}^{-\beta {x_0}})- d({\textrm{e}}^{-\beta {x_0}}/\beta )\cdot (1+\alpha )x_0^\alpha }\,.\end{aligned}$$

Multiplying by \({\textrm{e}}^{-\beta {x_0}}\) and rearranging shows that \(\psi (x_0^*)=0\) is the same as (15). For uniqueness of the solution, note that the l.h.s. of (15) is strictly increasing in \(x_0^*\) with limits 0 at \(x_0^*=\varepsilon \) and \(\infty \) at \(x_0^*=\infty \). \(\square \)

5 Using Complete Monotonicity Structure

Again, we consider only the spectrally positive case and assume the Lévy measure n(x) to be completely monotone in the sense of (2). We refer to the measure \(V({\textrm{d}} t)\) as the reference measure and to v(t) as the reference density. See, e.g., [34] for background on complete monotonicity and a huge list of examples. Motivation and financial examples are in [12, 19, 21].

Example 1

We check here that complete monotonicity holds in our main examples. We use the rule that if m(x) is completely monotone with reference density v(t), \(t>0\), then \({\textrm{e}}^{-\beta x}m(x)\) is completely monotone with reference density \(v(t-\beta )\) for \(t>\beta \) and \(=0\) for \(0<t<\beta \).

In the NIG case, this rule together with the standard formula \(K_1(x)=\) \( \ x\int _1^{\infty } {\textrm{e}}^{-xt} (t^2-1)^{1/2} {\textrm{d}} t\) and elementary substitutions gives the expression

$$\begin{aligned} v(t)\ =\ \frac{\delta }{\pi }\sqrt{(t+\beta )^2-\alpha ^2}\,,\ \ t>\alpha -\beta , \end{aligned}$$

for the reference density for the positive part of the Lévy measure. For MX, let \(\chi (t)=\lceil t\rceil \) be the step function equal to \(n+1\) for \(t\in (n,n+1]\). Then

$$\begin{aligned}\frac{1}{1-{\textrm{e}}^{-x}}\ {}&=\ 1+{\textrm{e}}^{-x}+{\textrm{e}}^{-2x}+\cdots \ =\ 1-{\textrm{e}}^{-x}+2({\textrm{e}}^{-x}-{\textrm{e}}^{-2x})+3({\textrm{e}}^{-2x}-{\textrm{e}}^{-3x})+\cdots \\ {}&=\ \sum _{n=0}^\infty \, (n+1)x\!\!\int _n^{n+1} {\textrm{e}}^{-xt}\,{\textrm{d}} t \ =\ x\int _0^\infty {\textrm{e}}^{-xt}\chi (t)\,{\textrm{d}} t \end{aligned}$$

which gives

$$\begin{aligned}v(x)\ =\ {2d\,\chi \bigl (a(t-\pi /a+b/a)/(2\pi )\bigr )\,,\ \ t>\pi /a-b/a}. \end{aligned}$$

Finally for the TS case, it is shown in [12] that \(v(t)=\delta (t-\beta )^\alpha /\Gamma (1+\alpha )\), \(t>\beta \), which in turn is an easy consequence of \(\int _0^\infty {\textrm{e}}^{-xt} t^\alpha \,{\textrm{d}} t\) \(=\Gamma (1+\alpha )/x^{1+\alpha }\).

In all three examples, the reference density v(t) grows at rate \(t^{\alpha ^*}\) as \(t\rightarrow \infty \), with \(\alpha ^*\) as in (13). This is in fact no coincidence since Feller’s Tauberian theorem [17, p. 445] implies that \(V(t)=\) \(\int _0^t v(s)\,{\textrm{d}} s\) \(\sim \delta t^{1+\alpha ^*}/\Gamma (2+\alpha ^*)\). Hence by formal differentiation,

$$\begin{aligned} v(t)\ \sim \ \delta (1+\alpha ^*)t^{\alpha ^*}/\Gamma (2+\alpha ^*) \ =\ \delta t^{\alpha ^*}/\Gamma (1+\alpha ^*).\end{aligned}$$
($\Gamma _2$)

We stress that this is formal: the known rigorous result in this direction requires (beyond existence of v) that v is monotone, cf. [36]. However, we shall take (16) as an assumption for the further developments to follow.

In the following, we use that (2), (6) and Fubini’s theorem give the representation

$$\begin{aligned} \int _0^\infty x^kn(x)\,{\textrm{d}} x\ =\ \int _0^\infty \left( \int _0^\infty x^k{\textrm{e}}^{-tx}\,{\textrm{d}} x\right) v(t)\, {\textrm{d}} t\ =\ \int _0^\infty \frac{k!}{t^{k+1}}v(t)\,{\textrm{d}} t \end{aligned}$$
(15)

of the cumulants for \(k=0,1,\ldots \)   As in Sect. 4, we decompose the Lévy density n into two components, here taken as

$$\begin{aligned} n_{0,A}(x)\ =\ \int _0^A {\textrm{e}}^{-xt}v(t)\,{\textrm{d}} t\,,\quad n_{A,\infty }(x)\ =\ \int _ A^\infty {\textrm{e}}^{-xt}v(t)\,{\textrm{d}} t\,. \end{aligned}$$

The corresponding decomposition of X is written as \(X=X_{0,A}+X_{A,\infty }\). The key to our algorithm using complete monotonicity is the following:

Proposition 2

Assume the measure V in (2) is finite and let

$$\begin{aligned} \mu =V(\infty )=\int _0^\infty \frac{v(t)}{t}\,{\textrm{d}} t\,,\quad \lambda =\int _0^\infty n(x)\,{\textrm{d}} x\,. \end{aligned}$$

Then \(\mu =\lambda \). Let further T be standard exponential, Y a independent r.v. with density \(\tfrac{v(t)}{t\mu }\) and Z one with density \(n(x)/\lambda \). Then \(T/Y= Z\) in distribution.

Proof

Taking \(k=0\) in (17) gives \(\lambda =\mu \). We then get

$$\begin{aligned}{\mathbb {P}}\bigl (T/Y\in \,{\textrm{d}} x\bigr )\ {}&=\ \int _0^\infty {\mathbb {P}}\bigl (T/Y\in \,{\textrm{d}} x\,\big |\,Y=t\bigr )\frac{v(t)}{t\mu }\,{\textrm{d}} t\\ {}&=\ \int _0^\infty t{\textrm{e}}^{-tx}\,\frac{v(t)}{t\mu }\,{\textrm{d}} t\ =\ \frac{n(x)}{\mu }\ =\ {\mathbb {P}}(Z\in {\textrm{d}} x)\,.\end{aligned}$$

\(\square \)

This suggests that in the finite variation case, we can generate a r.v. X approximately distributed as X(1) as follows (more details on the individual steps are given below):

(1) Choose \(A<\infty \), let \(\lambda =\int _0^A v(t)/t\,{\textrm{d}} t\) and generate N as Poisson\((\lambda )\).

(2) Generate \(X_1=\sum _{n=1}^N T_n/Y_n(A)\) where the \(T_n\) are standard exponential and the \(Y_n(A)\) have density \(v(t)/(\lambda t)\), \(0<t<A\).

(3) Generate \(X_2\) as some approximation to \(X_{A,\infty }(1)\).

(4) Return \(X=X_1+X_2\).

In the infinite variation case subject to (5), replace \(X_1\) in (2) by

$$\sum _{n=1}^N \frac{T_n}{Y_n(A)}-\int _0^\infty xn_{0,A}(x)\,{\textrm{d}} x \ = \ \sum _{n=1}^N \frac{T_n}{Y_n(A)}-\int _0^A \frac{v(t)}{t^2}\,{\textrm{d}} t $$

and X in (4) by \(\kappa '(0)+X_1+X_2\). In both cases, \(X\rightarrow X(1)\) as \(A\rightarrow \infty \).

That \(\lambda \) in (1) is finite follows by the Radon property of \(V({\textrm{d}} x)\). The shape of the part \(n_{0,A}\) of n corresponding to the simulated large jumps is illustrated in Fig. 1, The process in the example is TS with \(\alpha =0.8\), variance \(\kappa _2=1\), kurtosis \(K=2\) and there are 4 values of A determined by the \(\rho \) defined as the proportion \({\mathbb {V}ar}\bigl (X_{A,\infty }(1)\bigr )/\kappa _2\) of the total variance provided by the small jumps (see further Sect. 6).

Fig. 1
figure 1

n(x) and \(n_{0,A}(x)\)

As for the approximation in (3), the most obvious choice is a normal distribution with the correct mean and variance, and this is in fact justified by the following result (recall that W denotes BM):

Proposition 3

Define \(X^*_{A,\infty }(t)\ =\ \bigl (X_{A,\infty }(t)-t{{\mathbb {E}}} X_{A,\infty }(1)\bigr )\big /\sqrt{{\mathbb {V}ar}\,X_{A,\infty }(1)}\). Then \(X^{*}_{{A,\infty }}\, \xrightarrow {\,\scriptscriptstyle \smash {\mathcal {D}}\,} \, W\) in the Skorokhod space \(D[0,\infty )\) as \(A\rightarrow \infty \).

Proof

Let \(\kappa _k^*\) be the kth cumulant of \(X^*_{A,\infty }(1)\). Then \(\kappa _k^*\) is of order \(A^{\alpha -k}A^{(2-\alpha )k/2}=A^{\alpha ((1-k/2)}\) for \(k>2\) since by (17)

$$\begin{aligned} \frac{\Gamma (1+\alpha )}{\delta }\int _0^\infty x^kn_{A,\infty }(x)\,{\textrm{d}} x\ &=\ k!\int _A^\infty \frac{(t-\beta )^\alpha }{t^{k+1}}\,{\textrm{d}} t\\ {}&\sim \ k!\int _A^\infty t^{\alpha -k-1}\,{\textrm{d}} t\ =\ \frac{k!A^{\alpha -k}}{k-\alpha }\,. \end{aligned}$$

Hence \(\kappa _k^*\rightarrow 0\) for \(k>2\) and obviously, \(\kappa _1^*=0\), \(\kappa _2^*=1\). Thus all cumulants and hence all moments of \(X^*_{A,\infty }(1)\) converge to those of the standard normal r.v. W(1). This implies \(X^*_{A,\infty }(1) \xrightarrow {\,\scriptscriptstyle \smash {\mathcal {D}}\,} W(1)\) (e.g. [22, Exercise 11 p. 101]), from which the asserted convergence in function space follows from Chap. 15 in [22]. \(\square \)

Gamma distributions fitted by \((\Gamma _1)\) or \((\Gamma _2)\) are appealing alternatives to the normal approximation and perform again significantly better in the numerical examples to be given in Sect. 6. A gamma form of \(n_{A,\infty }(x)\) comes up directly: one can use (16) and standard asymptotics of the upper incomplete gamma function to infer that

$$\begin{aligned}n_{A,\infty }(x)\ {}&\sim \ \int _A^\infty {\textrm{e}}^{-tx} \delta t^\alpha /\Gamma (1+\alpha )\,{\textrm{d}} t\ =\ \frac{\delta }{x^{1+\alpha }\Gamma (1+\alpha )} \int _{Ax}^\infty {\textrm{e}}^{-y} y^\alpha \,{\textrm{d}} y\\ {}&\sim \ \frac{\delta }{x^{1+\alpha }\Gamma (1+\alpha )} (Ax)^\alpha {\textrm{e}}^{-Ax}\ \sim \ \frac{\delta }{\Gamma (1+\alpha )}\frac{A^\alpha }{x} {\textrm{e}}^{-Ax} \end{aligned}$$

for any given fixed x. However, the first \(\sim \) is not valid if Ax is small or moderate, and in fact the gamma distribution with shape parameter \(\delta A^\alpha /\Gamma (1+\alpha )\) and rate parameter A substantially underestimates the order of \(X_{A,\infty }(1)\). For example, its mean is 1.2 for \(\alpha =0.8\), \(\kappa _2=2\), \(K=2\) and \(\rho =0.75\), whereas the correct value is \({{\mathbb {E}}} X_{A,\infty }(1)=5.5\).

6 Numerical Examples

As illustration of the \(\varepsilon \)- and CM-algorithms, we considered spectrally positive TS processes with varying parameters. Such a process can be parametrized with the variance \(\kappa _2\), the kurtosis K and \(\alpha \), and one then has

$$\begin{aligned}\beta =\sqrt{\frac{(2-\alpha )(3-\alpha )}{{\kappa _2^2}K}}\,,\quad \delta =\frac{\kappa _2}{\Gamma (2-\alpha )}\beta ^{2-\alpha }\,,\end{aligned}$$

cf. [3]. We considered three values 0.2, 0.8, 1.4 of \(\alpha \) and three 1/2, 2, 8 of K, and normalized by taking \(\kappa _2=1\). We further considered the normal as well as the two gamma approximations (\(\Gamma _1\)), \((\Gamma _2\)) of the small jumps, and as measure of performance, we took the \(L^2\)–distance

$$\begin{aligned} d\ =\ \int _0^\infty \bigl (f(x)-\widehat{f}(x)\bigr )^2{\textrm{d}} x \end{aligned}$$
(16)

between the true density f(x) of X(1) and an estimate \(\widehat{f}(x)\) provided by simulation. Here f(x) was evaluated by (12), using the Matlab routines for stable distributions. For \(\widehat{f}(x)\), we simulated \(M=10^6\) replicates \(Z_1,\ldots ,Z_M\) of \(X_{\varepsilon ,\infty }(1)\) and used the conditional Monte Carlo estimator

$$\begin{aligned} \widehat{f}(x)\ =\ \frac{1}{M}\sum _{m=1}^M\xi (x-Z_m)\end{aligned}$$
(17)

where \(\xi (\cdot )\) is the density in the approximation in question for the density of \(X_{(0,\varepsilon )}(1)\). Cf. e.g. [4, p. 146] and [2] (see also [26] for more sophisticated applications of the technique), but note also that conditional Monte Carlo can not universally replace generation of a r.v. distributed according to \(\xi (\cdot )\); e.g., this is needed when simulating a discrete skeleton. Numerically, (18) was computed by a discrete approximation with step length 0.01 in the interval \({{\mathbb {E}}} X(1)\pm 3\) (recall that X(1) was normalized to standard deviation 1).

The truncation parameters \(\varepsilon \), resp. A, for the two algorithms were chosen such that the variance of the approximated small jumps equaled various fractions \(\rho \) of the total variance \(\kappa _2=1\) of all jumps. For the \(\varepsilon \)-algorithm, this means that for a given \(\rho \)

$$\begin{aligned}\rho \ =\ \int _0^\varepsilon x^2\frac{\delta {\textrm{e}}^{-\beta x}}{x^{1+\alpha }}\,{\textrm{d}} x\ =\ \frac{\delta }{\beta ^{2-\alpha }}\int _0^{\varepsilon \beta } y^{2-\alpha -1}{\textrm{e}}^{-y}\,\textrm{d}y =\ \frac{\delta }{\beta ^{2-\alpha }}\Gamma (\varepsilon \beta ,2-\alpha )\Gamma (2-\alpha )\end{aligned}$$

where \(\Gamma (\cdot ;2-\alpha )\) is the lower incomplete Gamma function with parameter \(2-\alpha \). Thus

$$\begin{aligned} \varepsilon \ =\ \frac{1}{\beta }\Gamma ^{-1}\left( \frac{\rho \beta ^{2-\alpha }}{\delta \Gamma (2-\alpha };\, 2-\alpha \right) \,. \end{aligned}$$

For the CM-algorithm, we have instead by (17) that

$$\begin{aligned}\rho \ {}&=\ \int _0^\infty x^2\,{\textrm{d}} x\int _A^\infty {\textrm{e}}^{-tx}\frac{\delta (t-\beta )^\alpha }{\Gamma (1+\alpha )}\,{\textrm{d}} t \ =\ \frac{\delta }{\Gamma (1+\alpha )} \int _A^\infty \frac{(t-\beta )^{\alpha }}{t^3}\,{\textrm{d}} t\\ {}&=\ \frac{\delta }{\Gamma (1+\alpha )}\int _B^\infty \frac{y^\alpha }{(\beta +y)^3}\,{\textrm{d}} y \end{aligned}$$

where \(B=A-\beta \), and this equation was solved numerically.

Here S is the skewness of X(1) and \(\lambda \) is the Poisson mean in the compound Poisson sum of the simulated “large” jumps, that is,

$$\begin{aligned} \lambda \ =\ \int _\varepsilon ^\infty n(x)\,{\textrm{d}} x\ =\ \int _\varepsilon ^\infty \frac{\delta {\textrm{e}}^{-\beta x}}{x^{1+\alpha }}\,{\textrm{d}} x\,,\quad \lambda \ =\ \int _0^A \frac{v(t)}{t}\,{\textrm{d}} t\ =\ \int _0^B \frac{\delta t^\alpha }{(t+\beta )\Gamma (1+\alpha )}\,{\textrm{d}} t \end{aligned}$$

in the two cases. The \(L_2\) distances in (18) are denoted by \(d_N\) for the normal approximation and by \(d_{\Gamma _1}\), \(d_{\Gamma _2}\) for the two gamma ones. Graphs of f(x) and the \(\widehat{f}(x)\) are in Fig. 2 for some selected the parameter combinations in Table 2.

Fig. 2
figure 2

Left: \(\varepsilon \)-alg., \(\alpha =0.2\), \(K=2\), \(\rho =0.50\), \(d_N=1.3\) \(\,e^{-1}\), \(d_{\Gamma 1}=6.7\) \(\,e^{-3}\), \(d_{\Gamma 2}=3.5\) \(\,e^{-2}\); middle: CM-alg,. \(\alpha =0.8\), \(K=8\), \(\rho =0.75\), \(d_N=7.6\) \(\,e^{-3}\), \(d_{\Gamma 1}=2.6\) \(\,e^{-3}\), \(d_{\Gamma 2}=5.3\) \(\,e^{-5}\) right: \(\varepsilon \)-alg., \(\alpha =1.4\), \(K=2\), \(\rho =0.75\), \(d_N=2.0\) \(\,e^{-2}\), \(d_{\Gamma 2}=1.1\) \(\,e^{-4}\)

Our interpretation of Fig. 2 is that an \(L_2\)-distance of \(\,e^{-4}\) or less corresponds to an almost perfect fit, whereas one of order \(\,e^{-3}\) is sufficient for most practical purposes, one of order \(\,e^{-2}\) or more inadequate. With this in mind, we were quite surprised to see how well both algorithms perform already for so large values of \(\rho \) as \(75\%\) and \(50\%\), or equivalently for so small values of \(\lambda \) as those reported in the Tables 1 and 2. One further notes that both algorithms improve as K gets smaller or \(\alpha \) larger, which is in agreement with limit theorems given in [3] stating roughly that the distribution of X(1) gets closer to normal in the two cases.

Taking \(\lambda \) as measure of computational effort is certainly not unambiguous. On top comes the effort in generating from the r.v.’s \(Y_n(\varepsilon ),Y_n(A)\) with densities proportional to n(x), \(\varepsilon<x<\infty \), resp. v(t)/t, \(0<t<A\). However, this issue is largely implementation dependent. We have given one suggestion (based on (15)) for the \(\varepsilon \)-algorithm in Sect. 4 and give a similar A-R scheme for the CM-algorithm and TS case in the appendix. Both are certainly amenable to improvement. Comparison of the \(\varepsilon \)- and CM algorithms show that \(\lambda \) is slightly higher for the CM algorithm. However, the values of \(\lambda \) reported in the tables are quite small and thus \(1+\lambda \) could be a more fair measure than \(\lambda \), taking into account also the generation of the Poisson r.v.’s in addition to the Y. This makes the difference even smaller. As for precision, values of order \(e-5\) should not be compared as they do not improve by increasing \(\rho \), which could presumably be due to the discretization. Once this is said, the \(\Gamma _2\) scheme gives most often better precision than the \(\Gamma _1\) one, and both improve upon the normal, in some cases even significantly. The \(\varepsilon \)-algorithm gives slightly more precise estimates for the given \(\rho \) than the CM one, but most often not that much. Altogether, which one to prefer may depend on case-dependent issues such as the facility to generate the \(Y_n(\varepsilon )\) or \(Y_n(A)\).

Table 1 \(\rho =75\%\)
Table 2 \(\rho =50\%\)

Concerning the chosen values 1/2, 2, 8 of the kurtosis K, we remark that in financial log-return data K is most often of order 1–3 for daily log-returns series, but higher values occur when calibrating parameters, cf. Table 1 in [3]. Sampling at higher frequencies than daily will also increase K, and hence one may expect that larger values of \(\lambda \) than the ones in our tables will be needed for good precision.

7 Exact Simulation of X(h) and other Methods

In our main examples, it is fairly straightforward to generate a r.v. distributed as X(h) in a NIG process. Indeed, one description of the process is as subordinate to a BM W with drift \(\beta \) w.r.t. an inverse Gaussian subordinator \(\chi (t)\). In more detail, if \(W_1\) is another independent BM with drift \(\gamma \) and \(\chi (t)=\inf \{s>0:\,W_1(t)>\delta t\}\), then \(W(\chi (t))+\mu t\) is distributed as X(1) in a NIG\((\delta ,\alpha ,\beta ,\mu )\) process with \(\alpha =\sqrt{\beta ^2+\gamma ^2}\). Here a r.v. distributed as \(\chi (t)\) need not be simulated via the relation to \(W_1\) but can be directly generated. For X(h), just replace \(\delta \) by \(\delta h\) and \(\mu \) my \(\mu h\). These facts are surveyed in, e.g., [4, p. 343] and implemented in, e.g., [25]. A similar but easier exact subordination construction applies to the VG process. Asymptotic subordination algorithms for TS and MX are in [27].

For the spectrally positive TS process with finite variation (\(\alpha <1\)), it was noted in [3] that a r.v. distributed as X(1) can be generated by an A-R scheme, using (12) with the \(S_\alpha (\sigma ,1,0)\) r.v. Z as proposal and acceptance probability \({\textrm{e}}^{-\beta z}\) when \(Z=z\); for the standard algorithm to generate Z, see [4, p. 332]. Two-sided processes are of course generated by taking the difference between the positive and negative parts. The simplicity of this scheme should be compared to other approaches in the literature, e.g. [8, 23]. It was also remarked in [3] that the situation is more complicated when \(\alpha \ge 1\), since then X(1) is supported by the whole of \({\mathbb {R}}\) and \({\textrm{e}}^{-\beta z}\) is unbounded there. We suggest here an exact scheme based on asymptotic properties of stable densities. The details are in the Appendix but are included more for the sake of completeness than because we think the scheme is more attractive than the simple and efficient \(\varepsilon \)- and CM-algorithms.

A general comment on the method of discrete skeletons is that it gives little information on the whole path X(0 : T) unless one uses a skeleton with a quite small h and thereby a considerable computational effort.

We are not aware of methods for exact simulation of X(h) in the MX process. One could potentially use the explicit form of the density, cf. (9), via A-R, but a difficulty is to find suitable bounds for the complex gamma function.

Another approximate method is based on using a series expansion of the form \(X(T)=\sum _1^\infty \bigl \{H(\Gamma _n,V_n)-c_nT\bigr \}\) where the \(\Gamma _n\) are the order epochs of a standard Poisson process, and the \(V_n\) independent i.i.d. (possibly multivariate) r.v.’s., see the surveys in [31] and [4] XII.4. In the implementation, ones truncates to \(n\le N\) terms. Since \(H(\cdot ,v)\) is typically decreasing for fixed v, this method is hardly intrinsically different from the \(\varepsilon \)-algorithm. Calculation of H is not always straightforward. We are not aware of systematic studies of the error term \(\sum _{N+1}^\infty \ldots \).

8 Maxima, Minima and Other Path Functionals

In Sects. 46 and 7, we have concentrated on simulation of X(T) alone, say \(T=h\) or \(T=1\) (there is no loss of generality in taking \(T=1\) since \(X(T)=X_T(1)\) where \(X_T\) is the process obtained by replacing the Lévy measure \(\nu \) by \(T\nu \)). In the financial context, this covers European options, where \(\Psi \) in (1) is a function of X(T) alone. E.g. \(\Psi \bigl (X(0:T)={\textrm{e}}^{-rT}\bigl [Z(0){\textrm{e}}^{X(T)}-K\bigr ]^+\) for a European call with strike K. For many other options, \(\Psi \) is, however, more complicated. E.g. for an down-and-in barrier option

$$\begin{aligned} \Psi \bigl (X(0:T)\bigr )\ =\ {\textrm{e}}^{-rT}\bigl [Z(0){\textrm{e}}^{X(T)}-K\bigr ]^+{\mathbb {I}}\bigl (Z(0){\textrm{e}}^{X(t)}\le L\text { for some }t\le T\bigr )\,. \end{aligned}$$

One therefore needs to know also the minimum \(m_T=\inf _{t\le T}X(t)\) of X(0 : T), which typically is close to the value at some negative jump. Minima or maxima also come up in the context of queues modeled by Lévy input, where key processes Y such as workload, queue length etc. are obtained by reflecting the input X at 0. This means

$$\begin{aligned} Y(T)\ =\ \bigl (Y(0)+X(T)\bigr )\vee \max _{t\le T}\bigl (X(T)-X(t)\bigr )\,. \end{aligned}$$

In particular, \(Y(T){\mathop {=}\limits ^{d}}M_T\) where \(M_T=\sup _{t\le T}X(t)\) in the case \(Y(0)=0\) of an initially empty queue. If X is simulated as a discrete skeleton with step size h, the path of Y is approximated by \(Y_h(0)=Y(0)\) and the Lindley recursion

$$\begin{aligned} Y_h\bigl ((n+1)h\bigr )\ =\ \bigl [Y_h(nh)+X\bigl ((n+1)h\bigr )-X(nh)\bigr ]^+\,, \end{aligned}$$

leading to

$$\begin{aligned} Y_h(Nh)\ =\ \bigl (Y(0)+X(Nh)\bigr )\vee \max _{n\le N}\bigl (X(Nh)-X(nh)\bigr ) \ {\mathop {=}\limits ^{d}}\ \max _{n\le N} X(nh) \end{aligned}$$

where the final \({\mathop {=}\limits ^{d}}\) requires \(Y(0)=0\). For these facts, see Sects. III.6–7, IX.2 of [1].

We mention several strategies to access a minimum or maximum, say m(T), without recommending any particular one (in fact, such a choice may depend on the particular application context and a more extensive numerical study). One strategy is just to simulate a sufficiently fine skeleton exactly, when possible, and then take the minimum along the skeleton. This may be supplemented with continuity corrections as developed in [5, 20], that is, r.v.’s approximating

$$\begin{aligned}\min _{nh\le t\le (n+1)h} X(t)\,\big |\, X_{(n+1)h}, X_{nh}\,.\end{aligned}$$

If exact simulation of a skeleton is not feasible, one may instead generate the skeleton approximately by one of the compound Poisson algorithms of Sects. 4, 5, allocate uniform [0, T] locations to the Poisson jump times \(\tau _n\) , and supplement the minimum along the \(\tau _n\) by invoking bridge r.v.’s of the form

$$\begin{aligned}\min _{\tau _n\le t\le \tau _{n+1}} \bigl (\widehat{X}_{0:\varepsilon }(t)-\widehat{X}_{0:\varepsilon }(\tau _n)\bigr ) \,\big |\, \widehat{X}_{0:\varepsilon }(\tau _n)\,.\end{aligned}$$

The distribution and hence generation of such bridge minima is standard when \(\widehat{X}_{0:\varepsilon }\) is generated by using the normal approximation. For our gamma approximations, they may be efficiently generated by invoking the relation between gamma, beta and Dirichlet distributions, as developed in [7] and implemented in [25].

Similar remarks apply to other and possibly more complicated path functionals. For example, for a Parisian option one needs to know the first time the asset price \({\textrm{e}}^{X(t)}\) makes an excursion of length \(>D\) below some level L.