The Black–Scholes (BS) model for the value V (S, t) of a vanilla option is based on some assumptions on the market. In particular, the BS model assumes the price S t of the asset on which the option is written, follows a geometric Brownian motion with a constant volatility σ. Further, transaction costs are neglected, and trading of the underlying is supposed to have no influence on the price S t . As has been discussed extensively, the value function V (S, t) for standard options (“plain vanilla”) of the European type, satisfies the Black–Scholes equation (1.5),

$$\displaystyle{{ \partial V \over \partial t} +{ 1 \over 2}\sigma ^{2}S^{2}{\partial ^{2}V \over \partial S^{2}} + rS{\partial V \over \partial S} - rV = 0\,. }$$
(7.1)

Solutions of this linear equation are subject to the terminal condition V (S, T) = Ψ(S), where Ψ defines the payoff.

The BS-model is the core example of a complete market. In these idealized markets, the risk exposure to variations in the underlying can be hedged away. The corresponding risk strategy is unique. Hence vanilla options modeled by Assumptions 1.2 have a unique price, given by the costs of the replication strategy ( Appendix A.4). Essentially, Chaps. 4 through 6 have applied numerical methods to complete markets.

For the more realistic incomplete markets, there are no perfect hedges, and a risk remains. Each hedging strategy leads to a specific model with its own price [84]. The hedger compensates the remaining risk in incomplete markets by charging an additional risk premium. Hence the value function or expected value is not the price for which the option is sold. Depending on the way how the comfortable assumption of completeness of the BS-market is lost, different models are set up, calling for different numerical approaches. This Chap. 7 is devoted to computational tools for incomplete markets.

Relaxing several of the assumptions of the Black–Scholes market, nonlinear extensions of the BS equation can be derived. These “nonlinear Black–Scholes type equations” are of the form

$$\displaystyle{{ \partial V \over \partial t} +{ 1 \over 2}\left [\hat{\sigma }(S,t,{ \partial ^{2}V \over \partial S^{2}} )\right ]^{2}S^{2}{\partial ^{2}V \over \partial S^{2}} + rS{\partial V \over \partial S} - rV = 0\,. }$$
(7.2)

In this class of models, the volatility \(\hat{\sigma }\) is a function that may incorporate several types of nonlinearity. The standard PDE (7.1) is included for \(\hat{\sigma }\equiv \sigma\). In Sect. 7.1 we describe three scenarios leading to three different functions \(\hat{\sigma }\) of the volatility. A nonlinear PDE as (7.2) requires special numerical treatment, which will be the focus of Sect. 7.2.

Another stream of research beyond Black and Scholes is devoted to jump processes (Sect. 7.3). One of the numerical approaches is based on partial integro-differential equations (PIDE). Some highly efficient methods apply the Fourier transform; a basic approach will be discussed in Sect. 7.4.

7.1 Nonlinearities in Models for Financial Options

In this section we briefly discuss three sources of nonlinearity of \(\hat{\sigma }\) in (7.2). We start with transaction costs based on Leland’s approach [245], and touch the more sophisticated model of Barles and Soner [24]. Then we turn to specifying ranges of volatility. Finally we address feedback by market illiquidity.

7.1.1 Leland’s Model of Transaction Costs

Basic for the Black–Scholes model is the idea of rebalancing the portfolio continuously. But in financial reality this continuous trading would cause arbitrarily high trading costs. Keeping transaction costs low forces to abandon the optimal Black–Scholes hedging. But without the ideal BS hedging, the model suffers from hedging errors. To compromise, the hedger searches a balance between keeping both the transaction costs low and the hedging errors low.

Suppose that instead of rebalancing continuously, trading is only possible at discrete time instances with time step Δt apart (Δt fixed and finite). We assume a transaction cost rate proportional to the trading volume νS:

$$\displaystyle{\mbox{ trading}\ \nu \ \mbox{ assets costs the amount}\ c\vert \nu \vert S}$$

for some cost parameter c.

Here we sketch a heuristic derivation of a model due to [187, 245]. The discussion of this model parallels that for the Black–Scholes model, now adapted to the discrete scenario.Footnote 1 The stochastic changes of the asset with price S and of a riskless bond with price B are

$$\displaystyle\begin{array}{rcl} \varDelta S& =& \mu S\varDelta t +\sigma S\varDelta W\,, {}\\ \varDelta B& =& rB\varDelta t\,. {}\\ \end{array}$$

The portfolio with value Π is taken in the form

$$\displaystyle{\varPi =\alpha S +\beta B\,,}$$

with α units of the asset and β units of the bond. Suppose the portfolio is self-financing in the sense SΔα + BΔβ = 0, which is sufficient for ΔΠ = αΔS + βΔB. Further assume that trading is such that the portfolio Π replicates the value of the option.

By definition, ν = Δα. After one time interval, ν = Δα assets are traded, with transaction costs cS | Δα |. The change in the value of the portfolio is

$$\displaystyle{ \begin{array}{rcl} \varDelta \varPi & =&\alpha \varDelta S +\beta \varDelta B - cS\vert \varDelta \alpha \vert \\ & =&(\alpha \mu S +\beta rB)\varDelta t +\alpha \sigma S\varDelta W - cS\vert \varDelta \alpha \vert \,. \end{array} }$$
(7.3)

Let V be the value function of the option. Itô’s lemma adapted to the discrete scenario gives

$$\displaystyle{\varDelta V ={ \partial V \over \partial S} \;\varDelta S + \left ({\partial V \over \partial t} +{ \sigma ^{2} \over 2}S^{2}{\partial ^{2}V \over \partial S^{2}} \right )\varDelta t\,.}$$

By the no-arbitrage principle ΔV = ΔΠ holds for the replicating and self-financing portfolio. And coefficient matching will give further information. But first let us approximate the Δα-term.

From BS theory we expect \(\alpha \approx { \partial V \over \partial S}\). So ν = Δα will be approximated by

$$\displaystyle\begin{array}{rcl} & &{ \partial V (S +\varDelta S,t +\varDelta t) \over \partial S} -{ \partial V (S,t) \over \partial S} {}\\ & & ={ \partial ^{2}V (S,t) \over \partial S^{2}} \;\varDelta S +{ \partial ^{2}V (S,t) \over \partial S\,\partial t} \;\varDelta t +\mathrm{ t.h.o.}\,, {}\\ \end{array}$$

invoking Taylor’s expansion. After substituting ΔS we realize that the term of lowest order is

$$\displaystyle{\sigma S\,{\partial ^{2}V (S,t) \over \partial S^{2}} \;\varDelta W\,.}$$

In summary, by (7.3) the transaction costs in ΔΠ can be approximated by

$$\displaystyle{-cS\vert \varDelta \alpha \vert = -c\sigma S^{2}\Big\vert {\partial ^{2}V (S,t) \over \partial S^{2}} \Big\vert \;\vert \varDelta W\vert +\mathrm{ t.h.o.}\,,}$$

which is path-dependent. Leland [245] boldly suggested to approximate | ΔW | ≈ E( | ΔW | ). Exercise 7.1 tells

$$\displaystyle{\mathsf{E}(\vert \varDelta W\vert ) = \sqrt{\varDelta t}\;\sqrt{{2 \over \pi }} \;.}$$

In this way, the trading cost term − cS | Δα | is approximated by the deterministic expression

$$\displaystyle{ -c\sigma S^{2}\Big\vert {\partial ^{2}V (S,t) \over \partial S^{2}} \Big\vert \;\sqrt{\varDelta t}\;\sqrt{{2 \over \pi }} \,. }$$
(7.4)

This may be seen as further assumption, motivated by the above arguing. The approximation (7.4) of the transaction costs and its artificial parameter \(\sqrt{ 2/\pi } \approx 0.8\) reflect the lack of a unique price in incomplete markets.

With this somewhat artificial approximation (7.4) of the trading costs − cS | Δα |, coefficient matching of ΔV = ΔΠ leads to match the remaining stochastic terms,

$$\displaystyle{\alpha \sigma S\varDelta W =\sigma S{\partial V \over \partial S} \;\varDelta W\,,}$$

or \(\alpha ={ \partial V \over \partial S}\), which is the famous “delta hedging,” consistent with the modeling of Δα above. The remaining terms are deterministic. Use \(\beta B + S{ \partial V \over \partial S} =\varPi = V\) to obtain

$$\displaystyle{ \begin{array}{rcl} &&\left (\mu S{\partial V \over \partial S} + rV - rS{\partial V \over \partial S} \right )\,\varDelta t - cS\vert \varDelta \alpha \vert \\ && = \left ({\partial V \over \partial t} +{ \sigma ^{2} \over 2}S^{2}{\partial ^{2}V \over \partial S^{2}} +\mu S{\partial V \over \partial S} \right )\,\varDelta t\,. \end{array} }$$
(7.5)

The μ-terms cancel out. Equation (7.5) with transaction costs replaced by (7.4) leads to the variant of the Black–Scholes equation. With the coefficient

$$\displaystyle{ \gamma:= \sqrt{{2 \over \pi }} \left ({ 2c \over \sigma \sqrt{\varDelta t}}\right ) }$$
(7.6)

the resulting equation is

$$\displaystyle{{ \partial V \over \partial t} +{ 1 \over 2}\sigma ^{2}S^{2}{\partial ^{2}V \over \partial S^{2}} +{ 1 \over 2}\sigma ^{2}S^{2}\gamma \,\Big\vert {\partial ^{2}V \over \partial S^{2}} \Big\vert + rS{\partial V \over \partial S} - rV = 0\,. }$$
(7.7)

Formally, this becomes the standard Black–Scholes equation with a modified volatility

$$\displaystyle{ \hat{\sigma }^{2}(\varGamma ):=\sigma ^{2}[1 +\gamma \ \mathrm{ sign}(\varGamma )]\;, }$$
(7.8)

with \(\varGamma:={ \partial ^{2}V \over \partial S^{2}}\). For convex payoff, sign(Γ) = 1. This amounts to augment the volatility to a constant \(\hat{\sigma }>\sigma\) (Leland’s scenario). In this case the PDE (7.7) is again linear. But note that for instance for barrier options, Γ does change sign, and the PDE is nonlinear and of the general type of Eq. (7.2). For c = 0 (no transaction costs) (7.7) specializes to the BS-equation. To have a well-posed PDE, Δt must be such that γ < 1. In particular, Δt → 0 does not make sense.

7.1.2 The Barles and Soner Model of Transaction Costs

Barles and Soner [24] assume a price process dS t = S t (μ dt + σ dW t ), with constant volatility σ, 0 ≤ tT, and model transactions using the following variables:

  • α t shares of the asset with price S t ,

  • β t shares of the bond,

  • L t cumulative transfer form cash to stock, nondecreasing, L(0) = 0,

  • M t cumulative transfer from stock to cash, nondecreasing, M(0) = 0.

Consequently,

$$\displaystyle\begin{array}{rcl} \alpha _{t}& =& \alpha _{0} + L_{t} - M_{t}\,, {}\\ \beta _{t}& =& \beta _{0} -\int _{0}^{t}S_{\tau } \cdot (1 + c)\,\mathrm{d}L_{\tau } +\int _{ 0}^{t}S_{\tau } \cdot (1 - c)\,\mathrm{d}M_{\tau } +\int _{ 0}^{t}r\beta _{\tau }\,\mathrm{d}\tau \,. {}\\ \end{array}$$

That is, in both the cases buying and selling of stocks, transaction costs ∫S τ c are charged to β, where c again denotes proportional transaction costs. The further derivation of [24] is based on a utility function. The final result is

$$\displaystyle{{ \partial V \over \partial t} +{ 1 \over 2}\sigma ^{2}S^{2}{\partial ^{2}V \over \partial S^{2}} \cdot \left [1 + f\left (\mathrm{e}^{r(T-t)}a^{2}S^{2}{\partial ^{2}V \over \partial S^{2}} \right )\right ] + rS{\partial V \over \partial S} - rV = 0\,, }$$
(7.9)

where a is a parameter representing proportional transaction costs and risk aversion. The function f is the unique solution of the ODE

$$\displaystyle{{\mathrm{d}f(x) \over \mathrm{d}x} ={ f(x) + 1 \over 2\sqrt{xf(x)} - x}\quad \text{ with }\ f(0) = 0.}$$

The resulting function f is singular at x = 0 ( Exercise 7.2). Figure 7.1 shows the difference between the BS-solution and the solution of the corresponding nonlinear model (7.9).

Fig. 7.1
figure 1

V (S, Tt): difference between the solution of the Black–Scholes equation (7.1) and the solution of (7.9); K = 100, r = 0. 1, σ = 0. 2, a = 0. 02, T = 1. With kind permission of Pascal Heider

7.1.3 Specifying a Range of Volatility

The two above models of transaction costs come up with a nonlinear volatility function \(\hat{\sigma }(\varGamma )\). Usually this function is not known, and is subject to speculation (modeling). It will be easier to specify a range of volatility, assuming that \(\hat{\sigma }\) lies within an interval or band

$$\displaystyle{0 <\sigma _{\mathrm{min}} \leq \sigma \leq \sigma _{\mathrm{max}} <1\,.}$$

This is the uncertain-volatility model of [16, 17, 250].

The derivation starts as above, leading to (7.5) with c = 0. (Here transaction costs are not considered.) Formally, the result is the Black–Scholes equation (BSE), except that σ is no constant, but is considered as a stochastic variable σ(t):

$$\displaystyle{{\partial V \over \partial t} +{ 1 \over 2}\sigma (t)^{2}S^{2}{\partial ^{2}V \over \partial S^{2}} + rS{\partial V \over \partial S} - rV = 0\;.}$$

This is a PDE with stochastic control parameter σ(t). There is an ambitious theory for such controlled diffusion processes, see the monograph [233]. To avoid the use of this methodology, we adopt a simplified arguing, similar as in [375].

Using an argumentation of Black and Scholes, we construct a portfolio of one option (value V ), and hedge it with −α units of the underlying asset,

$$\displaystyle{\varPi = V -\alpha S\;.}$$

Assuming a change in the value of this portfolio in the form ΔΠ = ΔVαΔS, we have as above

$$\displaystyle{\varDelta \varPi ={ \partial V \over \partial S} \varDelta S + \left ({\partial V \over \partial t} +{ \sigma ^{2} \over 2}S^{2}{\partial ^{2}V \over \partial S^{2}} \right )\varDelta t -\alpha \varDelta S\;.}$$

The choice \(\alpha ={ \partial V \over \partial S}\) eliminates the risk represented by the ΔW-terms. This results in

$$\displaystyle{ \varDelta \varPi = \left ({\partial V \over \partial t} +{ \sigma ^{2} \over 2}S^{2}{\partial ^{2}V \over \partial S^{2}} \right )\varDelta t\,. }$$
(7.10)

Note that the return ΔΠ of the portfolio still depends on the unknown stochastic σ(t), we write ΔΠ(σ).

Now we define artificially two specific functions σ +(t) and σ (t) chosen such that the return ΔΠ(σ) increases by the maximum amount, or by the least amount:

  • σ +(t) chosen such that ΔΠ(σ +) is a maximum,

  • σ (t) chosen such that ΔΠ(σ ) is a minimum.

These returns reflect the best case and the worst case as seen by the holder. For every function σ(t) the no-arbitrage principle holds. Hence both cases σ +(t) and σ (t) result in a return ΔΠ = rΠΔt. This can be summarized as

$$\displaystyle\begin{array}{rcl} & & \sigma ^{+}\ \mathrm{maximizes}\quad \mathop{\mathrm{max}}\limits _{\sigma _{\min } \leq \sigma \leq \sigma _{\mathrm{max}}}\varDelta \varPi (\sigma ) = r\varPi \varDelta t\,, {}\\ & & \sigma ^{-}\ \mathrm{minimizes}\quad \mathop{\mathrm{min}}\limits _{\sigma _{\min } \leq \sigma \leq \sigma _{\mathrm{max}}}\varDelta \varPi (\sigma ) = r\varPi \varDelta t\,. {}\\ \end{array}$$

In view of the expression (7.10) for ΔΠ(σ), the two artificial functions σ +, σ enter via the term

$$\displaystyle{\sigma ^{2}{\partial ^{2}V \over \partial S^{2}} \,.}$$

For ΔΠ to become a maximum or minimum, σ + (or σ ) will equal σ min or σ max, depending on the sign of \(\varGamma ={ \partial ^{2}V \over \partial S^{2}}\). To become a maximum, set

$$\displaystyle{ \sigma ^{+}(\varGamma ):= \left \{\begin{array}{@{}l@{\quad }l@{}} \sigma _{\mathrm{max}}\quad &\text{if}\ \varGamma \geq 0\,, \\ \sigma _{\mathrm{min}} \quad &\text{if}\ \varGamma <0\,.\\ \quad \end{array} \right. }$$
(7.11)

And to become a minimum, set

$$\displaystyle{ \sigma ^{-}(\varGamma ):= \left \{\begin{array}{@{}l@{\quad }l@{}} \sigma _{\mathrm{max}}\quad &\text{if}\ \varGamma <0\,, \\ \sigma _{\mathrm{min}} \quad &\text{if}\ \varGamma \geq 0\,.\\ \quad \end{array} \right. }$$
(7.12)

Equations (7.11) and (7.12) define two specific control functions σ, which after substitution into the PDE ΔΠ(σ) = rΠΔt yields two nonlinear PDEs

$$\displaystyle{{ \partial V \over \partial t} +{ 1 \over 2}\hat{\sigma }(\varGamma )^{2}S^{2}{\partial ^{2}V \over \partial S^{2}} + rS{\partial V \over \partial S} - rV = 0\,, }$$
(7.13)

with \(\hat{\sigma }=\sigma ^{+}\) and \(\hat{\sigma }=\sigma ^{-}\) from (7.11)/(7.12). Let us denote the corresponding solutions V + and V . Since σ + yields the maximum return, we expect VV +, and similarly, V V. This provides the range V VV + for the option price.

In the special case of vanilla options, the convexity of V (S, . ) implies Γ ≥ 0 and hence σ + = σ max and σ = σ min; the nonlinearity is not effective then. The monotonicity of V with respect to σ is clear for vanilla options, but is not valid, for example, for barrier options. And convexity of V (S, . ) is lost for barrier options, butterfly spreads, digital options, and many other options [303]. The great potential of the uncertain-volatility model is illustrated by Fig. 7.2. For the example of a butterfly option, and an uncertainty interval 0. 15 ≤ σ ≤ 0. 25 we show the band V VV +, with two Black–Scholes curves therein. The payoff of a butterfly spread is illustrated schematically in Fig. 1.25d, see also Exercise 7.3. The functions V , V + were calculated with the methods to be explained in Sect. 7.2. For barrier options, the success of the method is doubtful because of the high sensitivity w.r.t. σ close to the barrier. Then the bandwidth may be so large that it is not of practical use. Such an example is shown in Fig. 7.3.

Fig. 7.2
figure 2

V (S, 0) of a European butterfly spread, uncertain-volatility model of Avellaneda et al., Sect. 7.1.3; with K = 100, K 1 = 85, K 2 = 115, r = 0. 13, σ min = 0. 15, σ max = 0. 25, δ = 0. 03, T = 0. 27. Four curves are shown: the bounding functions V + (orange curve) and V (green curve), and V of the standard Black–Scholes model with constant volatilities σ = 0. 15 (the steeper curve, in blue) and σ = 0. 25 (the lower profile, in violet)

Fig. 7.3
figure 3

V (S, 0) of a European up-and-out barrier call, uncertain-volatility model of Avellaneda et al., Sect. 7.1.3; with barrier B = 115, and K = 100, r = 0. 1, σ min = 0. 1, σ max = 0. 3, δ = 0, T = 0. 2. In addition to the two bounding curves V + (orange) and V (green) three V curves are shown of the standard Black–Scholes model with constant volatilities σ = 0. 1 (blue) and σ = 0. 2, 0. 3

7.1.4 Market Illiquidity

As pointed out by [140, 141, 330], the assumption that a big investor can trade large amounts of an asset without affecting its price, is not realistic. There will be a feedback, and the assumption of an infinite market liquidity may fail. Frey and Stremme [141], Schönbucher and Wilmott [330] introduce a market liquidity parameter λ, with 0 ≤ λ ≤ 1, and derive the nonlinear PDE

$$\displaystyle{{ \partial V \over \partial t} +{ 1 \over 2}{ \sigma ^{2}S^{2} \over (1 -\lambda { \partial ^{2}V \over \partial S^{2}} )^{2}}\,{\partial ^{2}V \over \partial S^{2}} + rS{\partial V \over \partial S} - rV = 0\,. }$$
(7.14)

Here we do not discuss further details. Note that this model is also of the form of Eq. (7.2).

7.2 Numerical Solution of Nonlinear Black–Scholes Equations

All the nonlinear PDEs of Sect. 7.1 fall under the general type of equation

$$\displaystyle{{ \partial V \over \partial t} +{ 1 \over 2}\hat{\sigma }^{2}(S,t,{ \partial ^{2}V \over \partial S^{2}} )S^{2}{\partial ^{2}V \over \partial S^{2}} + (r-\delta )S{\partial V \over \partial S} - rV = 0\,, }$$
(7.15)

which we are going to solve next. In this form, Eq. (7.15) represents the value of a European-style option. There is no analytical solution known for (7.15), so a numerical approach is needed also in the European case.

For an American-style option, a penalization can be applied, and an additional nonlinear term appears in (7.15). A penalty approach (e.g., [119, 133]) is to add the penalty \(\hat{p}\max (\varPsi -V,\,0)\), where Ψ denotes the payoff, and the penalty parameter \(\hat{p}\) is chosen large, say, \(\hat{p} = 10^{6}\). The resulting PDE is

$$\displaystyle{{ \partial V \over \partial t} +{ 1 \over 2}\hat{\sigma }^{2}(S,t,{ \partial ^{2}V \over \partial S^{2}} )S^{2}{\partial ^{2}V \over \partial S^{2}} + (r-\delta )S{\partial V \over \partial S} - rV +\hat{ p}\max (\varPsi -V,\,0) = 0\,. }$$
(7.16)

In the continuation region, for VΨ, the penalty term is zero, and (7.15) results. For \(\hat{p} \rightarrow \infty\), think of dividing the equation by \(\hat{p}\) to be convinced that V sticks close to Ψ. In Chap. 4, we could preserve the linear equation by the elegant complementarity approach. In (7.16) the PDE is nonlinear by the volatility function \(\hat{\sigma }\), and thus the nonlinear penalty term does not cause further harm.

7.2.1 Transformation

The transformation (4.3) of Chap. 4 is not valid here, because the volatility \(\hat{\sigma }\) is no longer constant. But assuming constant r, δ, the independent variables S, t can be transformed similarly. The transformation from variables S, t, V to x, τ, u is

$$\displaystyle{ x:=\log { S \over K}\,,\ \ \tau:={ 1 \over 2}\sigma _{0}^{2} \cdot (T - t)\,,\ \ u(x,\tau ):=\mathrm{ e}^{-x}{V (S,t) \over K} \,. }$$
(7.17)

σ 0 is a scaling parameter. As a result of the transformation, V S = u + u x and SV SS = u x + u xx . Here we use the notations V S , V SS , u τ , u x , u xx for partial derivatives. And (7.15) becomes

$$\displaystyle{ -u_{\tau } + \tilde{\sigma }^{2}(x,\tau,u_{ x},u_{xx})(u_{x} + u_{xx}) +{ 2(r-\delta ) \over \sigma _{0}^{2}} u_{x} -{ 2\delta \over \sigma _{0}^{2}}u = 0 }$$
(7.18)

with

$$\displaystyle{ \tilde{\sigma }:={ 1 \over \sigma _{0}} \,\hat{\sigma }\left (S,t,{ \partial ^{2}V \over \partial S^{2}} \right ) ={ 1 \over \sigma _{0}} \,\hat{\sigma }\left (K\mathrm{e}^{x},\,T -{ 2\tau \over \sigma _{0}^{2}},\,{ \mathrm{e}^{-x} \over K} (u_{x} + u_{xx})\right )\,. }$$
(7.19)

(Transform (7.16) in Exercise 7.4.) For example, for Leland’s model,

$$\displaystyle{\tilde{\sigma }^{2} = 1 +\gamma \,\mathrm{ sign}(u_{ x} + u_{xx})\,.}$$

For all of the models of Sect. 7.1 the nonlinearity is of the type

$$\displaystyle{ \tilde{\sigma }^{2}(x,\tau,s) \cdot s\ \ \mathrm{with}\ \ s:= u_{ x} + u_{xx}\,, }$$
(7.20)

with \(\tilde{\sigma }\) from (7.19).

The payoffs Ψ of the options are transformed as well. Let u denote the transformed payoff. For the payoff of a vanilla put,

$$\displaystyle{V (S,T) = K\mathrm{e}^{x}u(x,0) = (K - S)^{+} = K(1 -\mathrm{ e}^{x})^{+}}$$

and hence

$$\displaystyle{u(x,0) = u^{{\ast}}(x):= (\mathrm{e}^{-x} - 1)^{+}\,.}$$

Similarly, for a vanilla call,

$$\displaystyle{u(x,0) = u^{{\ast}}(x):= (1 -\mathrm{ e}^{-x})^{+}\,.}$$

This is similar for exotic options ( Exercise 7.3).

Finally, boundary conditions are chosen (as in Sect. 4.4) and transformed. For example, applying (4.27) for a vanilla call of the European type,

$$\displaystyle\begin{array}{rcl} u(x_{\mathrm{max}},\tau )& =&{ \mathrm{e}^{-x_{\mathrm{max}}} \over K} V (S_{\mathrm{max}},t) {}\\ & =&{ \mathrm{e}^{-x_{\mathrm{max}}} \over K} (S_{\mathrm{max}}\mathrm{e}^{-\delta (T-t)} - K\mathrm{e}^{-r(T-t)}) {}\\ & =& \mathrm{e}^{-\delta (T-t)} -\exp (-r(T - t) - x_{\mathrm{ max}}) {}\\ & =& \exp (-\tau { 2\delta \over \sigma _{0}^{2}}) -\exp (-\tau {2r \over \sigma _{0}^{2}} - x_{\mathrm{max}})\,, {}\\ u(x_{\mathrm{min}},\tau )& =& 0\,. {}\\ \end{array}$$

For a vanilla put and S min ≈ 0 one may choose

$$\displaystyle\begin{array}{rcl} u(x_{\mathrm{min}},\tau )& =&{ 1 \over K}\mathrm{e}^{-x_{\mathrm{min}} }K\mathrm{e}^{-r(T-t)} =\exp (-\tau {2r \over \sigma _{0}^{2}} - x_{\mathrm{min}})\,, {}\\ u(x_{\mathrm{max}},\tau )& =& 0\,. {}\\ \end{array}$$

For vanilla American-style options with penalty formulation (7.16), the nonzero boundary conditions are just that u is in contact with the payoff,

$$\displaystyle\begin{array}{rcl} & & u(x_{\mathrm{min}}) = u^{{\ast}}(x_{\mathrm{ min}}) =\mathrm{ e}^{-x_{\mathrm{min}} } - 1\quad \ \text{ for a put, and } {}\\ & & u(x_{\mathrm{max}}) = u^{{\ast}}(x_{\mathrm{ max}}) = 1 -\mathrm{ e}^{-x_{\mathrm{max}} }\quad \text{ for a call}. {}\\ \end{array}$$

7.2.2 Discretization

Finite differences in a standard fashion as in Chap. 4, with the same grid, lead to nonlinear equations for the vector w (ν) of approximate values at time level τ ν = τ ν−1 + Δτ. The equidistant x-spacing with mesh size Δx consists of m subintervals, see Sect. 4.2.2. As before, the components w 0 and w m are defined by boundary conditions. The finite differences include

$$\displaystyle\begin{array}{rcl} \delta _{x}w_{i,\nu }&:=&{ w_{i+1,\nu } - w_{i-1,\nu } \over 2\varDelta x} \,, {}\\ \delta _{xx}w_{i,\nu }&:=&{ w_{i+1,\nu } - 2w_{i,\nu } + w_{i-1,\nu } \over \varDelta x^{2}} \,, {}\\ \end{array}$$

where Δx 2 is understood as (Δx)2. For the discretization replace s of (7.20) by \(\bar{s}\) with

$$\displaystyle{\bar{s}_{i,\nu }:= (\delta _{x} +\delta _{xx})w_{i,\nu } ={ w_{i+1,\nu } - w_{i-1,\nu } \over 2\varDelta x} +{ w_{i+1,\nu } - 2w_{i,\nu } + w_{i-1,\nu } \over \varDelta x^{2}} \,.}$$

Substituting into the PDEs is the next step. Here we confine ourselves to the European case (7.15); the discretization of (7.16) is analogous and left to the reader. Define

$$\displaystyle\begin{array}{rcl} \mathcal{L}_{i,\nu }:& =& \tilde{\sigma }^{2}(x_{ i},\tau _{\nu },\delta _{x}w_{i,\nu },\delta _{xx}w_{i,\nu })(\delta _{x}w_{i,\nu } +\delta _{xx}w_{i,\nu }) {}\\ & & +{2(r-\delta ) \over \sigma _{0}^{2}} \delta _{x}w_{i,\nu } -{ 2\delta \over \sigma _{0}^{2}}w_{i,\nu } {}\\ \end{array}$$

to arrive at the θ-approach

$$\displaystyle{{ -w_{i,\nu +1} + w_{i,\nu } \over \varDelta \tau } +\theta \mathcal{L}_{i,\nu +1} + (1-\theta )\mathcal{L}_{i,\nu } = 0\,. }$$
(7.21)

Recall that this includes Crank–Nicolson for \(\theta ={ 1 \over 2}\), and for θ = 1 the fully implicit Euler (BDF). The \(\tilde{\sigma }\) of the above examples is represented by the discretization \(\tilde{\sigma }(x_{i},\tau _{\nu },\bar{s}_{i,\nu })\) with

$$\displaystyle{ \begin{array}{rcl} \bar{s}_{i,\nu }& =&w_{i-1,\nu }\left (-{ 1 \over 2\varDelta x} +{ 1 \over \varDelta x^{2}}\right ) -{ 2 \over \varDelta x^{2}}w_{i,\nu } + w_{i+1,\nu }\left ({ 1 \over 2\varDelta x} +{ 1 \over \varDelta x^{2}}\right ) \\ & =&\alpha \,w_{i-1,\nu } -{ 2 \over \varDelta x^{2}}\,w_{i,\nu } +\beta \, w_{i+1,\nu }\,,\end{array} }$$
(7.22)

where we denote

$$\displaystyle{ \alpha:= -{ 1 \over 2\varDelta x} +{ 1 \over \varDelta x^{2}}\,,\quad \beta:={ 1 \over 2\varDelta x} +{ 1 \over \varDelta x^{2}}\,, }$$
(7.23)

and reuse the notation \(\tilde{\sigma }\) for the three-argument version. Now the discretized version of the operator \(\mathcal{L}_{i,\nu }\) is

$$\displaystyle{ \mathcal{L}_{i,\nu } =\tilde{\sigma } ^{2}(x_{ i},\tau _{\nu },\bar{s}_{i,\nu })\bar{s}_{i,\nu } +{ r-\delta \over \sigma _{0}^{2}\varDelta x}(w_{i+1,\nu } - w_{i-1,\nu }) -{ 2\delta \over \sigma _{0}^{2}}w_{i,\nu } }$$
(7.24)

and the θ-method reads

$$\displaystyle{ -w_{i,\nu +1} + w_{i,\nu } +\theta \varDelta \tau \mathcal{L}_{i,\nu +1} + (1-\theta )\varDelta \tau \mathcal{L}_{i,\nu } = 0\,. }$$
(7.25)

With the vector notation w (ν) as in Chap. 4 and a vector function F this is written

$$\displaystyle{F(w^{(\nu +1)},w^{(\nu )}) = 0\,.}$$

For the fully implicit BDF method (θ = 1), the ith equation of the vector equation F = 0 reads

$$\displaystyle{ \begin{array}{rcl} F_{i} =&-&w_{i}^{(\nu +1)} + w_{i}^{(\nu )} \\ & +&\varDelta \tau \bigg[\tilde{\sigma }^{2}(x_{ i},\tau _{\nu +1},\alpha w_{i-1}^{(\nu +1)} -{ 2 \over \varDelta x^{2}}w_{i}^{(\nu +1)} +\beta w_{ i+1}^{(\nu +1)})\cdot \\ & &\qquad \qquad (\alpha w_{i-1}^{(\nu +1)} -{ 2 \over \varDelta x^{2}}w_{i}^{(\nu +1)} +\beta w_{ i+1}^{(\nu +1)}) \\ &-&{ r-\delta \over \sigma _{0}^{2}\varDelta x}w_{i-1}^{(\nu +1)} -{ 2\delta \over \sigma _{0}^{2}}w_{i}^{(\nu +1)} +{ r-\delta \over \sigma _{0}^{2}\varDelta x}w_{i+1}^{(\nu +1)}\bigg] = 0\,. \end{array} }$$
(7.26)

For i = 0 and i = m, boundary conditions enter. Their basic structure is

$$\displaystyle{ \begin{array}{rcl} &&F_{0}^{(\nu )}:= u(x_{\mathrm{min}},\tau _{\nu }) - w_{0}^{(\nu )}\,, \\ &&F_{m}^{(\nu )}:= u(x_{\mathrm{max}},\tau _{\nu }) - w_{m}^{(\nu )}\,.\end{array} }$$
(7.27)

In the θ-method (7.25) boundary conditions enter in the form θF (ν+1) + (1 −θ)F (ν). The nonlinear equation F(w (ν+1), w (ν)) = 0 with components defined by (7.26)/(7.27) represents a discretization of (7.15). It is solved iteratively by Newton’s method.

7.2.3 Convergence of the Discrete Equations

The above numerical scheme is of the form

$$\displaystyle{F(\varDelta \tau,\varDelta x,\nu,i,w_{i,\nu },\tilde{w}) = 0}$$

where \(\tilde{w}\) stands for the vector of all w k, l . For such a scheme convergence to the unique viscosity solution ( Appendix C.5) can be proved, provided F satisfies three conditions [23], namely,

  • stability,

  • consistency, and

  • monotonicity.

Not for the numerical scheme but for the equation an additional property must be assumed, namely, the strong uniqueness. For the uniqueness we refer to the special literature [89].

The proof that for a particular numerical scheme all of these three criteria are satisfied, can be quite involved [176, 177, 303]. Checking stability and consistency is rather standard, and has been widely discussed in previous chapters. Here we concentrate on the monotonicity of the scheme, which is a new aspect as compared to the investigations for the linear equation in Chap. 4.

Definition 7.1 (Monotone Scheme)

A discretization F(w (ν+1), w (ν)) is monotone if for all i = 0, , m

$$\displaystyle\begin{array}{rcl} \mathrm{(a)}\qquad & & F_{i}(w^{(\nu +1)} +\epsilon ^{(\nu +1)},\,w^{(\nu )} +\epsilon ^{(\nu )}) \geq F_{ i}(w^{(\nu +1)},w^{(\nu )})\ \text{ for all } {}\\ & & \epsilon ^{(\nu +1)}:= (0,\ldots,0,\epsilon _{ i-1}^{(\nu +1)},0,\epsilon _{ i+1}^{(\nu +1)},0,\ldots,0) \geq 0\ \text{ and} {}\\ & & \epsilon ^{(\nu )}:= (0,\ldots,0,\epsilon _{ i-1}^{(\nu )},\epsilon _{ i}^{(\nu )},\epsilon _{ i+1}^{(\nu )},0,\ldots,0) \geq 0\,, {}\\ \end{array}$$

and

$$\displaystyle\begin{array}{rcl} \mathrm{(b)}\qquad & & F_{i}(w^{(\nu +1)} +\epsilon ^{(\nu +1)},\,w^{(\nu )}) \leq F_{ i}(w^{(\nu +1)},w^{(\nu )})\ \text{ for all} {}\\ & & \epsilon ^{(\nu +1)}:= (0,\ldots,0,\epsilon _{ i}^{(\nu +1)},0,\ldots,0) \geq 0\,. {}\\ \end{array}$$

Translated into the fully implicit scheme (7.26)/(7.27), the condition (a) of monotonicity reads

$$\displaystyle\begin{array}{rcl} & & F_{i}(w_{i}^{(\nu +1)},\,w_{ i-1}^{(\nu +1)} +\epsilon _{ 1},\,w_{i+1}^{(\nu +1)} +\epsilon _{ 2},\,w_{i}^{(\nu )} +\epsilon _{ 3}) \geq {}\\ & & F_{i}(w_{i}^{(\nu +1)},\,w_{ i-1}^{(\nu +1)},\,w_{ i+1}^{(\nu +1)},\,w_{ i}^{(\nu )}) {}\\ \end{array}$$

for scalar ε 1, ε 2, ε 3, ε. Because of transitivity, it suffices to show separately

$$\displaystyle\begin{array}{rcl} & \mathrm{(a1)}\quad F_{i}(w_{i}^{(\nu +1)},\,w_{i-1}^{(\nu +1)}+\epsilon,\,w_{i+1}^{(\nu +1)},\,w_{i}^{(\nu )}) \geq F_{i}(w_{i}^{(\nu +1)},w_{i-1}^{(\nu +1)},w_{i+1}^{(\nu +1)},w_{i}^{(\nu )})& {}\\ & \mathrm{(a2)}\quad F_{i}(w_{i}^{(\nu +1)},\,w_{i-1}^{(\nu +1)},\,w_{i+1}^{(\nu +1)}+\epsilon,\,w_{i}^{(\nu )}) \geq F_{i}(w_{i}^{(\nu +1)},w_{i-1}^{(\nu +1)},w_{i+1}^{(\nu +1)},w_{i}^{(\nu )})& {}\\ & \mathrm{(a3)}\quad F_{i}(w_{i}^{(\nu +1)},\,w_{i-1}^{(\nu +1)},\,w_{i+1}^{(\nu +1)},\,w_{i}^{(\nu )}+\epsilon ) \geq F_{i}(w_{i}^{(\nu +1)},w_{i-1}^{(\nu +1)},w_{i+1}^{(\nu +1)},w_{i}^{(\nu )})& {}\\ \end{array}$$

for (a) to hold, and for (b)

$$\displaystyle{F_{i}(w_{i}^{(\nu +1)}+\epsilon,\,w_{ i-1}^{(\nu +1)},\,w_{ i+1}^{(\nu +1)},\,w_{ i}^{(\nu )}) \leq F_{ i}(w_{i}^{(\nu +1)},\,w_{ i-1}^{(\nu +1)},\,w_{ i+1}^{(\nu +1)},\,w_{ i}^{(\nu )})\,.}$$

Next we check under which conditions the scheme (7.26)/(7.27) is monotone. Heider [176] has shown that the scheme converges whenever the nonlinear term \(\tilde{\sigma }^{2}(x,\tau,s)s\) satisfies conditions (i)–(iii) of the following Theorem 7.2:

Theorem 7.2 (Convergence)

Assume \(\tilde{\sigma }^{2}(x,\tau,u_{x},u_{xx})\) in the form \(\tilde{\sigma }^{2}(x,\tau,s)\) , with s = u x + u xx from ( 7.20 ), and

  1. (i)

    \(\tilde{\sigma }^{2}(x,\tau,s)s\)  is continuous and monotone increasing in s,

  2. (ii)

    there exists a constant c + > 0 such that for all s and ε > 0

    $$\displaystyle{\tilde{\sigma }^{2}(x,\tau,s+\epsilon ) \cdot (s+\epsilon ) \geq \tilde{\sigma }^{2}(x,\tau,s) \cdot s + c_{ +}\epsilon \,,\ \mathit{\text{ and }}}$$
  3. (iii)

    Δx is small enough such that

    $$\displaystyle{c_{+}{2 -\varDelta x \over \varDelta x} -{ 2(r-\delta ) \over \sigma _{0}^{2}} \geq 0\ \mathit{\text{ and }}\ c_{+}{2 +\varDelta x \over \varDelta x} +{ 2(r-\delta ) \over \sigma _{0}^{2}} \geq 0\,.}$$

Then the fully implicit BDF scheme ( 7.26 )/( 7.27 ) converges to the viscosity solution of ( 7.15 ).

Proof

Here we confine ourselves to the proof of monotonicity. As noted above, we can proceed componentwise and check (a1), (a2), (a3), and (b) separately. We begin with 0 < i < m.

To show (a1), perturb w i−1 (ν+1)w i−1 (ν+1) + ε for ε > 0. Then \(\bar{s}_{i,\nu } \rightarrow \bar{ s}_{i,\nu }+\alpha \epsilon\), and

$$\displaystyle\begin{array}{rcl} & & F_{i}(w_{i}^{(\nu +1)},\,w_{ i-1}^{(\nu +1)}+\epsilon,\,w_{ i+1}^{(\nu +1)},\,w_{ i}^{(\nu )}) = {}\\ & & \quad - w_{i}^{(\nu +1)} + w_{ i}^{(\nu )} +\varDelta \tau \bigg [\tilde{\sigma }^{2}(x_{ i},\tau _{\nu +1},\bar{s}_{i,\nu }+\alpha \epsilon )(\bar{s}_{i,\nu }+\alpha \epsilon ) {}\\ & & \quad -{ r-\delta \over \sigma _{0}^{2}\varDelta x}(w_{i-1}^{(\nu +1)}+\epsilon ) -{ 2\delta \over \sigma _{0}^{2}}w_{i}^{(\nu +1)} +{ r-\delta \over \sigma _{0}^{2}\varDelta x}w_{i+1}^{(\nu +1)}\bigg] {}\\ & & \geq -w_{i}^{(\nu +1)} + w_{ i}^{(\nu )} +\varDelta \tau \bigg [\tilde{\sigma }^{2}(x_{ i},\tau _{\nu +1},\bar{s}_{i,\nu })\bar{s}_{i,\nu } + c_{+}\epsilon \alpha {}\\ & & \quad -{ r-\delta \over \sigma _{0}^{2}\varDelta x}w_{i-1}^{(\nu +1)} -{ 2\delta \over \sigma _{0}^{2}}w_{i}^{(\nu +1)} +{ r-\delta \over \sigma _{0}^{2}\varDelta x}w_{i+1}^{(\nu +1)} -{ r-\delta \over \sigma _{0}^{2}\varDelta x}\epsilon \bigg]\,, {}\\ \end{array}$$

where the inequality is due to (ii). Compare with F i in (7.26)/(7.27) and realize two extra terms. By (iii), with α from (7.23), they are

$$\displaystyle{c_{+}\epsilon \alpha -{ r-\delta \over \sigma _{0}^{2}\varDelta x}\epsilon ={ \epsilon \over 2\varDelta x}\left [c_{+}{2 -\varDelta x \over \varDelta x} -{ 2(r-\delta ) \over \sigma _{0}^{2}} \right ] \geq 0\,.}$$

So we have shown (a1), the first of the four criteria of monotonicity.

To show (a2), perturb w i+1 (ν+1)w i+1 (ν+1) + ε. Then \(\bar{s}_{i,\nu } \rightarrow \bar{ s}_{i,\nu }+\epsilon \beta\) and the perturbed F i is

$$\displaystyle\begin{array}{rcl} & & -w_{i}^{(\nu +1)} + w_{ i}^{(\nu )} +\varDelta \tau \bigg [\tilde{\sigma }^{2}(x_{ i},\tau _{\nu +1},\bar{s}_{i,\nu }+\beta \epsilon )(\bar{s}_{i,\nu }+\beta \epsilon ) {}\\ & & -{ r-\delta \over \sigma _{0}^{2}\varDelta x}w_{i-1}^{(\nu +1)} -{ 2\delta \over \sigma _{0}^{2}}w_{i}^{(\nu +1)} +{ r-\delta \over \sigma _{0}^{2}\varDelta x}w_{i+1}^{(\nu +1)} +\epsilon { r-\delta \over \sigma _{0}^{2}\varDelta x}\bigg]\,. {}\\ \end{array}$$

Again we obtain a lower bound by (ii), and arrive at the sum of two extra terms

$$\displaystyle{c_{+}\epsilon \beta +\epsilon { r-\delta \over \sigma _{0}^{2}\varDelta x}\,,}$$

which is ≥ 0 by (iii). So the perturbed F i is larger or equal the unperturbed F i , and (a2) is satisfied.

The assertion (a3) is clearly satisfied since the perturbation w i (ν)w i (ν) + ε only affects the term outside the brackets.

To show (b), perturb w i (ν+1)w i (ν+1) + ε. Then \(\bar{s}_{i,\nu } \rightarrow \bar{ s}_{i,\nu } -{ 2\epsilon \over \varDelta x^{2}}\), and F i is perturbed to

$$\displaystyle\begin{array}{rcl} & & -w_{i}^{(\nu +1)} -\epsilon +w_{ i}^{(\nu )} +\varDelta \tau \bigg [\tilde{\sigma }^{2}(x_{ i},\tau _{\nu +1},\bar{s}_{i,\nu } -\epsilon { 2 \over \varDelta x^{2}})(\bar{s}_{i,\nu } -\epsilon { 2 \over \varDelta x^{2}}) {}\\ & & -{ r-\delta \over \sigma _{0}^{2}\varDelta x}w_{i-1}^{(\nu +1)} -{ 2\delta \over \sigma _{0}^{2}}w_{i}^{(\nu +1)} -{ 2\delta \over \sigma _{0}^{2}}\epsilon +{ r-\delta \over \sigma _{0}^{2}\varDelta x}w_{i+1}^{(\nu +1)}\bigg]\,. {}\\ \end{array}$$

By the monotonicity (i) and by ε > 0, δ ≥ 0, the above is smaller or equal to the unperturbed F i —that is, (b) holds true.

Finally, monotonicity must be checked for F 0 and F m . For θ = 1, F 0 depends on w 0 (ν+1) and F m depends on w m (ν+1). Hence only (b) needs to be checked, which is clearly satisfied.

This ends the proof that the conditions (i), (ii), (iii) imply monotonicity of the fully implicit scheme.

Example 7.3 (Leland’s Model)

Let us inspect whether the criteria (i), (ii), (iii) of Theorem 7.2 are satisfied for Leland’s model of transaction costs. For (i) we require | γ | < 1. With some simple manipulations, one shows that (ii) is satisfied with c + = 1 −γ. And for (iii) to hold, the grid size Δx must be small enough ( Exercise 7.5). Specifically, for zero dividend rate δ = 0, the θ-method is

$$\displaystyle\begin{array}{rcl} & & -w_{i}^{(\nu +1)} + w_{ i}^{(\nu )} +\varDelta \tau \cdot \theta \,[\tilde{\sigma }^{2}(\bar{s}_{ i}^{(\nu +1)})\bar{s}_{ i}^{(\nu +1)} +{ 2r \over \sigma _{0}^{2}} \delta _{x}w_{i}^{(\nu +1)}] {}\\ & & +\varDelta \tau (1-\theta )\,[\tilde{\sigma }^{2}(\bar{s}_{ i}^{(\nu )})\bar{s}_{ i}^{(\nu )} +{ 2r \over \sigma _{0}^{2}} \delta _{x}w_{i}^{(\nu )}] = 0\,. {}\\ \end{array}$$

Sufficient conditions for the Crank–Nicolson scheme (θ = 1∕2) to converge include (i), (ii), (iii), and in addition (iv) and (v):

(iv) There exists a constant c > 0 such that for all ε > 0 and s

$$\displaystyle{\tilde{\sigma }^{2}(x,\tau,s-\epsilon )(s-\epsilon ) \geq \tilde{\sigma }^{2}(x,\tau,s)s - c_{ -}\epsilon \,,}$$

(v)

$$\displaystyle{\varDelta \tau \leq { \varDelta x^{2} \over c_{-}}\,{ \sigma _{0}^{2} \over \sigma _{0}^{2} +\varDelta x\,\delta }\,,}$$

see [176, 177]. Condition (iv) holds for Leland’s model with c = 1 + γ, and for the uncertain-volatility model with c = σ max 2. Conditions (iii) and (iv) amount to stability bounds. We emphasize that in the case of nonlinear models, unconditional stability does not hold!

The above has discussed convergence towards the viscosity solution. An application of the uncertain-volatility model to a butterfly is shown in Fig. 7.2. Another illustration is the barrier option in Fig. 7.3. When in case of an American-style option a penalty approach is applied, further assumptions are needed to assert convergence to the solution for \(\hat{p} \rightarrow \infty\), even though one keeps \(\hat{p}\) fixed.

7.3 Option Valuation Under Jump Processes

In this section, we sketch some instruments of Lévy processes as background to the application of partial integro-differential equations. The focus is on one important example, namely Merton’s jump diffusion, and on strategies for a numerical valuation of options under such processes. This is no introduction to Lévy processes; for expositions on Lévy processes consult, for instance, [84, 328, 339].

For a Lévy process X t , all increments X t+Δt X t are stochastically independent. Further, they are stationary, which means that all increments have the distribution of X t . Instead of requiring continuity, Lévy processes must be “càdlàg”Footnote 2: For all t, the process X t is right-continuous (\(X_{t} = X_{t^{+}}\)), and the left limit \(X_{t^{-}}\) exists. Important examples of Lévy processes are the Wiener process (Sect. 1.6.1), and the Poisson process (Sect. 1.9).

7.3.1 Characteristic Functions

A classification of Lévy processes X t is based on the Fourier transformationFootnote 3

$$\displaystyle{ \phi _{X_{t}}(\zeta ):= \mathsf{E}(\exp (\mathrm{i}\zeta X_{t}))\,. }$$
(7.28)

The function \(\phi _{X_{t}}\) singles out characteristic properties of a random variable X t . \(\phi _{X_{t}}\) is called characteristic function of X t , and \(\psi _{X_{t}}(\zeta )\) [shorter: ψ(ζ)] defined by \(\exp (t\psi (\zeta )) =\phi _{X_{t}}(\zeta )\) is the characteristic exponent. It suffices to take t = 1, since the distribution of X 1 characterizes the process. The characteristic exponent ψ(ζ) satisfies the Lévy–Khinchin representation

$$\displaystyle{ \psi (\zeta ) = \mathrm{i}\gamma \zeta -{ 1 \over 2}\sigma ^{2}\zeta ^{2} +\int \limits _{ -\infty }^{\infty }\left (\exp (\mathrm{i}\zeta x) - 1 -\mathrm{i}\zeta x\,\mathbf{1}_{\{\vert x\vert \leq 1\}}\right )\,\nu (\mathrm{d}x). }$$
(7.29)

The three terms in this representation characterize different aspects of X t . \(\gamma \in \mathbb{R}\) corresponds to a deterministic trend, σ 2 to the variance of a diffusion (Brownian-motion) part of X t , and ν is a measure on \(\mathbb{R}\) characterizing the activity of jumps \(\varDelta X_{t}:= X_{t} - X_{t^{-}}\),

$$\displaystyle{\nu (A):= \mathsf{E}\,[\#\{t \in [0,1]\,\mid \;\varDelta X_{t}\neq 0,\;\varDelta X_{t} \in A\}]\,.}$$

The Lévy measure ν(A) counts the (expected) number of jumps of “size” within A per unit time [84]. ν(A) is not a probability measure. For the Lévy measure ν, require \(\int _{\mathbb{R}}\min (x^{2},1)\,\nu (\mathrm{d}x) <\infty\) and \(\nu (\{0\}) = 0\). In the integrand of (7.29), the subtracted term iζx1 { | x | ≤ 1} causes the integrand to be of the order O( | x |2) for x → 0. This compensation along with the constraints on ν implies existence of the integral. For many important Lévy processes, ν(dx) has a convenient representation

$$\displaystyle{ \nu (\mathrm{d}x) = f_{\mathrm{L}}(x)\,\mathrm{d}x }$$
(7.30)

with a Lévy density f L. The three items γ, σ 2, ν (“characteristic triplet”) characterize a Lévy process in a unique way.

Example 7.4 (Compound Poisson Process)

For a Poisson process J t with jump intensity λ, a compound Poisson process is

$$\displaystyle{X_{t}:=\sum _{ j=1}^{J_{t} }\varDelta X_{\tau _{j}}\,,}$$

where the jump sizes \(\varDelta X_{\tau _{j}}\) are assumed i.i.d. with distribution density f, and independent of the Poisson process J. The characteristic function \(\phi _{X_{t}}(\zeta )\) of the compound Poisson process (cP) is

$$\displaystyle{ \begin{array}{rcl} \mathsf{E}(\exp [\mathrm{i}\zeta X_{t}))& =&\exp [\lambda t\,(\phi _{\varDelta X}(\zeta ) - 1)] \\ & =&\exp \left [t\int _{\mathbb{R}}(\mathrm{e}^{\mathrm{i}\zeta x} - 1)\nu (\mathrm{d}x)\right ] \end{array} }$$
(7.31)

with Lévy measure ν(dx) = λf(x) dx. The first of the equations in (7.31) uses rules of the conditional expectation [84], whereas the second just applies (7.28) with the definition (B.4) of the expectation, including \(\int _{\mathbb{R}}\nu (\mathrm{d}x) =\lambda\). The characteristic exponent ψ cP is the integral in (7.31), γ = σ = 0.

As in (1.65), financial models typically arise in exponential form. For such exponential Lévy processes there is a useful criterion for the martingale property, and hence for risk-neutral valuation:

Lemma 7.5 (Martingale Criterion)

Let X t be a Lévy process. \(\mathrm{e}^{X_{t}}\) is a martingale if and only if ψ X (−i) = 0 and \(\mathsf{E}(\mathrm{e}^{X_{t}}) <\infty\).

Proof

We extend ζ to complex numbers, and note that

$$\displaystyle{\mathsf{E}(\mathrm{e}^{X_{t} }) = \mathsf{E}(\mathrm{e}^{-\mathrm{i}\mathrm{i}X_{t} }) =\phi _{X_{t}}(-\mathrm{i}) =\mathrm{ e}^{t\psi (-\mathrm{i})}\,.}$$

Then by independence and stationarity,

$$\displaystyle{\mathsf{E}(\mathrm{e}^{X_{t} }\,\vert \,\mathcal{F}_{s}) -\mathrm{ e}^{X_{s} } = \mathsf{E}(\mathrm{e}^{X_{t-s} }) -\mathrm{ e}^{X_{0} } =\mathrm{ e}^{(t-s)\psi (-\mathrm{i})} - 1\,.}$$

( Exercise 7.6) □

In finance applications, with an asset price S t for t ≥ 0, the absence of arbitrage implies that the discounted ert S t is a martingale with respect to a risk-neutral measure. This suggests to represent S t in the form S t = S 0exp(rt + X t ). Then the discounted S t is the situation to which the Lemma 7.5 applies.

Example 7.6 (Brownian Motion with Drift)

A Lévy process X t is Brownian motion if and only if ν ≡ 0 (no jump). For ease of comparison with (1.71) and (1.76) we take the drift γ in the form \(\gamma =\mu -{1 \over 2}\sigma ^{2}\). For the Brownian motion with drift (Bwd) X t : = γt + σW t we use a result from probability Footnote 4 and conclude for the characteristic exponent

$$\displaystyle{\psi _{\mathrm{Bwd}}(\zeta ) = \mathrm{i}(\mu -{1 \over 2}\sigma ^{2})\zeta -{ 1 \over 2}\sigma ^{2}\zeta ^{2}\,.}$$

Clearly, ψ Bwd(−i) = μ. Hence by Lemma 7.5 e t X is martingale for μ = 0. Hence the discounted

$$\displaystyle{S_{0}\mathrm{e}^{-rt}\exp (rt + X_{ t}) = S_{0}\mathrm{e}^{-rt}\exp [(r -{ 1 \over 2}\sigma ^{2})t +\sigma W_{ t}]}$$

is martingale. This recovers the well-known riskless drift rate r for a numerical simulation of GBM in the Black-Scholes model.

Example 7.7 (Merton’ s Jump Diffusion)

We now combine Examples 7.4 and 7.6. As a special case of Example 7.4 we choose as in Sect. 1.9 the jump sizes ΔY in the log process Y t : = logS t to be normally distributed, \(\varDelta Y \sim \mathcal{N}(\mu _{\mathrm{J}},\sigma _{\mathrm{J}}^{2})\) (logq in Sect. 1.9). Furnished with a drifted Brownian motion, this is Merton’s jump-diffusion model (1.74) with jump intensity λ and \(\gamma =\mu -{1 \over 2}\sigma ^{2}\). The Lévy density of the compound Poisson process is λ times the density of the normal distribution,

$$\displaystyle{ f_{\mathrm{L}}(x) = f_{\mathrm{cP}}(x):=\lambda \,{ 1 \over \sigma _{\mathrm{J}}\sqrt{2\pi }}\,\exp \left [-{(x -\mu _{\mathrm{J}})^{2} \over 2\sigma _{\mathrm{J}}^{2}} \right ]\,. }$$
(7.32)

Since the two processes are independent, and by the exponential structure in (7.28), the two characteristic exponents add:

$$\displaystyle\begin{array}{rcl} \psi (\zeta )& =& \psi _{\mathrm{Bwd}}(\zeta ) +\psi _{\mathrm{cP}}(\zeta ) {}\\ & =& \mathrm{i}\gamma \zeta -{ 1 \over 2}\sigma ^{2}\zeta ^{2} +\int _{ \mathbb{R}}(\mathrm{e}^{\mathrm{i}\zeta x} - 1)\nu (\mathrm{d}x) {}\\ & {}\\ \end{array}$$

and

$$\displaystyle{\psi (-\mathrm{i}) =\gamma +{1 \over 2}\sigma ^{2} +\int _{ \mathbb{R}}(\mathrm{e}^{x} - 1)\nu (\mathrm{d}x)\,.}$$

Similar as in Exercise 22 we calculate the integral

$$\displaystyle{\int _{-\infty }^{\infty }(\mathrm{e}^{x} - 1)f_{\mathrm{ cP}}(x)\,\mathrm{d}x =\lambda \left (\exp \left [\mathrm{i}\mu _{\mathrm{J}}\zeta -{ 1 \over 2}\sigma _{\mathrm{J}}^{2}\zeta ^{2}\right ] - 1\right )\,.}$$

Hence, to see whether S t = exp(Y t ) is a martingale, check \(\psi (-\mathrm{i}) =\gamma +{1 \over 2}\sigma ^{2} +\lambda (\exp [\mu _{\mathrm{ J}} +{ 1 \over 2}\sigma _{\mathrm{J}}^{2}] - 1)\). By Lemma 7.5, a martingale can be obtained by choosing a drift with

$$\displaystyle{\gamma = -{\sigma ^{2} \over 2} -\lambda \left (\exp \left [\mu _{\mathrm{J}} +{ 1 \over 2}\sigma _{\mathrm{J}}^{2}\right ] - 1\right )\,.}$$

This makes \(S_{0}\mathrm{e}^{-rt}\exp (rt +\gamma t +\sigma W_{t} +\sum _{ j=1}^{J_{t}}\log q_{j})\) a martingale. When applied to simulation of SDEs under the risk-neutral measure for Monte Carlo, this risk-neutral valuation amounts to the drift rate in Example 1.21. That is, the SDE is

$$\displaystyle{{ \mathrm{d}S \over S} = (r -\lambda (\exp [\mu _{\mathrm{J}} +{ 1 \over 2}\sigma _{\mathrm{J}}^{2}] - 1))\,\mathrm{d}t +\sigma \,\mathrm{ d}W_{ t}\,.}$$

In case of a dividend yield with rate δ, the term δdt is subtracted on the right-hand side, similar as in Sect. 3.5.

For other models, a risk-neutral growth rate can be obtained in an analogous way. A table of risk-neutral drift rates is given in [332, p. 80]. For a jump diffusion, jumps are comparably “rare,” there is only a finite number of them in any time interval. Apart from Merton’s model another jump-diffusion model is Kou’s model, which works with an asymmetric double exponential distribution of jump sizes [229].

There are Lévy processes of infinite activity: Then in every time interval an infinite number of jumps occurs. Examples include the VG-process (Variance Gamma) [253], the NIG-process (Normal Inverse Gaussian), the hyperbolic process [114] and the CGMY process [67]. Specifically for VG and NIG, see also [155]. Time deformation plays an important role for constructing Lévy processes. For example, with a Wiener process W t and a Gamma process G t as subordinator replacing time, VG can be represented as

$$\displaystyle{S_{t} = S_{0}\mathrm{e}^{rt+X_{t} }\text{ with }X_{t} =\theta G_{t} +\sigma W_{G_{t}}\,.}$$

This includes GBM with the standard time G t = t and parameter θ = −σ 2∕2. Such a subordinating process G t can be regarded as “business time,” which runs faster than the calendar time when the trading volume is high, and slower otherwise. Then, for a Wiener process W t , a class of Lévy processes is defined by \(W_{G_{t}}\). With a t-grid as in Algorithm 1.8, a time-changed process can be generated as \(W_{j} = W_{j-1} + Z\sqrt{G_{j\varDelta t } - G_{(\,j-1)\varDelta t}}\) ( Exercise 2.11).

7.3.2 Option Valuation with PIDEs

Assume European options based on a price process S t = S 0exp(rt + X t ), where X t is a Lévy process such that \(\mathrm{e}^{X_{t}}\) is a martingale, with Lévy measure ν, and the integral | y | ≥ 1e2y ν(dy) exists. Then the value function V (S, t) satisfies

$$\displaystyle{ \begin{array}{rcl} &&{\partial V (S,t) \over \partial t} +{ 1 \over 2}\sigma ^{2}S^{2}{\partial ^{2}V \over \partial S^{2}} + rS{\partial V \over \partial S} - rV \\ && +\int _{\mathbb{R}}\left [V (S\mathrm{e}^{y},t) - V (S,t) - (\mathrm{e}^{y} - 1)S{\partial V (S,t) \over \partial S} \right ]\,\nu (\mathrm{d}y) = 0 \end{array} }$$
(7.33)

A proof can be found in [84, pp. 385–387].

Definition 7.8 (PIDE)

An equation of the above type (7.33) is called partial integro-differential equation (PIDE).

The integral term in (7.33) complicates the numerical solution since it is a nonlocal term accumulating information on all − < y < , in contrast to the local character of the partial derivatives. For general Lévy processes, the three terms under the integral can not be separated, otherwise the integral may fail to converge. It can be separated in the case of Merton’s jump-diffusion model, because this process is of finite activity, \(\lambda =\nu (\mathbb{R}) <\infty\).

In what follows, we discuss Merton’s jump-diffusion process, with lognormal distribution for q = ey. The integral in (7.33) can be split into three terms with three integrals

$$\displaystyle{\int _{\mathbb{R}}V (S\mathrm{e}^{y},t)\nu (\mathrm{d}y) - V (S,t)\int _{ \mathbb{R}}\nu (\mathrm{d}y) - S{\partial V (S,t) \over \partial S} \int _{\mathbb{R}}(\mathrm{e}^{y} - 1)\nu (\mathrm{d}y)\,.}$$

In view of ν(dy) = λf( y)dy, factors λ show up. f is the normal density, and the integrals become expectations. Then the first integral can be written λ E(V (Sey, t)), and the second integral is λ. The third integral E(ey − 1) does not depend on V and can be calculated beforehand since the distribution for q = ey is stipulated.Footnote 5 The lognormal density for q is

$$\displaystyle{f_{q}(x) ={ 1 \over \sqrt{2\pi }\,\sigma _{\mathrm{J}} \cdot x}\exp \bigg\{ -{ (\log x -\mu _{\mathrm{J}})^{2} \over 2\sigma _{\mathrm{J}}^{2}} \bigg\}\,\mathbf{1}_{\{x>0\}}}$$

and we recover the constant of Example 7.7:

$$\displaystyle\begin{array}{rcl} c:& =& \int _{0}^{\infty }(x - 1)f_{ q}(x)\,\mathrm{d}x {}\\ & =& \int _{-\infty }^{\infty }(\mathrm{e}^{y} - 1)f(\,y)\,\mathrm{d}y =\exp \left [\mu _{\mathrm{ J}} +{ 1 \over 2}\sigma _{\mathrm{J}}^{2}\right ] - 1\,. {}\\ \end{array}$$

With the precalculated number c, the resulting Eq. (7.33) can be ordered into

$$\displaystyle{{ \partial V \over \partial t} +{ 1 \over 2}\sigma ^{2}S^{2}{\partial ^{2}V \over \partial S^{2}} + (r -\lambda c)S{\partial V \over \partial S} - (\lambda +r)V +\lambda \mathsf{E}(V (qS,t)) = 0\,. }$$
(7.34)

The last term is an integral taken over the unknown solution function V (S, t). So the resulting equation is a PIDE, a special case of (7.33). Note that the product λc is the drift compensation in Example 7.7. The standard Black–Scholes PDE (7.1) is included for λ = 0. A simplified derivation of (7.34) can be found in Appendix A.4. For further discussions, see for example [84, 270, 365, 375].

7.3.3 Transformation of the PIDE

We approach the PIDE (7.34) with the transformation

$$\displaystyle{ \tau:= T - t\,,\quad x:=\log S\,,\quad u(x,\tau ):= V (\mathrm{e}^{x},T-\tau )\,, }$$
(7.35)

which appears moderate as compared to (4.3). Substituting accordingly

$$\displaystyle{u_{x} ={ \partial V \over \partial S} S\,,\quad u_{xx} = u_{x} + S^{2}{\partial ^{2}V \over \partial S^{2}} }$$

into (7.34) leads to

$$\displaystyle{-u_{\tau } +{ 1 \over 2}\sigma ^{2}(u_{ xx} - u_{x}) + (r -\lambda c)u_{x} - (\lambda +r)u +\lambda \mathsf{E}(V (q\mathrm{e}^{x},T-\tau )) = 0\,,}$$

which is organized into

$$\displaystyle{u_{\tau } -{ 1 \over 2}\sigma ^{2}u_{ xx} - (r -\lambda c -{ 1 \over 2}\sigma ^{2})u_{ x} + (\lambda +\tau )u -\lambda \mathsf{E}(V (q\mathrm{e}^{x},T-\tau )) = 0\,.}$$

After the above transformation S = ex we next transform the jump-size variable q = ey. Ignoring the factor λ, the integral term changes to

$$\displaystyle{ \begin{array}{rcl} &&\mathsf{E}(V (q\mathrm{e}^{x},T-\tau )) = \mathsf{E}(V (\mathrm{e}^{x+y},T-\tau )) = \mathsf{E}(u(x + y,\tau )) \\ && =\int _{\mathbb{R}}u(x + y,\tau )f(\,y)\,\mathrm{d}y =\int _{\mathbb{R}}u(z,\tau )f(z - x)\,\mathrm{d}z\,, \end{array} }$$
(7.36)

where we have applied the substitution z: = x + y. The function f for Merton’s jump-diffusion model is the density of \(y =\log q \sim \mathcal{N}(\mu _{\mathrm{J}},\sigma _{\mathrm{J}}^{2})\). In summary, the PIDE of Merton’s jump-diffusion model is

Problem 7.9 (Merton’s Jump-Diffusion PIDE)

$$\displaystyle{ \begin{array}{rcl} u_{\tau }&-&{1 \over 2}\sigma ^{2}u_{ xx} - (r -\lambda c -{ 1 \over 2}\sigma ^{2})u_{ x} + (\lambda +r)u \\ &-&\lambda \int _{\mathbb{R}}u(z,\tau )f(z - x)\,\mathrm{d}z = 0\,, \\ & &\mathit{\text{with }}\ f(\,y) ={ 1 \over \sqrt{2\pi }\sigma _{\mathrm{J}}}\exp \left [-{(\,y -\mu _{\mathrm{J}})^{2} \over 2\sigma _{\mathrm{J}}^{2}} \right ] \\ & &\mathit{\text{and }}\ \ c =\exp [\mu _{\mathrm{J}} +{ 1 \over 2}\sigma _{\mathrm{J}}^{2}] - 1\,.\\ \end{array} }$$
(7.37)

This is the problem to be solved numerically.

7.3.4 Numerical Approximation

For an approximation of the integral (7.36) we truncate the domain to a finite interval x minxx max. In view of the meaning of the integral, this truncation amounts to disregard large jumps. This might be seen as a weakness of the approach, but jumps that large are highly improbable. The simplest discretization approach is to use an equidistant x-grid with

$$\displaystyle{\varDelta x:={ x_{\max } - x_{\min } \over m} \,,\quad x_{i}:= x_{\min } + i\varDelta x\,,\quad i = 0,\ldots,m\,,}$$

for a suitable integer m. As in Chap. 4, the time-stepping nodes are τ ν , and the approximations of u(x i , τ ν ) are denoted by w i, ν . The integral in (7.37) is evaluated at each node (x, τ) = (x i , τ ν ). That is, for each i, ν, the numbers

$$\displaystyle{\int _{\mathbb{R}}u(z,\tau _{\nu })f(z - x_{i})\,\mathrm{d}z\, \approx \,\int _{x_{\mathrm{min}}}^{x_{\mathrm{max}} }u(z,\tau _{\nu })f(z - x_{i})\,\mathrm{d}z}$$

are to be approximated. Applying the composite trapezoidal sum (C.2) with

$$\displaystyle{f_{i,l}:= f(x_{l} - x_{i}) = f((l - i)\varDelta x)\,,}$$

the approximation of the integral for each i, ν is

$$\displaystyle{ \varDelta x\bigg[{w_{0,\nu }f_{i,0} \over 2} +\sum _{ l=1}^{m-1}w_{ l,\nu }f_{i,l} +{ w_{m,\nu }f_{i,m} \over 2} \bigg]\,. }$$
(7.38)

The numbers f i, l are elements of a Toeplitz matrix.Footnote 6 That is, the entries take only 2m + 1 different numbers. Due to the exponential structure of f, the elements in the northeast and southwest corners of the f i, l -matrix go to zero. In this sense, this Toeplitz matrix has a “banded” structure. In summary, for each i, ν the integral is approximated by a scalar product of the row vector

$$\displaystyle{\varDelta x\,\left ({f_{i,0} \over 2} \,,\,f_{i,1}\,,\,\ldots,f_{i,m-1}\,,\,{ f_{i,m} \over 2} \right )}$$

times the vector w (ν). In (7.38) the first term w 0,ν and the last term w m, ν (where boundary conditions enter) must be treated separately in case we deal with the short vector (w 1, , w m−1) as in Sect. 4.2.3. Now assemble all the rows into an (m + 1)2-matrix C. Then for all i within time level ν, the integrals are represented by the product

$$\displaystyle{Cw^{(\nu )}\,.}$$

Neglecting the fact that many of its elements are close to zero, the matrix C is dense, which reflects the nonlocal character of the integral. This is in contrast to the local character of standard finite differences with its tridiagonal matrices. The transformation (7.35) is different from (4.3), but tridiagonal matrices can be derived from (7.37) in a similar way as done in Chap. 4. The dense matrix C adds to the tridiagonal matrices, which makes the solution of linear systems with full matrices in each time step νν + 1 more expensive. In an attempt to save costs, splitting has been suggested. This means to evaluate the integral at the previous line (ν). In this way, the multiplication Cw only shows up in the right-hand side of the known terms. The tridiagonality of the left-hand side matrices is maintained, and the method still converges. Up to boundary conditions, this splitting can be represented by an Euler-type implicit scheme

$$\displaystyle{{ w^{(\nu +1)} - w^{(\nu )} \over \varDelta \tau } = Gw^{(\nu +1)} +\lambda Cw^{(\nu )}\,, }$$
(7.39)

where the matrix G represents the local information of the differentials. Neither G nor C are symmetric. We leave it to the reader to set up the system of equations ( Exercise 7.7).Footnote 7 The matrices G and C are used for the analysis, no matrix is needed for the algorithm. For an illustration how a larger intensity λ increases the value of an option see Fig. 7.4.

Fig. 7.4
figure 4

V (S, 0) of a European put option, solution of Problem 7.9; parameters as in Example 1.21: K = 10, r = 0. 06, σ = 0. 3, T = 1, with Merton’s jump diffusion, μ J = −0. 3, σ J = 0. 4, and three values of jump intensity λ: 0 (lower curve in red, no jump), 0.1 (green curve), and 0.2 (top curve, in blue); x min = −3, x max = log(K) + 1. 6 = 3. 9. The chosen value of μ J = −0. 3 corresponds to q = exp(μ J) = 0. 74, or a 26% fall in the asset price

Since the splitting can deteriorate the accuracy, a fixed point iteration has been suggested [105]. The integral term E(V ) with its truncation and discretization challenges the control of the involved errors. For example, [85] gives an estimate of the error induced by truncating the integral, as well as a convergence proof for finite differences applied to general Lévy models. Codes for American options based on a penalty formulation or on an LCP formulation can be easily modified and extended by an integral term. The techniques of Chap. 4 or Chap. 5 can be applied. Application of FFT increases the efficiency [105]. Typically, each Lévy process calls for a separate algorithm. A Monte Carlo approach is [272]. For Merton’s model and European options, an analytic solution is given [270], which allows to test corresponding algorithms.

7.4 Application of the Fourier Transform

The Fourier transform \(\mathcal{F}\) of a real function f is defined byFootnote 8

$$\displaystyle{ \mathcal{F}[\,f(u)]:=\int _{ y=-\infty }^{\infty }\mathrm{e}^{\mathrm{i}uy}f(\,y)\,\mathrm{d}y\,. }$$
(7.40)

This requires integrability of f. The inverse Fourier transformation is

$$\displaystyle{ \mathcal{F}^{-1}[g(x)] ={ 1 \over 2\pi }\int _{u=-\infty }^{\infty }\mathrm{e}^{-\mathrm{i}xu}g(u)\,\mathrm{d}u\,. }$$
(7.41)

A sufficiently well-behaved f is recovered by the inversion,

$$\displaystyle{f = \mathcal{F}^{-1}\mathcal{F}f\,.}$$

We perform this process of transform and inverse transform for a function c(k) to be defined below. The application of the Fourier transform in our context and the outline of three steps of the subsequent analysis is symbolized as follows:

$$\displaystyle\begin{array}{rcl} & & \!\!\!\!\!\!\!\!\!\!\!\!\!\!\!(1) {}\\ c(k)& \circ \longrightarrow \bullet & g(u) = {{\rm integral}} {}\\ & & \quad \downarrow (2) {}\\ c(k)& \circ \longleftarrow \bullet & g(u) = {{\rm formula}} {}\\ & & \!\!\!\!\!\!\!\!\!\!\!\!\!\!\!(3) {}\\ & & {}\\ \end{array}$$

Step (1) is the forward Fourier transform (7.40) of a function c(k). The result is an integral expression g(u). In our context this integral can be solved analytically (step (2)), which produces a formula for g(u). The inverse transformation (7.41) in step (3) is approximated numerically by the Fast Fourier Transformation (FFT), based on (C.7). The detour (1)–(3) is worth the effort, because the FFT calculation of c(k) is faster to evaluate than the original c(k).

Recall the characteristic function (7.28) ϕ of a Lévy process X t . These functions are the Fourier transform of the density function of X,

$$\displaystyle{ \phi _{X_{t}}(u):= \mathsf{E}(\exp (\mathrm{i}uX_{t})) =\int _{ -\infty }^{\infty }\mathrm{e}^{\mathrm{i}uX}f_{\mathrm{ density}X}\,\mathrm{d}x = \mathcal{F}[\,f_{\mathrm{density}X}]\,. }$$
(7.42)

The characteristic functions ϕ of many processes X are known and available as analytical expressions, for example, in [84, 235, 332].

In the following, we investigate a European call with vanilla payoff Ψ(S) = (SK)+ with an arbitrary underlying Lévy process S t . The integral representation of the call’s value under the risk-neutral measure Q is

$$\displaystyle\begin{array}{rcl} V (S_{t},t;\,K)& =& \mathrm{e}^{-r(T-t)}\mathsf{E}_{\mathsf{ Q}}[\varPsi (S_{T})\,\vert \,S_{t}] {}\\ & =& \mathrm{e}^{-r(T-t)}\int _{ S_{T}=K}^{\infty }(S_{ T} - K)\,f_{\mathrm{density}}(S_{T})\,\mathrm{d}S_{T}\,, {}\\ \end{array}$$

where f is the density of S T of the Lévy process starting at t with the value S t . Transform

$$\displaystyle{ S_{T} =\mathrm{ e}^{s},\ K =\mathrm{ e}^{k},\ \mathrm{d}S_{ T} =\mathrm{ e}^{s}\mathrm{d}s\,; }$$
(7.43)

note that \(k \in \mathbb{R}\). Then

$$\displaystyle{V (S_{t},t;\,K) =\mathrm{ e}^{-r(T-t)}\int _{ k}^{\infty }(\mathrm{e}^{s} -\mathrm{ e}^{k})\hat{f}(s)\,\mathrm{d}s\,,}$$

where \(\hat{f}(s) =\mathrm{ e}^{s}f(\mathrm{e}^{s})\) is the density of logS, similar as in Sect. 1.8.2. Following [68], in order to make the function integrable, we scale the integral with a factor exp(αk) (a constant):

$$\displaystyle{ c(k):=\mathrm{ e}^{\alpha k}\mathrm{e}^{-r(T-t)}\int _{ k}^{\infty }(\mathrm{e}^{s} -\mathrm{ e}^{k})\hat{f}(s)\,\mathrm{d}s =\mathrm{ e}^{\alpha k}V (S_{ t},t;\,K) }$$
(7.44)

and denote \(\mathcal{F}[c(u)]\) its Fourier transform. We leave the choice of the scaling parameter α open until later.

As outlined above, when \(\mathcal{F}[c]\) is calculated, then the call’s value V (S, t) is recovered from the inverse Fourier transformation,

$$\displaystyle{V (S_{t},t;\mathrm{e}^{k}) =\bigg ({1 \over 2\pi }\int _{-\infty }^{\infty }\mathrm{e}^{-\mathrm{i}ux}\mathcal{F}[c(u)]\,\mathrm{d}u\bigg) \cdot \mathrm{ e}^{-\alpha k}\,,}$$

which can be approximated efficiently by the Fast Fourier Transform (FFT). This outlines the program of the three steps (1), (2), (3), and now we turn to its realization.

The Fourier transform of c(k) is

$$\displaystyle\begin{array}{rcl} \mathcal{F}[c(u)]& =& \int _{k=-\infty }^{\infty }\mathrm{e}^{\mathrm{i}uk}c(k)\,\mathrm{d}k {}\\ & =& \int _{-\infty }^{\infty }\mathrm{e}^{\mathrm{i}uk}\mathrm{e}^{\alpha k}\mathrm{e}^{-r(T-t)}\int _{ s=k}^{\infty }(\mathrm{e}^{s} -\mathrm{ e}^{k})\hat{f}(s)\,\mathrm{d}s\,\mathrm{d}k {}\\ & =& \mathrm{e}^{-r(T-t)}\int _{ k=-\infty }^{\infty }\int _{ s=k}^{\infty }\mathrm{e}^{(\mathrm{i}u+\alpha )k}(\mathrm{e}^{s} -\mathrm{ e}^{k})\hat{f}(s)\,\mathrm{d}s\,\mathrm{d}k {}\\ & =& \mathrm{e}^{-r(T-t)}\int _{ s=-\infty }^{\infty }\int _{ k=-\infty }^{s}\mathrm{e}^{(\mathrm{i}u+\alpha )k}(\mathrm{e}^{s} -\mathrm{ e}^{k})\hat{f}(s)\,\mathrm{d}k\,\mathrm{d}s\,, {}\\ \end{array}$$

where the last equation holds since

$$\displaystyle{\{\,k \leq s <\infty \,\mid -\infty <k <\infty \,\} = \{\, -\infty <k \leq s\,\mid -\infty <s <\infty \,\}\,.}$$

This leads to

$$\displaystyle{ \begin{array}{rcl} \mathcal{F}[c(u)]& =&\mathrm{e}^{-r(T-t)}\int _{ -\infty }^{\infty }\hat{f}(s)\int _{ -\infty }^{s}[\mathrm{e}^{(\mathrm{i}u+\alpha )k+s} -\mathrm{ e}^{(\mathrm{i}u+\alpha +1)k}]\,\mathrm{d}k\,\mathrm{d}s \\ & =&\mathrm{e}^{-r(T-t)}\int _{ -\infty }^{\infty }\hat{f}(s)\bigg[{\mathrm{e}^{s}\mathrm{e}^{(\mathrm{i}u+\alpha )k} \over \mathrm{i}u+\alpha } -{\mathrm{e}^{(\mathrm{i}u+\alpha +1)k} \over \mathrm{i}u +\alpha +1} \bigg]_{k=-\infty }^{s}\,\mathrm{d}s\,. \end{array} }$$
(7.45)

To have the integral exist, we require the factor eαk to vanish for k → −, which leads to choose α > 0. That is, the factor exp(αk) amounts to a damping of the integral. The bracketed term in (7.45) is

$$\displaystyle{{(\mathrm{i}u +\alpha +1)\mathrm{e}^{s(\mathrm{i}u+\alpha +1)} - (\mathrm{i}u+\alpha )\mathrm{e}^{s(\mathrm{i}u+\alpha +1)} \over \mathrm{i}u(2\alpha + 1) +\alpha (\alpha +1) - u^{2}} \,,}$$

and we come up with

$$\displaystyle{\mathcal{F}[c(u)] ={ \mathrm{e}^{-r(T-t)} \over \mathrm{i}u(2\alpha + 1) +\alpha (\alpha +1) - u^{2}}\int _{-\infty }^{\infty }\hat{f}(s)\mathrm{e}^{\mathrm{i}s(u-(\alpha +1)\mathrm{i})}\,\mathrm{d}s\,.}$$

We denote the integral therein ϕ(u − (α + 1)i), because it is the characteristic function of the density \(\hat{f}\). For ϕ an analytic expression is known. Hence

$$\displaystyle{ \mathcal{F}[c(u)] ={ \mathrm{e}^{-r(T-t)}\,\phi (u - (\alpha +1)\mathrm{i}) \over \alpha ^{2} +\alpha -u^{2} + \mathrm{i}u(2\alpha + 1)} =: g(u) }$$
(7.46)

can be considered to be a known function g, and step (2) is completed. For the final choice of the parameter α > 0 further request \(g(u) = \mathcal{F}[c(u)]\) to be integrable as well. Since the integration is along real values of u one has to take care that the denominator has only imaginary roots in u. The choice of α is discussed in the literature [68, 235]. Usually α = 3 works well.

The inverse Fourier transformation evaluates

$$\displaystyle{\mathrm{e}^{-\alpha k}{1 \over 2\pi }\int _{-\infty }^{\infty }\mathrm{e}^{-\mathrm{i}ku}g(u)\,\mathrm{d}u\,.}$$

The integral is real, and hence its integrand is real too. Think of g from (7.46) being split into real part and imaginary part, g(u) = g 1(u) + ig 2(u). Then i(cos(ku)g 2(u) − sin(ku)g 1(u)) = 0, and we conclude that g 1(u) is an even function, and g 2(u) is an odd function. Hence the integrand

$$\displaystyle{\cos (ku)g_{1}(u) +\sin (ku)g_{2}(u)}$$

is even, and the value of the call is

$$\displaystyle{ V (S_{t},t;\mathrm{e}^{k}) ={ \mathrm{e}^{-\alpha k} \over \pi } \int _{0}^{\infty }\mathrm{e}^{-\mathrm{i}ku}g(u)\,\mathrm{d}u\,. }$$
(7.47)

Next, the semi-infinite integration interval is truncated to a finite length A. Thereby, for most Lévy models the truncation error can be made arbitrarily small because the characteristic function ϕ decays exponentially fast at infinity.Footnote 9 With the restriction to the integration interval 0 ≤ uA and M − 1 subintervals with equal length Δu, the discrete grid points are

$$\displaystyle{u_{j}:= j\varDelta u = j{ A \over M - 1}\,,\quad j = 0,\ldots,M - 1\,.}$$

Choosing the trapezoidal sum (C.2) for the quadrature , the approximation is

$$\displaystyle{ \int _{0}^{\infty }\mathrm{e}^{-\mathrm{i}ku}g(u)\,\mathrm{d}u\, \approx \,{ A \over M - 1}\sum _{j=0}^{M-1}\beta _{ j}\,g(u_{j})\,\mathrm{e}^{-\mathrm{i}ku_{j} } }$$
(7.48)

with weights \(\beta _{0} =\beta _{M-1} ={ 1 \over 2}\) and β j = 1 for 1 ≤ jM − 2. The trapezoidal sum goes along with a sampling error of the order O(Δu 2).

So far, the log-strike k = logK is not specified. The aim is to exploit the potential of FFT, which calculates sums of the type

$$\displaystyle{ \sum _{j=0}^{M-1}a_{ j}\,\mathrm{e}^{-\mathrm{i}\nu j{ 2\pi \over M} } }$$
(7.49)

for complex numbers a 0, , a M−1, one sum for each ν. This amounts to calculate a vector of M such sums, for ν = 0, , M − 1. Applying FFT we gain the possibility to calculate the above for M strikes simultaneously. Let us calculate the call values for the log-strike values

$$\displaystyle{ k_{\nu }:= -b +\varDelta k\cdot \nu \,,\quad \nu = 0,\ldots,M - 1\,, }$$
(7.50)

for suitable values of b and Δk, which define the k-range and the strike spacing of interest. Substituting these values k ν into the above sum (7.48) produces

$$\displaystyle{{ A \over M - 1}\sum _{j=0}^{M-1}\beta _{ j}\,g(u_{j})\exp \left [-\mathrm{i}(-b +\varDelta k\,\nu )j{ A \over M - 1}\right ]\,.}$$

The argument of the exponential function is

$$\displaystyle{\mathrm{i}bj{ A \over M - 1} -\mathrm{i}\nu j\varDelta k{ A \over M - 1}\,.}$$

To apply FFT aiming at (7.49), steps Δk and \(\varDelta u ={ A \over M-1}\) must be chosen such that

$$\displaystyle{ \varDelta k{ A \over M - 1} =\varDelta k\,\varDelta u ={ 2\pi \over M} \,. }$$
(7.51)

Then the sum in (7.48) is

$$\displaystyle{{ A \over M - 1}\sum _{j=0}^{M-1}\beta _{ j}g(u_{j})\exp \left [\mathrm{i}bj{ A \over M - 1}\right ]\,\mathrm{e}^{-\mathrm{i}\nu j{ 2\pi \over M} }\,,}$$

which is the standard FFT applied to (7.49) for the complex numbers

$$\displaystyle{ a_{j}:= A\beta _{j}g(u_{j})\exp \left [\mathrm{i}bj{ A \over M - 1}\right ]\,,\quad i = 0,\ldots,M - 1\,. }$$
(7.52)

This completes the calculation of a bunch of European call values: The integral in (7.47) is approximated by the FFT sum (7.49) with coefficients (7.52). For the highly efficient calculation of the FFT sums (7.49) consult standard literature on numerical analysis (such as [306]), and related software packages.

The above method amounts to a fast algorithm in case option prices are to be calculated on a grid of many strikes, all options with the same maturity T. The log-strike grid of the values k ν is defined by (7.50) with the parameters b and Δk, which in turn are based on A, M. By (7.51),

$$\displaystyle{\varDelta k ={ 2\pi \over A}{ M - 1 \over M} \,.}$$

And to cover log strikes in the at-the-moment range around k = 0, one aims at

$$\displaystyle{b ={ (M - 1)\varDelta k \over 2} \,.}$$

Efficiency of FFT is maximal for M a power of 2. The Eq. (7.51) is a limitation that requests a careful design of parameters M and A.

In this section, we have explained the basic FFT approach of Carr and Madan [68]. The Fast Fourier Transform can be applied also for early-exercise options [248]. A novel transform is based on Fourier-cosine expansions [125], which is also applied to barrier options [126]. The resulting algorithms converge exponentially fast. In summary, FFT-based methods have shown a rich potential, in particular for option pricing under Lévy models.

7.5 Notes and Comments

On Sect. 7.1

For a critical account of Leland’s approach see [380]. The nonlinear version (7.6)–(7.8) is due to [187]. A piecewise linear treatment is suggested in [77]. The paper [18] discusses Eq. (7.7), suggesting a modification for the case γ ≥ 1, where \(\hat{\sigma }^{2}\) would be negative for Γ < 0. For bounds on V in case of “misspecified” volatility, see [118]. For related work, consult also [116, 156, 159].

Apart from the one-factor case, ranges for parameters play a role also in multiasset cases. For example, consider two assets with prices S 1, S 2, and assume a correlation in the range − 1 ≤ ρ minρρ max ≤ 1. In the Black–Scholes equation (6.5), the term

$$\displaystyle{\rho \sigma _{1}\sigma _{2}S_{1}S_{2}{ \partial ^{2}V \over \partial S_{1}\partial S_{2}}}$$

occurs. Depending on the sign of the cross derivative \({ \partial ^{2}V \over \partial S_{1}\partial S_{2}}\), ρ is chosen either as ρ min or ρ max in order to characterize a “worst-case,” see [362].

To complete the introduction into more general models we outline the Dupire equation in Appendix A.6.

On Sect. 7.2

For reference and examples consult [134, 176, 177]. The assumption of a constant c + in Theorem 7.2 is not always satisfied easily. For example, in the Barles and Soner model of Sect. 7.1.2 and a payoff with jump discontinuity (as digital option), c + = c +(Δx) = O(Δx 2), which affects the assumptions of Theorem 7.2, and has strong implications on stability. Apart from nonsmooth payoffs, also the PDE itself typically is not smooth. For American options, the penalty term in (7.16) causes a lack of smoothness. Also the volatility function \(\tilde{\sigma }\) may be nonsmooth. This happens, for example, in Leland’s model when V SS changes sign. Newton’s method then works with a generalized derivative. The higher the degree of “non-smoothness,” the worse the convergence rate of CN. The BDF method (7.26)/(7.27) is highly recommended. An a priori check of convergence criteria is advisable.

On Sect. 7.3

The definition of Lévy processes includes stochastic continuity. A table of Lévy densities f L is found in [332, p. 154]. The Lévy-Khinchin representation (7.29) is a scalar setting; [69] develops analytic expressions for the characteristic function of time-changed Lévy process in a general vector setting. In this framework, Heston’s stochastic-volatility model can be represented as time-changed Brownian motion.

For time-changed Lévy processes, consult [11, 67, 69, 84]. Time-changed Lévy processes have been successfully applied to match empirical data. For processes with density function (Merton, VG, NIG), Algorithm 1.18 can be applied [309]. Lévy-process models have been extended by incorporating stochastic volatilities [67, 212]. A subordinator τ(t) can be constructed as integral of a square-root process.

Pham [299] investigates properties of American options. Heston presents the characteristic function for his model in [178]. His model extended by jump diffusion [30] can be cast into the above framework: In this case a two-dimensional PDE is considered. For computational approaches see [68, 55, 85, 104, 105, 263].

On Sect. 7.4

Choosing the weights w j of Simpson’s sums instead of trapezoidal sums, the integrations get more accurate. An application to VG is found in [68]. Modifications and extensions of the above basic approach are described and reviewed in [235]. For references on transform methods in option pricing, see [126].

7.6 Exercises

7.1. Let ΔW be the increment of a Wiener process, see Sect. 1.6.1. Show

$$\displaystyle{\mathsf{E}(\vert \varDelta W\vert ) = \sqrt{\varDelta t}\,\sqrt{{2 \over \pi }} \,.}$$

7.2 (Barles–Soner Model). 

The differential equation of Barles and Soner is

$$\displaystyle{{\mathrm{d}f(x) \over \mathrm{d}x} ={ f(x) + 1 \over 2\sqrt{xf(x)} - x}\ \text{ with }f(0) = 0\,.}$$
  1. (a)

    By numerical computations, analyze the solution for − 2 ≤ x ≤ 2.

  2. (b)

    Construct an approximating function \(\hat{f}(x)\) in a piecewise fashion.

7.3 (Payoffs of Spreads). 

We consider portfolios of two or more options of the same type with the same underlying stock. K 1, K 2, K are strikes with K 1 < K 2.

  1. (a)

    A butterfly spread is a portfolio with

    • one long call with strike K 1,

    • one long call with strike K 2,

    • two short calls with strike \(K ={ K_{2}-K_{1} \over 2}\).

    The payoff is

    $$\displaystyle{\varPsi (S) = \left \{\begin{array}{@{}l@{\quad }l@{}} 0 \quad &\text{for}\ S \leq K_{1} \\ S - K_{1}\quad &\text{for}\ K_{1} <S \leq K \\ K_{2} - S\quad &\text{for}\ K <S \leq K_{2} \\ 0 \quad &\text{for}\ K_{2} \leq S\,.\\ \quad \end{array} \right.}$$
  2. (b)

    A bull spread is a portfolio with

    • one long call with strike K 1,

    • one short call with strike K 2,

    The payoff is

    $$\displaystyle{\varPsi (S) = \left \{\begin{array}{@{}l@{\quad }l@{}} 0 \quad &\text{for}\ S \leq K_{1} \\ S - K_{1} \quad &K_{1} <S \leq K_{2} \\ K_{2} - K_{1}\quad &K_{2} <S\,.\\ \quad \end{array} \right.}$$

For both spreads (a) and (b) explain and sketch the payoff. Apply the transformation (7.17) (Exercise 7.4) to derive the transformed payoff u (x). For (b), apply the transformation with K 2.

7.4 (Transformation of Nonlinear Black–Scholes Models). 

According Sect. 7.2, consider the following nonlinear PDE

$$\displaystyle{V _{t} +{ 1 \over 2}\sigma ^{2}(t,S,V _{ SS})S^{2}V _{ SS} + (r-\delta )SV _{S} - rV +\hat{ p}\max (\varPsi -V,0) = 0\,,}$$

where σ 2(t, S, V SS ) depends on the particular model; r is the risk-free interest rate and δ is the continuous dividend yield. Apply the transformation (7.17)

$$\displaystyle{x =\log (S/K),\ \tau =\sigma _{ 0}^{2}(T - t)/2,\ u(x,\tau ) =\mathrm{ e}^{-x}V (S,t)/K,}$$

with K > 0 and a model-dependent parameter σ 0, and derive a PDE for u.

7.5 (Convergence of the Fully Implicit Method). 

Two out of the three criteria for monotony in Theorem 7.2 are (i) and (ii). For

  1. (a)

    Leland’s model of transaction costs, with parameter γ, and

  2. (b)

    the model of uncertain volatility with σ minσσ max,

show that (i) and (ii) are satisfied. What are the constants c +? For (b), σ of (7.12) suffices.

7.6. For a Lévy process X t adapted to a filtration \(\mathcal{F}_{t}\) show

$$\displaystyle{\mathsf{E}(\mathrm{e}^{X_{t} }\,\vert \,\mathcal{F}_{s}) -\mathrm{ e}^{X_{s} } = \mathsf{E}(\mathrm{e}^{X_{t-s} }) -\mathrm{ e}^{X_{0} }\,.}$$

7.7 (Project: Implementing a PIDE). 

Set up a computer program to solve Merton’s jump diffusion (7.37) numerically. To this end, concentrate on European-style vanilla options. Set up boundary conditions using (4.27), and apply a BDF implicit scheme. Think of how to choose x min, x max in relation to the strike K. Hint: For testing the core part of the program, set the jump intensity λ = 0 and compare to the Black–Scholes value.

7.8 (Fourier Transform).  

Consider the Fourier transform

$$\displaystyle{\mathcal{F}[\,f(u)]:=\int _{ -\infty }^{\infty }\mathrm{e}^{\mathrm{i}uy}f(\,y)\,\mathrm{d}y\,.}$$

For the example f( y): = ea | y | and complex a show that

$$\displaystyle{\int _{-A}^{A}\mathrm{e}^{\mathrm{i}uy}f(\,y)\,\mathrm{d}y}$$

converges for A and Re(a) > 0.