1 Introduction

Over the past decade, a number of stochastic processes designed for option pricing have been introduced in the financial literature since the seminal work of Black and Scholes (1973). A common way to communicate option prices consists in computing the implied volatility (IV) resulting from the inversion of the Black and Scholes option price formula. IVs measure the uncertainty about how low or high an underlying asset might fall or rise over a given period. Unlike the historical asset’s volatility, IVs are somehow expectations for future volatility. As a result, IVs play an important role in financial markets and option pricing models aim to replicate at least the main empirical features of asset time series. For a wide set of financial assets, Cont (2001) stated five stylized statistical facts about the asset’s volatility. First, high volatility events tend to cluster in time. Second, there is a long memory effect in volatility which occurs when the effects of volatility shocks decay slowly. Third, the asset’s volatility is negatively correlated with the asset’s return. Fourth, trading volume is correlated with the asset’s volatility. Fifth, there is an asymmetry in time scales of the volatility which is also known as the time reversal asymmetry of financial time series data. This has been emphasized by Zumbach and Lynch (2001) and called the “Zumbach effect” by Blanc et al. (2016). Beside these empirical findings in asset’s volatility, common features of the IV surfaces are the volatility term structure effects, the at-the-money skew and the smile (see e.g. Aït-Sahalia and Lo 1998; Aït-Sahalia et al. 2021). Designing an option pricing model that includes most of these characteristics is a challenging task.

In quest of the suitable option pricing model, this article extends previous works on affine stochastic volatility (SV) models stemming from the Black and Scholes framework. While simple and easy to use, the Black and Scholes model relies on strong assumptions regarding the financial behavior observed throughout market data. Merton (1976) and Bates (1991) relax the assumption of the Black and Scholes (1973) that the asset’s return is normally distributed by inserting a jump component driven by a Poisson process. Later, Heston (1993) proposes a closed form solution for an option on an asset with a SV model, which allows relaxing the assumption of constant variance of the Black and Scholes model. He also introduces a correlation between the variance and the asset price in order to replicate the volatility smile and to explain the skewness of the asset’s return. The Heston’s framework is extended in several directions. Among them, we can cite stochastic volatility jump (SVJ) models, introduced by Bakshi et al. (1997), Duffie et al. (2000), Bates (2000) and Eraker (2004) to mention a few pioneering works. The common point of these models, known as affine stochastic volatility models is the heteroskedastic property of the instantaneous change in variance through the square root term of the variance in the diffusion volatility. However, in the literature, other authors like Christoffersen et al. (2006) and Ignatieva et al. (2009) have studied non affine stochastic volatility models in which the instantaneous change in variance is proportional to any power of the variance in the diffusion volatility. They conclude that the choice between affine and non affine stochastic volatility models corresponds to find the right balance between fit flexibility and estimation properties. This motivate us to employ an affine SV model style for its tractability and economic interpretation documented in the literature (see e.g. Heston 1993; Bakshi et al. 1997; Duffie et al. 2000).

Notice that in view of expanding affine SVJ models, some authors like Gatheral (2012) investigated stochastic volatility models with jumps in asset prices and in the variance, denoted by SVJJ model. Although, this setting enables to maintain the clustering property of SV models, SVJJ models are less parsimonious than SVJ models, and are harder to fit to observed option prices. Another popular family of SV models considers that the log volatility behaves as a fractional Brownian motion with a small Hurst exponent. Among seminal contributions about this family of SV models we list the fractional SV model of Comte et al. (2010), and the rough volatility model of Gatheral et al. (2014) that in contrast to fractional SV, has a Hurst exponent of order less than 0.5. Even if these models are remarkably consistent with major features of financial time series, as stated by Blanc et al. (2016), they fail to induce a tail of distribution in the return fat enough even with fluctuating volatility since the asset’s return in these models is conditionally Gaussian distributed. Moreover, these models are unable to explain in a realistic way how they generate fat tails, long memory in volatility and volatility clustering.

Nevertheless, most of affine SV models do not allow for the Zumbach effect. This motivates, authors like Zumbach et al. (2013), Jaisson and Rosenbaum (2015) and Blanc et al. (2016) to modify SV models in order to integrate this Zumbach effect. The two last works are based on quadratic Hawkes processes (HPs) which extend the self-exciting mechanism of HPs introduced by Hawkes (1971a). In particular, Blanc et al. (2016) reformulate the general HP in a way that intra-day price changes, have an intensity proportional to powers of past price variations. Motivated by the empirical results, authors truncate this intensity to the second exponent and ignore the first exponent term. However, recent work of Hainaut and Moraux (2018a), Hainaut and Moraux (2018b) and Njike and Hainaut (2022) have shown the role of linear HP when it comes to replicate features of asset’s return time series. In combination with the self-excitation property of the linear HP, Hainaut (2021) has designed a non-markov linear HP for claims process that has memory. The author has generated this memory effect by considering a linear HP with a kernel function that admits a Fourier’s transform representation. For this work, it was of interest to explore the benefit of such a setup in an option pricing model. In doing so, the linear HP will increase the probability to observe a jump of the asset price and therefore the future volatility after a jump occurrence, whilst the memory kernel of the linear HP will enable us to obtain a longer effect of past prices into future prices.

This works contributes to the financial literature in three ways. First, we introduce an option pricing model where the underlying asset price dynamics is self-excited and allows for a longer effect of past prices through a linear HP. Second, although our model is non Markov, we derive a closed-form expression for European call option prices. For hedging applications, Greeks of the three sources of risk: price risk, volatility risk and jump risk are obtained straightforwardly by taking derivatives of our formula with respect to the asset price, the variance or the arrival rate of jump. Third, we provide a calibration procedure of our model to real market option data.

The rest of the article proceeds as follows. Section 2 presents the affine Heston model style with self-exciting jumps and long memory. This section focuses on distributional properties of the arrival rate of jumps and the log-return through an infinite dimensional Markov representation of our asset price model and a discretisation scheme. Section 3 develops the option pricing model under our setting. Section 4 provides a descriptions of the Euro Stoxx 50 option data. Therein, we present our estimation procedure, discuss the estimated parameter and assess the sensitivity of option prices to memory and self-excitation parameters, underlying price, volatility and jump risks. Concluding comments are developped in Sect. 5.

2 Model specification

We define all the processes on a probability space \(\Omega \) endowed with a risk neutral measure \({\mathbb {Q}}\) and a filtration \(\left( {\mathcal {F}}_{t}\right) _{t\ge 0}\). We exclusively focus in the option pricing, and assumes that the pricing measure can be retrieve from European option prices. For sake of simplicity we assume a constant interest rate r.

Hereafter we consider a SVJ model with self-excited jump in return driven by a Hawkes process with memory kernel:

$$\begin{aligned} \frac{\text {d}S_{t}}{S_{t}}= & {} r\text {d}t+\sqrt{V_{t}} \left( \rho \text {d}W_{t}^{1}+\sqrt{1-\rho ^{2}}\text {d}W_{t}^{2} \right) \\{} & {} +\text {d}\left( \sum _{l=1}^{N_{t}}\left( e^{J_{l}}-1\right) \right) -\lambda _{t}{\mathbb {E}}\left( e^{J}-1\right) \text {d}t\nonumber \\ \text {d}V_{t}= & {} \left( \theta _{V}-k_{V}V_{t}\right) \text {d}t +\sigma _{V}\sqrt{V_{t}}\text {d}W_{t}^{1}\nonumber \end{aligned}$$
(1)

where \(k_{V}\), \(\frac{\theta _{V}}{k_{V}}\) and \(\sigma _{V}\) are respectively the speed of adjustment, long-run and variation coefficients of the diffusion volatility of \(V_{t}\). \(W_{t}^{1}\) and \(W_{t}^{2}\) are two independent standard Brownian motions. \(\rho \) is the correlation between diffusion volatility of stock prices and variances. The variance process is the square root process introduced by Cox et al. (1985) and referred to as CIR. To ensure that zero is an unattainable for the diffusion volatility \(V_{t}\), we assume the Feller condition \(2\theta _{V}>\sigma _{V}^{2}\) holds. Let recall that affine stochastic model implies that the instantaneous change in variance is heteroskedastic via the \(\sqrt{V_{t}}\) term in the diffusion volatility. We introduce memory and contagion in the dynamic of stock prices by considering an arrival intensity \(\lambda _{t}\) driven by the following equation:

$$\begin{aligned} \lambda _{t}|{\mathcal {F}}_{t}=\lim _{h\rightarrow 0}\frac{{\mathbb {E}} \left( N_{t+h}|{\mathcal {F}}_{t}\right) -N_{t}}{h}= & {} \alpha +\left( \lambda _{0}-\alpha \right) g\left( t\right) +\eta \int _{0}^{t}g\left( t-u\right) \text {d}L_{u},\nonumber \\ \end{aligned}$$
(2)

with \(\alpha \) the constant reversion level, \(\lambda _{0}\) the initial intensity at time \(t=0\), \(L_{t}=\sum _{l=1}^{N_{t}}\left| J_{l}\right| \) the sum of the absolute values of the jump sizes up to time t. The intensity jumps by \(\eta \left| J_{l}\right| \) when the \(l^{th}\) shock in the asset’s returns dynamics occurs. Therefore, the occurrence of an event increases the probability of a further shock. The jump sizes J are double exponentially identically and independently distributed with density \(v\left( z\right) \) on \({\mathbb {R}}\):

$$\begin{aligned} v\left( z\right)= & {} p\rho ^{+}\exp \left( -\rho ^{+}z\right) 1_{\left\{ z\ge 0\right\} }-\left( 1-p\right) \rho ^{-}\exp \left( -\rho ^{-}z\right) 1_{\left\{ z<0\right\} }, \end{aligned}$$

with \(\rho ^{+}>1\), \(\rho ^{-}<0\) and \(p\in \left]0,1\right[\). The condition \(\rho ^{+}>1\) ensures that the jump size of the asset price has a finite expectation. Using the previous equation, it follows that the moment generating function (mgf) \(\psi \left( .,.\right) \) of the pair \(\left( J,\left| J\right| \right) \) is given by:

$$\begin{aligned} \psi \left( z_{1},z_{2}\right) :={\mathbb {E}}\left( e^{z_{1}J+z_{2}\left| J\right| }\right)= & {} \frac{p\rho ^{+}}{\rho ^{+}-\left( z_{1}+z_{2}\right) } +\frac{\left( 1-p\right) \rho ^{-}}{\rho ^{-}-\left( z_{1}-z_{2}\right) }. \end{aligned}$$

In particular, the expectation of J and \(\left| J\right| \) are weighted sum of the average sizes of positive and negative shocks:

$$\begin{aligned} {\mathbb {E}}\left( J\right)= & {} p\frac{1}{\rho ^{+}}+\left( 1-p\right) \frac{1}{\rho ^{-}}, \end{aligned}$$
(3)
$$\begin{aligned} {\mathbb {E}}\left( \left| J\right| \right)= & {} p\frac{1}{\rho ^{+}}-\left( 1-p\right) \frac{1}{\rho ^{-}}. \end{aligned}$$
(4)

We consider double exponential jumps as in Kou (2002) in order to replicate the asymmetry of the log-return distribution. Note that our SVJ model turns back to the Heston model if the jump size is null.

In Eq. (2), the memory kernel function g, is a positive definite function such that \(g\left( 0\right) =1\) and \(\lim _{t\rightarrow \infty }g\left( t\right) =0\). In this setting, the speed of reversion depends upon the decay rate of \(g\left( .\right) \). The function \(g\left( .\right) \) is the kernel function defining the memory of the stock prices process. If it is fast decreasing, the contagion effect between stock price jumps is limited in time. Otherwise, the processes remember the history of stock price jumps for a long period. In the present work, we consider four functions \(g\left( .\right) \) that are Fourier transform of positives measures on \({\mathbb {R}}\):

  • The first function \(g_{L}\left( .\right) \) is the Fourier transform of the Laplace or double exponential measure denoted by \(v_{L}\left( .\right) \). Let \(\beta \in {\mathbb {R}}^{+}\) be constant. The symmetric exponential measure \(v_{L}\left( .\right) \) defined by the following relation:

    $$\begin{aligned} v_{L}\left( \text {d}u\right)= & {} \frac{1}{2}\left( \beta \exp \left( -\beta u\right) 1_{\left\{ u\ge 0\right\} }+\beta \exp \left( \beta u\right) 1_{\left\{ u<0\right\} }\right) \text {d}u, \end{aligned}$$
    (5)

    generates the following memory kernel:

    $$\begin{aligned} g_{L}\left( h\right)= & {} \int _{{\mathbb {R}}}v_{L}\left( u\right) e^{ihu}\text {d}u =\frac{\beta ^{2}}{\beta ^{2}+h^{2}}. \end{aligned}$$
    (6)

    The mean of \(v_{L}\left( .\right) \) is \(\mu _{L}=0\) and its variance is \(\sigma _{L}^{2}=\frac{2}{\beta ^{2}}\). \(g_{L}\left( .\right) \) is a power decreasing function. Since \(\int _{0}^{\infty }g_{L} \left( u\right) \text {d}u=\frac{\pi \beta }{2}\), in this case the pair \(\left( N,\lambda \right) \) is a stationary HP if \(\beta <\frac{2}{\pi \eta {\mathbb {E}}\left( \left| J\right| \right) }\).

  • The second function \(g_{G}\left( .\right) \) is the Fourier transform of a Gaussian spectral measure \(v_{G}\left( .\right) \)

    $$\begin{aligned} v_{G}\left( \text {d}u\right)= & {} \sqrt{\frac{\beta }{\pi }}e^{-\beta u^{2}}\text {d}u. \end{aligned}$$
    (7)

    The mean of \(v_{G}\left( .\right) \) is \(\mu _{G}=0\) and its variance is \(\sigma _{G}^{2}=\frac{1}{2\beta }\). Given that \(v_{G}\left( .\right) \) is symmetric and \(\int _{{\mathbb {R}}^{+}}e^{-\beta u^{2}}\cos \left( hu\right) \text {d}u=\frac{1}{2}\sqrt{\frac{\pi }{\beta }}e^{-\frac{h^{2}}{4\beta }}\), the memory kernel is equal to

    $$\begin{aligned} g_{G}\left( h\right)= & {} e^{-\frac{h^{2}}{4\beta }}, \end{aligned}$$
    (8)

    where \(\beta \in \mathbb {R^{+}}\). Since \(\int _{0}^{\infty }g_{G}\left( u\right) \text {d}u=\sqrt{\pi \beta }\), in this case the pair \(\left( N,\lambda \right) \) is a stationary HP if \(\sqrt{\beta }<\frac{1}{\sqrt{\pi }\eta {\mathbb {E}}\left( \left| J\right| \right) }\).

  • The third function \(g_{Log}\left( .\right) \) is the Fourier transform of the logistic distribution defined as follows:

    $$\begin{aligned} v_{Log}\left( \text {d}u\right)= & {} \frac{e^{-\frac{u}{\beta }}}{\beta \left( 1+e^{-\frac{u}{\beta }}\right) ^{2}}\text {d}u, \end{aligned}$$
    (9)

    where \(\beta \in \mathbb {R^{+}}\). The mean of \(v_{Log}\left( .\right) \) is \(\mu _{Log}=0\) and its variance is \(\sigma _{Log}^{2}=\frac{\beta ^{2}\pi ^{2}}{3}\). The memory kernel is equal to

    $$\begin{aligned} g_{G}\left( h\right)= & {} {\left\{ \begin{array}{ll} \frac{\pi \beta h}{\sinh \left( \pi \beta h\right) } &{} \quad \text {if }h>0\\ 1 &{} \quad \text {if }h=0 \end{array}\right. }. \end{aligned}$$
    (10)

    Since \(\int _{0}^{\infty }g_{Log}\left( u\right) \text {d}u=\frac{\pi }{4\beta }\), in this case the pair \(\left( N,\lambda \right) \) is a stationary HP if \(\beta >\frac{\pi \eta {\mathbb {E}}\left( \left| J\right| \right) }{4}\).

  • The last memory kernel \(g_{C}\left( .\right) \) is the Fourier transform of a Cauchy spectral measure \(v_{C}\left( .\right) \)

    $$\begin{aligned} v_{C}\left( \text {d}u\right)= & {} \frac{1}{\pi }\frac{\beta }{\beta ^{2}+u^{2}}\text {d}u, \end{aligned}$$
    (11)

    where \(\beta \in \mathbb {R^{+}}\)and the moments of \(v_{C}\left( .\right) \) are undefined. Given that this measure is symmetric, the memory kernel \(g_{C}\left( .\right) \) is an exponentially decreasing function

    $$\begin{aligned} g_{C}\left( h\right)= & {} e^{-\beta \left| h\right| }, \end{aligned}$$
    (12)

    considered in the following as the exponential kernel because the delays after the arrival of jumps are non-negative. Since \(\int _{0}^{\infty }g_{C}\left( u\right) \text {d}u=\frac{1}{\beta }\), in this case the pair \(\left( N,\lambda \right) \) is a stationary HP if \(\beta >\eta {\mathbb {E}}\left( \left| J\right| \right) \).

Within the literature, the Hawkes process with exponential kernel has an intensity \(\lambda _{t}\) usually defined as:

$$\begin{aligned} \lambda _{t}|{\mathcal {F}}_{t}=\lim _{h\rightarrow 0}\frac{{\mathbb {E}} \left( N_{t+h}|{\mathcal {F}}_{t}\right) -N_{t}}{h}= & {} \alpha +\left( \lambda _{0}-\alpha \right) e^{-\beta t}+\eta \int _{0}^{t}e^{-\beta \left( t-u\right) }\text {d}L_{u},\nonumber \\ \end{aligned}$$
(13)

where \(\beta \) is the constant rate of mean reversion. From Hawkes (1971b), the pair \(\left( N,\lambda \right) \) is a stationary Hawkes process if the following inequality holds \(\eta {\mathbb {E}}\left( \left| J\right| \right) \int _{0}^{\infty }e^{-\beta u}\text {d}u<1\). This latter condition is then met by choosing the intensity parameters \(\eta \) and \(\beta \) such as \(\eta {\mathbb {E}}\left( \left| J\right| \right) <\beta \). This stationarity property of the Hawkes process is required to ensure that the long-term expectation, noted \(\lambda _{\infty }=\lim _{t\rightarrow \infty }{\mathbb {E}}\left( \lambda _{t}\right) \) is finite and equal to \(\lambda _{\infty }=\frac{\beta \alpha }{\beta -\eta {\mathbb {E}}\left( \left| J\right| \right) }\).

The first order moment of the HP intensity for the kernel functions in Eqs. (6), (8) and (10) can be obtained by inverse Fourier transform. In fact, from Eq. (2), we use the properties of the expectation and apply the Fourier transform to obtain:

$$\begin{aligned} {\mathcal {F}}\left( {\mathbb {E}}\left( \lambda _{t}\mid {\mathcal {F}}_{0}\right) \right) \left( \omega \right)= & {} \alpha {\mathcal {F}}\left( 1\right) \left( \omega \right) +\left( \lambda _{0}-\alpha \right) {\mathcal {F}} \left( g\left( t\right) \right) \left( \omega \right) \\{} & {} +\eta {\mathbb {E}}\left( \left| J\right| \right) {\mathcal {F}} \left( \int _{0}^{t}g\left( t-u\right) {\mathbb {E}}\left( \lambda _{u} \mid {\mathcal {F}}_{0}\right) \text {d}u\right) \left( \omega \right) \nonumber \\= & {} 2\alpha \pi \delta \left( \omega \right) +\left( \lambda _{0} -\alpha \right) v_{k}\left( \omega \right) +\eta {\mathbb {E}} \left( \left| J\right| \right) v_{k}\left( \omega \right) {\mathcal {F}} \left( {\mathbb {E}}\left( \lambda _{t}\mid {\mathcal {F}}_{0}\right) \text {d}u\right) \left( \omega \right) \nonumber \end{aligned}$$
(14)

where \(\delta \left( .\right) \) is the Dirac Delta function. Then, given that

$$\begin{aligned} {\mathcal {F}}\left( {\mathbb {E}}\left( \lambda _{t}\mid {\mathcal {F}}_{0}\right) \right) \left( \omega \right)= & {} \frac{2\alpha \pi \delta \left( \omega \right) +\left( \lambda _{0}-\alpha \right) v_{k}\left( \omega \right) }{1-\eta {\mathbb {E}} \left( \left| J\right| \right) v_{k}\left( \omega \right) }\\= & {} \frac{2\alpha \pi \delta \left( \omega \right) }{1-\eta {\mathbb {E}} \left( \left| J\right| \right) v_{k}\left( \omega \right) }+\frac{\left( \lambda _{0} -\alpha \right) v_{k}\left( \omega \right) }{1-\eta {\mathbb {E}}\left( \left| J\right| \right) v_{k}\left( \omega \right) }\nonumber \end{aligned}$$
(15)

it follows that

$$\begin{aligned} {\mathbb {E}}\left( \lambda _{t}\mid {\mathcal {F}}_{0}\right)= & {} \alpha {\mathcal {F}}^{-1}\left( \frac{2\pi \delta \left( \omega \right) }{1-\eta {\mathbb {E}}\left( \left| J\right| \right) v_{k}\left( \omega \right) } \right) \left( t\right) \\{} & {} +\left( \lambda _{0}-\alpha \right) {\mathcal {F}}^{-1} \left( \frac{v_{k}\left( \omega \right) }{1-\eta {\mathbb {E}}\left( \left| J\right| \right) v_{k}\left( \omega \right) }\right) \left( t\right) \nonumber \\= & {} \alpha \frac{1}{2\pi }\int _{{\mathbb {R}}}e^{-i\omega t} \frac{2\pi \delta \left( \omega \right) }{1-\eta {\mathbb {E}}\left( \left| J\right| \right) v_{k}\left( \omega \right) }\text {d}\omega \nonumber \\{} & {} +\left( \lambda _{0}-\alpha \right) \frac{1}{2\pi } \int _{{\mathbb {R}}}e^{-i\omega t}\frac{v_{k}\left( \omega \right) }{1-\eta {\mathbb {E}}\left( \left| J\right| \right) v_{k}\left( \omega \right) }\text {d}\omega \nonumber \\= & {} \frac{\alpha }{1-\eta {\mathbb {E}}\left( \left| J\right| \right) v_{k}\left( 0\right) }+\left( \lambda _{0}-\alpha \right) \frac{1}{2\pi } \int _{{\mathbb {R}}}e^{-i\omega t}\frac{v_{k}\left( \omega \right) }{1-\eta {\mathbb {E}}\left( \left| J\right| \right) v_{k}\left( \omega \right) }\text {d}\omega .\nonumber \end{aligned}$$
(16)

The previous stability conditions ensure that this latter integral exists. Notice that there is no closed form expression of this expected value of the HP intensity, except for the HP intensity with exponential kernel. Through either the Laplace transform technique or ODE technique of Fonseca and Zaatour (2013), the first order moment of the HP intensity with exponential kernel is given by,

$$\begin{aligned} {\mathbb {E}}\left( \lambda _{t}\mid {\mathcal {F}}_{0}\right)= & {} \lambda _{0}e^{\left( \eta {\mathbb {E}}\left( \left| J\right| \right) -\beta \right) t}-\frac{\beta \alpha }{\eta {\mathbb {E}}\left( \left| J\right| \right) -\beta }\left( 1-e^{\left( \eta {\mathbb {E}}\left( \left| J\right| \right) -\beta \right) t}\right) . \end{aligned}$$
(17)

To the best of our knowledge, it is not possible to derive analytically moments of HP with non-exponential kernel listed above using the ordinary differential equation (ODE) technique of Fonseca and Zaatour (2013). As an alternative, we will derive a quasi-analytic formula of the conditional moment generating function of these HP intensities with non-exponential kernel.

To compare shapes of memory kernel we range above memory kernels by degree of decay and consider that their spectral density have the same variance \(\sigma ^{2}\). It follows that the parameter \(\beta \) of memory kernels \(\left( g_{k}\right) _{k\in \left\{ L,G,Log\right\} }\) is a function of \(\sigma ^{2}\). As the variance of the Cauchy spectral measure is not defined, we find an \(\beta \) by minimizing the sum of least square between the Log and Cauchy memory kernels. Figure 1 compares these four memory kernels when the variance \(\sigma ^{2}=16\). The left plot reveals that the Laplace kernel has a slower decay than Gaussian and logistic kernels. Further, in long run the exponential kernel seems to decay more slowly than the Gaussian and logistic kernels, while the opposite situation occurs in the short run. The right graph shows that the decay rate of the kernel increases with the thickness of the distribution tails.

Fig. 1
figure 1

Comparison of Laplace, Gaussian and Logistic kernels for \(\sigma _{L}^{2}=\sigma _{G}^{2}=\sigma _{Log}^{2}=16\). The Cauchy kernel is computed with \(\beta _{C}=2.71\)

The common point of these four memory kernels is the stability of the Hawkes intensity \(\lambda _{t}\) under a certain condition. Unlike the exponential kernel, the other kernels can enable the Hawkes intensity \(\lambda _{t}\) to remember past events for a long time according to their speed of decay. Considering the exponential kernel (12), the pair \(\left( N_{t},\lambda _{t}\right) _{t\ge 0}\) is a Markov process and the intensity \(\lambda _{t}\) is solution of the following stochastic differential equation (SDE):

$$\begin{aligned} \text {d}\lambda _{t}= & {} \beta \left( \alpha -\lambda _{t}\right) \text {d}t+\eta \text {d}L_{t}. \end{aligned}$$
(18)

Taking into account kernel functions in Eqs. (6), (8) and (10), the pair \(\left( N_{s},\lambda _{s}\right) _{s\ge 0}\) is not a Markov process and we have very little information on its behavior conditionally to the filtration \({\mathcal {F}}_{t}\) for \(t\le s\). For non-Markov processes we cannot rely on Itô calculus to determine quantity like the mgf of the arrival rate of shocks in the asset’s returns dynamics. However, by using the spectral representation of kernel \(g_{k}\left( .\right) \) for \(k\in \left\{ L,G,Log,C\right\} \), we reformulate \(\left( N_{s},\lambda _{s}\right) _{s\ge 0}\) as an infinite dimensional Markov process in the complex plane. To do so, we rewrite \(g_{k}\left( .\right) \) in Eq. (2) as a Fourier transform of the measure \(v_{k}\left( .\right) \) and change the order of integration:

$$\begin{aligned} \lambda _{t}= & {} \alpha +\left( \lambda _{0}-\alpha \right) g_{k} \left( t\right) +\eta \int _{0}^{t}\int _{{\mathbb {R}}}e^{i\left( t-u\right) \xi }v_{k}\left( \text {d}\xi \right) \text {d}L_{u}\\= & {} \alpha +\left( \lambda _{0}-\alpha \right) g_{k}\left( t\right) +\eta \int _{{\mathbb {R}}}\left( \int _{0}^{t}e^{i\left( t-u\right) \xi } \text {d}L_{u}\right) v_{k}\left( \text {d}\xi \right) .\nonumber \end{aligned}$$
(19)

It follows that the process \(Y_{t}^{\xi }=\int _{0}^{t}e^{i\left( t-u\right) \xi }\text {d}L_{u}\), defined on the complex plane \({\mathbb {C}}\) is solution of the following SDE:

$$\begin{aligned} \text {d}Y_{t}^{\xi }= & {} i\xi Y_{t}^{\xi }\text {d}t+\text {d}L_{t}, \end{aligned}$$
(20)

and the SDE of the Hawkes intensity is given by

$$\begin{aligned} \text {d}\lambda _{t}= & {} \left( \lambda _{0}-\alpha \right) \frac{\text {d}g_{k}\left( t\right) }{\text {d}t}\text {d}t+\eta \int _{{\mathbb {R}}}\text {d}Y_{t}^{\xi }v_{k} \left( \text {d}\xi \right) . \end{aligned}$$
(21)

Remark that \(\int _{{\mathbb {R}}}\text {d}Y_{t}^{\xi }v_{k}\left( \text {d}\xi \right) \) is real since the measure \(v_{k}\left( .\right) \) is symmetric and by rewriting \(Y_{t}^{\xi }\) as \(\sum _{l=1}^{N_{t}^{l}}e^{i\left( t-u_{l}\right) }\left| J_{l}\right| \) with \(u_{l}\) the time of the \(l^{th}\) shock in the asset’s returns dynamics.

To benefit from the reformulation of the process \(\left( N_{t},\lambda _{t}\right) _{t\ge 0}\) like an infinite dimensional Markov process \(\left( N_{t},\lambda _{t},\left( Y_{t}^{\xi }\right) _{\xi \in {\mathbb {R}}}\right) _{t\ge 0}\), we use the discretisation scheme described below, and passing to the limit we infer the mgf of the arrival rate of shocks in the asset’s returns dynamics. The key point of this discretisation scheme consists of an approximation of the measure \(v_{k}\left( .\right) \) as a discrete measures with a finite numbers of atoms. For this purpose, we consider a symmetric partition \({\mathcal {E}}^{(n)}:=\{-\infty<\xi _{-n}^{\left( n\right) }<...<\xi _{0}^{(n)}=0<\xi _{1}^{(n)}<...<\xi _{n}^{(n)}<\infty \}\), with \(\xi _{-n}^{(n)}=-\xi _{n}^{(n)}\). For each interval \((\xi _{l}^{(n)},\xi _{l+1}^{(n)})\), the barycenter is defined as

$$\begin{aligned} b_{l+1}^{\left( n\right) }= & {} \frac{\xi _{l}^{(n)}+\xi _{l+1}^{(n)}}{2}, \quad \text {for }l\in \left\{ 0,\ldots ,n-1\right\} , \\ b_{l}^{\left( n\right) }= & {} \frac{\xi _{l}^{(n)}+\xi _{l+1}^{(n)}}{2}, \quad \text {for }l\in \left\{ -n,\ldots ,-1\right\} ,\nonumber \end{aligned}$$
(22)

while the mass of any atom with respect of the measure \(v_{k}\left( .\right) \) is

$$\begin{aligned} m_{l+1}^{\left( k,n\right) }= & {} \int _{\xi _{l}^{(n)}}^{\xi _{l+1}^{(n)}}v_{k} \left( \text {d}u\right) ,\text { for }l\in \left\{ 0,\ldots ,n-1\right\} ,\\ m_{l}^{\left( k,n\right) }= & {} \int _{\xi _{l}^{(n)}}^{\xi _{l+1}^{(n)}}v_{k} \left( \text {d}u\right) ,\text { for }l\in \left\{ -n,\ldots ,-1\right\} .\nonumber \end{aligned}$$
(23)

By construction \(\lim _{n\rightarrow \infty }\sum _{l=-n}^{n}m_{l}^{\left( k,n\right) }=1\) and the discrete approximation of the measure \(v_{k}\left( .\right) \) given a partition of size n, denoted by \({\tilde{v}}_{k}^{\left( n\right) }\left( .\right) \) is defined as follows

$$\begin{aligned} {\tilde{v}}_{k}^{\left( n\right) }\left( u\right)= & {} \sum _{l=-n}^{n}m_{l}^{\left( k,n\right) }\delta _{b_{l}^{\left( n\right) }} \left( u\right) , \end{aligned}$$
(24)

where \(\delta _{b_{l}^{(n)}}\left( u\right) \) is the Dirac measure located at the barycenter \(b_{l}^{(n)}\). According to Eqs. (22) and (23), and given that the measure \(v_{k}\left( .\right) \) is symmetric, for \(l\in \left\{ -n,\ldots ,-n\right\} \backslash \left\{ 0\right\} \), \(m_{-l}^{\left( k,n\right) }=m_{l}^{\left( k,n\right) }\) and \(b_{-l}^{\left( n\right) }=-b_{l}^{\left( n\right) }\). It follows that the discrete measure \({\tilde{v}}_{k}^{\left( n\right) }\left( .\right) \) is also symmetric. We consider that the following assumptions holds for the partition \({\mathcal {E}}^{(n)}\):

(i):

\(\xi _{-n}^{(n)}\rightarrow -\infty \) and \(\xi _{n}^{(n)}\rightarrow \infty \) when \(n\rightarrow \infty \).

(ii):

\(\max |\xi _{l+1}^{(n)}-\xi _{l}^{(n)}|\rightarrow 0\) when \(n\rightarrow \infty \).

(iii):

\({\mathcal {E}}^{(n)}\subset {\mathcal {E}}^{(n+1)}\).

In this setting, for any integrable function, \(f\left( .\right) \) with respect to the measure \(v_{k}\left( .\right) \), we have that

$$\begin{aligned} \lim _{n\rightarrow \infty }\int _{-\infty }^{\infty }f\left( u\right) {\tilde{v}}_{k}^{\left( n\right) }\left( \text {d}u\right) = \int _{-\infty }^{\infty }f\left( u\right) v_{k}\left( \text {d}u\right) . \end{aligned}$$
(25)

To lighten further developments, we choose n and adopt the following notations:

  • \(Y_{t}^{(l)}:=Y_{t}^{(b_{l}^{(n)})}\) for \(l\in \left\{ -n,\ldots ,n\right\} \) and \(Y_{t}^{(l)}\) is solution of the following SDE

    $$\begin{aligned} \text {d}Y_{t}^{(l)}=ib_{l}^{(n)}Y_{t}^{(l)}\text {d}t+\text {d}L_{t}^{\left( n\right) }, \end{aligned}$$
    (26)

    with \(L_{t}^{\left( n\right) }=\sum _{l=1}^{N_{t}^{\left( n\right) }}\left| J_{l}\right| \);

  • the approximated intensity in the discrete framework is

    $$\begin{aligned} \lambda _{t}^{\left( n\right) }=\alpha +\left( \lambda _{0}-\alpha \right) g_{k}\left( t\right) +\eta \sum _{l=-n}^{n} m_{l}^{\left( k,n\right) }Y_{t}^{(l)} \end{aligned}$$
    (27)

    and has the following SDE

    $$\begin{aligned} \text {d}\lambda _{t}^{\left( n\right) }=\left( \lambda _{0}-\alpha \right) \frac{\text {d}g_{k}\left( t\right) }{\text {d}t}\text {d}t +\eta \sum _{l=-n}^{n}m_{l}^{\left( k,n\right) }ib_{l}^{(n)}Y_{t}^{(l)} \text {d}t+\eta \sum _{l=-n}^{n}m_{l}^{\left( k,n\right) } \text {d}L_{t}^{\left( n\right) }.\nonumber \\ \end{aligned}$$
    (28)

Proposition 1

The moment generating function of \(\lambda _{s}^{^{\left( n\right) }}\) conditionally to the information at time \(t\le s\) is given by

$$\begin{aligned} {\mathbb {E}}\left( e^{\omega \lambda _{s}^{\left( n\right) }}|{\mathcal {F}}_{t}\right)= & {} \exp \left( F\left( t,s,\omega \right) +G\left( t,s,\omega \right) \lambda _{t}^{\left( n\right) }+\sum _{l=-n}^{n}H_{l}\left( t,s,b_{l}^{\left( n\right) }, \omega \right) m_{l}^{\left( k,n\right) }Y_{t}^{(l)}\right) \end{aligned}$$

for all \(\omega \in {\mathbb {C}}_{-}\) that satisfies the regularity condition \(\psi \left( 0,\eta \omega \sum _{l=-n}^{n}m_{l}^{\left( k,n\right) }\right) <\infty \). The functions \(H_{l}\) are defined as follows

$$\begin{aligned} H_{l}\left( t,s,b_{l}^{(n)},\omega \right)= & {} b_{l}^{\left( n\right) }\eta i\int _{t}^{s}e^{ib_{l}^{\left( n\right) }\left( u-t\right) }G \left( u,s,\omega \right) \text {d}u,\,\forall l\in \left\{ -n,\ldots ,n\right\} , \end{aligned}$$

F and G are time dependent functions that solve the following ODEs system

$$\begin{aligned} \frac{\partial F}{\partial t}\left( t,s,\omega \right)= & {} -\left( \lambda _{0}-\alpha \right) \frac{\text {d}g_{k}\left( t\right) }{\text {d}t}G\left( t,s,\omega \right) \end{aligned}$$
(29)
$$\begin{aligned} \frac{\partial G}{\partial t}\left( t,s,\omega \right)= & {} 1-\psi \left( 0,\eta G\left( t,s,\omega \right) \sum _{l=-n}^{n}m_{l}^{\left( k,n\right) } +\sum _{l=-n}^{n}H_{l}\left( t,s,b_{l}^{\left( n\right) },\omega \right) m_{l}^{\left( k,n\right) }\right) \nonumber \\ \end{aligned}$$
(30)

with final conditions \(F\left( s,s,\omega \right) =0\), \(G\left( s,s,\omega \right) =\omega \). \(\psi \) is the moment generating function of the pair of the random jump sizes J and its absolute value: \(\psi \left( z_{1},z_{2}\right) :={\mathbb {E}}\left( e^{z_{1}J+z_{2}\left| J\right| }\right) \). \(k\in \left\{ L,G,Log,C\right\} \) indicates the type of HP memory kernel

Proof

In Appendix A. \(\square \)

Passing to the limit with respect of n and using the frequency derivative property of Fourier transform, we have the following proposition on the conditional mgf of the HP intensity.

Proposition 2

The moment generating function of \(\lambda _{s}\) conditionally to the information at time \(t\le s\) is given by

$$\begin{aligned} {\mathbb {E}}\left( e^{\omega \lambda _{s}}|{\mathcal {F}}_{t}\right)= & {} \exp \left( F\left( t,s,\omega \right) +G\left( t,s,\omega \right) \lambda _{t}\right) \nonumber \\{} & {} \times \exp \left( i\eta \int _{t}^{s}G\left( u,s,\omega \right) \int _{{\mathbb {R}}}le^{il\left( u-t\right) }Y_{t}^{l}v_{k}\left( l\right) \text {d}l\,\text {d}u\right) \nonumber \\ \end{aligned}$$
(31)

for all \(\omega \in {\mathbb {C}}_{-}\) that satisfies the regularity condition \(\psi \left( 0,\eta \omega \right) <\infty \). The functions F and G solve the following ODE and partial integro-differential equation (PIDE):

$$\begin{aligned} \frac{\partial F}{\partial t}\left( t,s,\omega \right)= & {} - \left( \lambda _{0}-\alpha \right) \frac{\text {d}g_{k}\left( t\right) }{\text {d}t}G\left( t,s,\omega \right) \end{aligned}$$
(32)
$$\begin{aligned} \frac{\partial G}{\partial t}\left( t,s,\omega \right)= & {} 1-\psi \left( 0,\eta G\left( t,s,\omega \right) +\eta \int _{t}^{s}G\left( u,s, \omega \right) \frac{\text {d}g_{k}\left( u-t\right) }{\text {d}t}\text {d}u\right) \nonumber \\ \end{aligned}$$
(33)

with final conditions \(F\left( s,s,\omega \right) =0\), \(G\left( s,s,\omega \right) =\omega \). \(k\in \left\{ L,G,Log,C\right\} \) indicates the type of HP memory kernel.

Proof

The approximated intensity \(\lambda _{t}^{\left( n\right) }\) defined in Eq. (27), converges almost surely to \(\lambda _{t}\) since for fixed \(w\in \Omega \), the function \(z\mapsto Y_{t}^{(z)}\left( w\right) \) is integrable with respect to the measure \(v_{k}\) and for any integrable function, \(f\left( .\right) \) with respect to the measure \(v_{k}\left( .\right) \), we have that

$$\begin{aligned} \lim _{n\rightarrow \infty }\int _{-\infty }^{\infty }f\left( u\right) {\tilde{v}}_{k}^{\left( n\right) }\left( \text {d}u\right) =\int _{-\infty }^{\infty }f\left( u\right) v_{k}\left( \text {d}u\right) . \end{aligned}$$
(34)

It follows that \(N_{t}^{\left( n\right) }=\int _{0}^{t}\lambda _{u}^{\left( n\right) }\text {d}u\) converges almost surely to \(N_{t}\). For \(l\in \left\{ -n,\ldots ,n\right\} \), \(Y_{t}^{(l)}=\int _{0}^{t}e^{-b_{l}^{\left( n\right) }\left( t-u\right) }\text {d}L_{u}^{\left( n\right) }\). If the partition \({\mathcal {E}}^{(n)}\) satisfies the assumptions (i)-(iii), when n tends to \(\infty \), a barycenter \(b_{l}^{\left( n\right) }\) tends to be an element of \({\mathbb {R}}_{+}\). Hence the sequence \(\left( Y_{t}^{\left( b_{l}^{(n)}\right) }\right) _{l=-n,\ldots ,n}\) tends to \(\left( Y_{t}^{\left( l\right) }\right) _{l\in {\mathbb {R}}}\) almost surely. Next, the dominated convergence provides that

$$\begin{aligned} {\mathbb {E}}\left( e^{\omega \lambda _{s}u}|{\mathcal {F}}_{t}\right)= & {} \lim _{n\rightarrow \infty }{\mathbb {E}}\left( e^{\omega \lambda _{s}^{\left( n \right) }}|{\mathcal {F}}_{t}\right) , \end{aligned}$$

where from Proposition 1

$$\begin{aligned} {\mathbb {E}}\left( e^{\omega \lambda _{s}^{\left( n\right) }}|{\mathcal {F}}_{t}\right)= & {} \exp \left( F\left( t,s,\omega \right) +G\left( t,s,\omega \right) \lambda _{t}^{\left( n\right) }+\sum _{l=-n}^{n}H_{l}\left( t,s,b_{l }^{\left( n\right) },\omega \right) m_{l}^{\left( k,n\right) }Y_{t}^{(l)}\right) \end{aligned}$$

and

$$\begin{aligned} \sum _{l=-n}^{n}H_{l}\left( t,s,b_{l}^{(n)},\omega \right) m_{l}^{\left( k,n\right) }= & {} \sum _{l=-n}^{n}b_{l}^{\left( n\right) }\eta i\int _{t}^{s}e^{ib_{l}^{\left( n\right) }\left( u-t\right) }G\left( u,s,\omega \right) \text {d}u\,m_{l}^{\left( k,n\right) }. \end{aligned}$$

Taking into account the definition of the discretised measure \({\tilde{v}}_{k}^{\left( n\right) }\) (24) we obtain

$$\begin{aligned} \int _{-\infty }^{\infty }H_{l}\left( t,s,l,\omega \right) {\tilde{v}}_{k}^{\left( n\right) } \left( l\right) \text {d}l= & {} \eta i\int _{-\infty }^{\infty } \left( \int _{t}^{s}le^{-il\left( u-t\right) }G\left( u,s,\omega \right) \text {d}u\right) {\tilde{v}}_{k}^{\left( n\right) }\left( l\right) \text {d}l. \end{aligned}$$

Passing to the limit we have

$$\begin{aligned} \int _{-\infty }^{\infty }H_{l}\left( t,s,l,\omega \right) v_{k}\left( l\right) \text {d}l= & {} \eta i\int _{-\infty }^{\infty }\left( \int _{t}^{s}le^{-il\left( u-t\right) }G \left( u,s,\omega \right) \text {d}u\right) v_{k}\left( l\right) \text {d}l\\= & {} \eta i\int _{t}^{s}G\left( u,s,\omega \right) \int _{-\infty }^{\infty }le^{-il \left( u-t\right) }v_{k}\left( l\right) \text {d}l\text {d}u\\= & {} \eta \int _{t}^{s}G\left( u,s,\omega \right) \frac{\text {d}g_{k} \left( u-t\right) }{\text {d}t}\text {d}u \end{aligned}$$

The second and last lines of the latter equation stem from the Fubini’s theorem and the frequency differentiation property of the Fourier transform. Likewise, we show that the limit of \(\sum _{l=-n}^{n}H_{l}\left( t,s,b_{l}^{(n)},\omega \right) Y_{t}^{\left( l\right) }m_{l}^{\left( k,n\right) }\) is

$$\begin{aligned} \int _{-\infty }^{\infty }H_{l}\left( t,s,l,\omega \right) Y_{t}^{l}v_{k}\left( l\right) \text {d}l= & {} i\eta \int _{t}^{s}G\left( u,s,\omega \right) \int _{{\mathbb {R}}} le^{il\left( u-t\right) }Y_{t}^{l}v_{k}\left( l\right) \text {d}l\,\text {d}u, \end{aligned}$$

and concludes the proof. \(\square \)

Note that for \(t=0\), \(Y_{t}^{l}=0\) and we evaluate the moment generating function of \(\lambda _{s}\mid {\mathcal {F}}_{t}\) in Eq. (31) solely from \(\lambda _{0}\). But for \(0<t<s\), we need to infer \(\left( \lambda _{t}^{\left( n\right) },\left( Y_{t}^{\left( n\right) }\right) _{l=-n,\cdots ,n}\right) \) from option prices observed over the interval [0, t]. This can be done using the filtering technique as in Hainaut and Moraux (2018b), and Njike and Hainaut (2022). Applying a finite difference approach to the mgf of the HP intensity in Proposition 2, we draw the expected value of the HP intensity given the time in Fig. 2. From this figure, we conclude that the expected HP intensity decreases as time increases at the same rate of decay as the corresponding kernel function.

Fig. 2
figure 2

Expected value of the HP intensity \(\lambda _{t}\mid {\mathcal {F}}_{0}\). For comparison purpose, the parameter \(\beta \) of the Laplace, the Gaussian and the Logistic kernels are such that \(\sigma _{L}^{2}=\sigma _{G}^{2}=\sigma _{Log}^{2}=16\). While the Cauchy kernel is computed with \(\beta _{C}=2.71\). The other parameters are: \(\alpha =0.4\), \(\eta =0.2\), \(\lambda _{0}=50\), \(p=0.4\), \(\rho ^{+}=30\) and \(\rho ^{-}=-37\)

Let \(X_{t}:=\log S_{t}\) the log of the asset price at time t. Considering the discretisation scheme on the partition \({\mathcal {E}}^{(n)}\), the asset price noted \(S_{t}^{\left( n\right) }\) is solution of the following PDE:

$$\begin{aligned} \frac{\text {d}S_{t}^{\left( n\right) }}{S_{t}^{\left( n\right) }}= & {} r\text {d}t+\sqrt{V_{t}}\left( \rho \text {d}W_{t}^{1} +\sqrt{1-\rho ^{2}}\text {d}W_{t}^{2}\right) \nonumber \\{} & {} +\text {d}\left( \sum _{l=1}^{N_{t}^{\left( n\right) }} \left( e^{J_{l}}-1\right) \right) -\lambda _{t}^{\left( n\right) }{\mathbb {E}} \left( e^{J}-1\right) \text {d}t.\nonumber \\ \end{aligned}$$
(35)

By Itô formula the discretised version of the process \(X_{t}\), noted \(X_{t}^{\left( n\right) }:=\log S_{t}^{\left( n\right) }\) admits the next SDE

$$\begin{aligned} \text {d}X_{t}^{\left( n\right) }= & {} \left( r-\frac{1}{2}V_{t} -\lambda _{t}^{\left( n\right) }{\mathbb {E}}\left( e^{J}-1\right) \right) \text {d}t+\sqrt{V_{t}}\left( \rho \text {d}W_{t}^{1} +\sqrt{1-\rho ^{2}}\text {d}W_{t}^{2}\right) \nonumber \\{} & {} +\text {d}\left( \sum _{l=1}^{N_{t}^{\left( n\right) }}J_{l}\right) . \end{aligned}$$
(36)

The process \(X_{t}^{\left( n\right) }\) is then a Markov process with respect to the intensity \(\lambda _{t}^{\left( n\right) }\), the \(2n+1\) factors \(\left( Y_{t}^{\left( l\right) }\right) _{l=-n,\ldots ,n}\) and the counter \(N_{t}^{\left( n\right) }\). We can therefore deal with pricing and hedging problems using standard methods developed for stochastic models. It is shown in the proof of Proposition 2 that \(\left( \lambda _{t}^{\left( n\right) },\left( Y_{t}^{\left( l\right) }\right) _{l=-n,\ldots ,n},N_{t}^{\left( n\right) }\right) \) converges almost surely to \(\left( \lambda _{t},\left( Y_{t}^{l}\right) _{l=-n,\ldots ,n},N_{t}\right) \). Consequently, the process \(X_{t}^{\left( n\right) }\) converges almost surely to \(X_{t}\). The next proposition establishes the characteristic function of \(X_{s}^{\left( n\right) }\) conditionally to the information at time \(t\le s\).

Proposition 3

For all \(\omega \in {\mathbb {R}}\) that satisfies the regularity condition \(\psi \left( i\omega ,\eta \sum _{l=-n}^{n}\right. \)\( \left. m_{l}^{\left( k,n\right) }\right) <\infty \), the characteristic function of \(X_{s}^{\left( n\right) }\) conditionally to the information at time \(t\le s\) is given by

$$\begin{aligned} \varUpsilon _{t,s}^{\left( n\right) }\left( \omega \right)&:= {\mathbb {E}}^{Q} \left( e^{i\omega X_{s}^{\left( n\right) }}\mid {\mathcal {F}}_{t}\right) \\&= \exp \left( i\omega \left( X_{t}^{\left( n\right) }+r\left( s-t\right) \right) \right) \\&\quad \times \exp \left( \theta _{V}\sigma _{V}^{-2}\left( k_{V}-i\omega \rho \sigma _{V}-{\bar{d}}\right) \left( s-t\right) -2\log \left( \frac{1-{\bar{g}} e^{-{\bar{d}}\left( s-t\right) }}{1-{\bar{g}}}\right) \right) \\&\quad \times \exp \left( V_{t}\sigma _{V}^{-2}\left( k_{V}-i\omega \rho \sigma _{V}-{\bar{d}}\right) \frac{1-e^{-{\bar{d}}\left( s-t\right) }}{1-{\bar{g}}e^{-{\bar{d}}\left( s-t\right) }}\right) \\&\quad \times \exp \left( A_{J}\left( t,s,\omega \right) +\sum _{l=-n}^{n} C_{l}\left( t,s,b_{l}^{\left( n\right) },\omega \right) m_{l}^{\left( k,n \right) }Y_{t}^{\left( l\right) }+D\left( t,s,\omega \right) \lambda _{t}^{\left( n\right) }\right) \end{aligned}$$

where

$$\begin{aligned} {\bar{d}}= & {} \sqrt{\left( i\omega \rho \sigma _{V}-k_{V}\right) ^{2} +\sigma _{V}^{2}\left( i\omega +\omega ^{2}\right) },\\ {\bar{g}}= & {} \frac{k_{V}-i\omega \rho \sigma _{V}-{\bar{d}}}{\left( k_{V} -i\omega \rho \sigma _{V}+{\bar{d}}\right) }, \end{aligned}$$

\(k\in \left\{ L,G,Log,C\right\} \) indicates the type of HP memory kernel. For all \(l\in \left\{ -n,\ldots ,n\right\} \),

$$\begin{aligned} C_{l}\left( t,s,b_{l}^{\left( l\right) },\omega \right)= & {} ib_{l}^{\left( n\right) }\eta \int _{t}^{s}e^{ib_{l}^{\left( n\right) } \left( u-t\right) }D\left( u,s,\omega \right) \text {d}u. \end{aligned}$$
(37)

\(A_{J}\) and D are functions that solves the following ODEs system:

$$\begin{aligned} \left\{ \begin{aligned} \frac{\partial A_{J}}{\partial t}\left( t,s,\omega \right)&=-D\left( t,s,\omega \right) \left( \lambda _{0}-\alpha \right) \frac{\text {d}g_{k}\left( t\right) }{\text {d}t}\\ \frac{\partial D}{\partial t}\left( t,s,\omega \right)&=i\omega \left( \psi \left( 1,0\right) -1\right) -\Bigg (\psi \Big (i\omega ,\sum _{l=-n}^{n}C_{l}\left( t,s,b_{l}^{\left( l\right) }, \omega \right) m_{l}^{(k,n)}\\&\quad +\eta D\left( t,s,\omega \right) \Big )-1\Bigg ) \end{aligned}\right. \end{aligned}$$
(38)

with the terminal condition \(A_{J}\left( s,s,\omega \right) =0\) and \(D\left( s,s,\omega \right) =0\).

Proof

In Appendix B. \(\square \)

The previous proposition presents the characteristic function as the product of the exponent of an affine function related to the jump component and the characteristic function of the standard Heston model (see Schoutens et al. 2004 and Albrecher et al. 2006 for more details). Passing to the limit with respect of n, we have the following proposition on the characteristic function of the process \(X_{s}\) under the risk neutral measure \({\mathbb {Q}}\).

Proposition 4

For all \(\omega \in {\mathbb {R}}\) that satisfies the regularity condition \(\psi \left( i\omega ,\eta \right) <\infty \), the characteristic function of \(X_{s}\) conditionally to the information at time \(t\le s\) is given by

$$\begin{aligned} \varUpsilon _{t,s}\left( \omega \right)&:= {\mathbb {E}}^{Q} \left( e^{i\omega X_{s}}\mid {\mathcal {F}}_{t}\right) \\&= \exp \left( i\omega \left( X_{t}^{\left( n\right) }+r\left( s -t\right) \right) \right) \nonumber \\&\quad \times \exp \left( \theta _{V}\sigma _{V}^{-2}\left( k_{V} -i\omega \rho \sigma _{V}-{\bar{d}}\right) \left( s-t\right) -2\log \left( \frac{1-{\bar{g}}e^{-{\bar{d}}\left( s-t\right) }}{1-{\bar{g}}}\right) \right) \nonumber \\&\quad \times \exp \left( V_{t}\sigma _{V}^{-2}\left( k_{V} -i\omega \rho \sigma _{V}-{\bar{d}}\right) \frac{1-e^{-{\bar{d}} \left( s-t\right) }}{1-{\bar{g}}e^{-{\bar{d}}\left( s-t\right) }}\right) \nonumber \\&\quad \times \exp \left( A_{J}\left( t,s,\omega \right) +i\eta \int _{t}^{s}D\left( u,s,\omega \right) \int _{{\mathbb {R}}} le^{il\left( u-t\right) }Y_{t}^{l}v_{k}\left( l\right) \text {d}l\, \text {d}u+D\left( t,s,\omega \right) \lambda _{t}\right) \nonumber \end{aligned}$$
(39)

where

$$\begin{aligned} {\bar{d}}= & {} \sqrt{\left( i\omega \rho \sigma _{V}-k_{V}\right) ^{2} +\sigma _{V}^{2}\left( i\omega +\omega ^{2}\right) },\\ {\bar{g}}= & {} \frac{k_{V}-i\omega \rho \sigma _{V}-{\bar{d}}}{\left( k_{V} -i\omega \rho \sigma _{V}+{\bar{d}}\right) }. \end{aligned}$$

\(k\in \left\{ L,G,Log,C\right\} \) indicates the type of HP memory kernel. \(A_{J}\) and D satisfy the next ODE and PIDE

$$\begin{aligned} \left\{ \begin{aligned} \frac{\partial A_{J}}{\partial t}\left( t,s,\omega \right)&=-D\left( t,s,\omega \right) \left( \lambda _{0}-\alpha \right) \frac{\text {d}g_{k}\left( t\right) }{\text {d}t}\\ \frac{\partial D}{\partial t}\left( t,s,\omega \right)&=i\omega \left( \psi \left( 1,0\right) -1\right) \\&\quad -\left( \psi \left( i\omega ,\eta \int _{t}^{s}\ frac{\text {d}g_{k}}{\text {d}h}\left( u-t\right) D \left( u,s,\omega \right) \text {d}u+\eta D\left( t,s,\omega \right) \right) -1\right) \end{aligned}\right. ,\nonumber \\ \end{aligned}$$
(40)

with the terminal condition \(A_{J}\left( s,s,\omega \right) =0\) and \(D\left( s,s,\omega \right) =0\)

Proof

This proof is similar to the one of Proposition 2, as \(X_{s}^{\left( n\right) }\)converges to \(X_{s}\) almost surely and by the dominated convergence the conditional characteristic function of the log-return process is equal to the limit with respect of n of the conditional characteristic function of the approximated log-return process. By some grouping and using the frequency derivative property of Fourier transform we retrieve the result of Proposition 4. \(\square \)

Our setting takes into account several features. It allows for stochastic evolution of the asset price’s variance, includes a jump risk component into the asset price dynamics and replicate the jumps clustering. We also generalize some existing valuation models. For instance, we retrieve the Black and Scholes option price formula by setting \(J_{l}=0\) and \(\theta _{V}=k_{V}=\sigma _{V}=0\), the Heston option price formula by setting \(J_{l}=0\) and the Bates and Merton option price formula by setting \(\alpha =\lambda _{0}\), \(\eta =0\) and the jump size \(J_{l}\) log-normally distributed.

For comparison purpose, Fig. 3 draws the density of the log-return at 50 days \(X_{50d}\) under the risk neutral measure for our four HP kernels. From these graphs, we conclude that the log-return density at 50 days with the HP Cauchy kernel has the thinnest tail. The log-return density at 50 days for ours extension of the Heston model with the Laplace, Gaussian and Logistic kernels look identical but the zoom in reveals the differences due to the decay rate of these kernel functions. Hence, the log-return density with the Laplace kernel which has a slower decay than the Gaussian and logistic ones, has the fattest tail.

Fig. 3
figure 3

Distribution of \(X_{50d}\mid {\mathcal {F}}_{0}\) under the risk neutral measure. For comparison purpose, the parameter \(\beta \) of the Laplace, the Gaussian and the Logistic kernels are such that \(\sigma _{L}^{2}=\sigma _{G}^{2}=\sigma _{Log}^{2}=16\). While the Cauchy kernel is computed with \(\alpha _{C}=2.71\). The other parameters are: \(S_{0}=1\), \(r=0.01\), \(\theta _{V}=0.4\), \(k_{V}=1\), \(\rho =-0.7\), \(\sigma _{V}=0.01\), \(v_{0}=0.01\), \(\alpha =0.4\), \(\eta =0.2\), \(\lambda _{0}=50\), \(p=0.4\), \(\rho ^{+}=30\) and \(\rho ^{-}=-37\)

3 Option pricing models

In this section, we develop the option pricing model under our modeling framework outlined in the previous section.

As many authors (see e.g. Heston 1993; Bakshi et al. 1997; Gatheral 2012) we will express the call option price using the following delta probability decomposition of Black-Sholes formula (1973). For this aim, in the next proposition we introduce a convenient measure \({\mathbb {Q}}^{\star }\).

Proposition 5

Let \({\mathbb {Q}}^{\star }\) an equivalent measures to \({\mathbb {Q}}\) on the measure space \(\left( \Omega ,{\mathcal {F}}\right) \), and define by

$$\begin{aligned} \left. \frac{\text {d}{\mathbb {Q}}^{\star }}{\text {d}{\mathbb {Q}}} \right| _{{\mathcal {F}}_{0}}= & {} \exp \left( -\frac{1}{2} \int _{0}^{t}V_{u}\text {d}u+\int _{0}^{t}\sqrt{V_{u}} \left( \rho \text {d}W_{u}^{1}+\sqrt{1-\rho ^{2}}\text {d}W_{u}^{2}\right) \right) \\{} & {} \times \exp \left( -\left( \psi \left( 1,0\right) -1\right) \int _{0}^{t}\lambda _{u}\text {d}u+\sum _{l=1}^{N_{u}}J_{l}\right) . \end{aligned}$$

Then, under the measure \({\mathbb {Q}}^{\star }\),

  1. (i)

    \(W_{t}^{\star 1}\) and \(W_{t}^{\star 2}\) are Brownian motions defined as follows:

    $$\begin{aligned} \text {d}W_{t}^{\star 1}= & {} \text {d}W_{t}^{1}-\rho \sqrt{V_{t}}\text {d}t,\\ \text {d}W_{t}^{\star 2}= & {} \text {d}W_{t}^{2}-\sqrt{1-\rho ^{2}}\sqrt{V_{t}}\text {d}t. \end{aligned}$$
  2. (ii)

    \(N_{t}^{\star }\) is a counting process with the following intensity

    $$\begin{aligned} \lambda _{t}^{\star }= & {} \psi \left( 1,0\right) \lambda _{t}. \end{aligned}$$
    (41)

    Considering the discretisation scheme on the partition \({\mathcal {E}}^{(n)}\), this intensity noted \(\lambda _{t}^{\left( n\right) ,\star }\) is solution of next SDE

    $$\begin{aligned} \text {d}\lambda _{t}^{\left( n\right) ,\star }= & {} \left( \lambda _{0}^{\star }-\alpha ^{\star }\right) \frac{\text {d}g_{k} \left( t\right) }{\text {d}t}\text {d}t+\eta ^{\star }\sum _{l=-n}^{n} m_{l}^{\left( k,n\right) }ib_{l}^{(n)}Y_{t}^{(l),\star }\text {d}t +\eta ^{\star }\text {d}L_{t}^{\left( n\right) ,\star },\nonumber \\ \end{aligned}$$
    (42)

    where \(\lambda _{0}^{\star }=\psi \left( 1,0\right) \lambda _{0}\), \(\eta ^{\star }=\psi \left( 1,0\right) \eta \), \(L_{t}^{\left( n\right) ,\star }=\sum _{l=1}^{N_{t}^{\left( n\right) , \star }}\left| J_{l}^{\star }\right| \) and for all \(l\in \left\{ -n,\ldots ,n\right\} \)

    $$\begin{aligned}&\text {d}Y_{t}^{\left( l\right) ,\star }=ib_{l}^{\left( n\right) } Y_{t}^{\left( l\right) ,\star }\text {d}t+\text {d}L_{t}^{\left( n\right) ,\star }. \end{aligned}$$
    (43)

    \(J^{\star }\) is a double exponential distribution random variables of parameters:

    $$\begin{aligned} \rho ^{+,\star }= & {} \rho ^{+}-1, \end{aligned}$$
    (44)
    $$\begin{aligned} \rho ^{-,\star }= & {} \rho ^{-}+1, \end{aligned}$$
    (45)
    $$\begin{aligned} p^{\star }= & {} \frac{p\rho ^{+}\rho ^{-,\star }}{p\rho ^{+}\rho ^{-,\star } +\left( 1-p\right) \rho ^{-}\rho ^{+,\star }}, \end{aligned}$$
    (46)

    with the following joint characteristic function:

    $$\begin{aligned} \psi ^{\star }\left( z_{1},z_{2}\right) :={\mathbb {E}}\left( e^{z_{1}J^{\star } +z_{2}\left| J^{\star }\right| }\right)= & {} \frac{\psi \left( z_{1} +1,z_{2}\right) }{\psi \left( 1,0\right) }. \end{aligned}$$
    (47)
  3. (iii)

    the processes \(X_{t}\) and \(V_{t}\) satisfy the following SDEs:

    $$\begin{aligned} \text {d}X_{t}= & {} \left( r+\frac{1}{2}V_{t}-\lambda _{t}^{\star } \left( \psi ^{\star }\left( 1,0\right) -1\right) \right) \text {d}t\\{} & {} +\sqrt{V_{t}}\left( \rho \text {d}W_{t}^{\star 1}+\sqrt{1 -\rho ^{2}}\text {d}W_{t}^{\star 2}\right) +\text {d} \left( \sum _{l=1}^{N_{t}^{\star }}J_{l}^{\star }\right) ,\nonumber \end{aligned}$$
    (48)
    $$\begin{aligned} \text {d}V_{t}= & {} \left( \theta _{V}-\left( k_{V}-\rho \sigma _{V} \right) V_{t}\right) \text {d}t+\sigma _{V}\sqrt{V_{t}}\text {d}W_{t}^{\star 1}. \end{aligned}$$
    (49)

Proof

In Appendix C. \(\square \)

From to this proposition and using the characteristic function of the log-return in Proposition 3 we value an European call option of maturity T and strike K, written on the terminal spot price \(S_{T}\) of some underlying asset in the next proposition.

Proposition 6

The price \(C\left( t,T,S,V,\lambda ,N,Y,K\right) \) of an European call option of maturity T, written on \(S_{t}\) is given by

$$\begin{aligned} C\left( t,T,S,V,\lambda ,N,Y,K\right)= & {} S_{t}P_{0}\left( t,T,S,V, \lambda ,N,Y,K\right) \\{} & {} -Ke^{-r\left( T-t\right) }P_{1}\left( t,T,S,V,\lambda ,N,Y,K\right) ,\nonumber \end{aligned}$$
(50)

where K is the strike price, \(P_{j}\equiv P_{j}\left( t,T,S,V,\lambda ,N,Y,K\right) \) are the probability of the call expiring in-the-money under the forward measure \({\mathbb {Q}}^{\star }\) and the risk neutral measure \({\mathbb {Q}}\) respectively. For \(j\in \left\{ 0,1\right\} \), these probabilities are of the form

$$\begin{aligned} P_{0}= & {} \frac{1}{2}+\frac{1}{\pi }\int _{{\mathbb {R}}_{+}}\text {Re} \left( \exp \left( -i\omega \log K\right) \frac{\varUpsilon _{t,s}\left( \omega -i\right) }{i\omega e^{r\left( T-t\right) }S_{t}}\right) \text {d}\omega , \end{aligned}$$
(51)
$$\begin{aligned} P_{1}= & {} \frac{1}{2}+\frac{1}{\pi }\int _{{\mathbb {R}}_{+}}\text {Re} \left( \exp \left( -i\omega \log K\right) \frac{\varUpsilon _{t,s}\left( \omega \right) }{i\omega }\right) \text {d}\omega \end{aligned}$$
(52)

where \(\varUpsilon _{t,s}\) is the characteristic function of the process \(X_{t}\)under the measure \({\mathbb {Q}}\) defined in the Proposition 4.

Proof

Considering the discretisation scheme on the partition \({\mathcal {E}}^{(n)}\), the European call option price is obtained as

$$\begin{aligned} C\left( t,T,S^{\left( n\right) },V,\lambda ^{\left( n\right) },N^{\left( n\right) },Y,K\right)= & {} C\left( t,T,X^{\left( n\right) },V,\lambda ^{\left( n\right) },N^{\left( n\right) }, Y,K\right) \nonumber \\= & {} e^{-r\left( T-t\right) }{\mathbb {E}}\left( \max \left( e^{X_{T}}-K,0\right) |{\mathcal {F}}_{t}\right) \nonumber \\= & {} e^{X_{t}}{\mathbb {E}}\left( \frac{e^{-rT}e^{X_{T}}}{e^{-rt}e^{X_{t}}} {\mathbb {I}}_{X_{T}^{\left( n\right) }>\log K}|{\mathcal {F}}_{t}\right) \nonumber \\{} & {} -Ke^{-r\left( T-t\right) }{\mathbb {E}}\left( {\mathbb {I}}_{X_{T }^{\left( n\right) }>\log K}|{\mathcal {F}}_{t}\right) \nonumber \\= & {} e^{X_{t}}{\mathbb {E}}^{\star }\left( {\mathbb {I}}_{X_{T}^{\left( n \right) }>\log K}|{\mathcal {F}}_{t}\right) \nonumber \\{} & {} -Ke^{-r\left( T-t\right) }{\mathbb {E}}\left( {\mathbb {I}}_{X_{T }^{\left( n\right) }>\log K}|{\mathcal {F}}_{t}\right) \nonumber \\= & {} S_{t}{\mathbb {Q}}^{\star }\left( X_{T}^{\left( n\right) }>\log K\right) \nonumber \\{} & {} -Ke^{-r\left( T-t\right) }{\mathbb {Q}}\left( X_{T}^{\left( n \right) }>\log K\right) \nonumber \\= & {} S_{t}P_{0}-Ke^{-r\left( T-t\right) }P_{1} \end{aligned}$$
(53)

where as in the Black-Sholes formula (1973) \(P_{0}^{\left( n\right) }\equiv P_{0}\left( t,T,X^{\left( n\right) },V,\lambda ^{\left( n\right) },N^{\left( n\right) },Y,K\right) \) and \(P_{1}^{\left( n\right) }\equiv P_{1}\left( t,T,X^{\left( n\right) },V,\lambda ^{\left( n\right) },N^{\left( n\right) },Y,K\right) \) are the probability of the call expiring in-the-money under the measure \({\mathbb {Q}}^{\star }\) and the risk neutral measure \({\mathbb {Q}}\) respectively. Since \(C\left( T,T,X^{\left( n\right) },V,\lambda ^{\left( n\right) },N^{\left( n\right) },Y,K\right) =\max \left( e^{X_{T}}-K,0\right) \), with the terminal condition

$$\begin{aligned} P_{j}\left( T,T,X^{\left( n\right) },V,\lambda ^{\left( n\right) },N^{\left( n\right) },Y,K\right)= & {} {\left\{ \begin{array}{ll} 1 &{} \quad \text {if }X_{T}>0\\ 0 &{} \quad \text {otherwise} \end{array}\right. },\quad \forall j\in \left\{ 0,1\right\} .\nonumber \\ \end{aligned}$$
(54)

Given that the \(P_{j}\) are survival probabilities with respect of \(X_{T}\mid {\mathcal {F}}_{t}\) under the measure \({\mathbb {Q}}^{\star }\) and the risk neutral measure \({\mathbb {Q}}\) respectively. We apply the inversion theorem of Gil-Pelaez (1951) to get:

$$\begin{aligned} P_{0}^{\left( n\right) }= & {} \frac{1}{2}+\frac{1}{\pi }\int _{{\mathbb {R}}_{+}} \text {Re}\left( \frac{e^{-i\omega \log K}\varUpsilon _{t,T}^{\left( n\right) ,\star }\left( \omega \right) }{i\omega }\right) \text {d}\omega \end{aligned}$$
(55)
$$\begin{aligned} P_{1}^{\left( n\right) }= & {} \frac{1}{2}+\frac{1}{\pi }\int _{{\mathbb {R}}_{+}} \text {Re}\left( \frac{e^{-i\omega \log K}\varUpsilon _{t,T}^{\left( n\right) } \left( \omega \right) }{i\omega }\right) \text {d}\omega , \end{aligned}$$
(56)

where \(\varUpsilon _{t,T}^{\left( n\right) ,\star }\left( .\right) \) and \(\varUpsilon _{t,T}^{\left( n\right) }\left( .\right) \) are the characteristic functions of \(X_{T}\mid {\mathcal {F}}_{t}\) under the measure \({\mathbb {Q}}^{\star }\) and the risk neutral measure \({\mathbb {Q}}\) respectively. From the definition of the measure \({\mathbb {Q}}^{\star }\) we have

$$\begin{aligned} \varUpsilon _{t,T}^{\left( n\right) ,\star }\left( \omega \right)= & {} {\mathbb {E}}^{\star }\left( e^{i\omega X_{T}^{\left( n\right) }} \mid {\mathcal {F}}_{t}\right) \\= & {} {\mathbb {E}}\left( \frac{e^{-rT}e^{X_{T}^{\left( n\right) }}}{e^{-rt} e^{X_{t}^{\left( n\right) }}}e^{i\omega X_{T}^{\left( n\right) }}\mid {\mathcal {F}}_{t}\right) \\= & {} \frac{1}{e^{r\left( T-t\right) }S_{t}}{\mathbb {E}} \left( e^{i\left( \omega -i\right) X_{T}^{\left( n\right) }}\mid {\mathcal {F}}_{t}\right) \\= & {} \frac{\Upsilon _{t,T}^{\left( n\right) }\left( \omega -i\right) }{e^{r\left( T-t\right) }S_{t}}. \end{aligned}$$

Replacing this into Eq. (55), passing to the limit with respect of n and using the frequency derivative property of Fourier transform, allows us to complete this proof. \(\square \)

The main problem for implementing this latter call option formula is the numerical approximation of integrals in Eqs. (51) and (52). It worth noting that a wrong implementation involves a weak quality of fitness on real market option data. One may think about Fourier transform, but the characteristic function \(\varUpsilon _{t,s}\) is only evaluated numerically with possible discontinuities and the integrands have a point of singularity at \(\omega =0\). However, these integrals can be evaluate efficiently via a trapezoidal quadrature approximation. Note that the Lemma 1 in Witkovský (2001) enables one to compute the limits of integrands in Eqs. (51) and (52) when \(\omega \) vanish:

$$\begin{aligned} \text {Re}\left( \frac{e^{-i\omega \log K}}{i\omega }\frac{\varUpsilon _{t,T} \left( \omega -i\right) }{e^{r\left( T-t\right) }S_{t}}\right)\rightarrow & {} {\mathbb {E}}^{\star }\left( X_{T}\mid {\mathcal {F}}_{t}\right) -\log K\end{aligned}$$
(57)
$$\begin{aligned} \text {Re}\left( \frac{e^{-i\omega \log K}}{i\omega }\varUpsilon _{t,T} \left( \omega \right) \right)\rightarrow & {} {\mathbb {E}}\left( X_{T} \mid {\mathcal {F}}_{t}\right) -\log K \end{aligned}$$
(58)

where

$$\begin{aligned} {\mathbb {E}}^{\star }\left( X_{T}\mid {\mathcal {F}}_{t}\right)= & {} r\left( T-t\right) +\frac{1}{2}\frac{\theta _{V}}{k_{V}-\rho \sigma _{V}}\left( T-t\right) \\{} & {} +\left( V_{t}-\frac{\theta _{V}}{k_{V}-\rho \sigma _{V}} \right) \frac{1}{2\left( k_{V}-\rho \sigma _{V}\right) } \left( 1-e^{-\left( k_{V}-\rho \sigma _{V}\right) \left( T-t\right) }\right) ,\nonumber \end{aligned}$$
(59)
$$\begin{aligned} {\mathbb {E}}\left( X_{T}\mid {\mathcal {F}}_{t}\right)= & {} r\left( T-t\right) -\frac{1}{2}\frac{\theta _{V}}{k_{V}} \left( T-t\right) -\frac{1}{2}\left( V_{t}-\frac{\theta _{V}}{k_{V}} \right) \frac{1}{k_{V}}\left( 1-e^{-k_{V}\left( T-t\right) }\right) .\nonumber \\ \end{aligned}$$
(60)

are results proven in Appendix D.

The trapezoidal quadrature approximation of the probabilities of the call expiring in-the-money for M discrete log-strikes \(\log K_{u}:=k_{u}=-k_{max}+\left( u-1\right) \Delta _{k}\), \(u=1,\ldots ,M\), equally spaced \(\Delta _{k}=\frac{2k_{max}}{M}\) are given by:

$$\begin{aligned} P_{0}\approx & {} \frac{1}{2}+\frac{1}{\pi }\sum _{l=1}^{M}\gamma _{l} \text {Re}\left( \frac{e^{-i\omega _{l}k_{u}}}{i\omega _{l}} \frac{\varUpsilon _{t,T}\left( \omega _{l}-i\right) }{e^{r \left( T-t\right) }S_{t}}\right) \Delta _{\omega }\\ P_{1}\approx & {} \frac{1}{2}+\frac{1}{\pi }\sum _{l=1}^{M}\gamma _{l} \text {Re}\left( \frac{e^{-i\omega _{l}k_{u}}}{i\omega _{l}} \varUpsilon _{t,T}\left( \omega _{l}\right) \right) \Delta _{\omega } \end{aligned}$$

where \(\gamma _{l}=\frac{1}{2}1_{\left\{ l=1\right\} }+1_{\left\{ l\ne 1\right\} }\) is the trapezoidal quadrature weights, \(\Delta _{\omega }=\frac{2\pi }{M\Delta _{K}}\) ensures that the absolute value of the integrand function is sufficiently small for all \(\omega >\omega _{M}\) with \(\omega _{l}=\left( l-1\right) \Delta _{\omega }\) for \(l=1,\ldots ,M\).

From our call option price formula (50) we can derive Greeks for hedging purpose given the three sources of risk: the price risk \(S_{t}\) the volatility risk \(V_{t}\) and the jump risk \(\lambda _{t}\). The analytical expressions of the three deltas are given by

$$\begin{aligned} \Delta S= & {} \frac{\partial C}{\partial S}=P_{0}+S_{t} \frac{\partial P_{0}}{\partial S}-Ke^{-r\left( T-t\right) } \frac{\partial P_{1}}{\partial S},\\ \Delta V= & {} \frac{\partial C}{\partial V}=S_{t} \frac{\partial P_{0}}{\partial V}-Ke^{-r\left( T-t\right) }\frac{\partial P_{1}}{\partial V},\nonumber \\ \Delta \lambda= & {} \frac{\partial C}{\partial \lambda } =S_{t}\frac{\partial P_{0}}{\partial \lambda }-Ke^{-r\left( T -t\right) }\frac{\partial P_{1}}{\partial \lambda },\nonumber \end{aligned}$$
(61)

where for \(j\in \left\{ 0,1\right\} \)

$$\begin{aligned} \frac{\partial P_{0}}{\partial S}= & {} \frac{1}{\pi }\int _{{\mathbb {R}}_{+}} \text {Re}\left( \frac{e^{-i\omega \log K}\varUpsilon _{t,T}\left( \omega -i\right) }{e^{r \left( T-t\right) }S_{t}^{2}}\right) \text {d}\omega ,\\ \frac{\partial P_{1}}{\partial S}= & {} \frac{1}{\pi }\int _{{\mathbb {R}}_{+}} \text {Re}\left( \frac{e^{-i\omega \log K}\varUpsilon _{t,T}\left( \omega \right) }{S_{t}} \right) \text {d}\omega ,\\ \frac{\partial P_{0}}{\partial V}= & {} \frac{1}{\pi }\int _{{\mathbb {R}}_{+}} \text {Re}\left( B\left( t,T,\omega -i\right) \frac{e^{-i\omega \log K}\varUpsilon _{t,T} \left( \omega -i\right) }{i\omega e^{r\left( T-t\right) }S_{t}}\right) \text {d}\omega ,\\ \frac{\partial P_{1}}{\partial V}= & {} \frac{1}{\pi }\int _{{\mathbb {R}}_{+}} \text {Re}\left( B\left( t,T,\omega \right) \frac{e^{-i\omega \log K}\varUpsilon _{t,T} \left( \omega \right) }{i\omega }\right) \text {d}\omega ,\\ \frac{\partial P_{0}}{\partial \lambda }= & {} \frac{1}{\pi }\int _{{\mathbb {R}}_{+}} \text {Re}\left( D\left( t,T,\omega -i\right) \frac{e^{-i\omega \log K}\varUpsilon _{t,T} \left( \omega -i\right) }{i\omega e^{r\left( T-t\right) }S_{t}}\right) \text {d}\omega ,\\ \frac{\partial P_{1}}{\partial \lambda }= & {} \frac{1}{\pi }\int _{{\mathbb {R}}_{+}} \text {Re}\left( D\left( t,T,\omega \right) \frac{e^{-i\omega \log K}\varUpsilon _{t,T} \left( \omega \right) }{i\omega }\right) \text {d}\omega , \end{aligned}$$

with

$$\begin{aligned} B\left( t,T,\omega \right)= & {} \sigma _{V}^{-2}\left( k_{V}-i\omega \rho \sigma _{V}-{\bar{d}}\right) \frac{1-e^{-{\bar{d}}\left( s-t\right) }}{1-{\bar{g}}e^{-{\bar{d}}\left( s-t\right) }}, \end{aligned}$$

and D which satisfies the ODE in Eq. (40). Similarly we can obtain the second and third order Greeks letter, gamma and vega which are respectively the second and third order partial derivatives of the call option price with respect of the asset price, the variance and the arrival rate of jumps. Thus, the sensitivity of our call option price with respect of these three state variables has an analytic form that relies on the computation of integrals.

4 Illustration

This section enables us to achieve three goals. Firstly, we present the method for calibrating our Heston model extension to an implied volatility surface. Secondly, we show that our model achieves a better fit of implied volatilities than the standard Heston. Finally, we study the sensitivity of European call option price to the memory parameter \(\beta \) (Eqs. 6, 8, 10 and 12).

4.1 Data description

The data used in this section is a result of a filtering process applied to the implied volatility surface of the Euro Stoxx 50 as of the \(26^{th}\) of September 2019 in Fig. 4, with the spot price \(S_{0}=3541\) and the dividend rate \(q=0.00225\). To our raw data obtain from the Bloomberg’s Option Monitor we exclude options with maturity greater than one as there are less sensitive to volatility. The constant risk free rate r is approximated by the three-month instantaneous forward rate on the \(26^{th}\) of the September 2019 implied in the yield curve for the euro area and taking into account issuers whose rating is triple A.

Fig. 4
figure 4

Implied volatility surface of the Euro Stoxx 50 as of the \(26^{th}\) of september 2019, with the spot price \(S_{0}=3541\) and the dividend rate \(q=0.00225\)

4.2 Calibration and sensitivity analysis

To calibrate our SVJ models and the Heston model under the risk neutral measure, we use the standard approach that consists in minimizing the sum of relative squares errors (SRSE) between IVs derived from our model and market IVs. More specifically, If we denote by \(\Theta \) the parameters set of our SVJ model, one evaluate the vector of parameters \({\hat{\Theta }}\) for which our model gives the closest prices to those observed in the market as follows:

$$\begin{aligned} {\hat{\Theta }}= & {} \underset{\Theta }{\text {argmin}}\left( \text {SRSE} \left( \Theta \right) +\gamma \max \left( \sigma _{V}^{2}-2\theta _{V},0\right) \right) ,\end{aligned}$$
(62)
$$\begin{aligned} \text {SRSE}\left( \Theta \right)= & {} \sum _{s}^{\#C}\left( \frac{\sigma _{IV}^{Market} \left( t,T_{s},S,V,\lambda ,N,Y,K_{s}\right) -\sigma _{IV} \left( t,T_{s},S,V,\lambda ,N,Y,K_{s},\Theta \right) }{\sigma _{IV}^{Market} \left( t,T_{s},S,V,\lambda ,N,Y,K_{s}\right) }\right) ^{2},\nonumber \\ \end{aligned}$$
(63)

where \(K_{s}\) is the strike price of the s-th call option, \(T_{s}\) is the maturity of the call option, \(\sigma _{IV}^{Market}\) is the market implied volatility of the observed call option and \(\#C\) is the number of call options used. Note that the objective function is not convex and is not of any particular structure. Thus, by using a gradient based optimizing procedure we are not sure to obtain a global minimum. As proposed by Hamida and Cont (2005), we add a convex penalization term \(\gamma \max \left( \sigma _{V}^{2}-2\theta _{V},0\right) \) to the objective function (63) in order to ensure that a gradient based algorithm can be used.

Based on the filtered IV surface of the Euro Stoxx 50 as of the 26\(^{\text {th}}\) September 2019, we find the set of model parameters minimizing the objective function (63). Table 1 presents the result of our calibration procedure applied to the Heston model, and to its extensions with a self-exciting jumps induce by a Laplace (6) (HHPL), Gaussian (8) (HHPG), Logistic (10) (HHPLog) or Cauchy (12) (HHPC) kernels. In Table 1 the SRSE measures the quality of fit of theses models. We find that our Heston models with jumps in the asset price dynamics are better, with SRSEs lower than the SRSE of the Heston model. In Fig. 5, we plot the market implied volatility and the volatility replicated through the Heston, the HHPL, HHPG, HHPLog and HHPC models at maturities T equal to 22, 50, 85 and 113 days. We see that the implied volatilities derived from the HHPL, HHPG, HHPLog and HHPC models are the closest to market one and they outperform the Heston model, particularly for small maturities. From the Heston model to its extensions we observe that the correlation \(\rho \) becomes close to \(-1\). This results from the contribution of the jump component to the explanation the log-return volatility.

Table 1 Estimated parameters with \(k_{max}=1\), \(M=2^{8}\) and \(\gamma =10000\) used for the trapezoidal quadrature approximation of the probabilities of the call expiring in-the-money (Eqs. 51 and 52)
Fig. 5
figure 5

Market implied Volatilities versus replicated implied volatilities from the Heston, the HHPL, HHPG, HHPLog and HHPC models at maturities T equal to 22, 50, 85 and 113 days

Another important quality check, when it comes to calibrate an implied volatility surface, is the replication of the term structure of at-the-money (ATM) volatility skews. Denoted by \(\varPsi \left( T\right) \), the ATM volatility skew at expiry time \(T\ge t\) is the derivative of implied volatility with respect to the money strike:

$$\begin{aligned} \varPsi \left( T\right):= & {} \left| \frac{\partial \sigma _{IV}\left( t,T,S,V,\lambda ,N,Y,K,\Theta \right) }{\partial K}\right| _{K=S_{t}} \end{aligned}$$

For each model and for \(\epsilon \) small enough, we approximate \(\varPsi \left( T\right) \) by

$$\begin{aligned} {\hat{\varPsi }}\left( T\right)= & {} \left| \frac{\sigma _{IV}\left( t,T,S,V,\lambda ,N,Y,K +\epsilon ,{\hat{\Theta }}\right) -\sigma _{IV}\left( t,T,S,V,\lambda ,N,Y,K -\epsilon ,{\hat{\Theta }}\right) }{2\epsilon }\right| _{K=S_{t}}, \end{aligned}$$

where the approximation holds for small enough \(\epsilon \). Empirical studies (see e.g. Gatheral et al. 2014) show that the observed term structure of ATM volatility skew is well approximated by power-law functions of the form \(T^{H-\frac{1}{2}}\) with \(H\in \left( 0,\frac{1}{2}\right) \) the Hurst exponent. Gatheral et al. (2014) shows that a value of H close to zero generates a volatility surface with a reasonable shape. From the Table 2, we deduce that all models considered have an estimated Hurst exponent \({\hat{H}}\) between 0.09 and 0.14. The HHPLog model exhibits the smallest estimated Hurst exponent with an R-squared of the linear regression model that explains \(\log {\hat{\varPsi }}\left( T\right) \) given \(\log T\) equal to 0.88. From Fig. 6, we conclude that compared to the Heston model, the other models better replicate the empirical short term explosion of the ATM volatility skew.

Table 2 Estimated Hurst exponent \({\hat{H}}\) of the ATM volatility skew and the R-squared of the linear regression model that explains the log-ATM volatility skew \(\log {\hat{\varPsi }}\left( T\right) \) given \(\log T\)
Fig. 6
figure 6

The slope of implied volatility skew as function of the maturity over a range from one day to 2.5 years. Model Parameters used are from Table 1

We now focus on the differences between our extensions of the Heston model. To distinguish them in term of memory length, we draw in Fig. 7 their corresponding kernel functions given the fitted parameters in Table 1. We observe that the Gaussian kernel has the slowest decay, while the Logistic kernel has the fastest decay. Thus, the HHPG and HHPLog are respectively the models with the longest and shortest memory among our extensions of the Heston model. From the Fig. 7 also, we find that the memory of the HHPL and HHPG models increases with the parameter \(\beta \). Whereas, the one of HHPLog and HHPC models increases when the parameter \(\beta \) decreases. The Fig. 8 plot the probability density function of the log-return at 113 days. This figure allows us to make link between the memory length of models and the thickness of the tail of the log-return distribution. In agreement of the result of the Sect. 2, the thickness of the tail of the log-return distribution for our extension of the Heston model are ordered by their memory lengths.

Fig. 7
figure 7

Sensitivity of kernel functions to the memory parameter \(\beta \). The initial value of \(\beta \) for each model is in Table 1

Fig. 8
figure 8

Probability density function of the log-return \(f_{X_{0,113\,\text {days}}}\)

The impact of the memory parameter \(\beta \) of our affine Heston style models on option pricing is studied through the implied volatilities and the price risk \(\Delta S\) (61). To inspect how recalling past price changes for a long period influences option prices, we vary the memory parameter \(\beta \) of our extensions of the Heston model. Figure 9 reveals that an increase in memory indicated by a slower decaying kernel function \(g_{.}\left( .\right) \), induces an increase in implied volatilities of out-of-the-money (OTM) call options (\(S<K\)). This increase in implied volatilities becomes more pronounced as the expiry rises. Figure 10 confirms this findings and brings out two points. Firstly, the impact of the memory parameter \(\beta \) is negligible in the short term, but increases with the expiry time T. Secondly, for the HHPLog model with the shortest memory, the impact of a small increase or decrease in memory remains negligible in the long term.

Fig. 9
figure 9

Sensitivity to the memory parameter \(\beta \) of the HHPL model. Model parameters used are in Table 1

Fig. 10
figure 10

Sensitivity of the implied volatility to the memory parameter \(\beta \) regarding the expiry time T in month. Model parameters used are in Table 1

Fig. 11
figure 11

The price risk \(\Delta S\) for the Heston, HHPL, HHPG, HHPLog and HHPC models. Model parameters used are in Table 1

The price risk \(\Delta S\) measures the sensitivity of the option to the stock price. Let us recall that for options it should be interpreted as the amount of stocks to hold (\(\Delta S>0\)) or to short sale (\(\Delta S<0\)) for hedging an option. Called the “delta hedging”, this is a very common strategy to do the arbitrage and minimize risk of portfolio in the option market. From the Fig. 11, the price risk \(\Delta S\) calculated from the Heston model is slightly higher than the one given by the other models for OTM call options, while, for in-the-money (ITM) call options, the price risk \(\Delta S\) in the Heston model is almost the same. The Fig. 12 highlights the sensitivity of the price risk \(\Delta S\) to the self-excitation parameter \(\lambda _{0}\). We find that the increase in self-excitation caused by a rising \(\lambda _{0}\), implies a higher price risk \(\Delta S\) for OTM call options and a lower one for ITM call options. Unlike the price risk \(\Delta S\), from Fig. 13, the implied volatilities derived from our extensions of the Heston model are more sensitive to the self-excitation parameter \(\lambda _{0}\) for OTM call options.

Although call option prices from the Heston model and its extensions have roughly the same sensitivity to changes in underlying price, we find that these models induce different shapes of sensitivity to volatility and jump risks in the option price. The upper graphs in Fig. 14 draw the sensitivity of call option prices expiring at 50 and 113 days to the volatility V. Known as the greek “vega”, for all models considered, it takes its greatest value around the ATM. Except for the Heston model, the vega remains positive and decreases exponentially to zero as the strike moves away from the spot price. In the lower graphs, we observe a similar shape of the call option price change in case of a jump. Denoted by \(\Delta \lambda \), it takes its greatest value at the OTM region and its curve widens when the expiry increases. It is worth noting that a higher memory length does not necessary results in a wider \(\Delta \lambda \) curve.

Fig. 12
figure 12

Sensitivity of the price risk \(\Delta S\) to the initial intensity \(\lambda _{0}\) of Hawkes processes. Model parameters used are in Table 1

Fig. 13
figure 13

Sensitivity to the memory parameter \(\lambda _{0}\) of the HHPL model. Model parameters used are in Table 1

Fig. 14
figure 14

Vega and jump risk sensitivity of call prices expiring at 50 and 113 days. Model parameters used are in Table 1

5 Conclusions

In this article, we propose an extension of the Heston model that allows for self-excited jumps in asset price and long memory of past price changes. This expansion corresponds to an asset price process with a stochastic volatility and a jump component driven by a linear Hawkes process (HP). The memory feature of our model stems from the decay rate of the HP kernel function. Considering HP kernel function that are Fourier transform of the Laplace, Gaussian, Logistic, and Cauchy measures, we generate different memory ranges, from the longest to the shortest. The self-excitation and memory properties of our framework enables us to have a realistic option pricing model that encompass major characteristic of asset’s return times series such as the volatility clustering and the long memory in volatility. However, the memory effect leads to a dependence on the past that makes our asset price process non Markov.

The HP with kernel function which is a Fourier transform of the Cauchy measure corresponds to the tractable exponential kernel. Except the stochastic volatility model with jumps self-excited by this exponential kernel, our Heston extensions are non Markov processes. We use the Fourier transform representation of their kernel functions to establish a closed form expression of the conditional moment generating functions of the HP’s intensity and the log-returns. Based on these results, we write the call option price according to the delta probability decomposition and derive Greeks for hedging purpose given the three sources of risk: the change in underlying price, volatility and the occurrence of a jump. By inverse Fourier transform of the conditional moment generating function of the log-returns, we find that with equal parameters, the thickness of the tails of its probability density function is fatter if the length of the memory of the HP kernel function is greater.

An estimation of our models to implied volatilities of the Euro Stoxx 50 as of the \(26^{th}\) of September 2019 achieves better calibration performance than the one with the standard Heston model, particularly in the short term. They also provide a good fit of the term structure of at-the-money volatility skew, approximted by power-law functions of the form \(T^{{\hat{H}}-\frac{1}{2}}\) with an estimated Hurst exponent \({\hat{H}}\) between 0.09 and 0.14. Considering the estimated parameters found and varying the memory parameter, we see that implied volatilities derived from our extensions of the Heston model, increase as the memory length expands. This sensitivity to the memory length appears more pronounced for out-of-the money call option prices. A similar conclusion is drawn, with respect to the sensitivity of the implies volatilities to other self-excitation parameters. The analysis of the Greeks delta, vega and the jump risk reveals that our extensions of the Heston model have roughly the same delta curve than the standard Heston. But these models induce different shapes of sensitivity to volatility and jump risks in the option price, which does not seem to be related only to their memory lengths.

A next step to this work will be to explore hedging strategies in our setting, to manage options whose underlying asset may experience some clustering of jumps. From a broader perspective, it would be interesting to extend this work to the pricing of bond, currency, exotic options or exotic derivatives which are path dependent. From an empirical point of view, it would be worthy to study the empirical relevance of such Hawkes memory kernels and develop accurated calibration procedure to financial time series.