1 Introduction

Interest rate risk management is of paramount importance for financial institutions such as banks and insurance companies. Several financial derivatives can be used for this purpose, e.g. interest rate swaps, zero-coupon futures, and options on such contracts. Developing pricing procedures for these derivatives is therefore essential. Indeed, the calibration of interest rate dynamics models relies on these pricing procedures. Furthermore, a strand of literature is interested in studying the risk premium embedded in option prices, see for example Coval and Shumway (2001), Bakshi et al. (2023), Bakshi et al. (2022). In the context of interest rate options, Bakshi et al. (2023) recall the puzzling stylized fact of negative excess returns for both out-of-the-money call and put options on treasury futures, and propose a pricing kernel model explaining such feature. The ability to calculate option risk premia implied by interest rate models is therefore useful to better analyse whether a given model is consistent with observed properties of option prices.

The main objective of this study is to provide procedures and formulas to obtain prices for several interest derivatives, namely swaps, swaptions, zero-coupon futures and European options on such futures. Expected excess return formulas for options on zero-coupon futures are also provided. The formulas presented are based on the discrete-time arbitrage-free Nelson–Siegel (DTAFNS) model of Eghbalzadeh et al. (2022), which is a discrete-time version of the original arbitrage-free Nelson–Siegel model developed in Christensen et al. (2011). Such model has numerous advantages. Firstly, being within the family of affine term structure (ATS) models (see Duffie and Kan, 1996) where spot rates are linear combinations of risk factors, it is highly tractable. Secondly, it provides a clear interpretation for factors driving term structure movements: they respectively drive the yield curve’s level, its slope and its curvature. Finally it ensures absence of arbitrage.

Several other works also study the pricing of swaptions and other interest rate options in the context of multi-factor or ATS models. We list a few here. The pioneering works of Black et al. (1990) and Black and Karasinski (1991) show how to price zero-coupon options based respectively on a one-factor binomial tree model and a log-normal diffusion model. Munk (1999) demonstrates that the price of a European option on a coupon bond (e.g. a swaption) is roughly equal to some multiple of the price of a European option on a zero-coupon bond with maturity equal to the coupon bond’s stochastic duration. Collin-Dufresne and Goldstein (2002) propose to apply an Edgeworth expansion to approximate the density of the coupon bond price and obtain the price of a swaption. Singleton and Umantsev (2002) rely on Fourier inversion methods to calculate swaption prices in the ATS framework. Schrager and Pelsser (2006) propose to approximate the swap rate volatility under the swap measure, which is a low-variance martingale, by its time-zero value. Such strategy leads to a closed-form formula for the swaption price.

The paper is structured as follows. Section 2 provides a description of the DTAFNS term structure model. Section 3 presents pricing procedures for swaptions, whereas Sect. 4 provides formulas for prices and expected excess returns of options on zero-coupon futures. Section 5 briefly discusses calibration methods for the DTAFNS model relying on option prices. Section 6 concludes.

2 The DTAFNS model

This section discusses interest rate dynamics in the DTAFNS model of Eghbalzadeh et al. (2022). Dynamics are provided under three probability measures: the physical measure, the risk-neutral measure and the forward measure. All three measures are required for the computation of prices and associated risk premia of derivatives considered in this study.

2.1 Risk-neutral dynamics in the DTAFNS model

This section provides a description of risk-neutral dynamics in the DTAFNS interest rate term structure model. Consider a discrete-time setting with monthly time points \(t=0,\ldots ,T\) and time elapse \(\Delta\) year between each point. The filtration \(\mathcal {F}=\{\mathcal {F}_t\}^{T}_{t=0}\) characterizes the information flow in the market. The DTAFNS model assumes that the term structure of interest rates is determined by three factors: the long-term level of interest rates, the slope of the yield curve and its curvature. The time-t short rate applying over period \([t,t+1)\) is

$$\begin{aligned} r_t= X^{(1)}_t + X^{(2)}_t, \end{aligned}$$
(2.1)

with \(\{X_t\}^{T}_{t=0}\) denoting the term structure factor process, where time-t factors are the triplet \(X_t= [X^{(1)}_t, X^{(2)}_t, X^{(3)}_t]^\top\).

Under the risk-neutral measure \(\mathbb {Q}\), factors exhibit the following auto-regressive dynamics:

$$\begin{aligned} \left( {\begin{array}{c} X^{(1)}_{t+1}-X^{(1)}_t \\ X^{(2)}_{t+1}-X^{(2)}_t \\ X^{(3)}_{t+1}-X^{(3)}_t \\ \end{array} } \right)&= \underbrace{\left[ \begin{array}{ccc} 0 &{} 0 &{} 0 \\ 0 &{} \lambda &{} -\lambda \\ 0 &{} 0 &{} \lambda \end{array} \right] }_{ \kappa ^\mathbb {Q}} \underbrace{\left[ \begin{array}{c} \theta ^{\mathbb {Q}}_1-X^{(1)}_t \\ \theta ^{\mathbb {Q}}_2-X^{(2)}_t \\ \theta ^{\mathbb {Q}}_3-X^{(3)}_t \end{array} \right] }_{ \theta ^{\mathbb {Q}}-X_t } + \underbrace{\left( {\begin{array}{ccc} \Sigma _{1,1} &{} 0 &{} 0 \\ 0 &{} \Sigma _{2,2} &{} 0 \\ 0&{} 0 &{} \Sigma _{3,3} \\ \end{array} } \right) }_{\Sigma } \left( {\begin{array}{c} Z^{\mathbb {Q}}_{t+1,1} \\ Z^{\mathbb {Q}}_{t+1,2} \\ Z^{\mathbb {Q}}_{t+1,3} \\ \end{array} } \right) , \end{aligned}$$
(2.2)

where scalar \(\lambda \in (0,1)\) and matrices \(\theta ^{\mathbb {Q}}\), \(\kappa ^{\mathbb {Q}}\) and \(\Sigma\), with \(\Sigma _{i,i}>0\), represent model parameters, and \(\{ Z^{\mathbb {Q}}_{t,i} \}^{n}_{t=1}\), \(i=1,2,3\) are \(\mathcal {F}\)-adapted standard Gaussian white noises with contemporaneous correlation \(\text {Corr}[Z^{\mathbb {Q}}_{t_1,i},Z^{\mathbb {Q}}_{t_2,j}] = \mathbbm {1}_{ \{t_1=t_2\} } \rho _{i,j}\) represented by correlation matrix \(\rho = \left[ \rho _{i,j} \right] ^3_{i,j=1}\). We set \(\theta ^{\mathbb {Q}}_1=0\) since such parameter is unused.

As shown in Eghbalzadeh et al. (2022), the time-t price of a risk-free zero-coupon bond paying one dollar on maturity \(\mathcal {T}>t\) is, under the such model,

$$\begin{aligned} P(t,\mathcal {T}) = A_\tau \exp \left[ -\Delta \mathcal {B}_\tau ^\top X_t \right] , \end{aligned}$$
(2.3)

where \(\tau =\mathcal {T}-t\), \(\mathcal {B}_\tau =\left[ \mathcal {B}^{(1)}_\tau , \,\, \mathcal {B}^{(2)}_\tau , \, \, \mathcal {B}^{(3)}_\tau \right] ^\top\) and

$$\begin{aligned} \mathcal {B}^{(1)}_\tau&= \tau , \quad \mathcal {B}^{(2)}_\tau = \dfrac{1-(1-\lambda )^\tau }{\lambda }, \quad \mathcal {B}^{(3)}_\tau = \frac{1-(1-\lambda )^{\tau -1}}{\lambda } - (\tau -1) (1-\lambda )^{\tau -1}, \end{aligned}$$
(2.4)
$$\begin{aligned} \log A_\tau&= -\Delta \theta ^{\mathbb {Q}}_2 \left( \mathcal {B}^{(1)}_\tau - \mathcal {B}^{(2)}_\tau \right) + \Delta \theta ^{\mathbb {Q}}_3 \mathcal {B}^{(3)}_\tau + \frac{1}{2} \Delta ^2\upsilon _\tau , \end{aligned}$$
(2.5)

with

$$\begin{aligned} \upsilon _\tau&= \left( \sum ^3_{i=1} \sum ^3_{j=1}\upsilon ^{(i,j)}_\tau \right) , \\ \upsilon ^{(1,1)}_\tau&= \Sigma ^2_{1,1}\dfrac{\tau (\tau -1)(2\tau -1)}{6}, \\ \upsilon ^{(2,2)}_\tau&= \frac{\Sigma ^2_{2,2}}{\lambda ^2} \left( \tau - 2 \left[ \frac{1-(1-\lambda )^\tau }{\lambda }\right] + \frac{1-(1-\lambda )^{2 \tau }}{1-(1-\lambda )^2}\right) , \\ \upsilon ^{(3,3)}_\tau&= \mathbbm {1}_{ \{ \tau> 1\} }\frac{\Sigma ^2_{3,3}}{\lambda ^2} \Bigg [ \tau -2 + \zeta _0\left( (1-\lambda )^2,\tau -1\right) +\lambda ^2 \zeta _2\left( (1-\lambda )^2,\tau -1\right) \\&\quad - 2\zeta _0\left( (1-\lambda ),\tau -1\right) - 2\lambda \zeta _1\left( (1-\lambda ),\tau -1\right) + 2\lambda \zeta _1\left( (1-\lambda )^2,\tau -1\right) \bigg ], \\ \upsilon ^{(1,2)}_\tau&= \upsilon ^{(2,1)}_t = \rho _{1,2}\Sigma _{1,1} \Sigma _{2,2} \frac{1}{\lambda }\left( \frac{\tau (\tau -1)}{2} - \zeta _1 ((1-\lambda ),\tau )\right) , \\ \upsilon ^{(1,3)}_\tau&= \upsilon ^{(3,1)}_t = \mathbbm {1}_{ \{ \tau> 1\} }\rho _{1,3}\Sigma _{1,1} \Sigma _{3,3} \frac{1}{\lambda } \\&\qquad \bigg [\frac{\tau (\tau -1)}{2}-1 -\zeta _0\left( (1-\lambda ),\tau -1\right) -(\lambda +1)\zeta _1\left( (1-\lambda ),\tau -1\right) \\&\quad -\lambda \zeta _2\left( (1-\lambda ),\tau -1\right) \bigg ], \\ \upsilon ^{(2,3)}_\tau&= \upsilon ^{(3,2)}_t = \mathbbm {1}_{ \{ \tau > 1\} } \rho _{2,3}\Sigma _{2,2} \Sigma _{3,3} \\&\quad \bigg ( \frac{\tau -2- (2-\lambda )\zeta _0\left( (1-\lambda ),\tau -1\right) + (1-\lambda )\zeta _0\left( (1-\lambda )^2,\tau -1\right) }{\lambda ^2} \\&\quad + \frac{- \zeta _1\left( (1-\lambda ),\tau -1\right) + (1-\lambda )\zeta _1\left( (1-\lambda )^2,\tau -1\right) }{\lambda } \bigg ), \end{aligned}$$
(2.6)

and

$$\begin{aligned} \zeta _0(r,\tau )&\equiv \sum _{u=1}^{\tau -1}r^{u} = \dfrac{r-r^{\tau }}{1-r}, \end{aligned}$$
(2.7)
$$\begin{aligned} \zeta _1(r,\tau )&\equiv \sum _{u=1}^{\tau -1}u r^{u} = \dfrac{r- \tau r^{\tau }+(\tau -1) r^{\tau +1}}{(1-r)^2}, \end{aligned}$$
(2.8)
$$\begin{aligned} \zeta _2(r,\tau )&\equiv \sum _{u=1}^{\tau -1}u^2 r^{u} = \dfrac{ -(\tau -1)^2 r^{\tau +2} + (2\tau ^2-2\tau -1)r^{\tau +1} - \tau ^2 r^{\tau } + r^2+r}{(1-r)^3}. \end{aligned}$$
(2.9)

Remark 2.1

For the rest of the paper, the convention \(\mathcal {B}^{(1)}_0=\mathcal {B}^{(2)}_0=\mathcal {B}^{(3)}_0=0\), \(A_0=1\) and \(\upsilon _{0}=0\) is used, which makes (2.3) hold for \(\tau =0\).

2.2 Forward measure dynamics in the DTAFNS model

Prices of financial derivatives are expressed as expected discounted payoffs under the risk-neutral measure. However, the interaction between the stochastic discount factor and the derivatives payoff is often non-trivial and complexifies the pricing. A common technique to ease the calculation of prices in the context of stochastic interest rates is the change of numéraire (see for instance Geman et al., 1995). This approach relies on the construction of a new probability measure, called the forward measure, under which the price of a zero-coupon is used a numéraire for discounting. This allows directly discounting with zero-coupon bond prices, thus circumventing the difficulty associated with representing the potentially complex dependence between the payoff and the stochastic discount factor.

The probability measure using the risk-free zero-coupon bond maturing at time \(\mathcal {T}\) as a numéraire is known as the \(\mathcal {T}\)-forward measure and is denoted by \(\mathbb {Q}^\mathcal {T}\). The Radon–Nikodym derivative allowing to pass from the risk-neutral to the \(\mathcal {T}\)-forward measure, which is provided by Jamshidian (1996) or Brigo and Mercurio (2007), is

$$\begin{aligned} \dfrac{d\mathbb {Q}^{\mathcal {T}}}{d\mathbb {Q}}=\dfrac{B(0) P(\mathcal {T},\mathcal {T})}{P(0,\mathcal {T})B(\mathcal {T})}=\dfrac{D(0,\mathcal {T})}{P(0,\mathcal {T})}, \end{aligned}$$
(2.10)

where \(B(t)=\exp (\Delta \sum _{s=0}^{t-1}r_s)\) is the year-t bank account numéraire under the risk-neutral measure and \(D(t_1,t_2) = B(t_1)/B(t_2)\) is the stochastic discount factor for any \(0\le t_1 \le t_2 \le T\). Note that the Radon–Nikodym derivative allowing to go from the forward measure to the risk-neutral measure is \(\dfrac{d\mathbb {Q}}{d\mathbb {Q}^{\mathcal {T}}} = \left( \dfrac{d\mathbb {Q}^{\mathcal {T}}}{d\mathbb {Q}}\right) ^{-1}\).

Let \(\mathbb {E}^{\mathcal {T}}[\cdot ]\) represent the expectation under the \(\mathcal {T}\)-forward measure. Asset prices discounted by the zero-coupon price maturing at \(\mathcal {T}\) are martingales under the forward measure (Geman, 1989). As a consequence, as discussed in Brigo and Mercurio (2007), the time-t price \(H_t\) of an asset providing a payoff \(H_\mathcal {T}\) at time \(\mathcal {T}\) is

$$\begin{aligned} H_t= P(t,\mathcal {T})\mathbb {E}^{\mathcal {T}}\left[ H_\mathcal {T}|\mathcal {F}_t\right] . \end{aligned}$$
(2.11)

The following proposition defines so-called forward measure innovations and outlines their dynamics.

Proposition 2.1

For any \(\mathcal {T} \in \{1,\ldots ,T \}\), \(t<\mathcal {T}\) and \(\tau =\mathcal {T}-t\), conditional on \(\mathcal {F}_t\), the forward measure innovation defined as \(Z^{\mathcal {T}}_{t+1}=Z^{\mathbb {Q}}_{t+1}+ \Delta \rho \Sigma \mathcal {B}_{\tau -1}\) follows the multivariate Gaussian distribution with mean vector zero and covariance matrix \(\rho\) under the \(\mathcal {T}\)-forward measure.

Proof

See “Appendix”. \(\square\)

Corollary 2.1

Since the conditional distribution of \(Z^{\mathcal {T}}_{t+1}\) with respect \(\mathcal {F}_t\) under the \(\mathcal {T}\)-forward measure does not depend on \(Z^{\mathcal {T}}_1, \dots ,Z^{\mathcal {T}}_t\), and since the latter variables characterize the information contained in \(\mathcal {F}_t\), elements of the sequence \(\{ Z^{\mathcal {T}}_{j} \}^{\mathcal {T}}_{j=1}\) are independent.

Based on the above results, we now provide an expression for the dynamics of term structure factors \(\{ X_t\}^T_{t=0}\) analogous to (2.2), but using instead the \(\mathcal {T}\)-forward measure innovations. Define

$$\begin{aligned} \theta ^{\mathcal {T}} = \theta ^{\mathbb {Q}}, \quad \kappa ^{\mathcal {T}} = \kappa ^{\mathbb {Q}}, \quad \eta ^{\mathcal {T}}_t=\Delta \Sigma \rho \Sigma \mathcal {B}_{\tau -1} = \Delta \Sigma \rho \Sigma \mathcal {B}_{\mathcal {T}-t-1}. \end{aligned}$$
(2.12)

A direct consequence of the application of Proposition 2.1 into (2.2) is that

$$\begin{aligned} X_{t+1}&=X_t-\eta ^{\mathcal {T}}_t+\kappa ^{\mathcal {T}} (\theta ^{\mathcal {T}}-X_t)+\Sigma Z_{t+1}^{\mathcal {T}}. \end{aligned}$$
(2.13)

Representation (2.13), along with some additional lemmas provided in “Appendix”, allow obtaining the t-conditional distribution of \(X_{\mathcal {T}}\) under the \(\mathcal {T}\)-forward measure.

Proposition 2.2

Under the \(\mathcal {T}\)-forward measure, conditionally on \(\mathcal {F}_t\) and for \(t+n \le \mathcal {T}\), factors \(X_{t+n}\) follow the multivariate Gaussian distribution with mean vector \(\mathcal {M}_{t,n}=\left[ \mathcal {M}^{(i)}_{t,n}\right] ^3_{i=1}\) and covariance matrix \(\mathcal {V}_{n}=\left[ \mathcal {V}^{(i,j)}_{n}\right] ^3_{i,j=1}\) where

$$\begin{aligned} \mathcal {M}^{(1)}_{t,n}&=X^{(1)}_{t} - \sum _{l=0}^{n-1}\eta ^{\mathcal {T},(1)}_{t+l}, \\ \mathcal {M}^{(2)}_{t,n}&=X^{(2)}_{t}(1-\lambda )^{n}+(\theta ^\mathcal {T}_2-\theta ^\mathcal {T}_3)\left( 1-(1-\lambda )^n\right) - \sum _{l=0}^{n-1}\eta ^{\mathcal {T},(2)}_{t+l} (1- \lambda )^{n-1-l}\\&\quad + \lambda \bigg (nX^{(3)}_{t} (1-\lambda )^{n-1} + \theta ^{\mathcal {T}}_3 \left( \dfrac{1-(1-\lambda )^n}{\lambda }-n(1-\lambda )^{n-1} \right) \\&\quad -\sum _{l=0}^{n-1}(n-l-1) \eta ^{\mathcal {T},(3)}_{t+l} (1-\lambda )^{n-l-2}\bigg ),\\ \mathcal {M}^{(3)}_{t,n}&=X^{(3)}_{t}(1-\lambda )^{n}+ \theta ^\mathcal {T}_3\left( 1-(1-\lambda )^n\right) - \sum _{l=0}^{n-1}\eta ^{\mathcal {T},(3)}_{t+l} (1- \lambda )^{n-1-l}, \\ \mathcal {V}^{(1,1)}_{n}&= n \Sigma _{1,1}^2,\\ \mathcal {V}^{(2,2)}_{n}&= \Sigma _{2,2}^2 \left( 1+\zeta _0((1-\lambda )^2,n)\right) + \lambda ^2\Sigma _{3,3}^2(1-\lambda )^{-2}\zeta _2((1-\lambda )^2,n)\\&\quad +2\Sigma _{2,2} \lambda \Sigma _{3,3}\rho _{2,3}(1-\lambda )^{-1}\zeta _1\left( \left( 1-\lambda \right) ^2,n\right) ,\\ \mathcal {V}^{(3,3)}_{n}&= \Sigma _{3,3}^2 \left( 1+\zeta _0((1-\lambda )^2,n)\right) ,\\ \mathcal {V}^{(1,2)}_{n}&=\mathcal {V}^{(2,1)}_{n}=\Sigma _{1,1}\Sigma _{2,2}\rho _{1,2} \left( 1+\zeta _0(1-\lambda ,n) \right) +\lambda \Sigma _{1,1}\Sigma _{3,3}\rho _{1,3}\dfrac{\zeta _1(1-\lambda ,n)}{1-\lambda },\\ \mathcal {V}^{(1,3)}_{n}&=\mathcal {V}^{(3,1)}_{n}=\Sigma _{1,1}\Sigma _{3,3}\rho _{1,3} \left( 1+ \zeta _0(1-\lambda ,n)\right) ,\\ \mathcal {V}^{(2,3)}_{n}&=\mathcal {V}^{(3,2)}_{n}=\Sigma _{2,2}\Sigma _{3,3} \rho _{2,3}\left( 1+\zeta _0((1-\lambda )^2,n)\right) +\lambda \Sigma _{3,3}^2\dfrac{\zeta _1((1-\lambda )^2,n)}{1-\lambda }. \end{aligned}$$

Proof

See “Appendix”. \(\square\)

The following quantities appearing in Proposition 2.2 can be further simplified.

Lemma 2.1

Considering the case \(n = \mathcal {T}-t\),

$$\begin{aligned} \sum _{l=0}^{\mathcal {T}-t-1}\eta ^{\mathcal {T},(1)}_{t+l}&=\Delta \Sigma _{1,1}\bigg [\Sigma _{1,1} \frac{(\mathcal {T}-t-1)(\mathcal {T}-t)}{2} \\&\quad +\dfrac{\Sigma _{2,2} \rho _{1,2}}{\lambda } \left( \mathcal {T}-t-1-\zeta _0\left( 1-\lambda ,\mathcal {T}-t\right) \right) \\&\quad +\Sigma _{3,3} \rho _{1,3}\bigg ( \dfrac{\mathcal {T}-t-1-\left( 1+\zeta _0\left( 1-\lambda ,\mathcal {T}-t-1\right) \right) }{\lambda }\\&\quad -\zeta _1\left( 1-\lambda , \mathcal {T}-t-1\right) \bigg )\bigg ]. \end{aligned}$$

Moreover, for \(i=2,3\),

$$\begin{aligned}&\sum _{l=0}^{\mathcal {T}-t-1}\eta ^{\mathcal {T},(i)}_{t+l} (1 \!-\! \lambda )^{\mathcal {T}-t-1-l}=\Delta \Sigma _{i,i}\bigg [\Sigma _{1,1} \rho _{i,1} \zeta _1\left( 1-\lambda ,\mathcal {T}-t\right) \\&\quad +\dfrac{\Sigma _{2,2} \rho _{i,2}}{\lambda } \left( \zeta _0\left( 1-\lambda ,\mathcal {T}-t\right) -\zeta _0 \left( \left( 1-\lambda \right) ^{2},\mathcal {T}-t\right) \right) \\&\quad +\Sigma _{3,3} \rho _{i,3}\bigg (\dfrac{\zeta _0\left( 1-\lambda ,\mathcal {T}-t\right) -(1-\lambda )^{-1}\zeta _0\left( \left( 1-\lambda \right) ^{2},\mathcal {T}-t\right) }{\lambda }\\&\quad -(1+\lambda )\zeta _1\left( \left( 1-\lambda \right) ^{2},\mathcal {T}-t-1\right) \bigg )\bigg ]. \end{aligned}$$

Lastly,

$$\begin{aligned}&\sum _{l=0}^{\mathcal {T}-t-1}(\mathcal {T}-t-l-1) \eta ^{\mathcal {T},(3)}_{t+l} (1-\lambda )^{\mathcal {T}-t-l-2}\\&\quad = \frac{\Delta \Sigma _{3,3}}{1-\lambda } \bigg ( \Sigma _{1,1} \rho _{3,1} \zeta _2\left( 1-\lambda ,\mathcal {T}-t\right) \\&\qquad+ \frac{\Sigma _{2,2} \rho _{2,1}}{\lambda } \left[ \zeta _1\left( 1-\lambda ,\mathcal {T}-t\right) \right. \left. - \zeta _1\left( (1-\lambda )^2,\mathcal {T}-t\right) \right] \\&\quad \quad + \Sigma _{3,3} \rho _{3,1} \left[ \frac{\zeta _1\left( 1-\lambda ,\mathcal {T}-t\right) }{\lambda } - \frac{1}{\lambda }\zeta _1\left( (1-\lambda )^2,\mathcal {T}-t\right) \right. \\&\qquad - (1-\lambda )^{-1}\zeta _2\left( (1-\lambda )^2,\mathcal {T}-t\right) \bigg] \bigg ). \end{aligned}$$

Proof

See “Appendix”. \(\square\)

2.3 Physical measure dynamics in the DTAFNS model

To determine option risk premia and expected excess returns, dynamics of interest rates under the physical measure \(\mathbb {P}\) must be specified. The \(\mathbb {P}\)-dynamics considered in Eghbalzadeh et al. (2022) are used here since they are shown in that paper to exhibit natural compatibility with the \(\mathbb {Q}\)-dynamics model; the form of the pricing kernel allowing to pass from such \(\mathbb {P}\)-measure to the \(\mathbb {Q}\)-measure outlined in Sect. 2.1 is provided in that paper.

Under the risk-neutral measure \(\mathbb {P}\), factors are assumed to have the following auto-regressive dynamics:

$$\begin{aligned} \left( {\begin{array}{c} X^{(1)}_{t+1}-X^{(1)}_t \\ X^{(2)}_{t+1}-X^{(2)}_t \\ X^{(3)}_{t+1}-X^{(3)}_t \\ \end{array} } \right)&= \underbrace{\left[ \begin{array}{ccc} \kappa ^\mathbb {P}_{1,1} &{} 0 &{} 0 \\ 0 &{} \kappa ^\mathbb {P}_{2,2} &{} -\lambda \\ 0 &{} 0 &{} \kappa ^\mathbb {P}_{3,3} \end{array} \right] }_{ \kappa ^\mathbb {P}} \underbrace{\left[ \begin{array}{c} \theta ^{\mathbb {P}}_1-X^{(1)}_t \\ \theta ^{\mathbb {P}}_2-X^{(2)}_t \\ \theta ^{\mathbb {P}}_3-X^{(3)}_t \end{array} \right] }_{ \theta ^{\mathbb {P}}-X_t } + \underbrace{\left( {\begin{array}{ccc} \Sigma _{1,1} &{} 0 &{} 0 \\ 0 &{} \Sigma _{2,2} &{} 0 \\ 0&{} 0 &{} \Sigma _{3,3} \\ \end{array} } \right) }_{\Sigma } \left( {\begin{array}{c} Z^{\mathbb {P}}_{t+1,1} \\ Z^{\mathbb {P}}_{t+1,2} \\ Z^{\mathbb {P}}_{t+1,3} \\ \end{array} } \right) , \end{aligned}$$
(2.14)

where \(\kappa ^\mathbb {P}_{1,1} \in [0,1)\), \(\kappa ^\mathbb {P}_{i,i} \in (0,1)\), \(i=2,3\), and \(\{(Z^{\mathbb {P}}_{t,1},Z^{\mathbb {P}}_{t,2},Z^{\mathbb {P}}_{t,3})\}^T_{t=1}\) is again a 3-dimensional Gaussian standard white noise with \(\text {Corr}[Z^{\mathbb {P}}_{t_1,i},Z^{\mathbb {P}}_{t_2,j}] = \mathbbm {1}_{ \{t_1=t_2\} } \rho _{i,j}\). The \(\mathbb {P}\)-dynamics are slightly more flexible than the \(\mathbb {Q}\)-dynamics since the components in the diagonal of \(\kappa ^\mathbb {P}\) are not required to be either 0 or \(\lambda\). A possibility would be to impose \(\kappa ^\mathbb {P}_{1,1}=0\) to replicate the non-stationary dynamics of factor \(X^{(1)}\) under \(\mathbb {Q}\). Nevertheless Eghbalzadeh et al. (2022) argue that not imposing such restriction provides a better fit for their dataset.

Under the above \(\mathbb {P}\)-dynamics, Eghbalzadeh et al. (2022) provide the relationship between the long-term mean parameters \(\theta ^\mathbb {P}\) and \(\theta ^\mathbb {Q}\) for both measures:

$$\begin{aligned} \theta _2^{\mathbb {Q}} = \lambda ^{-1}(\kappa _{2,2}^{\mathbb {P}} \theta _2^{\mathbb {P}} + \kappa _{3,3}^{\mathbb {P}} \theta _3^{\mathbb {P}} - \lambda \theta _3^{\mathbb {P}}), \quad \theta _3^{\mathbb {Q}} = \lambda ^{-1} \kappa _{3,3}^{\mathbb {P}} \theta _3^{\mathbb {P}} \end{aligned}$$

and

$$\begin{aligned} \theta _3^{\mathbb {P}}=\lambda \theta _3^{\mathbb {Q}}/\kappa ^{\mathbb {P}}_{3,3}, \quad \theta _2^{\mathbb {P}}=\frac{\lambda }{\kappa ^{\mathbb {P}}_{2,2}} \left[ \theta _2^{\mathbb {Q}} -\frac{\theta _3^{\mathbb {Q}}}{\kappa ^{\mathbb {P}}_{3,3}}(\kappa ^{\mathbb {P}}_{3,3}-\lambda ) \right] , \end{aligned}$$

whereas the \(\kappa ^{\mathbb {P}}_{i,i}\), \(i=1,2,3\) are allowed to vary freely.

Remark 2.2

A straightforward possible extension of the model could consist in still using (2.14) for \(\mathbb {P}\)-dynamics, but instead considering a physical volatility matrix \(\Sigma ^{\mathbb {P}}\) whose components are different (most likely lower) than those of the risk-neutral volatility \(\Sigma\). This could help producing negative excess returns for out-of-the-money options (either calls or put) on zero-coupon futures, which are observed empirically as documented in Bakshi et al. (2023).

The following proposition, whose proof is in “Appendix”, is analogous to Proposition 2.2 and provides transition distributions for factors X under the physical measure \(\mathbb {P}\). Such proposition is used subsequently in the derivation of option expected excess returns.

Proposition 2.3

Assume \(\kappa ^{\mathbb {P}}_{2,2} \ne \kappa ^{\mathbb {P}}_{3,3}\) and define \(\omega =\frac{1-\kappa ^{\mathbb {P}}_{3,3}}{1-\kappa ^{\mathbb {P}}_{2,2}}\). Under measure \(\mathbb {P}\), conditionally on \(\mathcal {F}_t\) and for \(t+n \le T\), factors \(X_{t+n}\) follow the multivariate Gaussian distribution with mean vector \(\mathcal {M}^\mathbb {P}_{t,n}=\left[ \mathcal {M}^{\mathbb {P},(i)}_{t,n}\right] ^3_{i=1}\) and covariance matrix \(\mathcal {V}^\mathbb {P}_{n}=\left[ \mathcal {V}^{\mathbb {P},(i,j)}_{n}\right] ^3_{i,j=1}\) where

$$\begin{aligned} \mathcal {M}^{\mathbb {P},(i)}_{t,n}&= X^{(i)}_{t}(1-\kappa ^{\mathbb {P}}_{i,i})^{n}+ \theta ^\mathbb {P}_i\left( 1-(1-\kappa ^{\mathbb {P}}_{i,i})^n\right) \\&\quad+ \mathbbm {1}_{ \{i=2\} } \lambda (X^{(3)}_{t}-\theta ^{\mathbb {P}}_3)\frac{1-\omega ^n}{1-\omega } (1-\kappa ^{\mathbb {P}}_{2,2} )^{n-1} \end{aligned}$$

and

$$\begin{aligned} \mathcal {V}^{\mathbb {P},(1,1)}_{n}&= {\left\{ \begin{array}{ll} \Sigma _{1,1}^2 \left( 1+\zeta _0((1-\kappa ^{\mathbb {P}}_{1,1})^2,n)\right) \quad \text { if } \kappa ^{\mathbb {P}}_{1,1}\in (0,1),\\ n \Sigma _{1,1}^2 \quad \text { if } \kappa ^{\mathbb {P}}_{1,1}=0, \end{array}\right. }\\ \mathcal {V}^{\mathbb {P},(2,2)}_{n}&= \Sigma _{2,2}^2 \left( 1+\zeta _0((1-\kappa ^{\mathbb {P}}_{2,2})^2,n)\right) \\&\quad + \lambda ^2 \Sigma ^2_{3,3} \frac{(1-\kappa ^{\mathbb {P}}_{2,2} )^{2n-2}}{(1-\omega )^2} \left[ \zeta _0\left( (1-\kappa ^{\mathbb {P}}_{2,2})^{-2},n\right) -2\omega ^n \zeta _0\left( \omega (1-\kappa ^{\mathbb {P}}_{3,3})^{-2},n\right) \right. \\&\quad \left. + \omega ^{2n} \zeta _0\left( (1-\kappa ^{\mathbb {P}}_{3,3})^{-2},n\right) \right] \\&\quad +2 \frac{\rho _{2,3} \lambda \Sigma _{2,2} \Sigma _{3,3}}{1-\omega } (1-\kappa ^{\mathbb {P}}_{2,2} )^{(2n-1)} \left[ \zeta _0\left( \frac{\omega }{(1-\kappa ^{\mathbb {P}}_{2,2})(1-\kappa ^{\mathbb {P}}_{3,3})},n\right) \right. \\&\quad \left. -\omega ^n \zeta _0\left( \frac{1}{(1-\kappa ^{\mathbb {P}}_{2,2})(1-\kappa ^{\mathbb {P}}_{3,3})},n\right) \right] \\ \mathcal {V}^{\mathbb {P},(3,3)}_{n}&=\Sigma _{3,3}^2 \left( 1+\zeta _0((1-\kappa ^{\mathbb {P}}_{3,3})^2,n)\right) \\ \mathcal {V}^{\mathbb {P},(1,2)}_{n}&=\Sigma _{1,1}\Sigma _{2,2}\rho _{1,2} \left[ 1+\zeta _0((1-\kappa ^{\mathbb {P}}_{1,1})(1-\kappa ^{\mathbb {P}}_{2,2}),n) \right] \\&\quad +\lambda \Sigma _{1,1}\Sigma _{3,3}\rho _{1,3} \frac{(1-\kappa ^{\mathbb {P}}_{1,1})^n (1-\kappa ^{\mathbb {P}}_{2,2})^{n-1}}{1-\omega } \times \\&\quad \left( \zeta _0 \left( \frac{\omega }{(1-\kappa ^{\mathbb {P}}_{1,1})(1-\kappa ^{\mathbb {P}}_{3,3})},n\right) - \omega ^n \zeta _0 \left( \frac{1}{(1-\kappa ^{\mathbb {P}}_{1,1})(1-\kappa ^{\mathbb {P}}_{3,3})},n\right) \right) ,\\ \mathcal {V}^{\mathbb {P},(1,3)}_{n}&=\Sigma _{1,1}\Sigma _{3,3}\rho _{1,3} \left[ 1+\zeta _0( (1-\kappa ^{\mathbb {P}}_{1,1})(1-\kappa ^{\mathbb {P}}_{3,3}),n) \right] ,\\ \mathcal {V}^{\mathbb {P},(2,3)}_{n}&=\Sigma _{2,2}\Sigma _{3,3} \rho _{2,3} \left[ 1+\zeta _0((1-\kappa ^{\mathbb {P}}_{2,2})(1-\kappa ^{\mathbb {P}}_{3,3}),n)\right] \\&\quad +\lambda \Sigma _{3,3}^2 \frac{(1-\kappa ^{\mathbb {P}}_{2,2} )^{n-1} (1-\kappa ^{\mathbb {P}}_{3,3} )^{n}}{1-\omega } \left( \zeta _0\left( \frac{1}{(1-\kappa ^{\mathbb {P}}_{2,2} )(1-\kappa ^{\mathbb {P}}_{3,3} )},n\right) \right. \\&\quad \left. - \omega ^n \zeta _0\left( (1-\kappa ^{\mathbb {P}}_{3,3})^{-2} ,n\right) \right) . \end{aligned}$$

Remark 2.3

Closed-form formulas for conditional moments of \(X_{t+n}\) given \(X_t\) under \(\mathbb {P}\) could also be derived in the case \(\kappa ^{\mathbb {P}}_{2,2} = \kappa ^{\mathbb {P}}_{3,3}\). However, since such case is very unlikely to occur in practice, it is omitted.

3 European swaption pricing

This section describes two pricing procedures for European swaptions and outlines their respective advantages. The first relies on a risk-neutral simulation, whereas the second uses the forward measure to perform the simulation.

3.1 Risk-neutral pricing of European swaptions

Swaptions are classified into three types: European, Bermudan, and American, which differ in their possible exercise dates. Whereas American and Bermudan swaptions allow the exercise of the option on multiple dates, the European swaption has a single possible exercise date. We shall focus on European swaptions in this study. The European swaption considered, which is a payer swaption, is a financial option that gives the holder the right to enter, at time \(T_\alpha\), into a swap with payment dates \(T_{\alpha +1},\dots , T_\beta\) on which the holder pays the strike rate as the fixed rate, and receives the prevailing floating rate on each payment date.Footnote 1 Typically, the floating rate is tied to an interbank offered rate, such as LIBOR in the United Kingdom or the CDOR in Canada.

As shown in Brigo and Mercurio (2007), for \(t<T_\alpha\), the time-t price of a European payer swaption with maturity \(T_\alpha\), strike K, nominal value N and payment dates \(\{T_i\}^{\beta }_{i=\alpha +1}\) is

$$\begin{aligned} PS\left[ t; \{T_i\}^{\beta }_{i=\alpha };K;N \right] = \mathbb {E^Q}\left[ D(t,T_\alpha )\left( N \left( S_{\alpha ,\beta }(T_\alpha )-K \right) ^+ \sum _{i=\alpha +1}^{\beta } \delta _i P(T_\alpha ,T_i)\right) \bigg | \mathcal {F}_t \right] \end{aligned}$$
(3.1)

where \(\delta _i=T_i-T_{i-1}\). The time-t forward swap rate \(S_{\alpha ,\beta }(t)\) is

$$\begin{aligned} S_{\alpha ,\beta }(t)&=\dfrac{P(t,T_\alpha )-P(t,T_\beta )}{\sum _{i=\alpha +1}^\beta \delta _i P(t,T_i)}. \end{aligned}$$
(3.2)

The swap rate \(S_{\alpha ,\beta }(t)\) corresponds to a value of the fixed rate which would make the time-t value of the swap nil. The rationale underlying (3.1) is that a market participant could, while exercising the option, enter without fee into a receiver swap with the swap rate as the fixed rate. Combining both positions would lead to a net payment being the difference between the swap rate and the strike rate at each payment date.

A straightforward approach to obtain the swaption price via (3.1) is to conduct a Monte-Carlo simulation of the term structure factors under the risk-neutral measure and to average discounted cash flows, thereby approximating the expectation in (3.1). Algorithm 1 summarizes this process. The risk-neutral approach for swaption pricing has the advantage of requiring a single simulation to price multiple swaptions at once, which could be desirable in a calibration exercise. The drawback of using such an approach is that it requires simulating the entire path of the term structure factors, which might not be needed if a single swaption needs to be priced, as explained in the following section.

figure a

3.2 Pricing swaptions under the forward measure

Calculating European swaption prices using Algorithm 1 requires simulating the entire path of risk-free rate factors, which might be numerically cumbersome in some situations. By applying a change of numéraire, we can obtain a pricing approach which is more time-efficient. Detailing such an approach in the context of the DTAFNS model is the objective of this subsection.

Considering the zero-coupon bond maturing at \(\mathcal {T}=T_\alpha\) as the new numéraire makes the computation of the swaption price much more convenient. In such case, the payer swaption price may therefore be rewritten based on (2.11), (3.1) and (3.2) as

$$\begin{aligned}&PS\left[ t; \{T_i\}^{\beta }_{i=\alpha };K;N \right] =P(t,T_\alpha ) \mathbb {E}^{T_\alpha }\left[ \left( N \left( S_{\alpha ,\beta }(T_\alpha )-K \right) ^+ \sum _{i=\alpha +1}^{\beta } \delta _i P(T_\alpha ,T_i)\right) \bigg | \mathcal {F}_t \right] \\&\quad =P(t,T_\alpha ) \mathbb {E}^{T_\alpha }\left[ \left( N \left( 1 - P(T_\alpha ,T_\beta )-K \sum _{i=\alpha +1}^{\beta } \delta _i P(T_\alpha ,T_i) \right) ^+\right) \bigg | \mathcal {F}_t \right] . \end{aligned}$$
(3.3)

Equation (3.3) involves the t-conditional expectation of a function of time-\(T_\alpha\) zero-coupon bond prices, which are fully characterized by term structure factors \(X_{T_\alpha } =\left[ X^{(1)}_{T_\alpha }, X^{(2)}_{T_\alpha }, X^{(3)}_{T_\alpha }\right] ^\top\). As a result, Proposition 2.2 can be used to calculate (3.3).

Algorithm 2 highlights the procedure to price swaptions using such an approach. When pricing a single swaption, such \(\mathcal {T}\)-forward measure simulation is much quicker than the Algorithm 1 based on the risk-neutral measure, which requires computing expectations over entire paths of the term structure factors.

figure b

4 Zero-coupon futures and options on futures

This section discusses calculation steps for prices and expected excess returns associated with European options on risk-free zero-coupon futures.

4.1 Futures price

Consider a futures contract with maturity \(\mathcal {T}_2\) on a zero-coupon bond maturing on \(\mathcal {T}_3\). Its time-\(\mathcal {T}_1\) price \(F_{\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3}\) is given by

$$\begin{aligned} F_{\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3} = \mathbb {E^Q}\left[ P(\mathcal {T}_2,\mathcal {T}_3)\bigg | \mathcal {F}_{\mathcal {T}_1}\right] , \end{aligned}$$

see for instance Björk (2009). Such expression can be calculated in closed-form, as indicated by the following theorem whose proof is found in “Appendix”.

Theorem 4.1

The time-\(\mathcal {T}_1\) price of a zero-coupon bond futures whose maturity is \(\mathcal {T}_2\) and whose underlying risk-free zero-coupon bond matures at \(\mathcal {T}_3\) is

$$\begin{aligned} F_{\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3} = \tilde{A}_{\tau _2,\tau _3} \exp \left[ -\Delta \sum ^3_{i=1} \tilde{\mathcal {B}}^{(i)}_{\tau _3} X^{(i)}_{\mathcal {T}_1} \right] \end{aligned}$$
(4.1)

with \(\tau _2=\mathcal {T}_2-\mathcal {T}_1\), \(\tau _3=\mathcal {T}_3-\mathcal {T}_2\), \(\mathcal {V}_{\tau _2}\) being defined in Proposition 2.2 and

$$\begin{aligned} \tilde{A}_{\tau _2,\tau _3}&= A_{\tau _3} \exp \bigg [ \frac{\Delta ^2}{2} \mathcal {B}^\top _{\tau _3} \mathcal {V}_{\tau _2} \mathcal {B}_{\tau _3} -\Delta \mathcal {B}^{(2)}_{\tau _3} (\theta ^\mathbb {Q}_2-\theta ^\mathbb {Q}_3)\left( 1-(1-\lambda )^{\tau _2}\right) \\&\quad -\Delta \mathcal {B}^{(2)}_{\tau _3} \lambda \theta ^{\mathbb {Q}}_3\left( \dfrac{\zeta _0(1-\lambda ,\tau _2+1)}{1-\lambda }-\tau _2(1-\lambda )^{\tau _2-1} \right) \\&\quad -\Delta \mathcal {B}^{(3)}_{\tau _3} \theta ^\mathbb {Q}_3\left( 1-(1-\lambda )^{\tau _2}\right) \bigg ],\\ \tilde{\mathcal {B}}^{(1)}_{n}&= \mathcal {B}^{(1)}_{n}, \quad \tilde{\mathcal {B}}^{(2)}_{n} = \mathcal {B}^{(2)}_{n} (1-\lambda )^{n}, \quad \tilde{\mathcal {B}}^{(3)}_{n} = \mathcal {B}^{(3)}_{n} (1-\lambda )^{n} + \mathcal {B}^{(2)}_{n} \lambda n (1-\lambda )^{n-1}. \end{aligned}$$

4.2 Price for options on futures

The closed-form solution for futures prices leads to a Black-Scholes-type formula for the price of options on the zero-coupon futures. Indeed, denote by \(\text {Call}_{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3}(K)\) the time-t price of a European option with strike K maturing at \(\mathcal {T}_1\) on a futures maturing at \(\mathcal {T}_2\) whose underlying asset is a zero-coupon maturing at \(\mathcal {T}_3\).

Recall the following result, see for instance Lemma A.1 from Godin (2019) for a proof. Suppose Y is a Gaussian random variable with mean \(\mu\) and standard deviation \(\sigma\). Then \(\mathbb {E} \left[ e^Y \mathbbm {1}_{\left\{ Y> y\right\} }\right] = e^{\mu +\sigma ^2/2} \bar{\Phi }\left( \frac{y-\mu -\sigma ^2}{\sigma } \right)\) where \(\bar{\Phi }\) is the survival function (i.e. one minus the CDF) of the Gaussian distribution.

Lemma 4.1

Suppose Y is a Gaussian random variable with mean \(\mu\) and standard deviation \(\sigma\). Then,

$$\begin{aligned} \mathbb {E}[ \max (0,e^Y -K) ]= & {} \mathbb {E}[ e^Y \mathbbm {1}_{\left\{ Y> \log (K)\right\} } ] -K \mathbb {E}[ \mathbbm {1}_{\left\{ Y> \log (K)\right\} } ]\\= & {} e^{\mu +\sigma ^2/2} \bar{\Phi }\left( \frac{\log (K)-\mu -\sigma ^2}{\sigma } \right) -K \bar{\Phi }\left( \frac{\log (K)-\mu }{\sigma } \right) . \end{aligned}$$

The following result is obtained by combining Proposition 2.2 with (4.1).

Lemma 4.2

The forward measure \(\mathbb {Q}^{\mathcal {T}_1}\) distribution of time-\(\mathcal {T}_1\) the log-futures price \(\log F_{\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3}\) conditional on time-t information is Gaussian with mean \(\nu _{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3}\) and variance \(\varsigma ^2_{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3}\), where

$$\begin{aligned} \nu _{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3}= & {} \log \tilde{A}_{\tau _2,\tau _3} -\Delta \sum ^3_{i=1} \tilde{\mathcal {B}}^{(i)}_{\tau _3}\mathcal {M}^{(i)}_{t,\tau _1},\\ \varsigma ^2_{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3}= & {} \Delta \tilde{\mathcal {B}}^\top _{\tau _3} \mathcal {V}_{\tau _1} \tilde{\mathcal {B}}_{\tau _3}, \end{aligned}$$

with \(\tau _1 = \mathcal {T}_1-t\) and \(\tilde{\mathcal {B}}_\tau =\left[ \tilde{\mathcal {B}}^{(1)}_\tau , \,\, \tilde{\mathcal {B}}^{(2)}_\tau , \, \, \tilde{\mathcal {B}}^{(3)}_\tau \right] ^\top\).

Using Lemma 4.1, the time-t call option price is

$$\begin{aligned} \text {Call}_{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3}(K)= & {} P(t,\mathcal {T}_1) \mathbb {E}^{\mathcal {T}_1}[ \max (0,F_{\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3} -K) \vert \mathcal {F}_t]\\= & {} P(t,\mathcal {T}_1) \bigg [e^{\nu _{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3}+\varsigma ^2_{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3}/2} \bar{\Phi }\left( \frac{\log (K)-\nu _{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3}-\varsigma ^2_{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3}}{\varsigma _{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3}} \right) \\{} & {} \quad \quad \quad \quad -K \bar{\Phi }\left( \frac{\log (K)-\nu _{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3}}{\varsigma _{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3}} \right) \bigg ]. \end{aligned}$$

Furthermore, denoting by \(\text {Put}_{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3}(K)\) the corresponding European put option, the put-call parity leads to

$$\begin{aligned} \text {Put}_{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3}(K) = \text {Call}_{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3}(K) - ( \mathbb {E}^{\mathcal {T}_1}[ F_{\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3} | \mathcal {F}_t]-K) P(t, \mathcal {T}_1) \end{aligned}$$

where Lemma 4.2 leads to

$$\begin{aligned} \mathbb {E}^{\mathcal {T}_1}[ F_{\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3} | \mathcal {F}_t] = \exp \left( \nu _{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3} + \varsigma ^2_{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3}/2\right) . \end{aligned}$$

4.3 Price for quadratic options on futures

A quadratic option on futures with time-\(\mathcal {T}_1\) payoff \(\left( \frac{F_{\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3}}{F_{t,\mathcal {T}_2,\mathcal {T}_3}}-1 \right) ^2\) can also be considered. Such option bears resemblance to a straddle option since it is more likely to produce higher payoffs in higher volatility environments for interest rates. Using Lemma 4.2, its time-t price is given by

$$\begin{aligned} \text {Quad}_{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3}= & {} P(t,\mathcal {T}_1) \mathbb {E}^{\mathcal {T}_1} \left[ \left( \frac{F_{\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3}}{F_{t,\mathcal {T}_2,\mathcal {T}_3}}-1 \right) ^2 \bigg |\mathcal {F}_t\right] \\= & {} P(t,\mathcal {T}_1) \mathbb {E}^{\mathcal {T}_1} \left[ \frac{ \exp ( 2\log F_{\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3})}{F^2_{t,\mathcal {T}_2,\mathcal {T}_3}} -2 \frac{ \exp ( \log F_{\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3} )}{F_{t,\mathcal {T}_2,\mathcal {T}_3}} + 1 \bigg |\mathcal {F}_t\right] \\= & {} P(t,\mathcal {T}_1) \left[ \frac{ \exp \left( 2\nu _{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3} + 2\varsigma ^2_{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3}\right) }{F^2_{t,\mathcal {T}_2,\mathcal {T}_3}}\right. \\{} & {} \quad \left. -2 \frac{ \exp \left( \nu _{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3} + \varsigma ^2_{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3}/2\right) }{F_{t,\mathcal {T}_2,\mathcal {T}_3}} + 1 \bigg |\mathcal {F}_t\right] . \end{aligned}$$

4.4 Option expected excess returns

Consider a European-type derivative whose time-t price is \(\text {Price}_t\) and whose time \(\mathcal {T}_1\) payoff is \(\text {Payoff}_{\mathcal {T}_1}\). Its (periodic) expected excess return (EER) could be calculated in two ways:

$$\begin{aligned} \text {EER}^{\text {Approach 1}}_{t,\mathcal {T}_1}= & {} \frac{1}{\mathcal {T}_1-t} \log \frac{ \mathbb {E}^\mathbb {P} \left[ \text {Payoff}_{\mathcal {T}_1} |\mathcal {F}_t\right] }{\text {Price}_{t,\mathcal {T}_1}} - s(t,\mathcal {T}_1), \\ \text {EER}^{\text {Approach 2}}_{t,\mathcal {T}_1}= & {} \frac{1}{\mathcal {T}_1-t} \log \frac{ \mathbb {E}^\mathbb {P} \left[ D(t,\mathcal {T}_1) \text {Payoff}_{\mathcal {T}_1} |\mathcal {F}_t\right] }{\text {Price}_{t,\mathcal {T}_1}}, \end{aligned}$$
(4.2)

where the risk-free spot rate is obtained through \(s(t,\mathcal {T}_1) = -\frac{1}{\mathcal {T}_1-t} \log P(t,\mathcal {T}_1)\). The first formulation relies on a future value perspective, whereas the second sees the expected excess return through the lens of a present value. The second approach has the conceptual advantage of producing an exactly null premium if \(\mathbb {P}=\mathbb {Q}\). Nevertheless, it is more cumbersome to compute and as such we consider (4.2) in this work. Bakshi et al. (2023) also use a formulation similar to (4.2) in their work.

The expected excess return for the European call, the European put and the quadratic options presented above are therefore respectively

$$\begin{aligned} \text {EER}^\text {Call}_{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3}(K)= & {} \frac{1}{\mathcal {T}_1-t} \log \frac{ \mathbb {E}^\mathbb {P} \left[ \max (0;F_{\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3}-K)|\mathcal {F}_t\right] }{\text {Call}_{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3}(K)} - s(t,\mathcal {T}_1), \end{aligned}$$
(4.3)
$$\begin{aligned} \text {EER}^\text {Put}_{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3}(K)= & {} \frac{1}{\mathcal {T}_1-t} \log \frac{ \mathbb {E}^\mathbb {P} \left[ \max (0;K-F_{\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3})|\mathcal {F}_t\right] }{\text {Put}_{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3}(K)} - s(t,\mathcal {T}_1), \end{aligned}$$
(4.4)
$$\begin{aligned} \text {EER}^\text {Quad}_{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3}(K)= & {} \frac{1}{\mathcal {T}_1-t} \log \frac{ \mathbb {E}^\mathbb {P} \left[ \left( \frac{F_{\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3}}{F_{t,\mathcal {T}_2,\mathcal {T}_3}}-1 \right) ^2 \bigg |\mathcal {F}_t\right] }{\text {Quad}_{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3}(K)} - s(t,\mathcal {T}_1). \end{aligned}$$
(4.5)

The following result is obtained by combining Proposition 2.3 with (4.1).

Lemma 4.3

Assuming \(\kappa ^{\mathbb {P}}_{2,2} \ne \kappa ^{\mathbb {P}}_{3,3}\), the \(\mathbb {P}\)-distribution of time-\(\mathcal {T}_1\) the log-futures price \(\log F_{\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3}\) conditional on time-t information is Gaussian with mean \(\nu ^\mathbb {P}_{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3}\) and variance \((\varsigma ^\mathbb {P}_{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3})^2\), where

$$\begin{aligned} \nu ^\mathbb {P}_{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3}= & {} \log \tilde{A}_{\tau _2,\tau _3} -\Delta \sum ^3_{i=1} \tilde{\mathcal {B}}^{(i)}_{\tau _3}\mathcal {M}^{\mathbb {P},(i)}_{t,\tau _1},\\ (\varsigma ^\mathbb {P}_{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3})^2= & {} \Delta \tilde{\mathcal {B}}^\top _{\tau _3} \mathcal {V}^\mathbb {P}_{\tau _1} \tilde{\mathcal {B}}_{\tau _3}, \end{aligned}$$

with \(\tau _1 = \mathcal {T}_1-t\) and \(\tilde{\mathcal {B}}_\tau =\left[ \tilde{\mathcal {B}}^{(1)}_\tau , \,\, \tilde{\mathcal {B}}^{(2)}_\tau , \, \, \tilde{\mathcal {B}}^{(3)}_\tau \right] ^\top\).

Again, using Lemma 4.1, the time-t European call and put option expected payoffs are respectively

$$\begin{aligned} \mathbb {E}^{\mathbb {P}}[ \max (0,F_{\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3} -K) \vert \mathcal {F}_t]= & {} \bigg [e^{\nu ^\mathbb {P}_{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3}+(\varsigma ^\mathbb {P}_{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3})^2/2} \times \\{} & {} \quad \bar{\Phi }\left( \frac{\log (K)-\nu ^\mathbb {P}_{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3}-(\varsigma ^\mathbb {P}_{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3})^2}{\varsigma ^\mathbb {P}_{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3}} \right) \\{} & {} \quad -K \bar{\Phi }\left( \frac{\log (K)-\nu ^\mathbb {P}_{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3}}{\varsigma ^\mathbb {P}_{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3}} \right) \bigg ],\\ \mathbb {E}^{\mathbb {P}}[ \max (0,K-F_{\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3}) \vert \mathcal {F}_t]= & {} \mathbb {E}^{\mathbb {P}}[ \max (0,F_{\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3} -K) \vert \mathcal {F}_t] - ( \mathbb {E}^{\mathbb {P}}[ F_{\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3} | \mathcal {F}_t]-K) \end{aligned}$$

where, from (4.2),

$$\begin{aligned} \mathbb {E}^{\mathbb {P}}[ F_{\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3} | \mathcal {F}_t] = \exp \left( \nu ^\mathbb {P}_{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3} + (\varsigma ^\mathbb {P}_{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3})^2/2\right) . \end{aligned}$$

Moreover, the quadratic option’s expected payoff under \(\mathbb {P}\) is

$$\begin{aligned} \mathbb {E}^{\mathbb {P}} \left[ \left( \frac{F_{\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3}}{F_{t,\mathcal {T}_2,\mathcal {T}_3}}-1 \right) ^2 \bigg |\mathcal {F}_t\right]= & {} \mathbb {E}^{\mathbb {P}} \left[ \frac{ \exp ( 2\log F_{\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3})}{F^2_{t,\mathcal {T}_2,\mathcal {T}_3}} -2 \frac{ \exp ( \log F_{\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3} )}{F_{t,\mathcal {T}_2,\mathcal {T}_3}} + 1 \bigg |\mathcal {F}_t\right] \\= & {} \left[ \frac{ \exp \left( 2\nu ^\mathbb {P}_{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3} + 2(\varsigma ^\mathbb {P}_{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3})^2\right) }{F^2_{t,\mathcal {T}_2,\mathcal {T}_3}}\right. \\{} & {} \quad \left. -2 \frac{ \exp \left( \nu ^\mathbb {P}_{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3} + (\varsigma ^\mathbb {P}_{t,\mathcal {T}_1,\mathcal {T}_2,\mathcal {T}_3})^2/2\right) }{F_{t,\mathcal {T}_2,\mathcal {T}_3}} + 1 \bigg |\mathcal {F}_t\right] \!. \end{aligned}$$

Substituting the above formulas in (4.3)–(4.5) provides values for the option’s expected excess return.

5 Methods for the calibration of the DTANFS model to option prices

While a full-blown calibration of the model to interest rate derivatives prices is left out-of-scope, we briefly highlight potential approaches for such a purpose. Assume a set of d derivatives prices \(Y_t = [Y^{(1)}_{t},\ldots ,Y^{(d)}_{t}]\) is available on any period t, each of which are associated with a set of deterministically chosen strike prices \(K_t = [K^{(1)}_{t},\ldots ,K^{(d)}_{t}]\).Footnote 2 Observed derivatives prices are assumed to be a noisy version of their true prices, and thus we can consider the following system of equations to depict the dynamics of derivatives prices:

$$\begin{aligned} Y_t = G(X_t, K_t) + N_t, \quad X_{t+1} = \mathbf{a} + b X_t + \Sigma Z_{t+1}, \quad t=0,\ldots ,T, \end{aligned}$$
(5.1)

where G is the non-linear function mapping risk factors and strikes into option prices, and the process \(N=\{ N_t\}^T_{t=0}\) is assumed to be a Gaussian d-dimensional white noise. Furthermore, \(\mathbf{a}=\kappa ^{\mathbb {P}} \theta ^{\mathbb {P}}\) and \(b= I-\kappa ^{\mathbb {P}}\) with I being the \(3\times 3\) identity matrix.

Since (5.1) involves a non-linear transformation G of the Gaussian latent factors, the conventional Kalman filter cannot be applied. Nevertheless, the unscented Kalman filter (UKF) developed in Julier and Uhlmann (1997) can be used, as the non-linear system (5.1) is a special case of their equations (1)-(2). The UKF is a generalization of the Kalman filter allowing to tackle non-linearities through a deterministic sampling method leading to a better approximation of filtered moments of the observable quantities. An alternative approach consists in using particle filters, which instead apply stochastic sampling of latent quantities. See for instance Del Moral (1997), Creal (2012) or Remillard (2013) for more information about particle filters. Both the UKF and particle filters have been applied to term structure models estimation in the literature, for instance by Christoffersen et al. (2014).

6 Conclusion

This paper describes how to calculate prices of swaptions and European options (either conventional or quadratic) on zero-coupon futures under the DTAFNS model. Expressions for the option expected excess return associated with the European options are also provided. Whereas Monte-Carlo simulation is used for swaptions, closed-form solutions are provided for the zero-coupon futures options. All pricing expressions are obtained after deriving exact formulas for transition distributions of risk factors underlying the term structure dynamics under the following three respective measures: the physical measure, the risk-neutral measure and the forward measure. A potential future work could consist of studying option risk premia produced by the DTAFNS model and determining whether or not they are consistent with empirical stylized facts outlined for instance in Bakshi et al. (2023). Tackling American options could also be an interesting subsequent work.