Keywords

1 Introduction

Interest rate caps and swaptions are among the most widely traded interest rate derivatives in the world. According to the Bank for International Settlements, their combined notional values are more than 10 trillion dollars in recent years, which are many times bigger than that of exchange-traded options. Because of the size of these markets, accurate and efficient pricing and hedging of caps and swaptions have enormous practical importance. Pricing interest rate derivatives are more demanding than pricing bonds in that the derivatives are more sensitive to the higher order moments of the distributions for underlying and therefore the models need to be able to capture the interest rate volatilities as well as the interest rates themselves. Under the unified framework of the dynamic term structure models (hereafter DTSMs), a benchmark in the term structure literature, the same set of risk factors are used in pricing bonds and derivatives. Consequently, the set risk factors can be identified with the observations of bond yields or swap rates while the inclusion of derivative prices can help in terms of the efficiency of the estimation but not essential. The practitioners on the other hand generally apply the Heath–Jarrow–Morton (HJM) type of models in pricing interest rate derivatives, in which the entire yield curve is taken as given, and sometimes factors independent of yield curve, such as stochastic volatilities and jumps, are added in a piece-meal approach. This divergence foreshadows one of the key issues of the fast growing literature on Libor-based interest rate derivatives, the so-called unspanned stochastic volatility (USV) puzzle.Footnote 1

Interest rate caps and swaptions are derivatives written on Libor and swap rates, and the traditional view is that their prices be determined by the same set of risk factors that determine Libor and swap rates. However, several recent studies have shown that there seem to be risk factors that affect the prices of caps and swaptions but are not spanned by the underlying Libor and swap rates. Heidari and Wu (2003) show that while the three common term structure factors (i.e., the level, slope and curvature of the yield curve) can explain 99.5% of the variations of bond yields, they explain less than 60% of swaption implied volatilities. After including three additional volatility factors, the explanatory power is increased to over 97%. Similarly, Collin-Dufresne and Goldstein (2002) show that there is a very weak correlation between changes in swap rates and returns on at-the-money (ATM) cap straddles: the R 2s of regressions of straddle returns on changes of swap rates are typically less than 20%. Furthermore, one principal component explains 80% of regression residuals of straddles with different maturities. As straddles are approximately delta neutral and mainly exposed to volatility risk, they refer to the factor that drives straddle returns but is not affected by the term structure factors as “unspanned stochastic volatility” (hereafter USV). Jagannathan et al. (2003) find that an affine three-factor model can fit the LIBOR swap curve rather well. However, they identify significant shortcomings when confronting the model with data on caps and swaptions, thus concluding that derivatives must be used when evaluating term structure models. On the other hand, Fan et al. (2003) provide evidence against the existence of USV and show the swaptions can be hedged using bonds alone with a HJM model and the difference from the previous studies results from the nonlinear dependence of derivative prices on the yield curve factors. Li and Zhao (2006) show the yield curve factors extracted using a quadratic term structure model can hedge the bonds perfectly, but not the interest rate caps and the unhedged components can systematically improve hedging performance across moneyness. They argue the difference is likely due to the fact that the interest caps are more sensitive to the volatility factors than the swaptions, and also the DTSMs are suitable to address the question whether the derivatives are redundant since the HJM type of models need using both data sets for estimation. Overall, most studies suggest that interest rate derivatives are not redundant securities and cannot be hedged using bonds alone. In other words, bonds do not span interest rate derivatives. In the following table we regress weekly cap straddle returns at different moneyness and maturity on weekly changes in the three yield factors and obtain very similar results. In general, the R 2s in Table 47.1 are very small for straddles that are close to the money. For deep ITM and OTM straddles, the R 2s increase significantly. This is consistent with the fact that the ATM straddles are more sensitive to the volatility risk than others away from money.

Table 47.1 Regression analysis of USV in caps market

The existence of USV has profound implications for term structure modeling, especially on the existing multifactor dynamic term structure models, a widely popular term structure model followed by a huge literature in the last decade. One of the main reasons of the popularity of these models is their tractability: they provide closed-form solutions for the prices of not only zero-coupon bonds, but also a wide range of interest rate derivatives (see, e.g., Duffie et al. 2001; Chacko and Das 2002; Leippold and Wu 2002). The closed-form formulae significantly reduce the computational burden of implementing these models and simplify their applications in practice. However, almost all existing DTSMs assume that derivatives are redundant and can be perfectly hedged using solely bonds. Hence, the presence of USV in the derivatives market implies that one fundamental assumption underlying all DTSMs does not hold, and these models need to be substantially extended to incorporate the unspanned factors before they can be applied to derivatives. However, as Collin-Dufresne and Goldstein (2002) show, it is rather difficult to introduce USV in traditional DTSMs: One must impose highly restrictive assumptions on model parameters to guarantee that certain factors that affect derivative prices do not affect bond prices. In other words, the ATSMs with USV are restricted version of the existing ATSMs. Some recent papers, for example see Bikbov and Chernov (2004), have tested the USV restrictions by comparing USV models to the nesting unrestricted ATSMs and rejected the USV restrictions when both models are fitted to both bonds and derivatives data. This approach however gives misleading conclusions. The USV for term structure models resembles the inclusion of stochastic volatility (SV) in the stock price process, where the natural comparison is between the Black–Scholes model and the SV model. Similarly the USV model should be nesting the traditional DTSM without USV for statistical testing. Specifically, if the unrestricted three-factor affine model is a good fit for the term structure of interest rates, one should test whether adding one more factor, i.e., a four-factor affine model with USV, will help capture the derivatives data.

Some recent studies have also provide evidence in support of the existence of USV using bonds data alone. They show the yield curve volatilities backed out from a cross-section of bond yields do not agree with the time-series filtered volatilities, via GARCH or high-frequency estimates from yields data. This challenges the traditional DTSMs even more since these models can not be expected to capture the option implied volatilities if they can not even match the realized yield curve volatilities. Specifically, Collin-Dufresne et al. (2004, CDGJ) show that the LIBOR volatility implied by an affine multi-factor specification from the swap rate curve can be negatively correlated with the time series of volatility obtained from a standard GARCH approach. In response, they argue that an affine four-factor USV model delivers both realistic volatility estimates and a good cross-sectional fit. Andersen and Benzoni (2006), through the use of high-frequency data on bond yields, construct the model-free “realized yield volatility” measure by computing empirical quadratic yield variation for a cross-section of fixed maturities. They find that the yield curve fails to span yield volatility, as the systematic volatility factors are largely unrelated to the cross-section of yields. They claim that a broad class of affine diffusive, Gaussian-quadratic and affine jump-diffusive models is incapable of accommodating the observed yield volatility dynamics. An important implication is that the bond markets per se are incomplete and yield volatility risk cannot be hedged by taking positions solely in the Treasury bond market. They also advocate using the empirical realized yield volatility measures more broadly as a basis for specification testing and (parametric) model selection within the term structure literature. Thompson (2004), on the LIBOR swap data, argues when the affine models are estimated with the time-series filtered yield volatility they can pass on his newly proposed specification test, but not with the cross-sectional backed-out volatility. From these studies on the yields data alone, there may exist an alternative explanation for the failure of DTSMs in effectively pricing derivatives in that the bonds small convexity makes bonds not sensitive enough to identify the volatilities from measurement errors. Therefore efficient inference requires derivatives data as well.

It can be argued in the same fashion that identification of the unspanned factors can be most efficient accomplished by adding derivatives data to the analysis. Duarte (2008) shows mortgage-backed security (MBS) hedging activity affects interest rate volatility and proposes a model that takes these effects as a measure for the stochastic volatility of the underlying term structure. However, it is unclear whether the realized volatility is indeed different form the implied volatility due to the MBS effects.

Li and Zhao (2009) provide one of the first nonparametric estimates of probability densities of LIBOR rates under forward martingale measures using caps with a wide range of strike prices and maturities.Footnote 2 The nonparametric estimates of LIBOR forward densities are conditional on the slope and volatility factors of LIBOR rates, while the level factor is automatically incorporated in existing methods.Footnote 3 They find that the forward densities depend significantly on the slope and volatility of LIBOR rates. For example, the forward densities become more dispersed (compact) when the slope of the term structure (the volatility of LIBOR rates) increases. Further analysis reveals a nonlinear relation between the forward densities and the volatility of LIBOR rates that depends on the slope of the term structure. With a flat (steep) term structure, higher volatility tends to lead to more dispersed (compact) forward densities. This result suggests that the speed of mean reversion of the volatility process depends on the slope of the term structure, a feature that has not been explicitly accounted for by existing term structure models. Additionally, this paper documents important impacts of mortgage market activities on the LIBOR forward densities even after controlling for both the slope and volatility factors. For example, the forward densities at intermediate maturities (3, 4, and 5 years) are more negatively skewed when refinance activities, measured by the Mortgage Bankers Association of America (MBAA) refinance index, are high. Demands for out-of–the-money (OTM) floors by investors in mortgage-backed securities (MBS) to hedge potential losses from prepayments could lead to more negatively skewed forward densities. The impacts of refinance activities are most significant at intermediate maturities because the durations of most MBS are around 5 years. The forward density at 2 year maturity is more rightly skewed when ARMs origination (measured by the MBAA ARMs index) is high. Since every ARM contains an interest rate cap that caps the mortgage rate at a certain level, demands for OTM caps from ARMs lenders to hedge their exposures to rising interest rate could lead to more rightly skewed forward densities. The impacts of ARMs is most significant at 2 year maturity because most ARMs get reset within the first 2 years. These empirical results have important implications for the unspanned stochastic volatility puzzle by providing nonparametric and model-independent evidence of USV. The impacts of mortgage activities on the forward densities further suggest that the unspanned factors could be partially driven by activities in the mortgage markets.

The next question naturally is how to best incorporate USV into a term structure model so it can price wide spectrum of interest rate derivatives effectively. In contrast to the approach of adding USV restrictions to DTSMs, it is relatively easy to introduce USV in the Heath et al. (1992) (hereafter, HJM) class of models, which include the LIBOR models of Brace et al. (1997) and Miltersen et al. (1997), the random field models of Goldstein (2000), and the string models of Santa-Clara and Sornette (2001). Indeed, any HJM model in which the forward rate curve has stochastic volatility and the volatility and yield shocks are not perfectly correlated exhibits USV. Therefore, in addition to the commonly known advantages of HJM models (such as perfectly fitting the initial yield curve), they offer the additional advantage of easily accommodating USV. Of course, the trade-off here is that in an HJM model, the yield curve is an input rather than a prediction of the model.

Recently, several HJM models with USV have been developed and applied to price caps and swaptions. Collin-Dufresne and Goldstein (2003) develop a random field model with stochastic volatility and correlation in forward rates. Applying the transform analysis of Duffie et al. (2000), they obtain closed-form formulae for a wide variety of interest rate derivatives. However, they do not calibrate their models to market prices of caps and swaptions. Han (2007) extends the model of LSS (2001) by introducing stochastic volatility and correlation in forward rates. Han (2007) shows that stochastic volatility and correlation are important for reconciling the mispricing between caps and swaptions. Trolle and Schwartz (in press) developed a multifactor term structure model with unspanned stochastic volatility factors and correlation between innovations to forward rates and their volatilities.

Jarrow et al. (2007) develop a multifactor HJM model with stochastic volatility and jumps in LIBOR forward rates. The LIBOR rates follow the affine jump diffusions (hereafter, AJDs) of Duffie et al. (2000) and a closed-form solution for cap prices is provided. Given a small number of factors can explain most of the variation of bond yields, they consider low-dimensional model specifications based on the first few (up to three) principal components of historical forward rates. Their model explicitly incorporates jumps in LIBOR rates, making it possible to differentiate between the importance of stochastic volatility versus jumps for pricing interest rate derivatives.

In section two of this review, we will discuss how the original DTSMs have difficulty in pricing and hedging interest rate derivatives, as shown in Li and Zhao (2006). In section three, we present the HJM model as in Jarrow, Li and Zhao (2007). Finally, we will provide nonparametric evidence from Li and Zhao (2009) showing both the realized and implied yield volatilities can not be spanned by the yield curve factors.

2 Term Structure Models with Spanned Stochastic Volatility

We begin with a two-factor spot rate model with stochastic volatility as in Longstaff and Schwartz (1992). Under the risk-neutral measure Q the short rate r and its volatility V follow a two-dimensional square-root process,

$$\begin{array}{rcl}{\mathit{dr}}_{t}& =& {\kappa}_{r}\left ({\theta}_{r} - {r}_{t}\right )\mathit{dt} + \sqrt{{V}_{t}}{\mathit{dW}}_{1t}^{Q};\end{array}$$
(47.1)
$$\begin{array}{rcl}{\mathit{dV}}_{t}& =& {\kappa}_{V}\left ({\theta}_{V} - {V}_{t}\right )\mathit{dt} + \sigma \sqrt{{V}_{t}}{\mathit{dW}}_{2t}^{Q}.\end{array}$$
(47.2)

The price of the zero coupon bond P(t, T) with maturity T can be solved through the fundamental PDE for bond pricing,

$$\begin{array}{rcl} & & \frac{1} {2}V \left ({P}_{\mathit{rr}} + {\sigma}^{2}{P}_{V V}\right ) + {P}_{r}{\kappa}_{r}\left ({\theta}_{r} - r\right ) \\ & & \qquad + {P}_{V}{\kappa}_{V}\left ({\theta}_{V} - V \right ) + {P}_{t} = rP \end{array}$$
(47.3)

for 0 ≤ tT with the terminal condition P(T, T) = 1. The price P(t, T) and yield y(t, T) are functions of the state variables r t , V t ,

$$\begin{array}{rcl} P(t,T)& =& {e}^{A(T-t)+B(T-t){r}_{t}+C(T-t){V}_{t}},\end{array}$$
(47.4)
$$\begin{array}{rcl} y(t,T)& =& -\frac{\log \left (P(t,T)\right )} {T - t} \\ & =& -\frac{A(T - t)} {T - t} -\frac{B(T - t)} {T - t} {r}_{t}-\frac{C(T - t)} {T - t} {V}_{t},\qquad \;\;\;\end{array}$$
(47.5)

where A, B and C are coefficients depending on maturity. It is clear here that the state variables are linear combinations of the yield curve factors such as level and slope. In this sense, the stochastic volatility is spanned by the bond yields. Both bonds and bond derivatives can be priced through the fundamental PDE and the same set of state variables enters into their prices. We should note here that the volatility process in this model serves two roles. First it helps to price the cross section of bonds making the model more flexible than one-factor models in generating various shapes of the yield curve. Second, it is the volatility process of the short rate therefore it can be inferred using the time series of the short rate alone. One essential question to address therefore is whether the volatility process inferred cross-sectionally fits the time-series properties stipulated by the model. The potential failure in the fit can be due to the model misspecification or the fact that the volatility process can not be identified using the yield curve factors. For illustration, we discuss the example given in Casassus et al. (2005),

$$\begin{array}{rcl}{\mathit{dr}}_{t}& =& {\kappa}_{r}\left ({\theta}_{t} - {r}_{t}\right )\mathit{dt} + \sqrt{{V}_{t}}{\mathit{dW}}_{1t}^{Q},\end{array}$$
(47.6)
$$\begin{array}{rcl} d\!{\theta}_{t}& =& {\kappa}_{r}\left ({\gamma}_{\theta}(t) - 2{\kappa}_{r}{\theta}_{t} + \frac{{V}_{t}} {{\kappa}_{r}} \right )\mathit{dt},\end{array}$$
(47.7)
$$\begin{array}{rcl}{\mathit{dV}}_{t}& =& {\mu}_{V}({V}_{t},t)\mathit{dt} + \sigma ({V}_{t},t){\mathit{dW}}_{2t}^{Q},\end{array}$$
(47.8)

where the long-run mean of the short rate θ t has a pure drift process and the short rate volatility V t follows a stochastic process with general drift and diffusion functions that only depend on the volatility itself. It can be shown that the zero coupon bond price P(t, T) depends only on the short rate and its long-run mean, not on the volatility, i.e.

$$P(t,T) = {e}^{A(t,T)+B(T-t){r}_{t}+C(T-t){\theta}_{t}}.$$
(47.9)

It can also be shown that the price of a European call option on the zero coupon bond however depends on the volatility V t . For this example, the call option can not be hedged by using bonds alone.

Therefore, it is an important exercise to test whether a sophicated DTSM without the USV factor can be used to hedge the interest derivatives. Dai and Singleton (2002) review many of the current dynamic term structure models and these models include the affine term structure models (ATSMs) of Duffie and Kan (1996) and the QTSMs of ADG (2002) and many others.Footnote 4

In a typical dynamic term structure model, the economy is represented by the filtered probability space Ω, ​, t 0 ≤ tT , P, where t 0 ≤ tT is the augmented filtration generated by an N-dimensional standard Brownian motion, W, on this probability space. It is usually assumed that t 0 ≤ tT satisfies the usual hypothesis (see Protter 1990).

The ATSMs rely on the following assumptions:

  • The instantaneous interest rate r t is an affine or quadratic function of the N-dimensional state variables X t ,

    $$r\left ({X}_{t}\right ) = {\beta}^{{\prime}}{X}_{t} + \alpha \text{}.$$
    (47.10)
  • The state variables follow a multivariate affine process,

    $$d{X}_{t} = K\left [\theta - {X}_{t}\right ]dt + \Sigma {S}_{t}d{W}_{t},$$
    (47.11)

    where S t is a diagonal matrix with elements being the squareroot of an affine function of X t . Hence the conditional means and variances of the state variables are affine functions of the state variables.

  • The market price of risk is a function of the state variables,

    $$\zeta ({X}_{t}) = {\eta}_{0}{X}_{t}{S}_{t}^{-} + {\eta}_{1}{S}_{t},$$
    (47.12)

    where S t is a diagonal matrix with elements being the inverse of those in S t wherever positive zero otherwise.

The zero coupon bond with time to maturity τ can be priced by risk-neutral pricing

$$\begin{array}{rcl} D(t,\tau )& =& {E}_{t}^{Q}\left [{e}^{-{\int \nolimits \nolimits}_{t}^{t+\tau}r\left ({X}_{s}\right )ds} \cdot 1\right ] \\ & =& {e}^{-A\left (\tau \right )-B{\left (\tau \right )}^{{\prime}}{X}_{t}}. \end{array}$$
(47.13)

The functions of Aτ and Bτ satisfy a system of ordinary differential equations. The continuously compounding yields y(t, τ) follows

$$y(t,\tau ) = \frac{1} {\tau}\left [A(\tau ) + B{\left (\tau \right )}^{{\prime}}{X}_{t}\right ].$$
(47.14)

The interest rate derivatives can be priced similarly via risk-neutral pricing. Without any restrictions on the model parameters, the loadings for the state variables, Bτ, are not zero in general. Hence the state variables can always be backed out given enough number of yields, leaving the derivative prices been redundant in identifying the state variables.

The QTSMs rely on the following assumptions:

  • The instantaneous interest rate r t is an affine or quadratic function of the N-dimensional state variables X t ,

    $$r\left ({X}_{t}\right ) = {X}_{t}^{{\prime}}\Psi {X}_{t} + {\beta}^{{\prime}}{X}_{t} + \alpha $$
    (47.15)
  • The state variables follow a multivariate Gaussian process,

    $${\mathit{dX}}_{t} = \left [\mu + \xi {X}_{t}\right ]dt + \Sigma {\mathit{dW}}_{t},$$
    (47.16)
  • The market price of risk is an affine function of the state variables,

    $$\zeta ({X}_{t}) = {\eta}_{0} + {\eta}_{1}{X}_{t}.$$
    (47.17)

Note that in the above equations Ψ, ξ, Σ, and η1 are N-by-N matrices, β, μ and η0 are vectors of length N and α is a scalar. The quadratic relation between r t and X t has the desired property that r t is guaranteed to be positive if Ψ is positive semidefinite and \(\alpha -\frac{1} {4}{\beta}^{{\prime}}\Psi \beta \geq 0.\) Although X t follows a Gaussian process in Equation (47.2), interest rate r t exhibits conditional heteroskedasticity because of the quadratic relationship between r t and X t . As a result, the QTSMs are more flexible in modeling volatility clustering in bond yields and correlations among the state variables than the ATSMs.

Consequently, the yield-to-maturity, y(t, τ), is a quadratic function of the state variables

$$y(t,\tau ) = \frac{1} {\tau}\left [{X}_{t}^{{\prime}}A(\tau ){X}_{t} + b{(\tau )}^{{\prime}}{X}_{t} + c(\tau )\right ].$$
(47.18)

In contrast, in the ATSMs the yields are linear in the state variables and therefore the correlations among the yields are solely determined by the correlations of the state variables. Although the state variables in the QTSMs follow multivariate Gaussian process, the quadratic form of the yields helps to model the time varying volatility and correlation of bond yields. Leippold and Wu (2002) show that a large class of fixed-income securities can be priced in closed-form in the QTSMs using the transform analysis of Duffie et al. (2001). The details of the derivation are in Appendix.

The first test for these models is to capture both the cross-sectional and time-series properties of bond yields, which has been reviewed in Dai and Singleton (2003). Even though the most sophiscated models can fit the cross section of bond prices very well and they can capture the time series property of the first moment of the yield curve factors, they do not perform satisfactorily in capturing the second moment. The second test is to see whether these models can be used to price and hedge a cross section of interest rate derivatives. The task to performing the second task is made somewhat easier due to one major advantage of these DTSMs in that they provide closed-form solutions for a wide range of interest rate derivatives.

The empirical results shown below are from Li and Zhao (2006), in which they study the performance of QTSMs in pricing and hedging interest rate caps. Even though the study is based on QTSMs, the empirical findings are common to ATSMs as well.Footnote 5 To price and hedge caps in the QTSMs, both model parameters and latent state variables need to be estimated. Due to the quadratic relationship between bond yields and the state variables, the state variables are not identified by the observed yields even in the univariate case in the QTSMs. Previous studies, such as ADG (2002) have used the efficient method of moments (EMM) of Gallant and Tauchen (1998) to estimate the QTSMs. Li and Zhao (2006) use the extended Kalman filter (EKF) to estimate model parameters and extract the latent state variables in one step. The details of the implementation of the EKF is in Appendix.

The pricing analysis can reveal two sources of potential model misspecification. One is on the number of factors in the model as a missing factor usually causes large pricing errors. An analogy is using Black–Scholes model while the stock price is generated from a stochastic volatility model. The other is on the assumption of the innovation process of each factor. If the innovation of the factor has a fat-tailed distribution, the convenient assumption of Gaussian distribution is going to deliver large pricing error as well. So from a pricing study, we can not conclude one or the other or both cause large pricing errors. On the other hand, hedging analysis focuses on the changes of the prices, so even if the marginal distribution of the prices can be highly non-gaussian, the conditional distribution for a small time step can still be reasonably approximated with Gaussian distribution. As the result, a deficiency in hedging, especially at high frequency, reveals more about the potential missing factors than the distribution assumption in a model.

In Li and Zhao (2006), QTSMs can capture yield curve dynamics extremely well. First, given the estimated model parameters and state variables, they compute the one day ahead projection of yields based on the estimated model. Figure 47.1 shows that QTSM1 model projected yields are almost indistinguishable from the corresponding observed yields. Secondly, they examine the performance of the QTSMs in hedging zero-coupon bonds assuming that the filtered state variables are traded and use them as hedging instruments. The delta-neutral hedge is conducted for zero-coupon bonds of six maturities on a daily basis. Hedging performance is measured by variance ratio, which is defined as the percentage of the variations of an unhedged position that can be reduced by hedging. The results on the hedging performance in Table 47.2 show that in most cases the variance ratios are higher than 95%. This should not be surprising given the excellent fit of bond yields by the QTSMs.

Fig. 47.1
figure 1

The observed yields (dot and the QTSM1 projected yields)

Table 47.2 The performance of QTSMs in modeling bond yields

If the Libor and swap market and the cap market are well integrated, the estimated three-factor QTSMs should be able to hedge caps well. Based on the estimated model parameters, the delta-neutral hedge of weekly changes of difference cap prices is conducted using filtered state variables as hedging instruments. It is also possible to use Libor zero-coupon bonds as hedging instruments by matching the hedge ratios of a difference cap with that of zero-coupon bonds. Daily rebalance – adjustment of the hedge ratios everyday given changes in market conditions – is implemented to improve hedging performance. Therefore daily changes of a hedged position is the difference between daily changes of the unhedged position and the hedging portfolio. The latter equals to the sum of the products of a difference cap’s hedge ratios with respect to the state variables and changes in the corresponding state variables. Weekly changes are just the accumulation over daily positions. The hedging effectiveness is measured by variance ratio, the percentage of the variations of an unhedged position that can be reduced by hedging. This measure is similar in spirit to R 2 in linear regression. The variance ratios of the three QTSMs in Table 47.3 show that all models have better hedging performance for ITM, short-term (maturities from 1.5 to 4 years) difference capsFootnote 6 than OTM, medium and long-term difference caps (maturities longer than 4 years) caps. There is a high percentage of variations in long-term and OTM difference cap prices that cannot be hedged. The maximal flexible model QTSM1 again has better hedging performance than the other two models. To control for the fact that the QTSMs may be misspecified, in Panel B of Table 47.3, the hedging errors of each moneyness/maturity group are further regressed on the changes of the three yield factors. While the three yield factors can explain some additional hedging errors, their incremental explanatory power is not very significant. Thus even excluding hedging errors that can be captured by the three yield factors, there is still a large fraction of difference cap prices that cannot be explained by the QTSMs. Table 47.4 reports the performance of the QTSMs in hedging cap straddles. The difference floor prices are computed from difference cap prices using the put-call parity and construct weekly straddle returns. As straddles are highly sensitive to volatility risk, both delta and gamma neutral hedge is needed. The variance ratios of QTSM1 are as low as the R 2s of linear regressions of straddle returns on the yield factors in Table 47.1, suggesting that neither approach can explain much variations of straddle returns. Collin-Dufresne and Goldstein (2002) show that 80% of straddle regression residuals can be explained by one additional factor. Principle component analysis of ATM straddle hedging errors in Panel B of Table 47.4 shows that the first factor can explain about 60% of the total variations of hedging errors. The second and third factor each explains about 10% of hedging errors and two additional factors combined can explain about another 10% of hedging errors. The correlation matrix of the ATM straddle hedging errors across maturities in Panel C shows that the hedging errors of short-term (2, 2.5, 3, 3.5, and 4 year), medium-term (4.5 and 5 year) and long-term (8, 9, and 10 year) straddles are highly correlated within each group, suggesting that there could be multiple unspanned factors.

Table 47.3 The performance of QTSMs in hedging interest rate caps
Table 47.4 Hedging interest rate cap straddles

To further understand whether the unspanned factors are related to stochastic volatility, we study the relationship between ATM cap implied volatilities and straddle hedging errors. Principle component analysis in Panel A of Table 47.5 shows that the first component explains 85% of the variations of cap implied volatilities. In Panel B, we regress straddle hedging errors on changes of the three yield factors and obtain R 2s that are close to zero. However, if we include the weekly changes of the first few principle components of cap implied volatilities, the R 2s increase significantly: for some maturities, the R 2s are above 90%. Although the time series of implied volatilities are very persistent, their differences are not and we do not suffer from the well-known problem of spurious regression. In the extreme case in which we regress straddle hedging errors of each maturity on changes of the yield factors and cap implied volatilities with the same maturity, the R 2s in most cases are above 90%. These results show that straddle returns are mainly affected by volatility risk but not term structure factors.

Table 47.5 Straddle hedging errors and cap implied volatilities

As ATM straddles are mainly exposed to volatility risk, their hedging errors can serve as a proxy of the USV. Panel A and B of Table 47.6 report the R 2s of regressions of hedging errors of difference caps and cap straddles across moneyness and maturity on changes of the three yield factors and the first five principle components of straddle hedging errors. In contrast to the regressions in Panel D of Table 47.6, which only include the three yield factors, the additional factors from straddle hedging errors significantly improve the R 2s of the regressions: for most moneyness/maturity groups, the R 2s are above 90%. Interestingly for long-term caps, the R 2s of ATM and OTM caps are actually higher than that of ITM caps. Therefore, a combination of the yield factors and the USV factors can explain cap prices across moneyness and maturity very well.

Table 47.6 ATM straddle hedging error as a proxy of systematic USV

While the above analysis is mainly based on the QTSMs, the evidence on USV is so compelling that the results should be robust to potential model misspecification. The fact that the QTSMs provide excellent fit of bond yields but can explain only a small percentage of the variations of ATM straddle returns is a strong indication that the models miss some risk factors that are important for the cap market. While we estimate the QTSMs using only bond prices, we could also include cap prices in model estimation. We do not choose the second approach for several reasons. First, the current approach is consistent with the main objective of our study: Use risk factors extracted from the swap market to explain cap prices. Second, it is not clear that modifications of model parameters without changing the fundamental structure of the model could remedy the poor crosssectional hedging performance of the QTSMs. In fact, if the QTSMs indeed miss some important factors, then no matter how they are estimated (using bonds or derivatives data), they are unlikely to have good hedging performance. Finally, Jagannathan et al. (2003) do not find significant differences between parameters of ATSMs estimated using LIBOR/swap rates and cap/swaption prices. The existence of USV strongly suggests that existing DTSMs need to relax their fundamental assumption that derivatives are redundant securities by explicitly incorporating USV factors. It also suggests that it might be more convenient to consider derivative pricing in the forward rate models of HJM (1992) or the random field models of Goldstein (2000) and Santa-Clara and Sornette (2001) because it is generally very difficult to introduce USV in DTSMs. For example, Collin-Dufresne and Goldstein (2002) show that highly restrictive assumptions on model parameters need to be imposed to guarantee that some state variables that are important for derivative pricing do not affect bond prices. In contrast, they show that it is much easier to introduce USV in the HJM and random field class of models: Any HJM or random field model in which the forward rate has a stochastic volatility exhibits USV. While it has always been argued that HJM and random field models are more appropriate for pricing derivatives than DTSMs, the reasoning given here is quite different. That is, in addition to the commonly known advantages of these models (such as they can perfectly fit the initial yield curve while DTSMs generally cannot), another advantage of HJM and random field models is that they can easily accommodate USV (see Collin-Dufresne and Goldstein (2002) for illustration).

The existence of USV suggests that these models may not be directly applicable to derivatives because they all rely on the fundamental assumption that bonds and derivatives are driven by the same set of risk factors. In this paper, we provide probably the first empirical analysis of DTSMs in hedging interest rate derivatives and hope to resolve the controversy on USV through this exercise.

3 LIBOR Market Models with Stochastic Volatility and Jumps: Theory and Estimation

In this section, we develop a multifactor HJM model with stochastic volatility and jumps in LIBOR forward rates and discuss model estimation and comparison using a wide cross section of difference caps. Instead of modeling the unobservable instantaneous spot rate or forward rate, we focus on the LIBOR forward rates which are observable and widely used in the market.

3.1 Specification of the LIBOR Market Models

Throughout our analysis, we restrict the cap maturity T to a finite set of dates 0 = T 0 < T 1 < ⋯ < T K < T K + 1, and assume that the intervals T k + 1T k are equally spaced by δ, a quarter of a year. Let L k t = Lt, T k be the LIBOR forward rate for the actual period T k , T k + 1, and similarly let D k t = Dt, T k be the price of a zero-coupon bond maturing on T k . Thus, we have

$$L\left (t,{T}_{k}\right ) = \frac{1} {\delta}\left ( \frac{D\left (t,{T}_{k}\right )} {D\left (t,{T}_{k+1}\right )} - 1\right ),\text{for}\ k = 1,2,\ldots K.$$
(47.19)

For LIBOR-based instruments, such as caps, floors and swaptions, it is convenient to consider pricing under the forward measure. Thus, we will focus on the dynamics of the LIBOR forward rates L k t under the forward measure \({\mathbb{Q}}^{k+1}\), which is essential for pricing caplets maturing at T k + 1. Under this measure, the discounted price of any security using D k + 1 t as the numeraire is a martingale. Therefore, the time t price of a caplet maturing at T k + 1 with a strike price of X is

$$\mathit{Caplet}\left (t,{T}_{k+1},X\right ) = \delta {D}_{k+1}\left (t\right ){E}_{t}^{{\mathbb{Q}}^{k+1}}\left [{\left ({L}_{k}\left ({T}_{k}\right ) - X\right )}^{+}\right ],$$
(47.20)

where \({E}_{t}^{{\mathbb{Q}}^{k+1}}\) is taken with respect to \({\mathbb{Q}}^{k+1}\) given the information set at t. The key to valuation is modeling the evolution of L k t under \({\mathbb{Q}}^{k+1}\) realistically and yet parsimoniously to yield closed-form pricing formula. To achieve this goal, we rely on the flexible AJDs of Duffie et al. (2000) to model the evolution of LIBOR rates.

We assume that under the physical measure \(\mathbb{P}\), the dynamics of LIBOR rates are given by the following system of SDEs, for t ∈ 0, T k and k = 1, , K,

$$\frac{{\mathit{dL}}_{k}\left (t\right )} {{L}_{k}\left (t\right )} = {\alpha}_{k}\left (t\right )\mathit{dt} + {\sigma}_{k}\left (t\right )d{Z}_{k}\left (t\right ) +{\mathit{dJ}}_{k}\left (t\right ),$$
(47.21)

where α k t is an unspecified drift term, Z k t is the k-th element of a Kdimensional correlated Brownian motion with a covariance matrix Ψt, and J k t is the k-th element of a Kdimensional independent pure jump process assumed independent of Z k t for all k. To introduce stochastic volatility and correlation, we could allow the volatility of each LIBOR rate σ k t and each individual element of Ψt to follow a stochastic process. But, such a model is unnecessarily complicated and difficult to implement. Instead, we consider a low dimensional model based on the first few principal components of historical LIBOR forward rates. We assume that the entire LIBOR forward curve is driven by a small number of factors N < K (N ≤ 3 in our empirical analysis). By focusing on the first N principal components of historical LIBOR rates, we can reduce the dimension of the model from K to N.

Following LSS (2001) and Han (2007), we assume that the instantaneous covariance matrix of changes in LIBOR rates share the same eigenvectors as the historical covariance matrix. Suppose that the historical covariance matrix can be approximated as H = 0 U , where Λ 0 is a diagonal matrix whose diagonal elements are the first N largest eigenvalues in descending order, and the N columns of U are the corresponding eigenvectors.Footnote 7 Our assumption means that the instantaneous covariance matrix of changes in LIBOR rates with fixed time-to-maturity, Ω t , share the same eigenvectors as H. That is

$${\Omega}_{t} = U{\Lambda}_{t}{U}^{{\prime}},$$
(47.22)

where Λ t is a diagonal matrix whose i-th diagonal element, denoted by V i t, can be interpreted as the instantaneous variance of the i-th common factor driving the yield curve evolution at t. We assume that Vt follows the square-root process that has been widely used in the literature for modeling stochastic volatility (see, e.g., Heston 1993):

$${\mathit{dV}}_{i}\left (t\right ) = {\kappa}_{i}\left (\bar{{v}}_{i} - {V}_{i}\left (t\right )\right )\mathit{dt} + {\xi}_{i}\sqrt{{V}_{i} \left (t\right )}d\!\tilde{{W}}_{i}\left (t\right )$$
(47.23)

where \(\tilde{{W}}_{i}\left (t\right )\) is the i-th element of an N-dimensional independent Brownian motion assumed independent of Z k t and J k t for all k. Footnote 8

While Equations (47.4) and (47.5) specify the instantaneous covariance matrix of LIBOR rates with fixed time-to-maturity, in applications we need the instantaneous covariance matrix of LIBOR rates with fixed maturities Σ t . At t = 0, Σ t coincides with Ω t ; for t > 0, we obtain Σ t from Ω t through interpolation. Specifically, we assume that U s, j is piecewise constant,Footnote 9 i.e., for time to maturity sT k , T k + 1,

$${U}_{s}^{2} = \frac{1} {2}\left ({U}_{k}^{2} + {U}_{k+1}^{2}\right ).$$
(47.24)

We further assume that U s, j is constant for all caplets belonging to the same difference cap. For the family of the LIBOR rates with maturities T = T 1, T 2, …T K , we denote U Tt the time-t matrix that consists of rows of \({U}_{{T}_{k}-t},\) and therefore we have the time-t covariance matrix of the LIBOR rates with fixed maturities,

$${\Sigma}_{t} = {U}_{T-t}{\Lambda}_{t}{U}_{T-t}^{{\prime}}.$$
(47.25)

To stay within the family of AJDs, we assume that the random jump times arrive with a constant intensity λ J , and conditional on the arrival of a jump, the jump size follows a normal distribution Nμ J , σ J 2. Intuitively, the conditional probability at time t of another jump within the next small time interval Δt is λ J Δt and, conditional on a jump event, the mean relative jump size is \(\mu =\exp \left ({\mu}_{J} + \frac{1} {2}{\sigma}_{J}^{2}\right ) - 1.\) Footnote 10 We also assume that the shocks driving LIBOR rates, volatility, and jumps (both jump time and size) are mutually independent from each other.

Given the above assumptions, we have the following dynamics of LIBOR rates under the physical measure \(\mathbb{P}\),

$$\begin{array}{rcl} \frac{{\mathit{dL}}_{k}\left (t\right )} {{L}_{k}\left (t\right )} & =& {\alpha}_{k}\left (t\right )\mathit{dt} +{\sum \nolimits}_{j=1}^{N}{U}_{{T}_{k}-t,j}\sqrt{{V}_{j} \left (t\right )}{\mathit{dW}}_{j}\left (t\right ) \\ & & +{\mathit{dJ}}_{k}\left (t\right ),\ k = 1,2,\ldots ,K. \end{array}$$
(47.26)

To price caps, we need the dynamics of LIBOR rates under the appropriate forward measure. The existence of stochastic volatility and jumps results in an incomplete market and hence the non-uniqueness of forward martingale measures. Our approach for eliminating this nonuniqueness is to specify the market prices of both the volatility and jump risks to change from the physical measure \(\mathbb{P}\) to the forward measure \({\mathbb{Q}}^{k+1}\).Footnote 11 Following the existing literature, we model the volatility risk premium as \({\eta}_{j}^{k+1}\sqrt{{V}_{j} \left (t\right )},\) for j = 1, , N. For the jump risk premium, we assume that under the forward measure \({\mathbb{Q}}^{k+1},\) the jump process has the same distribution as that under P, except that the jump size follows a normal distribution with mean μ J k + 1 and variance σ J 2. Thus, the mean relative jump size under \({\mathbb{Q}}^{k+1}\) is \({\mu}^{k+1} =\exp \left ({\mu}_{J}^{k+1} + \frac{1} {2}{\sigma}_{J}^{2}\right ) - 1.\) Our specification of the market prices of jump risks allows the mean relative jump size under \({\mathbb{Q}}^{k+1}\) to be different from that under \(\mathbb{P}\), accommodating a premium for jump size uncertainty. This approach, which is also adopted by Pan (2002), artificially absorbs the risk premium associated with the timing of the jump by the jump size risk premium. In our empirical analysis, we make the simplifying assumption that the volatility and jump risk premiums are linear functions of time-to-maturity, i.e., η j k + 1 = c jv T k − 1 and μ J k + 1 = μ J + c J T k − 1. Footnote 12 Due to the no arbitrage restriction, the risk premiums of shocks to LIBOR rates for different forward measures are intimately related to each other. If shocks to volatility and jumps are also correlated with shocks to LIBOR rates, then both volatility and jump risk premiums for different forward measures should also be closely related to each other. However, in our model shocks to LIBOR rates are independent of that to volatility and jumps, and as a result, the change of measure of LIBOR shocks does not affect that of volatility and jump shocks. Due to stochastic volatility and jumps, the underlying LIBOR market is no longer complete and there is no unique forward measure. This gives us the freedom to choose the functional forms of η j k + 1 and μ J k + 1. See Andersen and Brotherton-Ratcliffe (2001) for similar discussions.

Given the above market prices of risks, we can write down the dynamics of log(L k t) under forward measure \({\mathbb{Q}}^{k+1},\)

$$\begin{array}{rcl} d\log ({L}_{k}\left (t\right ))& =& -\left ({\lambda}_{J}{\mu}^{k+1} + \frac{1} {2}{\sum \nolimits}_{j=1}^{N}{U}_{{T}_{k}-t,j}^{2}{V}_{j}\left (t\right )\right )\mathit{dt} \\ & & +{\sum \nolimits}_{j=1}^{N}{U}_{{T}_{k}-t,j}\sqrt{{V}_{j} \left (t\right )}{\mathit{dW}}_{j}^{{\mathbb{Q}}^{k+1}}\left (t\right ) \\ & & +{\mathit{dJ}}_{k}^{{\mathbb{Q}}^{k+1}}\left (t\right ). \end{array}$$
(47.27)

For pricing purpose, the above process can be further simplified to the following one which has the same distribution,

$$\begin{array}{rcl} d\log ({L}_{k}\left (t\right ))& =& -\left ({\lambda}_{J}{\mu}^{k+1} + \frac{1} {2}{\sum \nolimits}_{j=1}^{N}{U}_{{T}_{k}-t,j}^{2}{V}_{j}\left (t\right )\right )\mathit{dt} \\ & & +\sqrt{{\sum \nolimits}_{j=1}^{N}{U}_{{T}_{k}-t,j}^{2}{V}_{j}\left (t\right )}{\mathit{dZ}}_{k}^{{\mathbb{Q}}^{k+1}}\left (t\right ) \\ & & +{\mathit{dJ}}_{k}^{{\mathbb{Q}}^{k+1}}\left (t\right ), \end{array}$$
(47.28)

where \({Z}_{k}^{{\mathbb{Q}}^{k+1}}\left (t\right )\) is a standard Brownian motion under \({\mathbb{Q}}^{k+1}\). Now the dynamics of V i t under \({\mathbb{Q}}^{k+1}\) becomes

$${\mathit{dV}}_{i}\left (t\right ) = {\kappa}_{i}^{k+1}\left (\bar{{v}}_{i}^{k+1} - {V}_{i}\left (t\right )\right )\mathit{dt} + {\xi}_{i}\sqrt{{V}_{i} \left (t\right )}d\!\tilde{{W}}_{i}^{{\mathbb{Q}}^{k+1}}\left (t\right )$$
(47.29)

where \(\tilde{{W}}^{{\mathbb{Q}}^{k+1}}\)is independent of \({Z}^{{\mathbb{Q}}^{k+1}},\) \({\kappa}_{j}^{k+1} = {\kappa}_{j} - {\xi}_{j}{\eta}_{j}^{k+1},\) and \(\bar{{v}}_{j}^{k+1} = \frac{{\kappa}_{j}\bar{{v}}_{j}} {{\kappa}_{j}-{\xi}_{j}{\eta}_{j}^{k+1}} ,j = 1,\ldots ,N.\) The dynamics of L k t under the forward measure \({\mathbb{Q}}^{k+1}\) are completely captured by Equations (47.28) and (47.29).

Given that LIBOR rates follow AJDs under both the physical and forward measure, we can directly apply the transform analysis of Duffie et al. (2000) to derive closed-form formula for cap prices. Denote the state variables at t as Y t = log L k t, V t and the time-t expectation of \({e}^{u\cdot {Y}_{{T}_{k}}}\) under the forward measure \({\mathbb{Q}}^{k+1}\) as \(\psi \left (u,{Y}_{t},t,{T}_{k}\right ) \triangleq {E}_{t}^{{\mathbb{Q}}^{k+1}}\left [{e}^{u\cdot {Y}_{{T}_{k}}}\right ].\) Let u = (u 0, 01 ×N ), then the time-t expectation of LIBOR rate at T k equals,

$$\begin{array}{rcl}{E}_{t}^{{\mathbb{Q}}^{k+1}}\left \{\exp \left [{u}_{0}\log \left ({L}_{k}\left ({T}_{k}\right )\right )\right ]\right \}& =& \psi \left ({u}_{0},{Y}_{t},t,{T}_{k}\right ) \\ & =& \exp \Big{[}a(s) + {u}_{0}\log ({L}_{k}\left (t\right )) \\ & & \quad \quad \ + B{(s)}^{{\prime}}{V}_{t}\Big{]},\end{array}$$
(47.30)

where s = T k t and closed-form solutions of a(s) and B(s) (an N-by-1 vector) are obtained by solving a system of Ricatti equations in the Appendix.

Following Duffie et al. (2000), we define

$${G}_{a,b}(y;\!{Y}_{t},\!{T}_{k},\! {\mathbb{Q}}^{k+1})\! =\! {E}_{t}^{{\mathbb{Q}}^{k+1}}\left [\!{e}^{a\cdot \log \left ({L}_{k}\left ({T}_{k}\right )\right )}{1}_{\left \{b\cdot \log \left ({L}_{k}\left ({T}_{k}\right )\right )\leq y\right \}}\!\right ],$$
(47.31)

and its Fourier transform,

$$\begin{array}{rcl}{\mathcal{G}}_{a,b}(v;{Y}_{t},{T}_{k}, {\mathbb{Q}}^{k+1})& =& {\int \nolimits \nolimits}_{R}{e}^{ivy}d{G}_{a,b}(y) \\ & =& {E}_{t}^{{\mathbb{Q}}^{k+1}}\left [{e}^{\left (a+ivb\right )\cdot \log \left ({L}_{k}\left ({T}_{k}\right )\right )}\right ] \\ & =& \psi \left (a + \mathit{ivb},{Y}_{t},t,{T}_{k}\right ).\end{array}$$
(47.32)

Levy’s inversion formula gives

$${G}_{a,b}(y;{Y}_{t},{T}_{k}, {\mathbb{Q}}^{k+1}) = \frac{\psi \left (a + \mathit{ivb},{Y}_{t},t,{T}_{k}\right )} {2} -\frac{1} {\pi}{\int \nolimits \nolimits}_{0}^{\infty}\frac{\mathrm{Im}\left [\psi \left (a + \mathit{ivb},{Y}_{t},t,{T}_{k}\right ){e}^{-ivy}\right ]} {v} \mathit{dv}.$$
(47.33)

The time-0 price of a caplet that matures at T k + 1 with a strike price of X equals

$$\mathit{Caplet}(0,{T}_{k+1},X) = \delta {D}_{k+1}\left (0\right ){E}_{0}^{{\mathbb{Q}}^{k+1}}\!\!\left [{\left ({L}_{k}\left ({T}_{k}\right ) - X\right )}^{+}\right ],$$
(47.34)

where the expectation is given by the inversion formula,

$$\begin{array}{rcl}{E}_{0}^{{\mathbb{Q}}^{k+1}}{\left [{L}_{k}({T}_{k}) - X\right ]}^{+}& =& {G}_{1,-1}(-\ln X;{Y}_{0},{T}_{k}, {\mathbb{Q}}^{k+1}) \\ & & -X{G}_{0,-1}(-\ln X;{Y}_{0},{T}_{k}, {\mathbb{Q}}^{k+1}). \\ & & \end{array}$$
(47.35)

The new models developed in this section nest some of the most important models in the literature, such as LSS (2001) (with constant volatility and no jumps) and Han (2007) (with stochastic volatility and no jumps). The closed-form formula for cap prices makes an empirical implementation of our model very convenient and provides some advantages over existing methods. For example, Han (2007) develops approximations of ATM cap and swaption prices using the techniques of Hull and White (1987). However, such an approach might not work well for away-from-the-money options. In contrast, our method would work well for all options, which is important for explaining the volatility smile.

In addition to introducing stochastic volatility and jumps, our multifactor HJM models also has advantages over the standard LIBOR market models of Brace et al. (1997), Miltersen et al. (1997), and their extensions often applied to caps in practice.Footnote 13 While our models provide a unified multifactor framework to characterize the evolution of the whole yield curve, the LIBOR market models typically make separate specifications of the dynamics of LIBOR rates with different maturities. As suggested by LSS (2001), the standard LIBOR models are “more appropriately viewed as a collection of different univariate models, where the relationship between the underlying factors is left unspecified.” In contrast, the dynamics of LIBOR rates with different maturities under their related forward measures are internally consistent with each other given their dynamics under the physical measure and the market prices of risks. Once our models are estimated using one set of prices, they can be used to price and hedge other fixed-income securities.

3.2 Estimation Method and Results

We estimate our new market model using prices form a wide cross section of difference caps with different strikes and maturities. Every week we observe prices of difference caps with ten moneyness and 13 maturities. However, due to changing interest rates, we do not have enough observations in all moneyness/maturity categories throughout the sample. Thus, we focus on the 53 moneyness/maturity categories that have less than ten percent of missing values over the sample estimation period. The moneyness and maturity of all difference caps belong to the following sets 0. 7, 0. 8, 0. 9, 1. 0, 1. 1 and 1. 5, 2. 0, 2. 5, 3. 0, 3. 5, 4. 0, 4. 5, 5. 0, 6. 0, 7. 0, 8. 0, 9. 0, 10. 0 (unit in years), respectively. The difference caps with time-to-maturity less than or equal to 5 years represent portfolios of two caplets, while those with time-to-maturity longer than 5 years represent portfolios of four caplets.

We estimate the model parameters by minimizing the sum of squared percentage pricing errors (SSE) of all relevant difference caps.Footnote 14 Consider the time series observations \(t = 1,\ldots ,\mathcal{T}\), of the prices of 53 difference caps with moneyness m i and time-to-maturities τ i , i = 1, , M = 53. Let θ represent the model parameters which remain constant over the sample period. Let Ct, m i , τ i be the observed price of a difference cap with moneyness m i and time-to-maturity τ i and let Ĉt, τ i , m i , V t θ, θ denote the corresponding theoretical price under a given model, where V t θ is the model implied instantaneous volatility at t given model parameters θ. For each i and t, denote the percentage pricing error as

$${u}_{i,t}\left (\theta \right ) = \frac{C\left (t,{m}_{i},{\tau}_{i}\right ) -\hat{C}\left (t,{m}_{i},{\tau}_{i},{V}_{t}\left (\theta \right ),\theta \right )} {C\left (t,{m}_{i},{\tau}_{i}\right )} ,$$
(47.36)

where V t θ is defined as

$${V}_{t}\left (\theta \right ) {=\arg \min}_{\{{V}_{t}\}}{\sum \nolimits}_{i=1}^{M}{\left [\frac{C\left (t,{m}_{i},{\tau}_{i}\right ) -\hat{C}\left (t,{m}_{i},{\tau}_{i},{V}_{t},\theta \right )} {C\left (t,{m}_{i},{\tau}_{i}\right )} \right ]}^{2}.$$
(47.37)

We provide empirical evidence on the performance of six different models in capturing the cap volatility smile. The first three models, denoted as SV1, SV2 and SV3, allow one, two, and three principal components to drive the forward rate curve, respectively, each with its own stochastic volatility. The next three models, denoted as SVJ1, SVJ2 and SVJ3, introduce jumps in LIBOR rates in each of the previous SV models. SVJ3 is the most comprehensive model and nests all the others as special cases. We first examine the separate performance of each of the SV and SVJ models, then we compare performance across the two classes of models. The estimation of all models is based on the principal components extracted from historical LIBOR forward rates between June 1997 and July 2000.Footnote 15

The SV models contribute to cap pricing in four important ways. First, the three principal components capture variations in the levels of LIBOR rates caused by innovations in the “level”, “slope”, and “curvature” factors. Second, the stochastic volatility factors capture the fluctuations in the volatilities of LIBOR rates reflected in the Black implied volatilities of ATM caps.Footnote 16 Third, the stochastic volatility factors also introduce fatter tails in LIBOR rate distributions than implied by the log-normal model, which helps capture the volatility smile. Finally, given our model structure, innovations of stochastic volatility factors also affect the covariances between LIBOR rates with different maturities. The first three factors, however, are more important for our applications, because difference caps are much less sensitive to time varying correlations than swaptions.Footnote 17 Our discussion of the performance of the SV models focuses on the estimates of the model parameters and the latent volatility variables, and the time series and cross-sectional pricing errors of difference caps.

A comparison of the parameter estimates of the three SV models in Table 47.7 shows that the “level” factor has the most volatile stochastic volatility, followed, in decreasing order, by the “curvature” and “slope” factor. The long-run mean (\(\bar{{v}}_{1}\)) and volatility of volatility (ξ1) of the first volatility factor are much bigger than that of the other two factors. This suggests that the fluctuations in the volatilities of LIBOR rates are mainly due to the time varying volatility of the “level” factor. The estimates of the volatility risk premium of the three models are significantly negative, suggesting that the stochastic volatility factors of longer maturity LIBOR rates under the forward measure are less volatile with lower long-run mean and faster speed of mean reversion. This is consistent with the fact that the Black implied volatilities of longer maturity difference caps are less volatile than that of short-term difference caps.

Table 47.7 Parameter estimates of stochastic volatility models

Our parameter estimates are consistent with the volatility variables inferred from the prices of difference caps. The volatility of the “level” factor is the highest among the three (although at lower absolute levels in the more sophisticated models). It starts at a low level and steadily increases and stabilizes at a high level in the later part of the sample period. The volatility of the “slope” factor is much lower and relatively stable during the whole sample period. The volatility of the “curvature” factor is generally between that of the first and second factors. The steady increase of the volatility of the “level” factor is consistent with the increase of Black implied volatilities of ATM difference caps throughout our sample period. In fact, the correlation between the Black implied volatilities of most difference caps and the implied volatility of the “level” factor are higher than 0.8. The correlation between Black implied volatilities and the other two volatility factors is much weaker. The importance of stochastic volatility is obvious: the fluctuations in Black implied volatilities show that a model with constant volatility simply would not be able to capture even the general level of cap prices.

The other aspects of model performance are the time series and cross-sectional pricing errors of difference caps. The likelihood ratio tests in Panel A of Table 47.8 overwhelmingly reject SV1 and SV2 in favor of SV2 and SV3, respectively. The Diebold–Mariano statistics in Panel A of Table 47.8 also show that SV2 and SV3 have significantly smaller SSEs than SV1 and SV2, respectively, suggesting that the more sophisticated SV models improve the pricing of all caps. The time series of RMSEs of the three SV models over our sample periodFootnote 18 suggest that except for two special periods where all models have extremely large pricing errors, the RMSEs of all models are rather uniform with the best model (SV3) having RMSEs slightly above 5%. The two special periods with high pricing errors cover the period between the second half of December of 2000 and the first half of January of 2001, and the first half of October 2001, and coincide with high prepayments in mortgage-backed securities (MBS). Indeed, the MBAA refinancing index and prepayment speed (see Figure 3 of Duarte 2004) show that after a long period of low prepayments between the middle of 1999 and late 2000, prepayments dramatically increased at the end of 2000 and the beginning of 2001. There is also a dramatic increase of prepayments at the beginning of October 2001. As widely recognized in the fixed-income market,Footnote 19 excessive hedging demands for prepayment risk using interest rate derivatives may push derivative prices away from their equilibrium values, which could explain the failure of our models during these two special periods.Footnote 20

Table 47.8 Comparison of the performance of stochastic volatility models

In addition to overall model performance as measured by SSEs, we also examine the cross-sectional pricing errors of difference caps with different moneyness and maturities. We first look at the squared percentage pricing errors, which measure both the bias and variability of the pricing errors. Then we look at the average percentage pricing errors (the difference between market and model prices divided by the market price) to see whether SV models can on average capture the volatility smile in the cap market.

The Diebold–Mariano statistics of squared percentage pricing errors of individual difference caps between SV2 and SV1 in Panel B of Table 47.8 show that SV2 reduces the pricing errors of SV1 for some but not all difference caps. SV2 has the most significant reductions in pricing errors of SV1 for mid- and short-term around-the-money difference caps. On the other hand, SV2 has larger pricing errors for deep ITM difference caps. The Diebold–Mariano statistics between SV3 and SV2 in Panel C of Table 47.8 show that SV3 significantly reduces the pricing errors of many short- (2–3 years) and mid-term around-the-money, and long-term (6–10 years) ITM difference caps.

Table 47.9 reports the average percentage pricing errors of all difference caps under the three SV models. Panel A of Table 47.9 shows that, on average, SV1 underprices short-term and overprices mid- and long-term ATM difference caps, and underprices ITM and overprices OTM difference caps. This suggests that SV1 cannot generate enough skewness in the implied volatilities to be consistent with the data. Panel B shows that SV2 has some improvements over SV1, mainly for some short-term (less than 3.5 years) ATM, and mid-term (3.5–5 years) slightly OTM difference caps. But SV2 has worse performance for most deep ITM (m = 0. 7 and 0. 8) difference caps: it actually worsens the underpricing of ITM caps. Panel C of Table 47.9 shows that relative to SV1 and SV2, SV3 has smaller average percentage pricing errors for most long-term (7–10 years) ITM, mid-term (3.5–5 years) OTM, and short-term (2–2.5 years) ATM difference caps, and bigger average percentage pricing errors for mid-term (3.5–6 years) ITM difference caps. There is still significant underpricing of ITM and overpricing of OTM difference caps under SV3.

Table 47.9 Average percentage pricing errors of stochastic volatility models

Overall, the results show that stochastic volatility factors are essential for capturing the time varying volatilities of LIBOR rates. The Diebold–Mariano statistics in Table 47.8 shows that in general more sophisticated SV models have smaller pricing errors than simpler models, although the improvements are more important for close-to-the-money difference caps. The average percentage pricing errors in Table 47.9 show that, however, even the most sophisticated SV model cannot generate enough volatility skew to be consistent with the data. While previous studies, such as Han (2007), have shown that a three-factor stochastic volatility model similar to ours performs well in pricing ATM caps and swaptions, our analysis shows that the model fails to completely capture the volatility smile in the cap markets. Our findings highlight the importance of studying the relative pricing of caps with different moneyness to reveal the inadequacies of existing term structure models, the same inadequacies cannot be obtained from studying only ATM options.

One important reason for the failure of SV models is that the stochastic volatility factors are independent of LIBOR rates. As a result, the SV models can only generate a symmetric volatility smile, but not the asymmetric smile or skew observed in the data. The pattern of the smile in the cap market is rather similar to that of index options: ITM calls (and OTM puts) are overpriced, and OTM calls (and ITM puts) are underpriced relative to the Black model. Similarly, the smile in the cap market could be due to a market expectation of dramatically declining LIBOR rates. In this section, we examine the contribution of jumps in LIBOR rates in capturing the volatility smile. Our discussion of the performance of the SVJ models parallels that of the SV models.

Parameter estimates in Table 47.10 show that the three stochastic volatility factors of the SVJ models resemble that of the SV models closely. The “level” factor still has the most volatile stochastic volatility, followed by the “curvature” and the “slope” factor. With the inclusion of jumps, the stochastic volatility factors in the SVJ models, especially that of the “level” factor, tend to be less volatile than that of the SV models (lower long run mean and volatility of volatility). Negative estimates of the volatility risk premium show that the volatility of the longer maturity LIBOR rates under the forward measure have lower long-run mean and faster speed of mean-reversion.

Table 47.10 Parameter estimates of stochastic volatility and jumps models

Most importantly, we find overwhelming evidence of strong negative jumps in LIBOR rates under the forward measure. To the extend that cap prices reflect market expectations of future evolutions of LIBOR rates, the evidence suggests that the market expects a dramatic declining in LIBOR rates over our sample period. Such an expectation might be justifiable given that the economy has been in recession during a major part of our sample period. This is similar to the volatility skew in the index equity option market, which reflects investors fear of the stock market crash such as that of 1987. Compared to the estimates from index options (see, e.g., Pan 2002), we see lower estimates of jump intensity (about 1.5% per annual), but much higher estimates of jump size. The positive estimates of a jump risk premium suggest that the jump magnitude of longer maturity forward rates tend to be smaller. Under SVJ3, the mean relative jump size, expμ J + c J T k − 1 + σ J 2 ∕ 2 − 1, for one, five, and ten year LIBOR rates are − 97%, − 94%, and − 80%, respectively. However, we do not find any incidents of negative moves in LIBOR rates under the physical measure with a size close to that under the forward measure. This big discrepancy between jump sizes under the physical and forward measures resembles that between the physical and risk-neutral measure for index options (see, e.g., Pan 2002). This could be a result of a huge jump risk premium.

The likelihood ratio tests in Panel A of Table 47.11 again overwhelmingly reject SVJ1 and SVJ2 in favor of SVJ2 and SVJ3, respectively. The Diebold–Mariano statistics in Panel A of Table 47.11 show that SVJ2 and SVJ3 have significantly smaller SSEs than SVJ1 and SVJ2, respectively, suggesting that the more sophisticated SVJ models significantly improve the pricing of all difference caps. The Diebold–Mariano statistics of squared percentage pricing errors of individual difference caps in Panel B of Table 47.11 show that SVJ2 significantly improves the performance of SVJ1 for long-, mid-, and short-term around-the-money difference caps. The Diebold–Mariano statistics in Panel C of Table 47.11 show that SVJ3 significantly reduces the pricing errors of SVJ2 for long-term ITM, and some mid- and short-term around-the-money difference caps. Table 47.12 shows the average percentage pricing errors also improve over the SV models.

Table 47.11 Comparison of the performance of stochastic volatility and jump models
Table 47.12 Average percentage pricing errors of stochastic volatility and jump models

Table 47.13 compares the performance of the SVJ and SV models. During the first 20 weeks of our sample, the SVJ models have much higher RMSEs than the SV models. As a result, the likelihood ratio and Diebold–Mariano statistics between the three pairs of SVJ and SV models over the entire sample are somewhat smaller than that of the sample period without the first 20 weeks. Nonetheless, all the SV models are overwhelmingly rejected in favor of their corresponding SVJ models by both tests. The Diebold–Mariano statistics of individual difference caps in Panel B, C, and D show that the SVJ models significantly improve the performance of the SV models for most difference caps across moneyness and maturity. The most interesting results are in Panel D, which show that SVJ3 significantly reduces the pricing errors of most ITM difference caps of SV3, strongly suggesting that the negative jumps are essential for capturing the asymmetric smile in the cap market.

Table 47.13 Comparison of the performance of SV and SVJ models

Our analysis shows that a low dimensional model with three principal components driving the forward rate curve, stochastic volatility of each component, and strong negative jumps captures the volatility smile in the cap markets reasonably well. The three yield factors capture the variations of the levels of LIBOR rates, while the stochastic volatility factors are essential to capture the time varying volatilities of LIBOR rates. Even though the SV models can price ATM caps reasonably well, they fail to capture the volatility smile in the cap market. Instead, significant negative jumps in LIBOR rates are needed to capture the smile. These results highlight the importance of studying the pricing of caps across moneyness: the importance of negative jumps is revealed only through the pricing of alway-from-the-money caps. Excluding the first 20 weeks and the two special periods, SVJ3 has a reasonably good pricing performance with an average RMSEs of 4.5%. Given that the bid-ask spread is about 2–5% in our sample for ATM caps, and because ITM and OTM caps tend to have even higher percentage spreads,Footnote 21 this cam be interpreted as a good performance.

Despite its good performance, there are strong indications that SVJ3 is misspecified and the inadequacies of the model seem to be related to MBS markets. For example, while SVJ3 works reasonably well for most of the sample period, it has large pricing errors in several special periods coinciding with high prepayment activities in the MBS markets. Moreover, even though we assume that the stochastic volatility factors are independent of LIBOR rates, Table 47.14 shows strong negative correlations between the implied volatility variables of the first factor and the LIBOR rates. This result suggests that when interest rate is low, cap prices become too high for the model to capture and the implied volatilities have to become abnormally high to fit the observed cap prices. One possible explanation of the “leverage” effect is that higher demands for caps to hedge prepayments from MBS markets in low interest rate environments could artificially push up cap prices and implied volatilities. Therefore, extending our models to incorporate factors from MBS markets seems to be a promising direction of future research.

Table 47.14 Correlations between LIBOR rates and implied volatility variables

4 Nonparametric Estimation of the Forward Density

For LIBOR-based instruments such as caps, floors, and swaptions, it is convenient to consider pricing using the forward measure approach. We will therefore focus on the dynamics of LIBOR forward rate L k t under the forward measure \({\mathbb{Q}}^{k+1}\), which is essential for pricing caplets maturing at T k + 1. Under this measure, the discounted price of any security using D k + 1 t as the numeraire is a martingale. Thus, the time-t price of a caplet maturing at T k + 1 with a strike price of X is

$$\begin{array}{rcl} C\left ({L}_{k}\left (t\right ),X,t,{T}_{k}\right )& =& \delta {D}_{k+1}\left (t\right ){\int \nolimits \nolimits}_{X}^{\infty}\left (y - X\right ){p}^{{\mathbb{Q}}^{k+1}} \\ & & \times \left ({L}_{k}\left ({T}_{k}\right ) = y\vert {L}_{k}\left (t\right )\right )dy, \\ & & \end{array}$$
(47.38)

where \({p}^{{\mathbb{Q}}^{k+1}}\left ({L}_{k}\left ({T}_{k}\right ) = y\vert {L}_{k}\left (t\right )\right )\) is the conditional density of L k T k under forward measure \({\mathbb{Q}}^{k+1}.\) Once we know the forward density, we can price any security whose payoff on T k + 1 depends only on L k t by discounting its expected payoff under \({\mathbb{Q}}^{k+1} using {D}_{k+1}\left (t\right ).\)

Existing term structure models rely on parametric assumptions on the distribution of L k t to obtain closed-form pricing formulae for caplets. For example, the standard LIBOR market models of Brace et al. (1997) and Miltersen et al. (1997) assume that L k t follows a log-normal distribution and price caplet using the Black formula. The models of Jarrow et al. (2007) assume that L k t follows affine jump-diffusions of Duffie et al. (2000).

4.1 Nonparametric Method

We estimate the distribution of L k t under \({\mathbb{Q}}^{k+1}\) using the prices of a cross section of caplets that mature at T k + 1 and have different strike prices. Following Breeden and Litzenberger (1978), we know that the density of L k t under \({\mathbb{Q}}^{k+1}\) is proportional to the second derivative of CL k t, t, T k , X with respect to X,

$${p}^{{\mathbb{Q}}^{k+1}}\left ({L}_{k}\left ({T}_{k}\right )\vert {L}_{k}\left (t\right )\right ) = \frac{1} {\delta {D}_{k+1}\left (t\right )} \frac{{\partial}^{2}C\left ({L}_{k}\left (t\right ),t,{T}_{k},X\right )} {\partial {X}^{2}} {\vert}_{X={L}_{k}\left ({T}_{k}\right )}.$$
(47.39)

In standard LIBOR market models, it is assumed that the conditional density of L k T k depends only on the current LIBOR rate. This assumption, however, can be overly restrictive given the multifactor nature of term structure dynamics. For example, while the level factor can explain a large fraction (between 80 and 90%) of the variations of LIBOR rates, the slope factor still has significant explanatory power of interest rate variations. Moreover, there is overwhelming evidence that interest rate volatility is stochastic,Footnote 22 and it has been suggested that interest rate volatility is unspanned in the sense that it can not be fully explained by the yield curve factors such as the level and slope factors.

One important innovation of our study is that we allow the volatility of L k t to be stochastic and the conditional density of L k T k to depend on not only the level, but also the slope and volatility factors of LIBOR rates. Denote the conditioning variables as Zt = {s(t), vt}, where s(t) (the slope factor) is the difference between the 10- and 2-year LIBOR forward rates and vt (the volatility factor) is the first principal component of EGARCH-filtered spot volatilities of LIBOR rates across all maturities. Under this generalization, the conditional density of L k T k under the forward measure \({\mathbb{Q}}^{k+1}\) is given by

$${p}^{{\mathbb{Q}}^{k+1}}\left ({L}_{k}\left ({T}_{k}\right )\vert {L}_{k}\left (t\right ),Z\left (t\right )\right ) = \frac{1} {\delta {D}_{k+1}\left (t\right )} \frac{{\partial}^{2}C\left ({L}_{k}\left (t\right ),X,t,{T}_{k},Z\left (t\right )\right )} {\partial {X}^{2}} {\vert}_{X={L}_{k}\left ({T}_{k}\right )}.$$
(47.40)

Next we discuss how to estimate the SPDs by combining the forward and physical densities of LIBOR rates. Denote a SPD function as π. In general, π depends on multiple economic factors, and it is impossible to estimate it using interest rate caps alone. Given the available data, all we can estimate is the projection of π onto the future spot rate L k T k :

$${\pi}_{k}\left ({L}_{k}({T}_{k});{L}_{k}(t),Z(t)\right ) = {E}_{t}^{\mathbb{P}}\left [\pi \vert {L}_{k}({T}_{k});{L}_{k}(t),Z(t)\right ],$$
(47.41)

where the expectation is taken under the physical measure. Then the price of the caplet can be calculated as

$$\begin{array}{rcl} C\left ({L}_{k}\left (t\right ),X,t,{T}_{k},Z\left (t\right )\right )& =& \delta {E}_{t}^{\mathbb{P}}\left [\pi \cdot {\left ({L}_{k}\left ({T}_{k}\right ) - X\right )}^{+}\right ] \\ & =& \delta {\int \nolimits \nolimits}_{X}^{\infty}{\pi}_{k}\left (y\right )\left (y - X\right ){p}^{\mathbb{P}}\left ({L}_{k}\left ({T}_{k}\right ) = y\vert {L}_{k}\left (t\right ),Z\left (t\right )\right )dy,\end{array}$$
(47.42)

where the second equality is due to iterated expectation and \({p}^{\mathbb{P}}\left ({L}_{k}\left ({T}_{k}\right ) = y\vert {L}_{k}\left (t\right ),Z\left (t\right )\right )\) is the conditional density of L k T k under the physical measure.

Comparing Equations (47.2) and (47.6), we have

$${\pi}_{k}\left ({L}_{k}({T}_{k});{L}_{k}(t),Z(t)\right ) = {D}_{k+1}\left (t\right )\frac{{p}^{{\mathbb{Q}}^{k+1}}\left ({L}_{k}\left ({T}_{k}\right )\vert {L}_{k}\left (t\right ),Z\left (t\right )\right )} {{p}^{\mathbb{P}}\left ({L}_{k}\left ({T}_{k}\right )\vert {L}_{k}\left (t\right ),Z\left (t\right )\right )}.$$
(47.43)

Therefore, by combining the densities of L k T k under \({\mathbb{Q}}^{k+1}\) and \(\mathbb{P},\) we can estimate the projection of π onto L k (T k ). The SPDs contain rich information on how risks are priced in financial markets. While Aït-Sahalia and Lo (1998, 2000), Jackwerth (2000), Rosenberg and Engle (2002), and others estimate the SPDs using index options (i.e., the projection of π onto index returns), our analysis based on interest rate caps documents the dependence of the SPDs on term structure factors.

Similar to many existing studies, to reduce the dimensionality of the problem, we further assume that the caplet price is homogeneous of degree 1 in the current LIBOR rate:

$$C\left ({L}_{k}\left (t\right ),X,t,{T}_{k},Z\left (t\right )\right ) = \delta {D}_{k+1}\left (t\right ){L}_{k}\left (t\right ){C}_{M}\left ({M}_{k}(t),t,{T}_{k},Z\left (t\right )\right ),$$
(47.44)

where M k (t) = XL k t represents the moneyness of the caplet. Hence, for the rest of the paper we estimate the forward density of L k T k L k t as the second derivative of the price function C M with respect to M :

$${p}^{{\mathbb{Q}}^{k+1}}\left (\frac{{L}_{k}\left ({T}_{k}\right )} {{L}_{k}\left (t\right )} \vert Z\left (t\right )\right ) = \frac{1} {\delta {D}_{k+1}\left (t\right )} \frac{{\partial}^{2}{C}_{M}\left ({M}_{k}(t),t,{T}_{k},Z\left (t\right )\right )} {\partial {M}^{2}} {\vert}_{M={L}_{k}\left ({T}_{k}\right )/{L}_{k}\left (t\right )}.$$
(47.45)

4.2 Empirical Results

In this section, we present nonparametric estimates of the probability densities of LIBOR rates under physical and forward martingale measures. In particular, we document the dependence of the forward densities on the slope and volatility factors of LIBOR rates.

Figure 47.2 presents nonparametric estimates of the forward densities at different levels of the slope and volatility factors at 2, 3, 4, and 5 year maturities. The two levels of the slope factor correspond to a flat and a steep forward curve, while the two levels of the volatility factor represent low and high volatility of LIBOR rates. The 95% confidence intervals are obtained through simulation. The forward densities should have a zero mean since LIBOR rates under appropriate forward measures are martingales. The expected log percentage changes of the LIBOR rates are slightly negative due to an adjustment from the Jensen’s inequality. We normalize the forward densities so that they integrate to one. However, we do not have enough data at the right tail of the distribution at 4 and 5 year maturities. We do not extrapolate the data to avoid potential biases.

Fig. 47.2
figure 2figure 2

Nonparametric estimates of the LIBOR forward densities at different levels of the slope and volatility factors. The slope factor is defined as the difference between the 10 and 2-year 3-month LIBOR forward rates. The volatility factor is defined as the first principal component of EGARCH-filtered spot volatilities and has been normalized to a mean that equals one. The two levels of the slope factor correspond to flat and steep term structures, while the two levels of the volatility factor corresponds to low and high levels of volatility

Figure 47.2 documents three important features of the nonparametric LIBOR forward densities. First, the log-normal assumption underlying the popular LIBOR market models is grossly violated in the data, and the forward densities across all maturities are significantly negatively skewed. Second, all the forward densities depend significantly on the slope of the term structure. For example, moving from a flat to a steep term structure, the forward densities across all maturities become much more dispersed and more negatively skewed. Third, the forward densities also depend on the volatility factor. Under both flat and steep term structures, the forward densities generally become more compact when the volatility factor increases. This is consistent with a mean reverting volatility process: High volatility right now leads to low volatility in the future and more compact forward densities.

To better illustrate the dependence of the forward densities on the two conditioning variables, we also regress the quantiles of the forward densities on the two factors. We choose quantiles instead of moments of the forward densities in our regressions for two reasons. First, quantiles are much easier to estimate. While quantiles can be obtained from the CDF function, which is the first derivative of the price function, moments require integrations of the forward density, which is the second derivative of the price function. Second, a wide range of quantiles provide a better characterization of the forward densities than a few moments, especially for the tail behaviors of the densities.

Suppose we consider I and J levels of the transformed slope and volatility factors in our empirical analysis. For a given level of the two conditioning variables s i , v j , we first obtain a nonparametric estimate of the forward density at a given maturity and its quantiles Q x s i , v j , where x can range from 0 to 100%. Then we consider the following regression model

$${Q}_{x}\left ({s}_{i},{v}_{j}\right ) = {b}_{0x} + {b}_{1x} \cdot {s}_{i} + {b}_{2x} \cdot {v}_{j} + {b}_{3x} \cdot {s}_{i} \cdot {v}_{j} + {\epsilon}_{x},$$
(47.46)

where i = 1, 2, , I, and j = 1, 2, , J. We include the interaction term to capture potential nonlinear dependence of the forward densities on the two conditioning variables.

Figure 47.3 reports regression coefficients of the slope and volatility factors for the most complete range of quantiles at each maturity, i.e., b 1x and b 2x as a function of x. While Fig. 47.2 includes only the slope and volatility factors as explanatory variables, Fig. 47.4 contains their interaction term as well. Though in results not reported here we also include lagged conditioning variables in our regressions, their coefficients are generally not statistically significant.

Fig. 47.3
figure 3

Impacts of the slope and volatility factors on LIBOR forward densities. This figure reports regression coefficients of different quantiles of the forward densities at 2, 3, 4, and 5 year maturities on the slope and volatility factors of LIBOR rates in Equation (47.27) without the interaction term

Fig. 47.4
figure 4figure 4

Impacts of the slope and volatility factors (with their interaction term) on LIBOR forward densities. This figure reports regression coefficients of different quantiles of the forward densities at 2, 3, 4, and 5 year maturities on the slope and volatility factors of LIBOR rates and their interaction term in Equation (47.27)

The regression results in Fig. 47.3 are generally consistent with the main findings in Fig. 47.2. The slope coefficients are generally negative (positive) for the left (right) half of the distribution and become more negative or positive at both tails. Consistent with Fig. 47.2, this result suggests that when the term structure steepens, the forward densities become more dispersed and the effect is more pronounced at both tails. One exception to this result is that the slope coefficients become negative and statistically insignificant at the right tail at 5 year maturity. The coefficients of the volatility factor are generally positive (negative) for the left (right) half of the distribution. Although the volatility coefficients start to turn positive at the right tail of the distribution, they are not statistically significant. These results suggest that higher volatility leads to more compact forward densities, a result that is generally consistent with that in Fig. 47.2.

In Fig. 47.4, although the slope coefficients exhibit similar patterns as that in Fig. 47.3, the interaction term changes the volatility coefficients quite significantly. The volatility coefficients become largely insignificant and exhibit quite different patterns than those in Fig. 47.3. For example, the volatility coefficients at 2 and 3 year maturities are largely constant across different quantiles. At 4 and 5 year maturities, they even become negative (positive) for the left (right) half of the distribution. On the other hand, the coefficients of the interaction term exhibit similar patterns as that of the volatility coefficients in Fig. 47.3. These results suggest that the impacts of volatility on the forward densities depend on the slope of the term structure.

Figure 47.5 presents the volatility coefficients at different levels of the slope factor (i.e., \(\hat{{b}}_{2x} +\hat{{b}}_{3x} \cdot {s}_{i},\) where s i = 0. 3 or 2.4). We see clearly that the impact of volatility on the forward densities depends significantly on the slope factor. With a flat term structure, the volatility coefficients generally increase with the quantiles, especially at 3, 4, and 5 year maturities. The volatility coefficients are generally negative (positive) for the left (right) tail of the distribution, although not all of them are statistically significant. However, with a steep term structure, the volatility coefficients are generally positive (negative) for the left (right) half of the distribution for most maturities. Therefore, if the current volatility is high and the term structure is flat (steep), then volatility is likely to increase (decline) in the future. We observe flat term structure during early part of our sample when the Fed has raised interest rate to slow down the economy. It could be that the market was more uncertain about future state of the economy because it felt that recession was imminent. On the other hand, we observe steep term structure after the internet bubble bursted and the Fed has aggressively cut interest rate. It could be that the market felt that the worst was over and thus was less uncertain about future state of the economy.

Fig. 47.5
figure 5

Nonlinear dependence of LIBOR forward densities on the volatility factor of LIBOR rates. This figure presents regression coefficients of quantiles of LIBOR forward densities on the volatility factor at different levels of the slope factor. The two levels of the slope factor represent flat and steep term structures

Our nonparametric analysis reveals important nonlinear dependence of the forward densities on both the slope and volatility factors of LIBOR rates. These results have important implications for one of the most important and controversial topics in the current term structure literature, namely the USV puzzle. While existing studies on USV mainly rely on parametric methods, our results provide nonparametric evidence on the importance of USV: Even after controlling for important bond market factors, such as level and slope, the volatility factor still significantly affects the forward densities of LIBOR rates. Even though many existing term structure models have modelled volatility as a mean-reverting process, our results show that the speed of mean reversion of volatility is nonlinear and depends on the slope of the term structure.

Some recent studies have documented interactions between activities in mortgage and interest rate derivatives markets. For example, in an interesting study, Duarte (2008) shows that ATM swaption implied volatilities are highly correlated with prepayment activities in the mortgage markets. Duarte (2008) extends the string model of Longstaff et al. (2001) by allowing the volatility of LIBOR rates to be a function of the prepayment speed in the mortgage markets. He shows that the new model has much smaller pricing errors for ATM swaptions than the original model with a constant volatility or a CEV model. Jarrow et al. (2007) also show that although their LIBOR model with stochastic volatility and jumps can price caps across moneyness reasonably well, the model pricing errors are unusually large during a few episodes with high prepayments in MBS. These findings suggest that if activities in the mortgage markets, notably the hedging activities of government sponsored enterprises, such as Fannie Mae and Freddie Mac, affect the supply/demand of interest rate derivatives, then this source of risk may not be fully spanned by the factors driving the evolution of the term structure.Footnote 23

In this section, we provide nonparametric evidence on the impact of mortgage activities on LIBOR forward densities. Our analysis extends Duarte (2008) in several important dimensions. First, by considering caps across moneyness, we examine the impacts of mortgage activities on the entire forward densities. Second, by explicitly allowing LIBOR forward densities to depend on the slope and volatility factors of LIBOR rates, we examine whether prepayment still has incremental contributions in explaining interest rate option prices in the presence of these two factors.Footnote 24 Finally, in addition to prepayment activities, we also examine the impacts of ARMs origination on the forward densities. Implicit in any ARM is an interest rate cap, which caps the mortgage rate at a certain level. Since lenders of ARMs implicitly sell a cap to the borrower, they might have incentives to hedge such exposures.Footnote 25

Our measures of prepayment and ARMs activities are the weekly refinancing and ARMs indexes based on the weekly surveys conducted by MBAA, respectively. The two indexes, as plotted in Fig. 47.6, tend to be positively correlated with each other. There is an upward trend in ARMs activities during our sample period, which is consistent with what happened in the housing market in the past few years.

Fig. 47.6
figure 6

Mortgage Bankers Association of America (MBAA) weekly refinancing and ARMs indexes. This figure reports the logs of the refinance and ARMs indexes obtained by weekly surveys at the (MBAA)

To examine the impacts of mortgage activities on LIBOR forward densities, we repeat the above regressions by including two additional explanatory variables that measure refinance and ARMs activities. Specifically, we refer to the top 20% of the observations of the refinance (ARMs) index as the high prepayment (ARMs) group. After obtaining a nonparametric forward density at a particular level of the two conditioning variables, we define two new variables “Refi” and “ARMs,” which measure the percentages of observations used in estimating the forward density that belong to the high prepayment and ARMs groups, respectively. These two variables allow us to test whether the forward densities behave differently when prepayment/ARMs activities are high. To control for potential collinearity among the explanatory variables, we have orthogonalized any new explanatory variable with respect to existing ones.

Figure 47.7 contain the new regression results with “Refi” and “ARMs” for the four maturities. The coefficients of the slope, volatility, and the interaction term exhibit similar patterns as that in Fig. 47.4.Footnote 26

Fig. 47.7
figure 7figure 7

Impacts of refinance and ARMs activities on LIBOR forward densities. In this figure, for each quantile of LIBOR forward densities at 2, 3, 4, and 5 year maturities, we report regression coefficients of the quantile on (1) the slope and volatility factors and their interaction term as in Equation 47.27; and (2) refinance and ARMs activities

The strongest impacts of ARMs on the forward densities occur at 2 year maturity, as shown in Panel A of Fig. 47.7. Therefore, high ARMs origination shifts the median and the right tail of the forward densities at 2 year maturity toward the right. This finding is consistent with the notion that hedging demands from ARMs lenders for the cap they have shorted might increase the price of OTM caps. One possible reason that the effects of ARMs are more pronounced at 2 year maturity than at 3, 4, and 5 year maturities is that most ARMs get reset within the first 2 years. While high ARMs activities shift the forward density at 2 year maturity to the right, high refinance activities shift the forward densities at 3, 4, and 5 year maturities to the left. We see that the coefficients of Refi at the left tail are significantly negative. While the coefficients also are significantly negative for the middle of the distribution (40–70% quantiles), the magnitude of the coefficients are much smaller. These can be seen in Panels B, C, and D of Fig. 47.7. Therefore, high prepayment activities lead to much more negatively skewed forward densities. This result is consistent with the notion that investors in MBS might demand OTM floors to hedge their potential losses from prepayments. The coefficients of Refi are more significant at 4 and 5 year maturities because the duration of most of MBS are close to 5 years.

Our results confirm and extend the findings of Duarte (2008) by showing that mortgage activities affect the entire forward density and consequently the pricing of interest rate options across moneyness. While prepayment activities affect the left tail of the forward densities at intermediate maturities, ARMs activities affect the right tail of the forward densities at short maturity. Our findings hold even after controlling for the slope and volatility factors and suggest that part of the USV factors could be driven by activities in the mortgage markets.

5 Conclusion

The unspanned stochastic volatility puzzle is one of the most important topics in the current term structure modeling. Similar to the stochastic volatility in the equity options literature, the existence of USV challenges the benchmark in the current term structure literature, the dynamic term structure models. But it also in part explains why the practioners generally apply the HJM type of models for interest rate derivatives, where sometimes the models are applied in an inconsistent manner across securities. Unlike the equity options literature where the underlying follows a univariate process, it is more difficult to argue the stochastic volatilities of yields are not spanned by the existing yield curve factors. We in this paper review the current literature, which is mostly in support of the USV using either bonds data or both bonds and derivatives data. We present the results in Li and Zhao (2006) that the DTSMs have serious difficulty in hedging against the interest rate caps. We also present the results from Li and Zhao (2008) where they show nonparametrically both the actual volatility of interest rates and the liquidity component of the implied volatility affect the derivative prices after controlling for the yield curve factors. This paper also presents the model developed in Jarrow, Li and Zhao (2007), which is quite rich parametrically to capture a spectrum of derivative prices. We can expect that the USV will have the similar effect on interest rate derivatives as the stochastic volatility on the equity options literature with many more issues to be addressed in the future.

6 Appendix 47A The Derivation for QTSMs

To guarantee the stationarity of the state variables, we assume that ξ permits the following eigenvalue decomposition,

$$\xi = U\Lambda {U}^{-1}$$

where Λ is the diagonal matrix of the eigenvalues that take negative values, Λdiagλ i N , and U is the matrix of the eigenvectors of ξ, Uu 1u 2 ⋯ u N . The conditional distribution of the state variables X t is multivariate Gaussian with conditional mean

$$\begin{array}{rcl} E\left [{X}_{t+\Delta t}\vert {X}_{t}\right ]& =& U{\Lambda}^{-1}\left [\Phi - {I}_{N}\right ]{U}^{-1}\mu \\ & & +U{\Lambda}^{-1}\left [\Phi - {I}_{N}\right ]{U}^{-1}{X}_{t}\end{array}$$
(47.47)

and conditional variance

$$\mathit{var}\left [{X}_{t+\Delta t}\vert {X}_{t}\right ] = U\Theta {U}^{{\prime}}$$
(47.48)

where Φ is a diagonal matrix with elements exp(λ i Δt) for i = 1, , N, Θ is a N-by-N matrix with elements

$$\left [ \frac{{v}_{ij}} {{\lambda}_{i} + {\lambda}_{j}}\left ({e}^{\Delta t\left ({\lambda}_{i}+{\lambda}_{j}\right )} - 1\right )\right ],$$

where v ij N ×N = U − 1ΣΣ U − 1.

With the specification of market price of risk, we can relate the risk-neutral measure Q to the physical one P as follows,

$$E\left [\frac{dQ} {dP}\vert {\mathcal{F}}_{t}\right ] =\exp \left [-{\int \nolimits \nolimits}_{0}^{t}\zeta {({X}_{s})}^{{\prime}}d{W}_{s} -\frac{1} {2}{\int \nolimits \nolimits}_{0}^{t}\zeta {({X}_{s})}^{{\prime}}\zeta ({X}_{s})ds\right ],\text{for}t \leq T.$$
(47.49)

Applying Girsanov’s theorem, we obtain the risk-neutral dynamics of the state variables

$$d{X}_{t} = \left [\delta + \gamma {X}_{t}\right ]dt + \Sigma d{W}_{t}^{Q}$$

where \(\delta = \mu - \Sigma {\eta}_{0},\gamma = \xi - \Sigma {\eta}_{1},\) and W t Q is an N-dimensional standard Brownian motion under measure Q.

Under the above assumptions, a large class of fixed-income securities can be priced in (essentially) closed-form (see Leippold and Wu 2002). We discuss the pricing of zero-coupon bonds below and the pricing of caps. Let V (t, τ) be the time-t value of a zero-coupon bond that pays 1 dollar at time T τ = Tt. In the absence of arbitrage, the discounted value process exp − ∫0 t rX s dsV (t, τ) is a Q − martingale. Thus the value function must satisfy the fundamental PDE, which requires the bond’s instantaneous return equals the risk-free rate,

$$\begin{array}{rcl} & & \frac{1} {2}tr\left (\Sigma {\Sigma}^{{\prime}}\frac{{\partial}^{2}V (t,\tau )} {\partial {X}_{t}\partial {X}_{t}^{{\prime}}}\right ) + \frac{\partial V (t,\tau )} {\partial {X}_{t}^{{\prime}}} \left (\delta + \gamma {X}_{t}\right ) \\ & & \quad + \frac{\partial V (t,\tau )} {\partial t} = {r}_{t}V (t,\tau ) \end{array}$$
(47.50)

with the terminal condition V (t, 0) = 1. The solution takes the form

$$V (t,\tau ) =\exp \left [-{X}_{t}^{{\prime}}A(\tau ){X}_{t} - b{(\tau )}^{{\prime}}{X}_{t} - c(\tau )\right ],$$

where A(τ), b(τ) and c(τ) satisfy the following system of ordinary differential equations (ODEs),

$$\begin{array}{rcl} \frac{\partial A\left (\tau \right )} {\partial \tau} & =& \Psi + A(\tau )\gamma + {\gamma}^{{\prime}}A(\tau ) - 2A(\tau )\Sigma {\Sigma}^{{\prime}}A(\tau ); \\ \frac{\partial b\left (\tau \right )} {\partial \tau} & =& \beta + 2A(\tau )\delta + {\gamma}^{{\prime}}b(\tau ) - 2A(\tau )\Sigma {\Sigma}^{{\prime}}b\left (\tau \right ); \\ \frac{\partial c\left (\tau \right )} {\partial \tau} & =& \alpha + b{(\tau )}^{{\prime}}\delta -\frac{1} {2}b{(\tau )}^{{\prime}}\Sigma {\Sigma}^{{\prime}}b\left (\tau \right ) \\ & & +\mathit{tr}\left [\Sigma {\Sigma}^{{\prime}}A(\tau )\right ]; \\ \text{with}A(0)& =& {0}_{N\times N};b(0) = {0}_{N};c(0) =0. \end{array}$$

Consequently, the yield-to-maturity, y(t, τ), is a quadratic function of the state variables

$$y(t,\tau ) = \frac{1} {\tau}\left [{X}_{t}^{{\prime}}A(\tau ){X}_{t} + b{(\tau )}^{{\prime}}{X}_{t} + c(\tau )\right ].$$

In contrast, in the ATSMs the yields are linear in the state variables and therefore the correlations among the yields are solely determined by the correlations of the state variables. Although the state variables in the QTSMs follow multivariate Gaussian process, the quadratic form of the yields helps to model the time varying volatility and correlation of bond yields.

Leippold and Wu (2002) show that a large class of fixed-income securities can be priced in closed-form in the QTSMs using the transform analysis of Duffie et al. (2001). They show that the time-t value of a contract that has an exponential quadratic payoff structure at terminal time T, i.e.,

$$\exp \left (-q({X}_{T})\right ) =\exp \left (-{X}_{T}^{{\prime}}\overline{A}{X}_{T} -{\overline{b}}^{{\prime}}{X}_{T} -\overline{c}\right )$$

has the following form

$$\begin{array}{rcl} \psi \left (q,{X}_{t},t,T\right )& =& {E}_{Q}\left ({e}^{-{\int \nolimits \nolimits}_{t}^{T}r\left ({X}_{s}\right )ds}{e}^{-q({X}_{T})}\vert {\mathcal{F}}_{t}\right ) \\ & =& \exp \left [-{X}_{t}A(T - t){X}_{t} - b{(T - t)}^{{\prime}}{X}_{t} - c(T - t)\right ]\end{array}.$$
(47.51)

where A(. ), b(. ) and c(. ) satisfy the ODEs (4)-(6) with the initial conditions \(A(0) = \overline{A},b(0) = \overline{b}\) and\(\ c(0) = \overline{c}.\)

The time-t price a call option with payoff \({\left ({e}^{-q({X}_{T})} - y\right )}^{+}\) at T = t + τ equals

$$\begin{array}{rcl} C\left (q,y,{X}_{t},\tau \right )& =& {E}_{Q}\left ({e}^{-{\int \nolimits \nolimits}_{t}^{T}r\left ({X}_{s}\right )ds}{\left ({e}^{-q({X}_{T})} - y\right )}^{+}\vert {\mathcal{F}}_{t}\right ) \\ & =& {E}_{Q}\left ({e}^{-{\int \nolimits \nolimits}_{t}^{T}r\left ({X}_{s}\right )ds}\left ({e}^{-q({X}_{T})} - y\right ){\mathbf{1}}_{\left \{-q({X}_{T})\geq \ln \left (y\right )\right \}}\vert {\mathcal{F}}_{t}\right ) \\ & =& {G}_{q,q}\left (-\ln \left (y\right ),{X}_{t},\tau \right ) - y{G}_{0,q}\left (-\ln \left (y\right ),{X}_{t},\tau \right ), \\ \end{array}$$

where \({G}_{{q}_{1},{q}_{2}}\left (y,{X}_{t},\tau \right ) =\) \({E}_{Q}\left [{e}^{-{\int \nolimits \nolimits}_{t}^{T}r\left ({X}_{s}\right )ds}{e}^{-{q}_{1}({X}_{T})}{\mathbf{1}}_{\left \{{q}_{2}({X}_{T})\leq y\right \}}\vert {\mathcal{F}}_{t}\right ]\) and can be computed by the inversion formula,

$${G}_{{q}_{1},{q}_{2}}\left (y,{X}_{t},\tau \right ) = \frac{\psi \left ({q}_{1},{X}_{t},t,T\right )} {2} -\frac{1} {\pi}{\int \nolimits \nolimits}_{0}^{\infty}\frac{{e}^{\mathit{ivy}}\psi \left ({q}_{1} +{\mathit{ivq}}_{2}\right ) - {e}^{-\mathit{ivy}}\psi \left ({q}_{1} -{\mathit{ivq}}_{2}\right )} {\mathit{iv}} \mathit{dv}.$$
(47.52)

Similarly, the price of a put option is

$$P\left (q,y,\tau ,{X}_{t}\right ) = y{G}_{0,-q}\left (\ln \left (y\right ),{X}_{t},\tau \right ) - {G}_{q,-q}\left (\ln \left (y\right ),{X}_{t},\tau \right ).$$

We are interested in pricing a cap which is portfolio of European call options on future interest rates with a fixed strike price. For simplicity, we assume the face value is 1 and the strike price is \(\overline{r}\). At time 0, let τ, 2τ, , nτ be the fixed dates for future interest payments. At each fixed date kτ, the \(\overline{r}\)-capped interest payment is given by \(\tau {\left (\mathcal{R}\left (\left (k - 1\right )\tau ,k\tau \right ) -\overline{r}\right )}^{+},\) where ℛk − 1τ, kτ is the τ-year floating interest rate at time k − 1τ, defined by

$$\frac{1} {1 + \tau \mathcal{R}\left (\left (k - 1\right )\tau ,k\tau \right )} = \varrho \left (\left (k - 1\right )\tau ,k\tau \right ) = {E}^{Q}\left (\exp \left (-{\int \nolimits \nolimits}_{\left (k-1\right )\tau}^{k\tau}r\left ({X}_{s}\right )ds\right )\vert {\mathcal{F}}_{\left (k-1\right )\tau}\right ).$$

The market value at time 0 of the caplet paying at date kτ can be expressed as

$$\begin{array}{rcl} \mathit{Caplet}\left (k\right )& =& {E}^{Q}\left [\exp \left (-{\int \nolimits \nolimits}_{0}^{k\tau}r\left ({X}_{s}\right )ds\right )\tau {\left (\mathcal{R}\left (\left (k - 1\right )\tau ,k\tau \right ) -\overline{r}\right )}^{+}\right ] \\ & =& \left (1 + \tau \overline{r}\right ){E}^{Q}\left [\exp \left (-{\int \nolimits \nolimits}_{0}^{\left (k-1\right )\tau}r\left ({X}_{s}\right )ds\right ){\left ( \frac{1} {\left (1 + \tau \overline{r}\right )} - \varrho \left (\left (k - 1\right )\tau ,k\tau \right )\right )}^{+}\right ].\end{array}$$

Hence, the pricing of the kth caplet is equivalent to the pricing of an k − 1τ-for-τ put struck at \(K = \frac{1} {\left (1+\tau \overline{r}\right )}.\) Therefore,

$$\begin{array}{rcl} \mathit{Caplet}(k)& =& {G}_{0,-{q}_{\tau}}\left (\ln K,{X}_{\left (k-1\right )\tau},\left (k - 1\right )\tau \right ) \\ & & -\frac{1} {K}{G}_{{q}_{\tau},-{q}_{\tau}}\left (\ln K,{X}_{\left (k-1\right )\tau},\left (k - 1\right )\tau \right ). \\ & & \end{array}$$
(47.53)

Similarly for the kth floorlet

$$\begin{array}{rcl} \mathit{Floorlet}(k)& =& -{G}_{0,{q}_{\tau}}\left (-\ln K,{X}_{\left (k-1\right )\tau},\left (k - 1\right )\tau \right ) \\ & & + \frac{1} {K}{G}_{{q}_{\tau},{q}_{\tau}}\left (-\ln K,{X}_{\left (k-1\right )\tau},\left (k - 1\right )\tau \right ). \\ & & \end{array}$$
(47.54)

7 Appendix 47B The Implementation of the Kalman Filter

To implement the extended Kalman filter, we first recast the QTSMs into a state-space representation. Suppose we have a time series of observations of yields of L zero-coupon bonds with maturities Γ = (τ1, τ2, , τ L ). Let Ξ be the set of parameters for QTSMs, Y k = f(X k , Γ; Ξ) be the vector of the L observed yields at the discrete time points kΔt, for k = 1, 2, , K, where Δt is the sample interval (one day in our case). After the following change of variable,

$${Z}_{k} = {U}^{-1}({\xi}^{-1}\mu + {X}_{k}),$$

we have the state equation:

$${Z}_{k} = \Phi {Z}_{k-1} + {w}_{k},\;\;\;\;\;\;{w}_{k} \sim N(0,\Theta )$$

where Φ and Θ are first introduced in (4) and (5), and measurement equation:

$${Y}_{k} = {d}_{k} + {H}_{k}{Z}_{k} + {v}_{k},\;\;{v}_{k} \sim N(0,{R}^{v})$$

where the innovations in the state and measurement equations w k and v k follow serially independent Gaussian processes and are independent from each other. The time-varying coefficients of the measurement equation d k and H k are determined at the ex ante forecast of the state variables,

$$\begin{array}{rcl}{H}_{k}& =& \frac{\partial f(Uz - {\xi}^{-1}\mu ,\Gamma )} {\partial z}{\mid}_{z={Z}_{k\vert k-1}} \\ {d}_{k}& =& f(U{Z}_{k\vert k-1} - {\xi}^{-1}\mu ,\Gamma ) - {H}_{k}{Z}_{k\vert k-1} + {B}_{k}, \\ \end{array}$$

where Z k | k − 1 = ΦZ k − 1.

In the QTSMs, the transition density of the state variables is multivariate Gaussian under the physical measure. Thus the transition equation in the Kalman filter is exact. The only source of approximation error is due to the linearization of the quadratic measurement equation. As our estimation uses daily data, the approximation error, which is proportional to one-day ahead forecast error, is likely to be minor. Furthermore, we can minimize the approximation error by introducing the correction term B k .Footnote 27 The Kalman filter starts with the initial state variable Z 0 = E(Z 0) and the variance–covariance matrix P 0 Z,

$${P}_{0}^{Z} = E\left [\left ({Z}_{0} - E({Z}_{0})\right ){\left ({Z}_{0} - E({Z}_{0})\right )}^{{\prime}}\right ].$$

These unconditional mean and variance have closed-form expressions that can be derived using Equations (47.4) and (47.5) by letting Δt goes to infinity. Given the set of filtering parameters, Ξ, R v, we can write down the log-likelihood of observations based on the Kalman filter

$$\begin{array}{rcl} \log \mathcal{L}\left (Y ;\Xi \right )& =& {\sum \nolimits}_{k=1}^{K}\log f({Y}_{k};{\mathcal{Y}}_{k-1},\left \{\Xi ,{R}^{v}\right \}) \\ & =& -\frac{LK} {2} \log \left (2\pi \right ) -\frac{1} {2}{\sum \nolimits}_{k=1}^{K}\log \left \vert {P}_{k\vert k-1}^{Y}\right \vert -\frac{1} {2}{\sum \nolimits}_{k=1}^{K}\left [{\left ({Y}_{k} -\hat{{Y}}_{k\vert k-1}\right )}^{{\prime}}{\left ({P}_{k\vert k-1}^{Y}\right )}^{-1}\left ({Y}_{k} -\hat{{Y}}_{k\vert k-1}\right )\right ]\\ \end{array}$$

with \({\mathcal{Y}}_{k-1}\) is the information set at time k − 1Δt, and P k | k − 1 Y is the time k − 1Δt conditional variance of Y k ,

$$\begin{array}{rcl}{P}_{k\vert k-1}^{Y}& =& {H}_{k}{P}_{k\vert k-1}^{Z}{H}_{k}^{{\prime}} + {R}^{v}; \\ {P}_{k\vert k-1}^{Z}& =& \Phi {P}_{k-1}^{Z}{\Phi}^{{\prime}} + \Theta. \end{array}$$

8 Appendix 47C Derivation of the Characteristic Function

The solution to the characteristic function of log L k T k ,

$$\psi \left ({u}_{0},{Y}_{t},t,{T}_{k}\right ) =\exp \left [a(s) + {u}_{0}\log ({L}_{k}\left (t\right )) + B{(s)}^{{\prime}}{V}_{t}\right ],$$

a(s) and B(s), 0 ≤ sT k satisfy the following system of Ricatti equations:

$$\begin{array}{rcl} \frac{d{B}_{j}(s)} {ds} & =& -{\kappa}_{j}^{k+1}{B}_{j}(s) + \frac{1} {2}{B}_{j}^{2}(s){\xi}_{j}^{2} \\ & & +\frac{1} {2}\left [{u}_{0}^{2} - {u}_{0}\right ]{U}_{s,j}^{2},\ \ \ \ 1 \leq j \leq N, \\ \end{array}$$
$$\begin{array}{rcl} \frac{da(s)} {ds} & =& {\sum \nolimits}_{j=1}^{N}{\kappa}_{j}^{k+1}{\theta}_{j}^{k+1}{B}_{j}(s) \\ & & +{\lambda}_{J}\left [\Gamma \left ({u}_{0}\right ) - 1 - {u}_{0}\left (\Gamma (1) - 1\right )\right ], \\ \end{array}$$

where the function Γ is

$$\Gamma (c) =\exp \left ({\mu}_{J}^{k+1}c + \frac{1} {2}{\sigma}_{J}^{2}{c}^{2}\right ).$$

The initial conditions are B(0) = 0 N ×1, a(0) = 0, and κ j k + 1 and θ j k + 1 are the parameters of V j t process under \({\mathbb{Q}}^{k+1}.\)

For any l < k, Given that BT l = B 0 and aT l = a 0, we have the closed-form solutions for BT l + 1 and aT l + 1. Define constants p = u 0 2u 0 U s, j 2, \(q = \sqrt{{\left ({\kappa}_{j}^{k+1} \right )}^{2} + p{\xi}_{j}^{2}},\) \(c = \frac{p} {q-{\kappa}_{j}^{k+1}}\) and \(d = \frac{p} {q+{\kappa}_{j}^{k+1}}.\) Then we have

$$\begin{array}{rcl}{B}_{j}({T}_{l+1})& =& c - \frac{\left (c + d\right )(c - {B}_{j0})} {\left (d + {B}_{j0}\right )\exp (-q\delta ) + \left (c - {B}_{j0}\right )},1 \leq j \leq N, \\ a({T}_{l+1})& =& {a}_{0} -{\sum \nolimits}_{j=1}^{N}\left [{\kappa}_{j}^{k+1}{\theta}_{j}^{k+1}\left (d\delta + \frac{2} {{\xi}_{j}^{2}}\ln \left (\frac{\left (d + {B}_{j0}\right )\exp (-q\delta ) + \left (c - {B}_{j0}\right )} {c + d} \right )\right )\right ] \\ & & +{\lambda}_{J}\delta \left [\Gamma \left ({u}_{0}\right ) - 1 - {u}_{0}\left (\Gamma (1) - 1\right )\right ], \\ \end{array}$$

if p≠0 and B j (T l + 1) = B j0, a(T l + 1) = a 0 otherwise. BT k and aT k can be computed via iteration.