1 Introduction

A fundamental question in commodity futures markets is whether volatility drives higher expected excess returns. To study the dynamic relation between expected returns and volatility in the term structure of commodity futures, a precise measure of the time-varying volatility and an accurate account of the term structure of expected returns are needed. In this paper, we employ a discrete-time term structure model with volatility components to examine the risk-return relationship in commodity futures markets.

Our model is built on the discrete-time stochastic volatility model of Treasury yields (see Le et al. 2010; Ghysels et al. 2014). Not only does the model retain the tractability of affine term structure models, but it also accurately captures the time variation in conditional variances. In the spirit of Engle and Lee (1999), we endow the conditional variance of futures prices to be driven by a long-run component and a short-run component, each of which follows its own GARCH-type process with different degrees of persistence. Moreover, following the ARCH-in-mean literature (Engle et al. 1987), we allow the conditional variance to affect the futures prices and expected excess returns.

The model developed can be applied to any futures market. As of April 2020, crude oil accounted for \(43.72\%\) of the entire Standard and Poor Goldman Sachs Commodity Index (S&P GSCI) in terms of dollar value, while the West Texas Intermediate (WTI) crude oil futures account for \(25.31\%\) of the S&P GSCI and is the largest commodity as of dollar value in the index. Crude oil is also the most liquid commodity traded in derivatives markets. Additionally, many studies show that crude oil has a distinguishable impact on non-energy commodity futures, the stock market, and the economy; and this market has the ability to exhibit cross-sectional influences on a variety of other market components.Footnote 1 As such, we use data from the crude oil futures market to empirically test our model.

In our empirical implementation, we adopt three state variables, characterized as level, slope, and curvature of futures price in our model. We endow only the volatility of level factor with the component GARCH dynamic to create a parsimonious model estimated using WTI crude oil futures prices with maturities of one to twelve months spanning January 1990-July 2016. This sample period identifies a structural break in June 2005; therefore, we examine the risk-return relationship in the two subsamples in addition to the full sample. We document a statistically significant positive relationship between volatility and expected excess returns (risk premiums) for all maturities in the first subsample, but a significant negative relation for all maturities in the second subsample.Footnote 2

A negative relation between volatility and the risk premium on futures contracts seems counterintuitive; however, this may not be surprising given the increasingly speculative activities in commodity markets. Hamilton and Wu (2014) find negative risk premia in the crude oil market after 2005. Other literature (see, for example, Li 2018; Heath 2019) document similar results and ascribe them to the financialization of commodity markets. The increased financial trading activities concurred with lower risk premiums in commodity markets since financial traders are willing to accept a lower or even negative risk premium in exchange for the diversification benefits of commodities. Brunetti and Reiffen (2014) and Cheng and Xiong (2014) suggest that financialization brings investors to the long side of commodity trading and mitigates the hedging pressure of short hedgers, and leads to decreases in risk premiums.

The proposed model allows us to differentiate the contributions of long- and short-run volatility components to futures risk premiums. We decompose the predicted one-month excess returns of futures and find that the short-run component is as important as the long-run component in explaining futures risk premia in the full sample. However, in the two subsamples, we find that the short-run volatility component is the most important contributor. Notably, in the first subsample, January 1990-May 2005, the short-run volatility component itself, on average, accounts for about \(96\%\) of the predictive component of futures risk premiums. In the second subsample, June 2005-July 2016, the short-run volatility component explains on average about \(64\%\) of the futures risk premiums. These results suggest that it is the short-run component of volatility, not the long-run component, which matters more for return predictability through exposure to level risk. Our findings also provide a caution against using a single volatility factor to forecast expected returns in the crude oil futures market.Footnote 3

Our model adequately captures the cross-sectional and time-series properties of the level and volatility of log futures prices. In the full sample, the average in-sample root mean square error (RMSE) of log futures prices is about 24 basis points across 12 maturities. The average in-sample RMSE of the one-month conditional volatilities of log futures prices is about ten basis points across 12 maturities in the full sample. The fits of conditional volatilities are better for futures with intermediate maturities (5 to 8 month). The proposed model fits the high volatility periods of the early 1990s, early 2000s, and 2008–2009 well. These findings are robust to the two subsamples.

Our study is related to the literature using the continuous-time stochastic volatility model to price commodity derivatives. Trolle and Schwartz (2009) develop a model with a stochastic cost of carry for pricing commodity derivatives in the presence of unspanned stochastic volatility. Chiarella et al. (2016) use a more elaborate stochastic volatility model with multiple state variables to accommodate a more flexible volatility term structure. They use both futures and options data to improve the fit and implement a filtering technique in the estimation of their models due to the state variables and the volatility process being unobservable. In contrast, our model is simpler and flexible in fitting the term structure of futures prices and volatilities by using only futures data. Our model is in discrete-time with three observed variables. Thus its estimation can be simplified substantially as no filtering is necessary. More importantly, we incorporate GARCH-type volatility into the term structure model and distinguish the long-run and short-run components of volatility; therefore, the performance of our model in fitting the volatilities of futures prices can be improved substantially.

Our results are complementary to the literature that examines the predictability of commodity futures returns (see, for example, Bessembinder 1992; De Roon et al. 2000; Erb and Harvey 2006; Gorton et al. 2013; Szymanowska et al. 2014; Hong and Yogo 2012), but we focus on studying how different volatility components affect expected futures returns of crude oil. We also contribute to the literature that documents the realized risk-return relation in commodity futures markets. For example, Baur and Dimpfl (2018) and Chen and Mu (2020) study the relationship between return shocks and subsequent variance and relate it to the asymmetric volatility effect, while Chiarella et al. (2016) focus on the contemporaneous relationship between futures returns and innovations in the volatility and justify it using convenience yield effect. We investigate the relationship between volatility and expected futures returns, and find a change in the relationship, which we contribute to the financialization of commodity markets.

The paper proceeds as follows: Section 2 presents the term structure model with long-run and short-run GARCH volatility components. Section 3 describes the estimation method and data. Section 4 discusses the modeling choice and the selection of samples in the empirical implementation. Section 5 provides parameter estimates and analyzes the long-run and short-run components. Section 6 investigates the model’s implications on risk-return relation. Section 7 shows the in-sample performance of the model. Section 8 concludes.

2 Model

In this section, we first present a discrete-time term structure model with long-run and short-run GARCH-type volatility components for the oil futures prices. Subsequently, we discuss the model’s implications on oil futures risk premia.

2.1 The futures pricing with volatility components

We assume that the first M principal components, \(PC_{t}\), capture the term structure of the log futures prices in an affine functionFootnote 4

$$\begin{aligned} f_{t}^{n}=A_{n}+B_{n}PC_{t}, \end{aligned}$$
(1)

where \(f_{t}^{n}\) is the n-maturity log futures price. \(A_{n}\) and \(B_{n}\) are free parameters: \(A_{n}\) is a scalar, and \(B_{n}\) is a \(1\times M\) matrix.Footnote 5

We assume the state variables, \(PC_{t}\), follow affine dynamics with conditionally Gaussian innovations

$$\begin{aligned} PC_{t+1}=K_{0}+K_{1}PC_{t}+K_{V}V_{t}+\varepsilon _{t+1}, \end{aligned}$$
(2)

where \(K_{0}\) and \(\varepsilon _{t+1}\) are \(M\times 1\) vectors, and \(K_{1}\) is a \(M\times M\) diagonal matrix. \(V_{t}\) is a \(M_{V}\times 1\) GARCH-in-mean vector that captures the volatility information relevant for predicting the state variables, \(PC_{t}\), where \(M_{V}\) is the total number of volatility factors for all M principal components. \(K_{V}\) is a \(M\times M_{V}\) matrix. \(\varepsilon _{t}\) is assumed to be distributed N(0,  \(\Sigma _{t})\) . We now turn to the specifications of the conditional variance \(\Sigma _{t}\) and the GARCH-in-mean term \(V_{t}\).

The conditional variance matrix \(\Sigma _{t}\) is a \(M\times M\) diagonal matrix with the diagonal element, \(\sigma _{i,t}^{2},\) \(i=1,...,M\), to be time-varying. Building on Engle and Lee (1999), we decompose the total variance \(\sigma _{i,t}^{2}\) into a long-run component, \(L_{i,t}\), and a short-run component, \(S_{i,t}\). Each component follows its own GARCH-type process with different persistence, captured by \(\rho _{i,L}\) and \(\rho _{i,S}\).

$$\begin{aligned} \sigma _{i,t}^{2}= & {} L_{i,t}+S_{i,t}, \nonumber \\ L_{i,t}= & {} \sigma _{i}^{2}(1-\rho _{i,L})+\rho _{i,L}L_{i,t-1}+\phi _{i}\left( \varepsilon _{i,t}^{2}-\sigma _{i,t-1}^{2}\right) , \nonumber \\ S_{i,t}= & {} \rho _{i,S}S_{i,t-1}+\varphi _{i}\left( \varepsilon _{i,t}^{2}-\sigma _{i,t-1}^{2}\right) , \end{aligned}$$
(3)

where \(\varepsilon _{i,t}\) is the ith element in the \(M\times 1\) vector of \(\varepsilon _{t}\). \(\rho _{i,L}\), \(\rho _{i,S}\), \(\phi _{i}\), and \(\varphi _{i}\) are all positive scalars. Without loss of generality, we impose the restriction that \(\rho _{i,L}>\rho _{i,S}\). \(L_{i,t}\) is a low-frequency trending component of \(\sigma _{i,t}^{2}\), whereas \(S_{i,t}\) is a high-frequency transitory component with a population mean of zero. \(\sigma _{i}^{2}\) is the constant unconditional mean of the conditional variance of the ith principal component. Note that \(E[\sigma _{i,t}^{2}]=E[L_{i,t}]= \frac{\sigma _{i}^{2}\left( 1-\rho _{i,L}\right) }{1-\rho _{i,L}}=\sigma _{i}^{2}\) as long as \(\rho _{i,L}<1\). The term \(\varepsilon _{i,t}^{2}-\sigma _{i,t-1}^{2}\) represents innovations about volatility. We also impose the following restriction as in Engle and Lee (1999) to guarantee that \(\sigma _{i,t}^{2}\) and \(L_{i,t}\) are strictly positive

$$\begin{aligned} \phi _{i}+\varphi _{i}<\rho _{i,S}<\rho _{i,L}<1. \end{aligned}$$
(4)

This decomposition allows us to differentiate the impact of recent volatilities on futures prices and risk premia from that of distant volatility information.Footnote 6

Following the ARCH-in-mean literature pioneered by Engle et al. (1987), we allow the conditional variance to affect oil futures prices. \(V_{t}\) in Eq. (2) is a \(M_{V}\times 1\) vector consisting of the long-run and short-run volatility components of the M principal components.Footnote 7

$$\begin{aligned} V_{t}=\left[ \begin{array}{c} \sqrt{L_{1,t}} \\ \sigma _{1,t}-\sqrt{L_{1,t}} \\ ... \\ \sqrt{L_{M,t}} \\ \sigma _{M,t}-\sqrt{L_{M,t}} \end{array} \right] . \end{aligned}$$
(5)

For a model with M principal components as state variables, there are M long-run volatility components, \(L_{i,t}\), and M short-run volatility component, \(S_{i,t}\). Therefore \(V_{t}\) is a \(2M\times 1\) vector (\(M_{V}=2M\)). The volatility components affect the conditional means of the state variables, \(PC_{t}\), through \(K_{V}\), a \(M\times M_{V}\) block diagonal matrix

$$\begin{aligned} K_{V}=diag\left( \begin{array}{ccc} \left[ K_{V1}\right]&...&\left[ K_{VM}\right] \end{array} \right) , \end{aligned}$$
(6)

where \(K_{Vi}\), \(i=1,...,M\), is a \(1\times 2\) vector. In particular, note that \(L_{i,t}\) and \(S_{i,t}\) are known as of time t, given the history of the ith principal component \(PC_{i}\) until t and an initial variance \(L_{i,0}\) and \(S_{i,0}\), as follows

$$\begin{aligned} L_{i,t}= & {} \sigma _{i}^{2}(1-\rho _{i,L})+\rho _{i,L}L_{i,t-1}+\phi _{i} \nonumber \\&\times \left( \left( PC_{i,t}-K_{0i}-K_{1i}PC_{i,t-1}-K_{Vi}\left[ \begin{array}{c} \sqrt{L_{i,t-1}} \\ \sigma _{i,t-1}-\sqrt{L_{i,t-1}} \end{array} \right] \right) ^{2}-\left( L_{i,t-1}+S_{i,t-1}\right) \right) , \end{aligned}$$
(7)
$$\begin{aligned} S_{i,t}= & {} \rho _{i,S}S_{i,t-1}+\varphi _{i} \nonumber \\&\times \left( \left( PC_{i,t}-K_{0i}-K_{1i}PC_{i,t-1}-K_{Vi}\left[ \begin{array}{c} \sqrt{L_{i,t-1}} \\ \sigma _{i,t-1}-\sqrt{L_{i,t-1}} \end{array} \right] \right) ^{2}-\left( L_{i,t-1}+S_{i,t-1}\right) \right) , \end{aligned}$$
(8)

where \(K_{0i}\) is the ith element in \(K_{0}\) and \(K_{1i}\) is the ith diagonal term in \(K_{1}\).

To summarize, our model is fully characterized by Eqs. (1)–(8). The full parameter set is given by: \(\Theta =\{A,\) B\(K_{0},\) \(K_{1},\) \(K_{V},\) \(\rho _{i,L},\) \(\rho _{i,S},\) \(\phi _{i},\) \(\varphi _{i},\) \(\sigma _{i}^{2}\}\), where A is a \(N\times 1\) vector and B is a \(N\times M\) matrix, with \(A_{n}\) and \(B_{n}\) are their corresponding elements for n-maturity. N denotes the total number of available maturities in the sample.

2.2 Implications on futures risk premia

Using Eq. (1), we can write the model’s prediction of the one-period ahead n-maturity futures price as

$$\begin{aligned} f_{t+1|t}^{n}=A_{n}+B_{n}PC_{t+1|t}. \end{aligned}$$
(9)

From Eq. (2), the predicted one-period ahead state variables, PC, are given by

$$\begin{aligned} PC_{t+1|t}=K_{0}+K_{1}PC_{t}+K_{V}V_{t}. \end{aligned}$$
(10)

Following Szymanowska et al. (2014) and Heath (2019), we define the one-period excess futures return as the continuously compounded return of holding an n-maturity futures for one period

$$\begin{aligned} rf_{t+1}^{n}=f_{t+1}^{n-1}-f_{t}^{n}. \end{aligned}$$
(11)

The model-implied risk premia for n-maturity futures is the conditional expectation of the excess holding return. Using Eqs. (9)–(11) and (1), we have

$$\begin{aligned} rf_{t+1|t}^{n}= & {} f_{t+1|t}^{n-1}-f_{t}^{n} \nonumber \\= & {} A_{n-1}+B_{n-1}PC_{t+1|t}-(A_{n}+B_{n}PC_{t}) \nonumber \\= & {} A_{n-1}+B_{n-1}(K_{0}+K_{1}PC_{t}+K_{V}V_{t})-(A_{n}+B_{n}PC_{t}) \nonumber \\= & {} A_{n-1}+B_{n-1}K_{0}-A_{n}+(B_{n-1}K_{1}-B_{n})PC_{t}+B_{n-1}(K_{V}V_{t}) \nonumber \\= & {} \text {constant}+(B_{n-1}K_{1}-B_{n})PC_{t}+B_{n-1}(K_{V}V_{t}). \end{aligned}$$
(12)

For each futures contract with n periods to maturity, the effect of the volatility components, \(V_{t}\), on its risk premium, \(rf_{t+1|t}^{n}\), is the product of the exposure of the log futures prices to each principal component, \(B_{n-1}\), and the effect of the volatility component on the forecast of future principal components, \(K_{V}\). Since \(K_{V}\) is a block diagonal matrix, \(K_{Vi}\) (\(i=1,...,M\)) determines the relationship between the volatility components of the ith principal component and futures risk premium for a given price loading \(B_{n-1}\).

Furthermore, Eq. (12) provides a term structure of risk-return relations across the maturity spectrum as n varies. We can decompose the predictive component of the one-period return for n-maturity futures into a \(PC_{t}\)-related component and a \(V_{t}\)-related component. We calculate the following fraction as a proxy for the contribution of the \(PC_{t}\)-related component for futures risk premia

$$\begin{aligned} \frac{Var\left[ (B_{n-1}K_{1}-B_{n})PC_{t}\right] }{Var\left[ (B_{n-1}K_{1}-B_{n})PC_{t}+B_{n-1}(K_{V}V_{t})\right] }. \end{aligned}$$
(13)

The contribution of the volatility component \(V_{t}\) for futures risk premia can be computed in the same way,

$$\begin{aligned} \frac{Var\left[ B_{n-1}(K_{V}V_{t})\right] }{Var\left[ (B_{n-1}K_{1}-B_{n})PC_{t}+B_{n-1}(K_{V}V_{t})\right] }. \end{aligned}$$
(14)

The contribution of the long-run volatility component of the ith principal component is

$$\begin{aligned} \frac{Var\left[ B_{n-1}(i)(K_{Vi}(1)\sqrt{L_{i,t}})\right] }{Var\left[ (B_{n-1}K_{1}-B_{n})PC_{t}+B_{n-1}(K_{V}V_{t})\right] }, \end{aligned}$$
(15)

where \(K_{Vi}(1)\) is the first element of \(K_{Vi}\) and corresponds to the long-run component. \(B_{n-1}(i)\) is the ith element of \(B_{n-1}\). The contribution of the short-run volatility component of the ith principal component is

$$\begin{aligned} \frac{Var\left[ B_{n-1}(i)(K_{Vi}(2)(\sigma _{i,t}-\sqrt{L_{i,t}}))\right] }{ Var\left[ (B_{n-1}K_{1}-B_{n})PC_{t}+B_{n-1}(K_{V}V_{t})\right] }, \end{aligned}$$
(16)

where \(K_{Vi}(2)\) is the second element of \(K_{Vi}\) and corresponds to the short-run component.

3 Estimation method and data

3.1 Estimation method

We assume that futures prices are measured with errors \(e_{t}\), a vector of measurement errors that is assumed to be i.i.d. normal. We further assume that the errors on each maturity have equal variance \(\sigma _{e}^{2}\) so that the likelihood tries equally hard to match the term structure of futures prices.

The conditional likelihood function of futures prices is

$$\begin{aligned} \Gamma (f_{t}|f_{t-1},\Theta )=\Gamma (f_{t}|PC_{t},\Theta _{1},\sigma _{e}^{2})\times \Gamma (PC_{t}|PC_{t-1},\Theta _{2}), \end{aligned}$$
(17)

where \(\Theta _{1}=\{A,\) \(B\}\) and \(\Theta _{2}=\{K_{0},K_{1},K_{V},\rho _{i,L},\rho _{i,S},\phi _{i},\varphi _{i},\sigma _{i}^{2}\}\). The first term captures the cross-sectional dependence of futures prices on risk factors \(PC_{t}\), and its logarithm is given by

$$\begin{aligned} \log \Gamma (f_{t}|PC_{t},\Theta _{1},\sigma _{e}^{2})=\text {constant} -0.5N\log (\sigma _{e}^{2})-0.5\frac{\left\| e_{t}\right\| ^{2}}{ \sigma _{e}^{2}}. \end{aligned}$$
(18)

Recall that N is the number of maturities in the term structure of futures prices. \(\left\| e_{t}\right\|\) denotes the Euclidean norm of the vector of measurement errors. The second term of Eq. (17) captures the time-series dynamics of the risk factors. It corresponds to the likelihood of a conditionally Gaussian VAR process. The logarithm is given by

$$\begin{aligned}&\log \Gamma (PC_{t}|PC_{t-1},\Theta _{2}) \nonumber \\&\quad =\text {constant}-0.5\log (\det (\Sigma _{t}))-0.5\left\| \Sigma _{t}^{-0.5}(PC_{t}-K_{0}-K_{1}PC_{t-1}-K_{V}V_{t-1})\right\| ^{2}. \end{aligned}$$
(19)

3.2 Data

We use the Chicago Mercantile Exchange (CME) WTI crude oil futures settlement price on the last business day of each month from January 1990 to July 2016 with the nearest 12 maturities.Footnote 8 Panel A of Fig. 1 plots the log futures prices for the full sample. The log futures prices with different maturities comove closely with each other. The log futures prices on average is increasing over time for all maturities. Panel A of Table 1a presents the summary statistics of the log futures prices using the full sample. On average, the term structure of log futures prices is flat, and the volatility of log futures prices is relatively higher for longer maturities. The prices for all maturities are highly persistent, with slightly higher autocorrelation for long-term futures than for short-term futures. The log futures prices exhibit excess kurtosis and positive skewness for all maturities.

Fig. 1
figure 1

Log futures prices and returns. Notes to Figure: Panels A and B of the figure plot the log futures prices with 1-month to 12-month maturities \(f^{n}_{t}\) and the one-month excess holding returns with 2-month to 12-month maturities \(rf^{n}_{t+1}\). Panel C plots the differences of log futures prices between 12-month and 1-month maturities \(f^{12}_{t}-f^{1}_{t}\). The sample period is from 1990:01 to 2016:07 (color figure online)

Table 1 Summary statistics
Table 2 Parameter estimates

Panel B of Fig. 1 plots the one-month excess futures returns for the full sample. The futures returns fluctuate significantly in our sample for all maturities, and the movements across different maturities are highly correlated with each other. The returns dropped to around \(-40\%\) during the financial crisis. Another two major drops happened in January 1991 (after the U.S. ordered a drawdown of Strategic Petroleum Reserve, and Kuwaiti oil facilities were destroyed by Iraq) and November 1998 (the oil price crisis of 1998 that the price hit 25-year low on November 30, 1998).

Panel B of Table 1a presents the summary statistics of futures returns using the full sample. The average returns are all positive ranging from \(0.0404\%\) (2-month) to \(0.4730\%\) (8-month). On average, the returns are higher for long maturities than short maturities. The volatility of one-month returns is monotonically decreasing with maturity. The returns are much less persistent than the log futures prices. The one-month excess returns exhibit excess kurtosis and negative skewness for all maturities.

4 Modeling choice and sample discussion

4.1 Modeling choice

To show that the cross-section of log futures prices can be well described by a low-dimensional term structure model, we conduct a principal component analysis on the data. Figure 2 plots the loadings of the first three principal components for the levels and changes in log oil futures prices by maturity. The first three principal components are well characterized as the level, slope, and curvature of the log futures prices. The figure also displays the percentage fraction of the variance that is accounted for by the principal components. We find that the first three principal components together account for more than 99.99% of the variance. We conclude that the three state variables are needed in the term structure model to capture both the levels and changes in the log prices, and they are sufficient to account for the variation of the log futures prices. This conclusion is consistent with Schwartz (1997) and Casassus and Collin-Dufresne (2005).Footnote 9 We, therefore, adopt a model with the observed first three principal components of oil futures prices as the state variables: \(M=3\) for the model specified in Sect. 2.

Our next step in the estimation is to diagnose the conditional variance \(\sigma _{i,t}^{2}\) of the first three principal components. We estimate the maximum flexible version of the model as specified in Sect. 2 using the full sample, and plot the estimated time-varying variance \(\sigma _{i,t}^{2}\) and its long- and short-run components, \(L_{i,t}\) and \(S_{i,t}\), for all three principal components in Fig. 3. We find that the total conditional variance of the first principal component is much larger and more volatile than those of the other two principal components. This observation also holds for the long-run and short-run variance components. The conditional variance and the variance components for the second and third principal components are relatively inconsiderable and time-invariant.

Fig. 2
figure 2

Principal component analysis of levels and changes in log futures prices. Notes to Figure: This figure plots the loadings (y-axis) of the first three principal components for the term structure of log futures prices \(f_{t}\) (Panel A) and the term structure of changes in log futures prices \(f_{t+1}-f_{t}\) (Panel B) by maturity. The diamond line (red) is for the first principal component. The cross line (blue) is for the second principal component. The square line (green) is for the third principal component. The percentage numbers reported in the top right corner of the figure represent the fraction of variance that is accounted for by the principal components. The sample period is from 1990:01 to 2016:07 (color figure online)

The principal component analysis in Fig. 2 indicates that \(99.76\%\) of the variance of the oil futures curve can be explained by the first principal component. We therefore may not need to specify time-varying variance for all three factors. Including time-varying variance and the variance components for all risk factors would make our model over-parameterized. Based on the evidence in Figs. 2 and 3, we will focus on a parsimonious model with only the first (level) factor having a time-varying variance for the remaining of the paper. In this specification, \(\rho _{i,L}\), \(\rho _{i,S}\), \(\phi _{i}\), and \(\varphi _{i}\) are zeros for \(i=2\) and 3 in Eq. (3). \(V_{t}\) is a \(2\times 1\) vector and \(K_{V}\) is a \(3\times 2\) matrix in Eq. (2). The first row of \(K_{V}\) is \(K_{V1}\), a \(1\times 2\) vector, and the remaining rows in \(K_{V}\) are zeros. According to Eqs. (9)–(12), the volatility components at t can affect the futures prices and risk premium at \(t+1\), but only through the level factor. \(K_{V1}\) determines the relationship between volatility components and futures risk premium for a given price loading \(B_{n-1}\).

Fig. 3
figure 3

Variance and variance components for the three state variables. Notes to Figure: This figure plots the estimated total conditional variance \(\sigma ^{2}_{i,t}\) (Panel A) and its long-run component \(L_{i,t}\) (Panel B) and short-run component \(S_{i,t}\) (Panel C) for the first three principal components using the full sample. The solid line (blue) is for the first principal component. The dotted line (red) is for the second principal component, and the dashed line (green) is for the third principal component. The full sample period is from 1990:01 to 2016:07 (color figure online)

4.2 Sample discussion

Our full sample is an extended sample, including 26 years of data. Estimation using such a long sample may subject to estimation biases due to structural break. As indicated in Panel A of Fig. 1, the log futures prices are apparently higher in the second half of the sample than in the first half. We also plot the basis (the difference between 12-month futures prices and 1-month futures prices) in Panel C of Fig. 1. The basis is on average negative in the first half of the sample, suggesting the crude oil futures market was in backwardation most of the time. However, after 2005, the basis becomes positive on average, implying the crude oil futures market was in contango most of the time. Given the fact that futures prices will converge to spot price at expiration, a market in backwardation (contango) implies the expected futures price will increase (decrease) therefore leading to a positive (negative) risk premium in the futures markets (Gorton and Rouwenhorst 2006). These observations are consistent with the literature documenting the structural change (from backwardation to contango) and the low (or even negative) risk premiums since 2005, the so-called financialization and electronification period (see, for example, Hamilton and Wu 2014; Brunetti and Reiffen 2014; Cheng and Xiong 2014; Heath 2019). The Chow (1960) structural break test on the log futures prices with one month to expiry suggests a breakpoint on June 6, 2005 in our sample. We therefore conduct our analyses for the full sample and also for the two subsamples, the first covering January 1990 to May 2005, and the second June 2005 to July 2016 in the remaining of the paper.Footnote 10

Table 1b and c show the summary statistics of log futures prices and one-month excess returns for the two subsamples. The futures prices on average are higher in the second subsample than in the first subsample for all maturities. Futures price exhibits negative skewness for all maturities in the second subsample, but positive skewness for all maturities in the first subsample. After separating the two subsamples, the volatility of log futures prices is relatively lower in the two subsamples than in the full sample for all maturities. On average, the log futures prices are more persistent in the full sample than in the two subsamples. In terms of the one-month excess return, the differences between the two subsamples are very obvious. The returns are negative for all maturities in the second subsample, while they are positive in the first subsample. Moreover, the one-month excess returns exhibit negative skewness for all maturities in the second subsample, while positive skewness for all maturities in the first subsample.

5 Parameter estimates and variance components

Table 2 reports the parameter estimates for the full sample (Panel A) and the two subsamples (Panels B and C). The level factor is the most persistent, and the third variable is strongly mean-reverting for all three samples. For example, the estimated \(K_{1}\) is 0.9930 for the level factor and 0.6718 for the third variable in the full sample. The estimated constant variance (\(\sigma _{i}^{2},\) \(i=1,2,3\)) is similar across different samples, with the first variable having a much larger variance than the second and third variables.

The long-run variance component \(L_{1,t}\) is extremely persistent, with the persistence parameter, \(\rho _{1,L}\), estimated to be 0.9974 for the full sample as shown in Table 2. The short-run variance component is much less persistent. The estimated \(\rho _{1,S}\) is 0.1495, suggesting a half-life of about five months. These findings are robust to the two subsamples. The estimated \(\phi _{1}\) and \(\varphi _{1}\) are 0.0094 and 0.0805, respectively, for the full sample. This implies that about \(9\%\) of the total variance (combining the long- and short-run components) shock in each month enters next month’s conditional variance \(\sigma _{1,t}^{2}\). The estimated \(\phi _{1}\) and \(\varphi _{1}\) are smaller in the two subsamples than in the full sample. All estimates associated with the volatility process are statistically significant.

We also examine the time-series properties of the filtered long-run and short-run variance components. Figure 4 plots the estimated time-varying variance \(\sigma _{1,t}^{2}\) and its long- and short-run components, \(L_{1,t}\) and \(S_{1,t}\), for the full sample and the two subsamples. There is a close correspondence between the total variance and the long-run variance component for all three samples. The mean zero short-run component adds high-frequency noise to the long-run component, which results in a variance dynamic with more spikes in all three samples. In terms of magnitude, the short-run component is much smaller than the long-run component, and by design the short-run component can be negative.

Fig. 4
figure 4

Variance and variance components for the first state variable by samples. Notes to Figure: This figure plots the estimated total conditional variance \(\sigma ^{2}_{1,t}\) and its long-run \(L_{1,t}\) and short-run components \(S_{1,t}\) for the first principal component using the full sample (Panel A) and two subsamples (Panels B and C). The solid line (blue) is the estimated total variance of the first principal component. The dotted line (red) is the estimated long-run component, and the dashed line (green) is the estimated short-run component. The full sample period is from 1990:01 to 2016:07. The first subsample is from 1990:01 to 2005:05. The second subsample is from 2005:06 to 2016:07 (color figure online)

6 Economic implications of the model

In this section, we discuss the economic implications of the term structure model with different volatility components. We first examine the model’s implication on the risk-return relation in the crude oil futures market. Subsequently, we investigate the relative contributions of the long-run and short-run volatility components in forecasting the term structure of futures risk premiums.

6.1 Risk-return relation

As shown in Eq. (12), the volatility component, \(V_{t}\) , summarizes the volatility information relevant for forecasting futures prices and risk premia. Therefore the estimated \(K_{V}\) indicates the relationship between volatility and futures risk premia. Recall that in the model with only the first (level) factor having a time-varying variance, \(K_{V}\) is a \(3\times 2\) matrix. The first row of the matrix is \(K_{V1}\), and the second and third rows are zeros. The volatility component affects the futures risk premium through the estimate of \(K_{V1}\).

Table 2 shows that the estimates of \(K_{V1}\) are negative for both the long- and short-run volatility components in the full sample. The estimate associated with the long-run volatility is \(-0.1690\) and that for the short-run volatility is \(-0.3189\). But both estimates are not statistically significant in the full sample. When we examine the estimates in the two subsamples, we find significantly negative \(K_{V1}\) in the second subsample, June 2005 to July 2016. While in the first subsample, January 1990 to May 2005, we reach an opposite conclusion: a significantly positive relationship between volatility risk and expected excess returns for all maturities. In particular, the short-run volatility component has a much larger impact on the level factor next month than the long-run volatility component. For example, the estimates of \(K_{V1}\) in the first subsample are 0.4206 for the long-run volatility component and 4.3284 for the short-run volatility component. In the second subsample, the short-run volatility component has a more negative estimate (\(-3.8606\)) than the long-run volatility component (\(-1.7947\)), suggesting that the short-run volatility component has a greater negative impact on the level factor next month than the long-run volatility component.

With negative \(K_{V1}\), a higher volatility predicts a lower level factor next month, as shown in Eq. (10). A lower-level factor is associated with a lower excess holding return. As explained in Eq. (12), the effect of volatility on risk premium is the product of the exposure of the log futures prices to each risk factor, \(B_{n-1}\), and the effect of volatility has on the forecast of future risk factors, \(K_{V}\). The price loadings on the level factor, \(B_{n-1}\), are positive for all maturities. Therefore a negative and significant \(K_{V1}\) indicates a significantly negative risk-return relation in the second subsample of the crude oil futures market. The significantly negative relationship holds both for the long- and short-run volatility risks.

Fig. 5
figure 5

Model-Implied Excess Holding Return. Notes to Figure: This figure plots the three-month averages of the model-implied one-month excess holding return \(rf_{t+1|t}^{n}\) for futures with different maturities n=2, 5, 9 and 12 months. We plot the results for the two subsamples using their corresponding estimated parameters respectively. The first subsample is from 1990:01 to 2005:05. The second subsample is from 2005:06 to 2016:07. The solid line (blue) represents the average return using the first subsample. The dotted line (red) represents the average return using the second subsample (color figure online)

Figure 5 plots the three-month averages of the expected one-month risk premia, \(rf_{t+1|t}^{n}\), for futures with different maturities. To conserve space, we limit ourselves to \(n=2\)-, 5-, 9- and 12-month futures. We plot the results for the two subsamples using their corresponding estimated parameters, respectively. Compared with the volatility shown in Figure 4, we observe positive spikes in risk premia for the first subsample when volatilities were relatively high (around 1991-1992); while for the second subsample, we observe negative spikes with high volatilities (around 2008-2009). This is consistent with the results in Table 2 that a positive estimate of \(K_{V1}\) in the first subsample and a negative one in the second subsample, implying a changing sign of the relationship between volatility and expected excess returns over time.

A negative relationship between volatility and expected returns seems to counter the classic asset pricing theory in the equity market. However, the commodity markets play different roles in the economy through the demand and supply of commodities and risk-shifting among different traders. Producers wish to minimize their future price risks; therefore, they would like to pay a premium for their hedging positions. This is the so-called "normal backwardation" proposed by Keynes (1930) that the risk premium on average will accrue to the buyers of the futures contracts. However, the risk premium theoretically could accrue to either buyers or sellers, given that big consumers would hedge too. Therefore, the sign of risk premium in the commodity markets could be positive or negative, resulting in a relatively lower average risk premium than that in the equity market.

As seen in Panel B of Table 1a, the average return of the crude oil futures market is about \(4.64\%\) \((0.3869\%\times 12)\) per year, which is much lower than the historical average in the equity market. Recall that in Panel C of Fig. 1, the crude oil futures market was in backwardation most of the time during the first subsample and in contango most of the time during the second subsample. As the convenience yield effect in Pindyck (2001) and Chiarella et al. (2016) implies, a contango commodity futures market gives rise to a negative risk premium–volatility relation. Therefore, the relationship between risk premium and volatility for the first subsample should be positive while it should be negative for the second subsample.

There is a large literature discussing financialization that features the increasing participation of financial traders, especially commodity index traders. For example, Tang and Xiong (2012) show that financial traders’ portfolio rebalancing can spillover price volatility from outside to commodity markets. On the other side, the literature argues that the financial traders invest in commodities for portfolio diversification purposes and are willing to accept a lower or even a negative risk premium. The long-only commodity index traders could also exert hedging pressure on the buy-side and therefore lower the risk premium (Brunetti and Reiffen 2014; Cheng and Xiong 2014). Acharya et al. (2013) document a negative relationship between the risk premium associated with hedging demand and speculative activities in the commodity markets. Hamilton and Wu (2014) and Li (2018) use different methodologies, and both find negative risk premiums during financialization. This is consistent with our findings of the negative excess returns in Panel B of Table 1c and the negative average risk premiums in Fig. 5. In addition to confirming the negative sign of the risk premium, our results confirm a significant negative relationship exists between volatility and expected return after 2005.

Overall, these findings suggest that the risk-return relation in the crude oil futures market depends on different market conditions, such as market volatility, backwardation and contango, market participants’ positions, and speculation levels. We find a significantly positive relation before financialization and a significantly negative relation after financialization. In both subsamples, we find that the short-run volatility component has a greater impact on the level factor next month than the long-run volatility component. Therefore the short-run volatility affects futures risk premium more than the long-run volatility through exposure to level risk. This finding provides a caution against using a single volatility factor to capture the risk-return relation in the oil futures market. Unless the single volatility factor is predominantly short-run, the risk-return relation can not be accurately characterized by a single volatility factor.

6.2 Risk premia decomposition

Following Eqs. (14)–(13), we decompose the predictive component of one-month excess return of n-maturity futures to the contributions of the three principal components and the two volatility components. The results for the full sample and the two subsamples are reported in Table 3. The contributions of the three principal components (\(PC_{t}\)-related component) are reported in the first four rows of each panel in Table 3. Each column corresponds to a given futures maturity, while the last column is the average across all maturities. The first three rows in each panel report the individual contribution of each of the first three principal components.

Table 3 Risk premia decomposition

The slope factor represents the most important contribution to futures risk premiums among all three principal components in all three samples. In the futures market, the slope factor corresponds to the basis between futures and spot prices; therefore, this finding is consistent with the literature showing that futures risk premium is related to futures basis (e.g., Erb and Harvey 2006; Gorton and Rouwenhorst 2006; Liu and Tang 2011). We find that the slope factor can explain on average about \(48.24\%\) of the predictive component of futures risk premia in the full sample (Panel A of Table 3). This contribution increases with maturity, reaching its peak of about \(57.54\%\) for the 12-month futures in the full sample. The level factor is also important in explaining the expected one-month excess return in the full sample, especially for short-dated futures. We observe that about \(36.75\%\) of the predictive component of the excess return for 2 -month futures is explained by the level factor. The curvature factor explains a small proportion of futures risk premia for all maturities, especially for intermediate maturities. Because the principal components are uncorrelated with each other, summing up the first three rows in each panel gives the overall contributions to futures risk premiums of all three principal components. The three principal components explain on average about \(78.32\%\) of the futures risk premia in the full sample.

The last three rows in each panel of Table 3 report the contributions of the long-run, short-run, and total volatility components. The total volatility component (\(V_{t}\)-related component) contributes on average about \(11.36\%\) of the predictive component of futures risk premia in the full sample and it explains a much smaller proportion than the three principal components (\(PC_{t}\)-related component). This is because the estimates of \(K_{V1}\) are small and insignificant in the full sample as discussed in Sect. (6.1). On average about \(5.81\%\) of the predictive component of futures risk premia can be attributed to the long-run volatility component. This fraction reaches its maximum of about \(6.84\%\) for the 8-month futures and is slightly lower for very short- and long-dated futures. The contribution of the short-run volatility component has similar magnitude as the long-run volatility component. On average the short-run volatility component explains about \(5.89\%\) of the futures risk premia. The largest contribution of the short-run volatility component is also for the intermediate-maturity futures (8-month). The short-run volatility component explains as much as the long-run volatility component does, because the estimated \(K_{V1}\) associated with the short-run volatility component is larger as discussed in Sect. (6.1). Even though the magnitude of the short-run volatility component, \(\sigma _{1,t}-\sqrt{ L_{1,t}}\), is smaller than the long-run volatility component, \(\sqrt{L_{1,t}}\), as indicated in Fig. 4, the contributions of the two volatility components are similar in the full sample.

Note that the sum of the contributions of the long-run and short-run volatility components can be different from the contributions of the total volatility component, because of the correlation between the long-run and short-run volatility components.Footnote 11 As presented in panel A of Table 4, the unconditional correlation between the two volatility components is \(-0.03\) in the full sample, \(-0.14\) in the first subsample, and 0.07 in the second subsample. For example, in the full sample, the sum of the contributions of the two volatility components for 2-month futures is \(4.34\%+4.40\%=8.74\%\), which is larger than the contribution of the total volatility component \(8.49\%\). This is due to the negative correlation between the two components. However, the difference is very small since the correlation is very small. This suggests that the two volatility components represent mostly independent channels through which risk premiums can be determined.

Also note that the sum of the contributions of the three principal components (\(PC_{t}\)-related component) and the total volatility (\(V_{t}\) -related component) can be different from \(100\%\). The sum ranges from \(88.36\%\) to \(100.25\%\) for all maturities in all three samples, suggesting that the average correlation between \(PC_{t}\)-related and \(V_{t}\)-related components are positive. Panel B of Table 4 presents the unconditional correlation between these two components. The average correlation is 0.18 in the full sample, 0.04 in the first subsample, and 0.09 in the second subsample. Since the correlation is not large, especially in the two subsamples, we can treat the principal components and the total volatility component as mainly independent channels to affect the futures risk premiums.

Table 4 Unconditional correlation between different components

As discussed in Sect. (6.1), the risk-return relations in the two subsamples are quite different from that in the full sample. We find consistent results for the risk premia decomposition analysis. Panels B and C of Table 3 show that the total volatility (\(V_{t}\)-related component) explains a large proportion of the futures risk premia in the two subsamples. It accounts for about \(95\%\) of the predictive component of futures risk premia on average across all maturities in both subsamples. While the contributions of the three principal components (\(PC_{t}\)-related component) is very marginal in both subsamples. The slope factor is still the most important principle component among all three in explaining futures risk premiums for all maturities. The total volatility component dominates and contributes more than the three principal components in the two subsamples, because the estimates of \(K_{V1}\) are large (in absolute value) and significant in the two subsamples.

Moreover, the short-run volatility component represents the most important contribution in explaining the futures risk premiums for all maturities in both subsamples. In the first subsample, January 1990 to May 2005, the short-run volatility component itself on average accounts for about \(96.31\%\) of the predictive component of futures risk premiums. In the second subsample, June 2005 to July 2016, the short-run volatility component explains on average about \(64.28\%\). This finding is consistent with the finding in Sect. (6.1): the short-run volatility affects futures risk premium more than long-run volatility through exposure to level risk as indicated in the estimates of \(K_{V1}\). As shown in panel B of Table 2, the estimate of \(K_{V1}\) for the short-run volatility component (4.3284) is almost as ten times large as the estimate for the long-run counterpart (0.4206).

In summary, the short-run volatility component is at least as important as the long-run volatility component in explaining the predictive component of futures risk premia in all samples. In particular, the short-run volatility component is the most important contributor among the three principal components and the two volatility components in the two subsamples. These findings are consistent with those in Sect. (6.1), which suggests that it is important to distinguish between long-run and short-run volatility components. A single volatility factor is not able to capture the explanation power from the high-frequency transitory component in the volatility of the crude oil futures market.

7 Model performance

While our main objective is to study the risk-return relation in the term structure of oil futures, it is also important that our model adequately captures the stylized facts in the data. We therefore examine the model’s in-sample performance in this section. We investigate the model’s ability to simultaneously fit the first and second moments of log futures prices.

7.1 Fit of futures prices

Table 5 reports the in-sample RMSEs of the term structure of log futures prices for the full sample and the two subsamples. In the full sample, the average RMSE across 12 maturities is about 24 basis points. In the two subsamples, the average RMSEs are about 24 and 13 basis points, respectively. Our model provides a good fit for the term structure of log futures prices. This finding is robust to different sample periods.

Table 5 Fits of log oil prices

7.2 Conditional volatility of futures prices

Using Eq. (9), the model-implied one-month ahead conditional variance of the n-maturity futures is given by

$$\begin{aligned} var_{t}(f_{t+1}^{n})=B_{n}^{^{\prime }}var_{t}(PC_{t+1})B_{n}+\sigma _{e}^{2}, \end{aligned}$$
(20)

where \(\sigma _{e}^{2}\) is the variance of the pricing errors. We use EGARCH(1, 1) model as a benchmark to measure the "true" conditional volatility of log futures prices. The model is estimated by assuming that the conditional mean of changes in monthly log futures prices is generated by an AR(1) process. Table 6 presents the in-sample RMSEs of the term structure of conditional volatilities for the full sample and the two subsamples. On average, the RMSE is about 10 basis points for the full sample, and it is about 11 and 7 basis points respectively for the two subsamples. The fits are better for futures with intermediate maturities (5 to 8 month) for all three samples.

We present the model-implied one-month conditional volatilities together with the EGARCH(1, 1) volatilities for all three samples in Fig. 6. The estimates from the term structure model with long- and short-run volatility components comove closely with the EGARCH volatilities for all maturities and in all three samples. The proposed model fits the high volatility periods of the early 1990s, early 2000s, and 2008-2009 well in the sample. To further assess the quality of the estimated conditional volatilities, we report the unconditional correlation between model-implied volatilities and EGARCH(1, 1) volatilities in Table 7. We find that the average unconditional correlation is as high as \(97\%\) for the full sample. For futures with intermediate maturities (4- to 8-month), the unconditional correlation is as high as \(99\%\). In the two subsamples, the average unconditional correlation is \(93\%\). The unconditional correlations are slightly higher for futures with longer maturities in the two subsamples. Overall, the term structure model with long- and short-run GARCH volatility components appears to trace well the movements in the conditional second moment of the log futures prices. Our model inherits the ability of GARCH models to capture the time-varying volatilities of futures prices.

Table 6 Fits of conditional volatilities
Fig. 6
figure 6figure 6

Model-implied conditional volatility. Notes to Figure: This figure plots the model-implied one-month conditional volatility of log futures prices and the EGARCH(1,1) volatility for 1-month to 12-month maturities using the full sample and two subsamples. For each maturity, the solid line (blue) represents the EGARCH(1,1) estimated volatility of changes in log futures prices. The EGARCH(1,1) is estimated assuming that the conditional mean of changes in monthly log futures prices is generated by an AR(1) process. The dotted line (red) represents the implied volatility from the term structure model with long-run and short-run volatility components. The full sample period is from 1990:01 to 2016:07. The first subsample is from 1990:01 to 2005:05. The second subsample is from 2005:06 to 2016:07 (color figure online)

Table 7 Unconditional correlation between model-implied volatility and EGARCH volatility

8 Conclusion

In this paper, we study the risk-return relation in the crude oil futures market through a discrete-time term structure model with long-run and short-run GARCH-type volatility components. The model combines the tractability of affine term structure model with the ability of the GARCH model to deliver an accurate measure of futures volatility. Moreover, this model allows us to differentiate the contributions to risk premiums of long-run and short-run volatility components of the term structure of futures prices.

Using WTI oil futures prices from 1990 to 2016, we find a significant positive relationship between risk and return before May 2005, but a significant negative relation after that time period. Due to the financialization in the commodity markets after 2005, speculators who treat commodities as alternative investments are willing to accept a lower or even negative risk premium. Notably, it is the short-run component of volatility that matters more for return predictability than the long-run component, especially in the two subsamples. The risk-return relation may be misrepresented by models with a single volatility factor.