Understanding the Tracking Errors of Commodity Leveraged ETFs

Guo, Kevin; Leung, Tim

doi:10.1007/978-1-4939-2733-3_2

Kevin Guo⁴ &
Tim Leung⁴

Part of the book series: Fields Institute Communications ((FIC,volume 74))

1766 Accesses
21 Citations
6 Altmetric

Abstract

Commodity exchange-traded funds (ETFs) are a significant part of the rapidly growing ETF market. They have become popular in recent years as they provide investors access to a great variety of commodities, ranging from precious metals to building materials, and from oil and gas to agricultural products. In this article, we analyze the tracking performance of commodity leveraged ETFs and discuss the associated trading strategies. It is known that leveraged ETF returns typically deviate from their tracking target over longer holding horizons due to the so-called volatility decay. This motivates us to construct a benchmark process that accounts for the volatility decay, and use it to examine the tracking performance of commodity leveraged ETFs. From empirical data, we find that many commodity leveraged ETFs underperform significantly against the benchmark, and we quantify such a discrepancy via the novel idea of realized effective fee. Finally, we consider a number of trading strategies and examine their performance by backtesting with historical price data.

Access provided by Autonomous University of Puebla. Download chapter PDF

Have trend-following signals in commodity futures markets become less reliable in recent years?

Article 15 April 2021

Return and volatility of emerging markets leveraged ETFs

Article 28 March 2016

Long-term returns estimation of leveraged indexes and ETFs

Article 16 December 2023

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

The advent of commodity exchange-traded funds (ETFs) has provided both institutional and retail investors with new ways to gain exposure to a wide array of commodities, including precious metals, agricultural products, and oil and gas. All commodity ETFs are traded on exchanges like stocks, and many have very high liquidity. For example, the SPDR Gold Trust ETF (GLD), which tracks the daily London gold spot price, is the most traded commodity ETF with an average trading volume of 8 million shares and market capitalization of US $31 billion in 2013.^{Footnote 1}

Within the commodity ETF market, some funds are designed to track a constant multiple of the daily returns of a reference index or asset. These are called leveraged ETFs (LETFs). An LETF maintains a constant leverage ratio by holding a variable portfolio of assets and/or derivatives, such as futures and swaps, based on the reference index. For example, the Dow Jones U.S. Oil & Gas Index (DJUSEN) or the Dow Jones U.S. Basic Materials Index (DJUSBM) and their associated ETFs track the stocks of a basket of commodities producers, as opposed to the physical commodity prices. On the other hand, most LETFs are based on total return swaps and commodity futures. The most common leverage ratios are ± 2 and ± 3, and LETFs typically charge an expense fee. Major issuers include ProShares, iShares, VelocityShares and PowerShares (see Table 1). For example, the ProShares Ultra Long Gold (UGL) seeks to return 2x the daily return of the London gold spot price minus a small expense fee. One can also take a bearish position by buying shares of an LETF with a negative leverage ratio. The ProShares Ultra Short Gold (GLL) is an inverse LETF that tracks −2x the daily return of the London gold fixing price. LETFs are a highly accessible and liquid instrument, thereby making them attractive instruments for traders who wish to gain leveraged exposure to a commodity without borrowing money or using derivatives.

Table 1 A summary of the 23 LETFs studied in this paper, arranged by commodity type and then leverage

Full size table

For a long LETF, with a leverage ratio β > 0, the fund must add to a winning position in a bull market to maintain a constant leverage ratio. On the other hand, during a bear market, the fund must sell its losing positions to maintain the same leverage ratio. Similar arguments can be made for short (or inverse) LETFs (β < 0). As a consequence, LETFs can potentially outperform β times its reference during periods of market trending. However, should the LETF exhibit high volatility but no significant movement in price over a period of time, the constant daily re-balancing would cause the fund to decline in value. Therefore, LETFs can be viewed as long momentum but short volatility, and the value erosion due to realized variance of the reference is called volatility decay (see [2–4]). This raises the important question of how well do LETFs perform over a long horizon.

Since their introduction to the market, LETFs a number of criticisms from both practitioners and regulators.^{Footnote 2} Some are concerned that the returns of LETFs exhibit some discrepancies from the goals stated in their prospectuses. In fact, some issuers provide warnings that LETFs are unsuitable for long-term buy-and-hold investors.

Many existing studies focus on equity-based ETFs and their leveraged counterparts. For example, Avellaneda and Zhang [2] study the price behavior and discuss the volatility decay of equity LETFs in different sectors. They find minimal 1-day tracking errors among the most liquid equity ETFs. They explain that an equity LETF can replicate the leveraged returns of its reference through a dynamic portfolio consisting of the component equities.

In contrast, commodities are unique because the physical assets cannot be stored easily. As such, ETF issuers are required to replicate through either warehousing,^{Footnote 3} which is very costly, and thus uncommon except for precious metals such as silver and gold, or trading futures with multiple counterparties (see [5]). Since the reference indices may represent the spot prices of physical commodities, futures-based commodity ETFs may fail to track their reference indices perfectly and their tracking performance is subject to the fluctuation and term structure of futures prices. On top of that, most commodity LETFs use over-the-counter (OTC) total return swaps with multiple counterparties to generate the required leverage ratios. The lower liquidity of OTC contracts and counterparty risk can contribute to additional tracking errors. As we show in this paper, tracking errors can seriously affect the long-term fund performance of LETFs.

In a related work, Murphy and Wright [12] perform a t-test based on 1-day returns to determine if any commodity LETF has a non-zero tracking error. They conclude that all LETFs have a very good daily tracking performance. However, they do not conduct the analysis over a longer horizon, or account for the volatility decay. There is also no discussion of trading strategies there. On the other hand, Guedj et al. [5] discuss the difficulties faced by an ETF provider in replicating a commodity index using futures. In particular, they point out that the term structure of futures may lead to large deviations between the ETF price and the spot price of a commodity.

In this paper, we analyze the tracking performance of commodity leveraged ETFs. Through a series of regression analyses, we illustrate how the returns of commodity LETFs deviate from the reference returns multiplied by the leverage ratio over different holding periods. In particular, the average tracking error tends to turn more negative over a longer horizon and for higher leveraged ETFs. With in mind that realized variance of the reference can erode the LETF value, we examine the over/under-performance of LETFs with respect to a benchmark that incorporates the effect of volatility decay. From empirical data, we find that many commodity leveraged ETFs in our study underperform significantly against the benchmark, and we quantify such a discrepancy by introducing the realized effective fee. Finally, we consider a static trading strategy that involves shorting two LETFs with leverage ratios of different signs, and study its performance and dependence on the realized variance of the reference. We find that the resulting portfolio is always long realized variance both theoretically and empirically, but is also exposed to the tracking errors associated with the two LETFs. We also backtest the strategy through examining its empirical returns over rolling periods.

The rest of the paper is organized as follows. In Sect. 2, we analyze the returns of commodity LETFs over different holding periods and illustrate horizon dependence of tracking errors. In Sect. 3, we use a benchmark process that incorporates the realized variance of the reference to study the over/under-performance of each LETF. In Sect. 4, we discuss a static trading strategy and backtest using historical data. Section 5 concludes the paper and points out a number of directions for future research.

2 Analysis of Tracking Error

We first compare the returns of LETFs and their reference indices. For every ETF, we obtain its closing prices and reference index values from Bloomberg for the period Dec 2008–May 2013. We then calculate the n-day returns from n = { 1, 2, …, 30} using disjoint successive periods (e.g. the return over days 1–30 then returns over days 31–60 for 30-day returns). Let L _t be the price of an LETF and S _t be the reference index value at time t. For a given leverage ratio β, we compare the log-returns of the LETF to β times the log-returns of the corresponding reference index. This leads us to define the n-day tracking error at time t by

$$\displaystyle{ Y _{t}^{(n)} =\ln \frac{L_{t+n\varDelta t}} {L_{t}} -\beta \ln \frac{S_{t+n\varDelta t}} {S_{t}}, }$$

(1)

where Δ t represents one trading day. We explore the empirical distribution of the n-day tracking error, and then analyze the effect of holding horizon on the magnitude of tracking errors. We remark there are alternative ways to define tracking errors for ETFs. For example, one can consider the difference in relative returns as opposed to log-returns, or the root mean square of the daily differences (see [10]).

2.1 Regression of Empirical Returns

We conduct a regression between log-returns of the LETF and its reference index based on the linear model:

$$\displaystyle{ \ln \frac{L_{t}} {L_{0}} =\hat{\beta }\ln \frac{S_{t}} {S_{0}} +\hat{ c}+\epsilon, }$$

(2)

where ε ∼ N(0, σ ²) is independent of the reference index value S _t, $\forall t \geq 0$. In other words, we run an ordinary least square 1-variable regression between the log-returns for every fixed horizon of n days. Then, we increase the holding period from 1 to 30 days, and observe how the regression coefficients vary.

We display the regression results in Figs. 1, 2, 3, and 4 for log-returns over periods of 1, 5, 10, and 20 days. To avoid dependence among returns, we use disjoint time intervals to calculate returns. For example, we use $\frac{S_{20}} {S_{0}}, \frac{S_{40}} {S_{20}} \ldots$ and $\frac{L_{20}} {L_{0}}, \frac{L_{40}} {L_{20}} \ldots$ for 20-day log-returns as the inputs for the regression.

In Fig. 1, the regression coefficient $\hat{\beta }$ for DIG (β = 2, oil & gas) increases from 2 to 2. 1 as the holding period lengthens from 1 to 20 days. Although the coefficient of determination R ² is close to 99 % for up to 20 days, it is highest for 1-day returns. In Fig. 2 for DUG ($\beta = -2$, oil & gas), one again observes $\hat{\beta }$ increasing, and R ² decreasing. For DUG ($\beta = -2$, oil & gas), as n varies from 1 to 20, $\hat{\beta }$ increases from − 2 to − 1. 66. As a result, this implies that DIG (β = 2, oil & gas) effectively gains leverage as the holding time increases, while DUG ($\beta = -2$, oil & gas) loses leverage compared to the advertised fund β.

On the other hand, UGL (β = 2, gold) and GLL ($\beta = -2$, gold) exhibit very different return behaviors. In Fig. 3 the R ² for UGL (β = 2, gold) is surprisingly worst for the shortest holding period of 1 day, whereas it increases to 95 % over a holding period of 20 days. In Fig. 4 for GLL ($\beta = -2$, gold), the R ² increases from 35 % to 96 % when holding the fund from 1 to 20 days. Furthermore, the estimators $\hat{\beta }$ for UGL (β = 2, gold) and GLL ($\beta = -2$, gold) both slowly approach their advertised β = ±2. The variation of $\hat{\beta }$ for DIG (β = 2, oil & gas) and UGL (β = 2, gold) over different holding periods is summarized in Fig. 5.

We observe that LETFs that track an illiquid reference, such as the gold bullion index GOLDLNPM, tend to have more tracking errors than those tracking a liquid index, such as the oil & gas index DJUSEN. The oil & gas commodity LETFs involve exchange-traded futures which are liquid proxy to the spot price. The gold and silver bullion LETFs consist of OTC total return swaps. The difficulty and higher costs replication using swaps, as well as infrequent (typically daily) update of the swaps’ mark-to-market values can weaken the fund’s tracking ability. For example, the 1-day regressions of UGL and GLL (β = ±2, gold) yield R ² values less than 40 %, while DIG and DUG (β = ±2, oil & gas ) have 1-day R ² values of over 90 %. On the other hand, full physical replication yields the greatest R ², with examples of the non-leveraged gold and silver ETFs, GLD and SLV, respectively. Hence, the replication strategy can significantly affect a fund’s tracking errors. A more precise understanding of the effectiveness of swaps, futures, and other replication strategies requires the full holdings history from the ETF provider, which is not publicly available at all times.^{Footnote 4}

In addition, the LETFs we studied have an increasingly negative constant coefficient $\hat{c}$ as the holding time increases. For example, over a holding period of 20-days, DUG ($\beta = -2$, oil & gas) has a 3 % decay on returns compared to β times its reference index. We would expect this phenomenon, however, since the LETF would need to buy high and sell low, while the reference investor would simply hold his securities. Therefore, the longer the LETF is held, the more likely the fund will underperform against β times the reference index. As we will see in Sect. 3, the constant coefficient $\hat{c}$ depends on two factors, the expense fee charged by the issuer as well as the realized variance of the reference index.

Hence, with this simple linear model for LETF prices, we have observed that although LETFs safely replicate β times the reference over short holding periods, they begin to exhibit negative tracking error and deviations in their leverage ratios β as the holding time increases. Furthermore, we see that LETFs which attempt to track illiquid spot prices perform much more poorly than expected. We conclude that more factors must be considered when modeling LETF returns.

2.2 Distribution of Tracking Errors

As defined in (1), the tracking error is the difference between the LETF’s log-return and the corresponding multiple of its reference index’s log-return. In this section, we examine the distribution of the tracking error. This provides a picture of the LETF’s efficiency in its stated goal of replicating the leveraged return of a reference index.

For the 23 LETFs in Table 2, we compute the mean μ and standard deviation σ for the tracking errors using available price data during the period Dec 2008 to May 2013. For all these funds, the mean 1-day tracking error has μ ≈ 0, ranging from 0 % to − 0. 27 %. Therefore, all these LETFs on average successfully replicate the stated multiple β of the daily reference return, with a slight negative bias. In fact, many LETFs even continued to replicate returns over periods as long as 10 days. However, as the holding time increases, the average tracking error grows more negative, so that the LETF in fact underperforms its intended goal over longer holding periods (see Fig. 6).

Table 2 Mean μ and standard deviation σ of the 1-day tracking error by commodity

Full size table

Interestingly, the tracking errors for the silver and gold LETFs (AGQ, ZSL (β = ±2, silver); UGL, GLL (β = ±2, gold)) in Table 2 have σ several magnitudes higher than μ. For example, AGQ (β = 2, silver) has a tracking error σ of 5 % compared to a μ of 0.01 %. In other words, these four LETFs, while they might track their references well on average, may also exhibit positive and negative deviations over 1-day holding periods as well. These observations are consistent with the regressions in Figs. 3 and 4, where UGL and GLL (β = ±2, gold) show significant 1-day tracking errors. On the other hand, the non-leveraged gold and silver bullion ETFs, GLD and SLV, have almost no tracking error σ ≈ 0, because they hold the underlying bullion according to their prospectuses. Since many investors use these ETFs to gain leveraged exposure to commodities, they should be aware of the large variance of the associated tracking errors.

In Fig. 6, we show the histogram for the tracking error for each ETF along with a quantile-quantile plot to illustrate the distribution. For DIG and DUG (β = ±2, oil & gas), the quantile-quantile plot shows that the tracking error distribution is not quite normal, and has a large negative tail, so that the commodity LETF tracking error is negatively biased even for the shortest possible holding period of 1 day. On the other hand, for UGL, GLL (β = ±2, gold) the distribution appears to be normal with R ² close to 98 %. However, as noted in Table 2, the tracking errors for UGL and GLL (β = ±2, gold) also have a very large variance.

Next, we examine the horizon effect of tracking errors. Figure 7 indicates that higher leveraged ETFs tend to have more negative average tracking errors, which appear to be decreasing linearly over longer holding periods. In addition, negative leveraged LETFs have a more negative average tracking error than their positive counterparts. For example, in Fig. 7, GLL ($\beta = -2$, gold) has a lower slope than UGL (β = 2, gold) even though they have the same absolute value of leverage ratio | β | . Furthermore, with few exceptions, the average tracking error is most negative when $\beta = -3$ followed by $\beta = 3,-2,2,-1,1$. Thus, there is a higher holding horizon punishment for buying short than long LETFs.

Our analysis of the tracking error distribution reveals several characteristics of the tracking error defined in (1). Over a very short holding period, most LETFs perform close to their objectives stated in their prospectuses. Nevertheless, the realized tracking error varies over time, and can be positive or negative. For gold and silver LETFs, the tracking error is more volatile. Moreover, the magnitude of the mean tracking error depends heavily on the β of the LETF, with bear LETFs suffering a higher penalty than bull LETFs.

3 Incorporating Realized Variance into Tracking Error Measurement

As is well known in the industry (see [2, 3]), the price dynamics of an LETF depends on the realized variance of the reference index. This leads us to incorporate the realized variance in measuring the performance of an LETF. We run a regression analysis based on empirical LETF and reference prices that incorporates the realized variance as an independent variable. We then derive a realized effective fee associated with each LETF and analyze the realized price behavior relative to a theoretical benchmark to better quantify the over/under-performance.

3.1 Model for the LETF Price

Let S _t be the price of the reference index, and L _t be the price of the LETF at time t. Also denote f as the expense rate, r as the interest rate and β as the leverage ratio. Assume the reference asset follows the SDE

$$\displaystyle{ \frac{dS_{t}} {S_{t}} =\mu _{t}dt +\sigma _{t}dW_{t},\quad t \geq 0, }$$

(3)

with stochastic drift $(\mu _{t})_{t\geq 0}$ and volatility $(\sigma _{t})_{t\geq 0}$. For our analysis herein, we assume a general diffusion framework, but do not need to specify a parametric model. Many well-known models, including the CEV, Heston, and exponential Ornstein-Uhlenbeck models, fit within the above framework.

A long β-LETF L can be constructed through a dynamic portfolio. Specifically, the portfolio at time t consists of the cash amount $ β L _t invested in the reference index S _t, while $(β − 1)L _t is borrowed at the positive risk free rate r. As a result, the LETF satisfies the SDE

$$\displaystyle{ dL_{t} = L_{t}\beta \frac{dS_{t}} {S_{t}} - L_{t}((\beta -1)r + f)dt. }$$

(4)

Solving the SDE, the log-return of the LETF is given by

$$\displaystyle{ \ln \frac{L_{t}} {L_{0}} =\beta \ln \frac{S_{t}} {S_{0}} + \frac{\beta -\beta ^{2}} {2} V _{t} + ((1-\beta )r - f)t, }$$

(5)

where

$$\displaystyle{ V _{t} =\int _{ 0}^{t}\sigma _{ s}^{2}ds }$$

(6)

is the realized variance of S accumulated up to time t. Therefore, under this general diffusion model, the log-return of the LETF is proportional to the log-return of the reference index by a factor of β, but also proportional to the variance by a factor of $\frac{\beta -\beta ^{2}} {2}$. The latter factor is negative if $\beta \notin (0,1)$, which is true for every LETF traded on the market. Also, the expense fee f reduces the return of the LETF.

Our regression analysis will focus on testing the functional form (5). We observe from (5) that the functional form of L _t in terms of S _t and V _t holds for any parametric model within the diffusion framework in (3). Considering the daily LETF returns, we set $\varDelta t = \frac{1} {252}$ as one trading day. Let R _t ^S be the daily return of the reference index at time t. At any time t, the n-day log-returns of an LETF follows

$$\displaystyle{ \ln \frac{L_{t+n\varDelta t}} {L_{t}} =\beta \ln \frac{S_{t+n\varDelta t}} {S_{t}} + \frac{\beta -\beta ^{2}} {2} V _{t}^{(n)} + ((1-\beta )r - f)n\varDelta t, }$$

(7)

$$\displaystyle{ V _{t}^{(n)} =\sum _{ i=0}^{n-1}(R_{ t+i\varDelta t}^{S} -\bar{ R_{ t}}^{S})^{2},\quad \bar{R_{ t}}^{S} = \frac{1} {n}\sum _{i=0}^{n-1}R_{ t+i\varDelta t}^{S}. }$$

(8)

This serves as a benchmark process for our subsequent analysis.

3.2 Regression of Empirical Returns

The log-return equation (7) suggests a regression with two predictors: the log-returns and the realized variance of the reference over n-days. This results in the linear model

$$\displaystyle{ \ln \frac{L_{t}} {L_{0}} =\hat{\beta }\ln \frac{S_{t}} {S_{0}} +\hat{\theta } V _{t} +\hat{ c}+\epsilon, }$$

(9)

where $\hat{c}$ is a constant intercept to be determined, and $\varepsilon \sim N(0,\sigma ^{2})$ is independent of $(S_{t})_{t\geq 0}$.

In Table 3, we summarize the estimated $\hat{\theta }$ from our regression with holding periods of 30 days. Again, we use price data from disjoint periods to calculate returns. The realized variance is calculated using the inter-period returns (30 days). The choice of 30-day periods gives us sufficient points to compute the realized variance while providing enough disjoint periods during the period Dec 2008–May 2013 to perform a regression. A longer price history would certainly have helped in balancing this tradeoff, but all these commodity LETFs were introduced only in the past 5 years.

Table 3 $\hat{\theta }$ vs. θ, estimated from 30-day multi-variable regression of returns, with a partial correlation table

Full size table

Our empirical analysis confirms several aspects of our theoretical model in (5) and provides explanations in cases where there is discrepancy. The theoretical value of θ according to (5) is given by $\frac{\beta -\beta ^{2}} {2}$. Table 3 shows that the estimator $\hat{\theta }$ is typically in the neighborhood of θ, its theoretical value. For example, SCO ($\beta = -2$, crude oil) has $\hat{\theta }= 2.93$ versus a theoretical θ of 3. In addition, the non-leveraged ETFs all have $\hat{\theta }$ close to 0, suggesting that realized variance does not play an important role in its price process, as predicted. However, some LETFs have $\hat{\theta }$ diverging significantly from θ. For example, the $\hat{\theta }$ for UGL (β = 2, gold) differs from its theoretical value by a factor of 114 % even with a regression R ² of 99 %.

We attribute the deviation of $\hat{\theta }$ from θ in our regression to the collinearity effect of the two predictors ($\ln \frac{S_{t}} {S_{0}}$ and V _t). Of course $\ln \frac{S_{t}} {S_{0}}$ and V _t cannot be independent observations, since V _t depends on the price path process of S _t, the reference index. In general, the reference returns and the realized variance are negatively correlated. When the realized variance is high, it is likely the reference has suddenly dropped in value. When the realized variance is low, it usually implies a period of steady positive growth for the reference. Thus, the multi-collinearity effect is responsible for shifting predictive power among the different predictor variables. In order to measure the magnitude of the collinearity effect and the contribution of each correlated predictor variable, we compute the coefficients of partial determination for our regression model.

The factor r _y | x ² which measures the marginal predictive power of adding the realized variance into the model. As r _y | x ² increases, $\hat{\theta }$ becomes closer to θ, suggesting a larger dependence of LETF returns on realized variance during holding periods of high volatility. For example, for the 3 LETFs DIG (β = 2, oil & gas), SCO ($\beta = -2$, crude oil), and UYM (β = 2, building materials) all have r _y | x ² over 90 %. Their estimated $\hat{\theta }$ is similarly very close to the theoretical θ, never differing by more than 10 %. However, for non-leveraged ETFs, the realized variance has minimal added predictive power in the model. For those ETFs, we observe $\hat{\theta }\approx 0$. For example, SLV (β = 1, silver), GLD (β = 1, gold), and DBO (β = 1, crude oil) all have $r_{y\vert x}^{2} \approx 0$, and they subsequently have $\hat{\theta }\approx 0$. In addition, $r_{x\vert y}^{2}$, which is the marginal predictive power of adding the log-returns of the reference into our regression model, is always very high, indicating that the log-returns of the reference affect the LETF prices the most, but that the realized variance is still important for predictive power, especially when leverage and the holding period is high.

3.3 Realized Effective Fee

In Fig. 8, we show three empirical price paths: the LETF log-returns, the benchmark process defined in (5), and β times the reference index log-returns. As we can see, the value erosion due to realized variance (volatility decay) starts to play a significant role in determining LETF prices as the holding time increases. The path associated with β times the reference log-returns dominates the LETF log-returns after about 1 month of holding. After about 1 year, the benchmark which incorporates volatility decay more closely models the empirical LETF log-returns. For example, after 6 months of holding, SCO ($\beta = -2$, crude oil) diverges from β times the reference, illustrating the effects of volatility decay.

However, there are also some strong deviations from the predictions given by the benchmark, which compound as the holding time increases. This causes the LETF to underperform even after the volatility decay is accounted for. For example, DUG’s ($\beta = -2$, oil & gas) empirical returns begin to trail its benchmark significantly around 2009. Therefore, the volatility decay cannot explain all the LETF underperformance.

We are therefore motivated to quantify the over/under-performance of the LETFs after observing deviations from the benchmark in Fig. 8. We introduce the concept of realized effective fee (REF) as the effective deduction rate charged by the LETF provider over the frictionless dynamic portfolio from which the LETF is constructed in Sect. 3.1. For a holding interval [0, t], the corresponding REF is defined by

$$\displaystyle{ \widehat{f_{t}} = (1-\beta )r -\frac{\ln \frac{L_{t}} {L_{0}} -\beta \ln \frac{S_{t}} {S_{0}} -\frac{\beta -\beta ^{2}} {2} V _{t}} {t}. }$$

(10)

Since for each LETF, L _t, S _t, V _t, β, and r are all known, we can calculate the REF $\widehat{f_{t}}$ for any LETF over a given holding period [0, t] using historical prices. We remark that the REF, which is indexed by time t, depends on the selected holding horizon.

In many cases, the REF is seen to be much larger than the fund’s advertised fee, indicating significant underperformance. Out of the 23 commodity LETFs, 2 have negative implied costs, so that the fund overperforms by the end of the 5 year period Dec 2008 to May 2013. If the REF exceeds the advertised fee, then the investor effectively pays an extra price for the opportunity to invest in the LETF. As a general trend, the bear LETFs tend to charge higher REFs than bull LETFs with the same magnitude of leverage | β | . For example, USLV (β = 2, silver) has a REF of 93 bps, while DSLV ($\beta = -2$, silver) has an REF of 504 bps over the period Dec 2008–May 2013. The two highest REFs correspond to DUG ($\beta = -2$, oil & gas) and SMN ($\beta = -2$, building materials), whose REFs are 1,134 bps and 1,625 bps respectively. Figure 8 illustrates that DUG ($\beta = -2$, oil & gas) drastically underperforms the benchmark, thereby realizing a high REF. Notice that in both cases, however, DUG and SMN’s bull counterparts DIG (β = 2, oil & gas) and UYM (β = 2, building materials) respectively display a negative REF, indicating overperformance during the same period. It is possible that as the reference trends upwards for a long period of time, the bear LETF will underperform, while the bull LETF will overperform (Table 4).

Table 4 Comparison of the official fee for the LETF charged on the fund prospectus and the REF calculated using 5 years of price data (Dec 2008–May 2013) for the LETF and reference (see (10))

Full size table

4 A Static LETF Portfolio

Taking advantage of the volatility decay, a well-known trading strategy used by practitioners involves shorting a ±β pair of LETFs with the same reference, as discussed in [2, 7, 9, 11]. Since the LETFs have opposite daily returns on the same reference index, the portfolio has very little exposure to the reference as long as the holding period is sufficiently short. With this strategy, the volatility decay can help generate profit, which is the intuition of many practitioners. However, the portfolio is exposed to risk during periods of low volatility and high trending, as well as tracking errors. In this section, we describe an extension of this trading strategy by allowing the positive and negative leverage ratios to differ. We determine the portfolio weights to approximately eliminate the dependence on the reference. We show that the resulting portfolio is long volatility. For a number of LETF pairs, we find from empirical data that on average the strategy is profitable with enormous tail risk.

We now construct a weighted portfolio which is short the LETF with leverage ratio β ₊ > 0 and short another LETF with leverage ratio β ₋ < 0. We emphasize that both LETFs having the same reference, but that β ₊ and | β ₋ | may differ. We hold fraction ω ∈ (0, 1) of the portfolio in the β ₊-LETF and (1 −ω) of the portfolio in the β ₋-LETF. At time T, the normalized return from this strategy is

$$\displaystyle{ \mathcal{R}_{T} = 1 -\omega \frac{L_{T}^{+}} {L_{0}^{+}} - (1-\omega )\frac{L_{T}^{-}} {L_{0}^{-}}. }$$

(11)

Applying (5), $\mathcal{R}_{T}$ admits the expression

$$\displaystyle\begin{array}{rcl} \mathcal{R}_{T}& =& 1 -\omega \left (\frac{S_{T}} {S_{0}} \right )^{\beta _{+}}\exp (\varGamma _{T}^{+}) - (1-\omega )\left (\frac{S_{T}} {S_{0}} \right )^{\beta _{-}}\exp (\varGamma _{T}^{-}),{}\end{array}$$

(12)

where

$$\displaystyle{ \varGamma _{T}^{\pm } = \frac{\beta _{\pm }-\beta _{\pm }^{2}} {2} V _{T} + ((1 -\beta _{\pm })r - f_{\pm })T, }$$

(13)

Here, β _± and f _± are the respective leverage ratios and fees of the two LETFs in the portfolio defined in (11). Over a short holding period such that $\frac{L_{T}} {L_{0}} \approx 1$, one can pick an appropriate weight ω ^∗ to approximately remove the dependence of $\mathcal{R}_{T}$ on S _T.

Proposition 1.

Select the portfolio weight $\omega ^{{\ast}} = \frac{-\beta _{-}} {\beta _{+}-\beta _{-}}$ . For $\frac{L_{T}} {L_{0}} \approx 1$ , the return from this strategy is given by

$$\displaystyle{ \mathcal{R}_{T} = \frac{-\beta _{-}\beta _{+}} {2} V _{T} - \frac{\beta _{-}} {\beta _{+} -\beta _{-}}(f_{+} - f_{-})T + (f_{-}- r)T. }$$

(14)

Proof.

For $\frac{L_{T}} {L_{0}} \approx 1$, we can substitute for $\frac{L_{T}} {L_{0}}$ with $\ln \frac{L_{T}} {L_{0}} + 1$ in (11). Then, we set ω = ω ^∗ and apply (5) to conclude (14).

The return (14) corresponding to portfolio weight ω ^∗ reflects a linear dependence on the realized variance. In particular, the coefficient $\frac{-\beta _{-}\beta _{+}} {2}$ is strictly positive, so the strategy is effectively long volatility (V _T). Also, as it does not depend on S _T, the ω ^∗ portfolio is Δ-neutral as long as the reference does not move significantly. In Table 5, we summarize the coefficient of V _T and the weighted portfolio $(\omega ^{{\ast}},1 -\omega ^{{\ast}})$ for different combinations of leverage ratios. Note that as long as $\beta _{+} = -\beta _{-}$, we end up with the portfolio weight $\omega ^{{\ast}} = \frac{1} {2}$. Also, the coefficient $\frac{-\beta _{-}\beta _{+}} {2}$ exceeds or equals to 1 except for the pair $(\beta _{+},\beta _{-}) = (1,-1)$, and it is largest for the pair $(\beta _{+},\beta _{-}) = (3,-3)$.

Table 5 Table of $(\beta _{+},\beta _{-})$ pairs vs ω ^∗ the weight of the β ₊ portfolio, and $\frac{-\beta _{-}\beta _{+}} {2}$ the dependence of the strategy on V _t (see Proposition 1)

Full size table

We now backtest the ω ^∗ strategy from Proposition 1 as follows. For each LETF pair, we short $0.5 of the β ₊-LETF and $0.5 of the β ₋-LETF with $\beta _{+} = -\beta _{-} = 2$ and hold the position for some time T. The normalized return $\mathcal{R}_{T}$ depends on the relative weights on the long/short-LETFs but not the absolute cash amounts. More generally, one can also test the strategy with different β _± and ω ^∗.

Dividing the price data from Dec 2008 to May 2013 into n-day rolling (overlapping) periods, we calculate the returns from the strategy over each period. For every n-day return, we compare against the realized variance over the same period. This is illustrated in Fig. 9. As a theoretical benchmark, we also plot $\mathcal{R}_{T}$ in (14) as a linear function. Each point (dot) on the plots represents a 5-day return, but over rolling periods the returns are not independent. In other words, the lines in Fig. 9 are not generated by regression but taken from (14). We choose (14) as a benchmark because it is expected to hold pathwise as long as $\frac{L_{T}} {L_{0}} \approx 1$ with negligible tracking error.

We can observe from Fig. 9 that the returns exhibit positive dependence on the realized variance (V _T). In particular, for the energy pairs (DIG-DUG (β = ±2, oil & gas) and UCO-SCO (β = ±2, crude oil)), the returns tend to be very positive when the realized variance is high. This is because the strategy captures the volatility decay as profit. Nevertheless, there is also a visible amount of noise in the returns deviating from the linear dependence on V _T, especially for the gold and silver pairs (UGL-GLL (β = ±2, gold) and AGQ-ZSL (β = ±2, silver), respectively). This can be partly attributed to tracking errors from both LETFs in the portfolio. Also, the ω ^∗-strategy loses its Δ-neutrality if the reference moves significantly.

While this portfolio is expected to be Δ-neutral (with respect to the reference index) for small reference movements, in reality the strategy is also short-Γ. One way to see this is through Fig. 10 that plots the returns against the reference index returns. Common to all four LETF pairs, when the reference return is either very positive or negative, the return of the ω ^∗-strategy tends to be negative. As a theoretical benchmark, we also plot the normalized return equation (12) which applies even for large reference movements.

In contrast to the energy pairs, the gold and silver pairs yield very noisy returns. This is consistent with our earlier observations from our regressions in Figs. 3 and 4. For instance, both UGL and GLL (β = ±2, gold) show substantial tracking errors over short periods such as 5 days, and their regressed leverage ratios differ from the stated ones. On the other hand, the DIG and DUG (β = ±2, oil & gas) regressions in Figs. 1 and 2 reflect much less tracking errors.

Furthermore, Fig. 11 shows that as the holding time increases, the returns from the ω ^∗ strategy increases as well. The performance is best for the energy pairs UCO-SCO (β = ±2, crude oil) and DIG-DUG (β = ±2, oil & gas), but more subdued for the bullion pairs UGL-GLL (β = ±2, gold) and AGQ-ZSL (β = ±2, silver). However, over longer holding periods, the ω ^∗ portfolio may lose its Δ-neutral status, thereby generating more risk as well. Although average returns from the ω ^∗ strategy are positive, one is subject to enormous tail risk, which increases with the holding time of the static portfolio. In order to ensure that we do not subject ourselves to excessive tail risk, we should not only be sure of a high volatility environment, but we must also adjust the holding time to account for the extra risk associated with time horizon of returns.

Figure 12 gives another perspective of the ω ^∗ strategy’s dependence on realized variance. It shows the time series of the 30-day rolling returns along with the realized variance of the reference index from Dec 2008 to May 2013. We see that when the realized variance increases sharply, the strategy returns also spike sharply. For example, when DJUSEN index realized variance spikes, the DIG-DUG (β = ±2, oil & gas) trading pair accumulates a 30 % return over a single 30-day holding period. However, when realized variance is subdued over a period of time, the ω ^∗ returns may turn quite negative as well.

In summary, the double-short trading strategy studied herein is profitable on average, but it is commodity specific and subject to enormous tail risk, as seen from empirical prices. The strategy’s profitability depends strongly on a high volatility from the reference index. Although longer holding times tend to enhance the average return, they also enormously increase the horizon risk. According to these findings, this strategy appears to be appealing only during times of high volatility in the reference index.

5 Concluding Remarks

The ETF market has continued to grow in quantity and diversity, especially in the past 5 years. For both investors and regulators, it is very important to understand and quantify the risks involved with various ETFs. In this paper, we have focused on commodity ETFs and their leveraged counterparts. We find that the LETF returns tend to deviate significantly from the corresponding multiple of the reference returns as the holding horizon lengthens. To study the performance of an LETF, we have applied a new benchmark process that accounts for the realized variance of the underlying. We find that many commodity LETFs still diverge, typically negatively, from this benchmark over time. These empirical observations motivate us to illustrate the over/under-performance of an LETF via the concept of realized expense fee. Based on the funds and the time periods we have studied, most commodity LETFs effectively charge significantly higher expense fees than stated on their prospectuses.

In view of LETFs’ common pattern of value erosion over time, one well-known trading strategy in the industry involves statically shorting both long and short LETFs in order to capture the volatility decay as profit. We systematically study an extension of this strategy that is applicable to LETF pairs with different asymmetric leverage ratios. We analytically derive the specific weights in the LETFs so that the resulting portfolio is approximately Δ-neutral, but short-Γ as well. This strategy can potentially be quite profitable but its return can be negatively impacted by tracking errors generated by the LETFs and large movements of the reference index. These two factors both depend on the holding horizon. This should motivate future research on the horizon risk for LETF strategies. To this end, Leung and Santoli [7] study the admissible holding horizon and leverage ratio given a risk constraint. The recent papers [6, 13, 14] examine the dynamics of price spreads between ETF pairs, for example, gold vs. silver.

Our analysis herein does not assume a parametric stochastic volatility model for the underlying. It is of practical interest to investigate the price behavior of LETF under a number of well-known stochastic volatility models, such as the Heston and SABR models. On top of LETFs, there are also options written on these funds. This gives rise to the question of consistent pricing of LETF options across leverage ratios (see [1, 8]). Finally, models that capture the connection between LETFs and the broader financial market would be very useful for not only traders and investors, but also regulators.

Notes

1.
According to ETF Database website (http://www.etfdb.com/compare/volume).
2.
In 2009, the SEC and FINRA issued an alert on the risk of leveraged ETFs on http://www.sec.gov/investor/pubs/leveragedetfs-alert.htm.
3.
For more details on the issue of storage cost for commodity ETFs, we refer to the Morningstar Report: “An Ugly Side to Some Commodity ETFs” by Bradley Kay, August 19, 2009.
4.
For a detailed snapshot of the holdings for a proshares ETF, please see http://www.proshares.com/funds/XYZ_daily_holdings.html where {XYZ} is the ETF ticker.

References

Ahn, A., Haugh, M., Jain, A.: Consistent pricing of options on leveraged ETFs. Working Paper, Columbia University (2012)
Google Scholar
Avellaneda, M., Zhang, S.: Path-dependence of leveraged ETF returns. SIAM J. Financ. Math. 1, 586–603 (2010)
Article MATH MathSciNet Google Scholar
Cheng, M., Madhavan, A.: The dynamics of leveraged and inverse exchange-traded funds. J. Invest. Manag. 7(4), 43–62 (2009)
Google Scholar
Dobi, D., Avellaneda, M.: Structural slippage of leveraged ETFs. Working Paper (August 2012)
Google Scholar
Guedj, I., Li, G., McCann, C.: Futures-based commodities ETFs. J. Index Invest. 2(1), 14–24 (2011)
Article Google Scholar
Leung, T., Li, X.: Optimal mean reversion trading with transaction costs and stop-loss exit. J. Theoretical & Applied Finance 18(3), p. 1550020 (2015)
Article MathSciNet Google Scholar
Leung, T., Santoli, M.: Leveraged exchange-traded funds: admissible leverage and risk horizon. J. Invest. Strateg. 2, 39–61 (2013)
Google Scholar
Leung, T., Sircar, R.: Implied volatility of leveraged ETF options. Applied Mathematical Finance, 22(2), pp. 162–188 (2015). http://www.tandfonline.com/eprint/jt8MFtBFhkjMIPDYiS9E/full
Mackintosh, P., Lin, V.: Longer term plays on leveraged ETFs. Credit Suisse: Portfolio Strategy, pp. 1–6 (April 2010)
Google Scholar
Mackintosh, P., Lin, V.: Tracking down the truth. Credit Suisse: Portfolio Strategy, pp. 1–10 (February 2010)
Google Scholar
Mason, C., Omprakash, A., Arouna, B.: Few strategies around leveraged ETFs. BNP Paribas Equities Derivatives Strategy, pp. 1–6 (April 2010)
Google Scholar
Murphy, R., Wright, C.: An empirical investigation of the performance of commodity-based leveraged ETFs. J. Index Invest. 1(3), 14–23 (2010)
Article Google Scholar
Naylor, M., Wongchoti, U., Gianotti, C.: Abnormal returns in gold and silver exchange traded funds. J. Index Invest. 2(2), 1–34 (2011)
Article Google Scholar
Triantafyllopoulos, K., Montana, G.: Dynamic modeling of mean-reverting spreads for statistical arbitrage. Comput. Manag. Sci. 8, 23–49 (2009)
Article MathSciNet Google Scholar

Download references

Acknowledgements

The authors would like to thank Scott Weiner of VelocityShares, and the participants of the 2013 Focus Program on Commodities, Energy and Environmental Finance held at Field’s Institute and the 2014 Joint Mathematics Meetings in Baltimore for their helpful suggestions and comments.

Author information

Authors and Affiliations

Industrial Engineering & Operations Research (IEOR) Department, Columbia University, New York, NY, 10027, USA
Kevin Guo & Tim Leung

Authors

Kevin Guo
View author publications
You can also search for this author in PubMed Google Scholar
Tim Leung
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tim Leung .

Editor information

Editors and Affiliations

EDF R&D, Clamart Cedex, France
René Aïd
Department of Statistics and Applied Probability, University of California Santa Barbara, City, USA
Michael Ludkovski
ORFE Department, Princeton University, Princeton, New Jersey, USA
Ronnie Sircar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Guo, K., Leung, T. (2015). Understanding the Tracking Errors of Commodity Leveraged ETFs. In: Aïd, R., Ludkovski, M., Sircar, R. (eds) Commodities, Energy and Environmental Finance. Fields Institute Communications, vol 74. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-2733-3_2

Download citation

DOI: https://doi.org/10.1007/978-1-4939-2733-3_2
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4939-2732-6
Online ISBN: 978-1-4939-2733-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

Understanding the Tracking Errors of Commodity Leveraged ETFs

Abstract

Similar content being viewed by others

Have trend-following signals in commodity futures markets become less reliable in recent years?

Return and volatility of emerging markets leveraged ETFs

Long-term returns estimation of leveraged indexes and ETFs

Keywords

1 Introduction