Modeling Long Term Return Distribution and Nonparametric Market Risk Estimation

Dutta, Santanu; Powdel, Tushar Kanti

doi:10.1007/s13571-023-00303-x

Modeling Long Term Return Distribution and Nonparametric Market Risk Estimation

Published: 21 February 2023

Volume 85, pages 257–289, (2023)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Sankhya B Aims and scope Submit manuscript

Modeling Long Term Return Distribution and Nonparametric Market Risk Estimation

Download PDF

151 Accesses
Explore all metrics

Abstract

The log-return of an asset is the change in the asset price, measured in natural logarithmic scale, over a certain time period. We introduce a mathematical model for long term asset return. This model is a generalization of the well known random walk model and provides the mathematical basis for normal approximation and i.i.d. bootstrap approximation of the long-term return distribution and its quantiles. Our results yield estimators of long term value at risk (VaR) and median shortfall (MS) which are well known measures of market risk. Extensive simulations suggest that the proposed estimators outperform a number of existing estimators of VaR and MS especially over a time horizon of at least one year. Unconditional backtest by Kupiec (J. Derivat. 3, 73–84 1995) based on the annual returns of the Nifty 50 index of the national stock exchange in India, crude oil and gold prices suggests that the proposed model yields reliable estimates of the one-year Value-at-Risk and Median-Shortfall for these assets.

Estimation of Expected Shortfall Using Quantile Regression: A Comparison Study

Article 05 August 2021

Dependent bootstrapping for value-at-risk and expected shortfall

Article 14 August 2017

Value-at-Risk Estimation via a Semi-parametric Approach: Evidence from the Stock Markets

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Returns in a portfolio arise from movements in market prices of assets, viz. stocks, commodities, market indices etc. Asset returns cannot be predicted perfectly and the distribution of returns is unknown (See Ruppert (2004), pages 78 and 79). Modeling return distribution is an important problem in finance. Tolikas and Gettinby (2009) studied the suitability of the Generalized Extreme Value (GEV), Generalized Pareto (GP) and Generalized Logistic (GL) distributions for modeling the distribution of the extreme daily share returns in Singapore Stock Exchange over the period 1973 to 2005. Kassberger and Kiesel (2006) investigated multivariate extension of the Normal Inverse Gaussian (NIG) distribution for capturing the distributional features of hedge fund returns. The extreme quantiles of the return distribution are important in the context of market risk estimation and also have important applications in portfolio optimization. See for instance (Ruppert, 2004; Allen et al. 2013).

Value-at-Risk (VaR) and Median-Shortfall (MS) are two well known measures of market risk which are required to determine regulatory capital requirements. See for instance, Cont (2001), Danielsson (2002), Ruppert (2004), So and Wong (2012), and Santomila et al. (2018) and references therein. Under Solvency II standard model, VaR has been chosen as a widely reported risk measure in financial markets (See Santomila et al. (2018)). Solvency II is a revision of standards for evaluating the financial situation of European insurers intended to improve risk measurement and control (see Santomila et al. (2018)). For any 0 < p < 1 and m > 0, Goh et al. (2012) interpreted the 100p percent VaR of a portfolio during (t, t + m] as the least amount of capital or cash necessary to be added with the portfolio at time t + m to ensure that the augmented return (portfolio return plus the cash added) is positive with probability at least p. In this sense, VaR is a measure of capital adequacy of a portfolio over a period of length m and with a certain confidence level p. The VaR of a portfolio turn out of be the extreme quantiles of the return distribution. See So and Wong (2012) and Dutta and Biswas (2017) and references therein. However, there are several demerits of the VaR. It is not a coherent risk measure, as it is not sub additive. Also, VaR does not provide any information about the size of potential loss when it exceeds the VaR level. Considering such issues, median shortfall (MS) was introduced. It is the median of the conditional loss distribution, given the event that the loss exceeds the VaR (see So and Wong (2012)). Therefore estimation of VaR and MS essentially reduce to the problem of estimation of extreme quantiles of the return distribution (See Dutta and Biswas (2017)).

Let P_t be the price of a financial asset at time t. Let

$$ X_{t, m }=\log\left( \frac{P_{t+m}}{P_{t}}\right). $$

X_{t, m} is referred to as the return (in log-scale) during a time period (t, t + m]. The number m > 0 is referred to as the time scale and it can be measured in weeks, days, hours or minutes. X_{t, m} is widely used in finance to represent the revenue earned from investing in an asset over a time period of length m. See for instance, Ruppert (2004), Cont (2001), and Goh et al. (2012) and the references therein.

The distribution of X_{t, m} is in general unknown. There seems to be no one probability model that provides best fit to all type of asset return data (values of X_{t, m}) for different time scales (i.e. different choices of m). See Cont (2001). The statistical properties of short term (i.e. small m) and long term (i.e. large m) return data are different. For instance, Cont (2001) reported that for a wide variety of assets the data on price fluctuations over very small time scales seem to be highly leptokurtic and negatively skewed. The kurtosis of the marginal distribution seems to be very high for m equal to 5 to 30 minutes (see Cont (2001, page 226)). However, as the time scale $m \rightarrow \infty $ the empirical distribution of the asset returns resembled a normal distribution. This phenomenon was referred to as Aggregational Gaussianity by Cont (2001). But no proof of this observation seems to be available.

Most of the existing research papers seem to focus on modeling short term returns (e.g. daily returns) or estimation of short term VaR or MS (for instance daily VaR and MS). Long-term return refers to the return generated after holding an asset for a substantial period of time. For example, in India equity returns exceeding one year is considered as long-term. In the 90s JP Morgan developed a VaR estimation method which was effective in measuring short term risk in the banking industry. Dowd et al. (2004) and Fedor (2007) discuss various problems of JP Morgan’s method in the context of measuring longer-term risks. These studies show that the long-term VaR is more difficult to estimate than the short-term VaR. Our model enables estimation of long term VaR and MS.

Dowd et al. (2004) have mentioned the demerits of the “square-root rule” of computing the VaR over m days (say VaR(m)) by multiplying the one day VaR with $\sqrt {m}$. The authors have argued that the formula VaR(m)=$\sqrt {m}$VaR(1) leads to over estimation of the VaR over m days. Under the assumption that the daily returns follow log-normal distribution with parameters μ and σ, Dowd, Blake, and Cairns obtained the following formula for the 100p percent VaR of the absolute returns over m days.

$$ \begin{array}{@{}rcl@{}} VaR(m)&=& P-\exp\left( \mu m+\alpha_{1-p}\sigma\sqrt{m}+\ln P\right)\\ &=&P\left( 1-\exp{\left( \mu m+{\alpha_{1-p}}\sigma\sqrt{m}\right)}\right), \end{array} $$

(1.1)

where P is the current price of the portfolio or asset and α_1−p is the (1 − p)th quantile of N(0,1) distribution, 0 < p < 1. Dowd et al. (2004) obtained the formula Eq. 1.1 for the VaR(m) under the assumption that the one day returns follow log-normal distribution. But the distribution of daily returns are in general not known. Obtaining formulae of VaR(m) for other probability distributions such as the Student’s-t, GEV, GP and GL distributions, which can fit daily return data, appear to be quite challenging. Our model for long term return yields asymptotic approximations to m period VaR and MS for large m, under very general conditions on short term 1-period returns.

We provide a probabilistic model which theoretically justifies the gaussianity of X_{t, m} for large m. We shall refer to the asymptotic distribution of X_{t, m} for large m as the long term return distribution. To obtain this distribution, we partition the interval (t, t + m] into a m number of non-overlapping subintervals {(t + i − 1, t + i]}_{i= 1,2,⋯ , m}, each of length 1. Since the log- returns are additive, we have the following equation

$$ \begin{array}{@{}rcl@{}} &&\ X_{t,m}=\log\left( \frac{P_{t+m}}{P_{t}}\right)=\sum\limits_{i=1}^{m}\ X_{t+i}\\ \text{where}\ &&\ X_{t+i} =\log\left( \frac{P_{t+i}}{P_{t+(i-1)}}\right),\ i=1,\cdots, m. \end{array} $$

(1.2)

Cont (2001) referred to the returns over small m time scales as fine. The return over large m is referred to as coarse. Since X_{t, m} is a sum of m many fine returns, a suitable Central Limit Theorem can be used to approximate the asymptotic distribution of centered and scaled X_{t, m} for $m\rightarrow \infty $ provided the fine returns satisfy some common properties. Our asset return model thus consists of appropriate assumptions on the fine returns {X_t+i}_{i= 1, 2, ⋯} in line with the empirical properties of fine returns as observed by Cont (2001).

Our Lemma 1 explains Cont’s Aggregational Gaussianity observation, i.e. the distribution of long term returns can be approximated by the normal distribution. The normal approximation naturally lead to approximation of quantiles of X_{t, m} for large m. See for instance, the quantile estimator Eq. 2.2. In Lemma 2 we show that the distribution of X_{t, m} can be approximated by the classical i.i.d. bootstrap for large m as well.

This paper is divided into five sections. In Section 1, we introduce a model (viz. Equation 1.2 and Assumption 1) for the long term return distribution and state and prove Lemmas 1 and 2 which provide mathematical basis for normal approximation and i.i.d. bootstrap approximation of the long-term return distribution and its quantiles. In Section 2, we have discussed the problem of estimating the long term VaR and MS. Equations 2.2 and 2.5 are the proposed estimators of VaR and MS over a long time period. Also we propose bootstrap based VaR and MS estimators given by Eqs. 2.12 and 2.13. We describe seven other VaR and MS estimators including the “square-root of time rule (SRTR)” based estimator. In Section 3, using Monte Carlo simulations, we compare the mean squared errors (MSE) of the nine VaR and MS estimators for different choices of n (sample size of long term returns), m (duration of long term returns) and for three different time series models for the fine returns data. The simulation results are reported in Tables 1, 2, 3 and 4. The results suggest that the proposed estimators Eqs. 2.2, 2.5, 2.12 and 2.13 outperform the other estimators of 95 percent VaR and MS for almost all choices of n and m and the time series model for fine returns. The SRTR based estimator performs well for fine returns data generated by GARCH(1,1) process. In the SRTR method a probability distribution is fitted to the fine return data to estimate the short term VaR, which is then multiplied by $\sqrt {m}$ to estimate the m period VaR (see Spadafora et al. (2014)). If the marginal distribution of the fine returns is different from the probability distribution fitted to the fine return data in the SRTR method, the SRTR VaR estimator performs poorly. The performance of the SRTR also deteriorates in the presence of dependence in the fine returns data. The extreme value theory based VaR estimator proposed by Drees (2003), Sfakianakis and Verginis based VaR estimator (Sfakianakis and Verginis, 2008) and kernel based estimators of the VaR perform poorly in comparison to the proposed central limit theorem based VaR estimator Eq. 2.2 and bootstrap based VaR estimator Eq. 2.12, for m ≥ 250. The MSE of the proposed estimators Eqs. 2.2, 2.5, 2.12 and 2.13 do not seem to fluctuate widely under different time series models. In contrast, the performance of the other estimators seem to be sensitive to the underlying model for the fine return data.

In Section 4, we describe the unconditional backtest by Kupiec (1995). We use the proposed estimators Eqs. 2.2, 2.5, 2.12 and 2.13 and a number of other estimators, viz. the sample quantile, Sfakianakis and Verginis estimator, extreme value theory based estimator and SRTR based estimator to estimate the 95 percent annual VaR and MS of the Nifty 50 index based on the real data reported by Dutta and Das (2018). We also use the six estimators to estimate the 95 percent annual VaR and MS of the crude oil and gold prices based on the historical data available in the Yahoo Finance website. Our analysis suggests that while the VaR and MS of crude oil annual returns are comparable to the same for the NIFTY 50 annual returns, the annual VaR and MS of gold returns are much smaller than the same for NIFTY 50 index. This indicates gold exhibits the least market risk over a duration of one year in comparison to the crude oil and NIFTY 50 index. In Appendix, viz. Appendix we report the Tables 1–10 containing the simulation results, real data and VaR and MS estimates based on real data.

1.1 Long Term Return Distribution and Aggregational Gaussianity

As $m\rightarrow \infty $, Eq. 1.2 represents a model for long term return of an asset for which the return history has been recorded for a large number of times in the past. This model is incomplete without specifying assumptions on the fine returns {X_t+(i− 1)}_{i= 1,2,3,⋯}. Before we state our assumptions, we note the empirical observations reported by Cont (2001):

(i)
(linear) auto-correlations of asset returns are often insignificant, except for very small intra-day time scales (say 20 minutes);
(ii)
different measures of volatility such as absolute and squared daily returns, display a positive auto-correlation over several days. This is known as volatility clustering;
(iii)
The marginal distribution of the fine returns exhibit Pareto-like tail or properties similar to the Student’s-t distribution with four degrees of freedom. The marginal distribution of the fine returns seemed to have finite variance, but infinite fourth moment.

Exact specification of the marginal distribution of fine returns is not possible and we make the following assumptions in line with the above. We refer to the process {X_t+i− 1}_i= 1,2,⋯ as the fine return process and make the following assumptions regarding the fine return process.

Assumption 1.

For any t > 0, {X_t+i− 1}_i= 1,2,⋯ is a stationary strongly mixing process, with exponential mixing rate, satisfying

$$ \begin{array}{@{}rcl@{}}&&\textbf{a.}\ 0\le E\left[|X_{t+i-1}|^{2}|\log|X_{t+i-1}||^{1+\delta}\right]<\infty,\ \text{for some}\ \delta>0,\\ &&\textbf{b.}\ Corr(X_{t+i-1},\ X_{t+k-i+1})=0\ \forall i,\ k=1,2,\cdots\\ &&\textbf{c.}\ 0< Corr(|X_{t+i-1}|, |X_{t-i}|) \ \text{and}\ \ Corr(X^{2}_{t-i+1}, X^{2}_{t-i}) \forall i=1,2,\cdots.\end{array} $$

Equation 1.2 and Assumption 1 represent our proposed model for m −period return.

Under Assumption 1, E(X_{t, m}) = mE(X_t) and V ar(X_{t, m}) = mV ar(X_t). The volatility of the m period returns increase with increase in m. From Eq. 1.2 we get

$$ \begin{array}{@{}rcl@{}}X_{t+k,m} =\sum\limits_{i=1}^{m}\ X_{t+k+i-1}.\end{array} $$

(1.3)

Then under Assumption 1, {X_{t+k, m}}_{k= 0,1,2,....} are identically distributed with common marginal distribution function F_m. X_{t+k, m} denotes the return during (t + k, t + k + m].

The above mentioned assumption viz. Assumption 1 are supported by several real datasets. For instance, Dutta and Das (2018) published data on daily log-returns of the Nifty 50 index in national stock exchange (NSE) in India for the financial years (FY) from 1995-96 to 2017-18. The Augmented Dickey Fuller (ADF) Test suggests that the NIFTY 50 data is stationary. The marginal variance of these 5692 observations in the data is less than 3. The auto-correlation of the daily log-returns of the Nifty 50 index seem to be insignificant. But the absolute and the squared daily log returns exhibit significant auto correlation.

The historical data on daily closing prices of crude oil (per barrel) and gold (per troy ounce) in USD are obtained from Yahoo Finance website (https://finance.yahoo.com/quote/CL%3DF/history?p=CL%3DF, https://finance.yahoo.com/quote/GC%3DF/history?p=GC%3DF) from FY 2001-02 to FY 2020-21. Based on the daily closing prices we obtain the log returns. The daily log returns data of crude oil prices and gold prices exhibit similar empirical properties as that of NIFTY 50 daily log returns. For instance, the datasets are stationary and the auto correlation of the daily log returns is insignificant but the squared daily log returns exhibit significant positive auto correlation. Further, the set of observations of crude oil log returns (5004 observations) and gold log returns (5009 observations) datasets seem to have finite marginal variance (less than 2). These observations support the Assumption 1 for the fine return process.

A sequence of i.i.d. random variables with finite third moment satisfies the conditions a., b., but not c.. The condition c. under Assumption 1 is in line with the phenomenon of volatility clustering observed by Cont (2001). The following processes are non-trivial example of {X_t}_t= 1,2,⋯ satisfying Assumption 1.

Example 1.

Let {P_t}_t= 1,2,⋯ be a sequence of positive valued random variables such that $\log (P_{t})$ follows an autoregressive process defined as follows

$$ \log(P_{t})=\theta\epsilon_{t-1}+\sqrt{1-\theta^{2}}\epsilon_{t},\ 0<\theta<1, $$

where {𝜖_t}_t= 1,2,⋯ is a sequence of i.i.d. N(0, 1) random variables, independent of {P_t}_t= 1,2,⋯. Further let {Y_t}_t= 1,2,⋯ be a sequence of i.i.d. random variables (independent of {P_t}_t= 1,2,⋯ and {𝜖_t}_t= 1,2,⋯) such that P(Y_t = 1) = 0.5 and P(Y_t = − 1) = 0.5 (for instance, one can look at {Y_t}_t as the outcomes of a sequence of coin tosses). Define

$$X_{t}=Y_{t}P_{t}.$$

Then

$$ \begin{array}{@{}rcl@{}}&& Cov\left( X_{t+i-1},\ X_{t-i+k+1}\right)=0,\ \forall i,\ k=1,2,\cdots\\ && Cov\left( |X_{t+i-1}|,\ |X_{t-i}|\right)=Cov\left( P_{t-i+1},\ P_{t-i}\right)=e\left( e^{\theta\sqrt{1-\theta^{2}}}-1\right),\\ &&Cov\left( |X_{t}|,\ |X_{t-k}|\right)=0,\ k\ge 2.\end{array} $$

Example 2.

Let {P_t}_t= 1,2,⋯ be a sequence of positive valued random variables such that $\log (P_{t})$ follows an autoregressive process defined as follows

$$ \log(P_{t})=\theta\log(P_{t-1})+\sqrt{1-\theta^{2}}\epsilon_{t},\ 0<\theta<1, $$

where {𝜖_t}_t= 1,2,⋯ is a sequence of i.i.d. N(0, 1) random variables, independent of {P_t}_t= 1,2,⋯. {Y_t}_t= 1,2,⋯ be a sequence of i.i.d. random variables as defined in Example 1. Define X_t = Y_tP_t. Then {X_t+i− 1}_i= 1,2,⋯ satisfies Assumption 1. For instance, it is easy to verify that in the above Example 1

$$ Cov\left( X_{t},\ X_{t-1}\right)=0,\ \ Cov\left( |X_{t}|,\ |X_{t-1}|\right)=Cov\left( P_{t},\ P_{t-1}\right)=e\left( e^{\theta}-1\right). $$

Example 3.

Let {X_t}_t= 1,2,⋯ follow a stationary GARCH(1,1) process defined as follows

$$ \begin{array}{@{}rcl@{}}X_{t}&=&\sigma_{t}Z_{t},\\ {\sigma^{2}_{t}}&=& C+ \alpha X^{2}_{t-1}+\beta\sigma^{2}_{t-1}, \end{array} $$

where C, α, β > 0 and α + β < 1. {Z_t}_t= 1,2,⋯ is a sequence of martingale differences with mean= 0 and variance= 1. Posedel (2005) has studied properties of the GARCH(1,1) model in detail. Under the assumptions that α + β < 1 and β² + 2αβ + 3α² < 1, {X_t} is a stationary uncorrelated process with $Var(X_{t})=\frac {C}{1-\alpha -\beta }$, finite fourth moment and $\{{X^{2}_{t}}\}$ is an ARMA process with positive auto-correlation. Hence Assumption 1 are satisfied.

Using Central Limit Theorem (CLT) for Strongly Mixing Sequences of Random Variables by Herrndorf (1985) we have the following result

Lemma 1.

Let {X_t+i− 1}_i= 1,2,⋯ be a α-mixing stationary process satisfying Assumption 1. Then

$$ \frac{X_{t,m}-mE(X_{t})}{\sqrt{mVar(X_{t})}}\rightarrow_{D} N(0,1)\ \text{as}\ m\rightarrow \infty, $$

where m is the number of fine (short-term) returns recorded within the time scale m.

Remark 1.

1. Lemma 1 explains Cont’s Aggregational Gaussianity observation, i.e. the distribution of long term return can be approximated by the normal distribution.

2. Let Q_{m, p} denote the 100p th percentile of the marginal distribution of X_{t, m}. Lemma 1 motivates the following estimator estimator of Q_{m, p} for large m

$$ \hat{Q}_{m,p}=mE(X_{t})+\sqrt{mVar(X_{t})}{\Phi}^{-1}(p),\ 0<p<1. $$

(1.4)

E(X_t), V ar(X_t) are approximated by the mean and the variance of the m fine returns.

The following lemma ensures that, one can also use the i.i.d. bootstrap method by Efron (1979) to approximate the sampling distribution of $\frac {X_{t,m}-mE(X_{t})}{\sqrt {mVar(X_{t})}}$ for large m.

Lemma 2.

Let {X_t+i− 1}_{i= 1,2, ⋯} be a sequence of stationary random variables satisfying Assumption 1 and that $E\left (|X_{t}|^{3}\right )<\infty $. Then as $m\rightarrow \infty $,

$$\frac{X^{*}_{t, m }-X_{t,m}}{\sqrt{m}S_{m}}\rightarrow_{D} N(0,1),\ \text{almost surely},$$

where $X^{*}_{t, m}={\sum }^{m}_{i=1}X^{*}_{t+i-1}.\ X^{*}_{t},\ X^{*}_{t+1},\cdots ,\ X^{*}_{t+m-1}$ are i.i.d. draws from the empirical distribution of X_t, X_t+ 1, ⋯ , X_t+m− 1. And S_m is the sample standard deviation of X_t, X_t+ 1, ⋯ , X_t+m− 1.

Proof.

Using the arguments in the proof of Theorem 2.2 of Lahiri(Lahiri, 2003), page 21) we see that it is enough to show that under Assumption 1 and the assumption that $E\left (|X_{t}|^{3}\right )<\infty $, as $m\rightarrow \infty $

$$ {S^{2}_{m}}\rightarrow Var(X_{t}),\ \frac{1}{m^{3/2}}\sum\limits_{i=1}^{m}|X_{i}|^{3}\rightarrow 0,\ \text{almost surely}. $$

(1.5)

A stationary strongly mixing process is a stationary ergodic process (see Rieders (1993)). Hence, Under Assumption 1, {X_t}_t∈Z is a stationary Ergodic process with finite marginal variance, and the Birkhoff ergodic theorem ensures that as $m\rightarrow \infty $

$$\frac{1}{m}\sum\limits_{i=1}^{m} X_{t+i-1}\rightarrow E(X_{t}),\ \frac{1}{m}\sum\limits_{i=1}^{m} X^{2}_{t+i-1}\rightarrow E({X^{2}_{t}}),\ \text{almost surely}$$

Consequently Eq. 1.5 is proved, and this completes the proof. □

Lemma 2 implies that Φ^− 1(p) in Eq. 1.4 can be approximated by the p th quantile of of the distribution of $\frac {X^{*}_{t, m }-X_{t,m}}{\sqrt {m}S_{m}}$ to obtain Q_{m, p}.

2 Estimation of VaR and MS

A risk measure ρ is a functional of a random variable representing return of a portfolio over a certain holding period. In this paper X_{t, m} is the random variable representing m period return. A law invariant risk measure is functional of the distribution function of the marginal return distribution. The Value at Risk (VaR), and the Median Shortfall (MS) are two well known law-invariant risk measures (see Dutta and Biswas (2017)). Goh et al. (2012) defined the VaR as follows.

Definition 1.

For 0 < p < 1, the 100p percent VaR during (t, t + m] denoted by V aR_{m, p} is a number satisfying

$$VaR_{m, p}=\inf\{x\ge 0:P\left( X_{t, m}+x\ge 0\right)\ge p\}.$$

Remark 2.

Since $P\left (X_{t, m}+x\ge 0\right )=P\left (X_{t, m}\ge -x\right )$, it is easy to verify that

$$ VaR_{m, p}=-Q_{m, 1-p}, $$

(2.1)

where is the Q_{m, p} is the p −th quantile of the marginal distribution of X_{t, m}, i.e. $Q_{m, p}=\inf \{y:F_{m}(y)\ge p\}$.

Under the Assumption 1 and that $m\rightarrow \infty $, using Eq. 1.4 we can approximate V aR_{m, p} by the following estimator

$$ \widehat{VaR}_{m, p}=-\hat{Q}_{m, 1-p}=-mE(X_{t})-\sqrt{mVar(X_{t})}{\Phi}^{-1}(1-p). $$

(2.2)

E(X_t) and V ar(X_t) are in general unknown. But under Assumption 1, these parameters can be estimated consistently from the observed fine returns. For instance, we estimate E(X_t) and V ar(X_t) by the mean and variance of the observed fine returns. One of the reviewers suggested to take into account an empirically observed property that the conditional second moment of asset returns is time-varying while estimating the VaR. The conditional variance of X_t+i, given X_t, X_t+ 1, ⋯ , X_t+i− 1, can be modeled by a stationary GARCH model satisfying Assamption 1. For instance, in Example 3 the GARCH(1,1) satisfies the Assumption 1 and takes into account the time variation of the conditional variance of fine returns. Under the GARCH(1,1) model in Example 3, $Var(X_{t})=\frac {C}{1-\alpha -\beta }$ and C, α, β can be estimated by maximum likelihood method based on the fine returns. However, if we replace V ar(X_t) in Eq. 2.2 by this formula the resulting VaR estimate is not robust i.e. mean squared error (MSE) of the resulting estimator varies widely for different time series models for the fine return process. For example if we fit a GARCH(1,1) model to the time series in Example 1 and estimate V ar(X_t) in Eq. 2.2 by $\frac {C}{1-\alpha -\beta }$, the resulting VaR estimate performs very poorly (large mean squared error) in comparison to the proposed VaR estimator obtained by replacing E(X_t) and V ar(X_t) in Eq. 2.2 by the sample mean and variance of the fine returns. Therefore, we do not recommend estimating V ar(X_t) in Eq. 2.2 by fitting a specific model to the fine return time series, as the resulting VaR estimate is a function of the model parameters and is highly sensitive to the choice of the model for fine returns. In contrast, the VaR estimator obtained by estimating E(X_t) and V ar(X_t) in Eq. 2.2 by the sample mean and variance of the fine returns is a nonparametric estimator of VaR and does not depend on the model generating the fine return process.

Remark 3.

Since $X_{t,m}=\log \left (\frac {P_{t+m}}{P_{t}}\right )$ and Q_{m, p} is the 100p th percentile of the distribution of X_{t, m}, the 100p th percentile of the distribution of m −period absolute return $\left (\frac {P_{t+m}}{P_{t}}-1\right )$ is equal to $\exp \{Q_{m,p}\}-1$. Consequently under Assumption 1 and for large m, we can approximate 100p-percent m-period VaR in absolute scale by the following estimator

$$ {-\left( \exp\{Q_{m,1-p}\}-1\right)=1-\exp{\{mE(X_{t})+\sqrt{mVar(X_{t})}{\Phi}^{-1}(1-p)\}}}, $$

(2.3)

which is in fact equal to the formula Eq. 1.1 obtained by Dowd et al. (2004) for P = 1, μ = E(X_t) and $\sigma =\sqrt {Var(X_{t})}$. Dowd, Blake, and Cairns (Dowd et al. 2004) obtained Eq. 1.1 under the assumption that the daily or 1 − period returns follow log-normal distribution with parameters μ and σ. In contrast, Eq. 2.3 is obtained under Assumption 1 without requiring the knowledge of the exact distribution of the fine returns.

One of the demerits of VaR is that it does not provide any information about the size of the potential loss during a time scale m when the loss within that period falls below the VaR level. The conditional loss distribution, given − X_{t, m} > V aR_{m, p}, is in general unknown. Also it is not a cohorent risk measure (see So and Wong (2012)). To overcome these issues, So and Wong (2012) introduced another risk measure named median shortfall (MS).

Let Θ_p denote the distribution function of the conditional loss distribution, given that the loss − X_{t, m} exceeds the VaR level, is defined as

$$ \begin{array}{@{}rcl@{}}{\Theta}_p(x)=\begin{cases}P\{-X_{t,m}\leq x||\space -X_{t,m} {>} VaR_{m,p}\},\ \ \text{if}\ x>-VaR_{m,p}\\ 0,\ \ \text{otherwise.}\end{cases} \end{array} $$

The median of this distribution Θ_p is called the median shortfall (MS). It is straightforward to verify that

$$ \begin{array}{@{}rcl@{}} {\Theta}_p(x) = \begin{cases}{1-\frac{F_m(-x)}{1-p}}, \ \ \text{if}\ x\le -VaR_p \\ 0,\ \ \text{otherwise.} \end{cases} \end{array} $$

(2.4)

where F_m is the marginal distribution function X_{t, m}. F_m is unknown. But Lemma 1 implies that that as $m\rightarrow \infty $ F_m can be approximated by normal distribution under Assumption 1.

Definition 2.

The 100p percent MS, denoted by MS_{m, p}, is defined as follows

$$MS_{m, p}=\inf\{x: {\Theta}_{p}(x)\ge 0.5\}.$$

By the above definition, the MS is the median loss when the loss exceeds the VaR (see So and Wong (2012)). Therefore, MS gives the median depreciation of the asset value during a time scale m, under the worst-case scenario quantified by the m − period VaR.

The following Lemma is a direct consequence of the definition of MS and Eq. 2.4.

Lemma 3.

Let F_m be a continuous distribution function. Then

$$MS_{m, p}=VaR_{m, 0.5+0.5p}.$$

Therefore for any 0 < p < 1, an estimator $\hat {MS}_{m, p}$ of the 100p percent MS is defined as follows

$$ \begin{array}{@{}rcl@{}} \widehat{MS}_{m, p}&=&\widehat{VaR}_{m, 0.5+0.5p}=-\hat{Q}_{m,1-0.5(1+p)}=-mE(X_{t})\\ &&-\sqrt{mVar(X_{t})}{\Phi}^{-1}(1-0.5(1+p)). \end{array} $$

(2.5)

2.1 Other Non-Parametric VaR and MS Estimators

Let X_{1, m}, X_{2, m},..., X_{n, m} be identical copies of X_{t, m} with distribution function F_m. Let X_{(1), m}, X_{(2), m},..., X_{(n), m} denote the corresponding order statistics. The m −period 100p percent V aR is equal to − Q_m,1−p (see Eq. 2.1), where 0 < p < 1. Equation 2.1 and Lemma 3 imply that the 100p percent MS is equal to − Q_{m,1 − 0.5(1+p)}. Therefore the problems of estimating the m −period 100p percent V aR and MS are essentially the problems of estimating − Q_m,1−p and − Q_{m,1 − 0.5(1+p)} based on X_{1, m}, X_{2, m},..., X_{n, m}.

Dutta and Biswas (2017) have reviewed the performance of a number of non-parametric quantile estimators which can be used to estimate Q_{m, p}. Following are some of the estimators which performed well in their simulation study.

Sample Quantile and Kernel Quantile Estimator

The p th sample quantile $X_{\left (\left \lfloor np \right \rfloor +1 \right ),m}$ is a natural estimator of Q_{m, p}. Asymptotic properties of the sample quantile are well known under i.i.d. assumption (see Sherfling (1980)). Asymptotic properties of the sample quantile has also been studied extensively under various dependence assumptions. See, for instance, Sun (2006), Wu (2005), and Wang et al. (2011) and references therein. Under strong mixing dependence (with polynomial mixing rate) assumption, Wang et al. (2011) obtained Bahadur representation of sample quantile, which provides insight into the rate of strong convergence of the sample quantile. Dutta and Biswas (2017) have reviewed these properties of the sample quantile in detail. We denote the sample quantile estimator by SQ_p.

A kernel estimator of Q_{m, p} is defined as follows

$$\widehat{Q}_{m, p}=\inf\left \{ x:\widehat{F}_{m}(x)\geq p \right \}$$

where $\widehat {F}_{m}$ is a kernel distribution function estimator (see Dutta and Biswas (2017)) which is defined as follows

$$\widehat{F}_{m}(y)=\frac{1}{n}\sum\limits_{i=1}^{n}K\left( \frac{y-X_{i,m}}{h_{n}}\right),$$

where K is the distribution function known as the kernel and {h_n} is a positive sequence referred to as the bandwidth. In the kernel-based method, the main challenge lies with the selection of bandwidth h_n. Polanski and Baker (2000), Chen and Tang (2005), and Alemany et al. (2013) provide some choices of h_n. Using the “kerdiest” package in R software one can find the kernel distribution function estimate with bandwidth formula proposed by Polanski and Baker (2000) as default which is given by,

$$h_{n}=\left (\frac{\rho (K)}{-n{\mu_{2}^{2}}(K)\widehat{\psi_{2}}(g_{2}) } \right )^{1/3},$$

where $\rho (K)=2{\int \limits }_{-\infty }^{\infty }uw(u)G(u)$, $\mu _{2}(K)={\int \limits }_{-\infty }^{\infty }u^{2}w(u)du$ and $\psi _{r}(g)=\frac {1}{n^{2}g^{r+1}}{\sum }_{i=1}^{n}{\sum }_{j=1}^{n}L^{r}\left (\frac {u_{i}-u_{g}}{g} \right )$, r ≥ 2 an even integer and $g_{2} = \left (\frac {2L^{(2)}(0)}{-n{\mu _{2}^{2}}(L)\psi _{4}}\!\right )^{1/5}$. L is a kernel function not necessarily equal to kernel function w and $G(u)={\int \limits }_{-\infty }^{u}w(t)dt$.

We denote the Polansky and Baker quantile estimator by PB_p.

Chen and Tang (2005) suggested the following choice for the optimal value of h_n,

$$h_{n}=\left\{ \frac{2f^{3}\left (Q_{p} \right )b_{k}}{{\sigma_{k}^{4}}\left (f^{(1)}\left (Q_{p} \right ) \right )^{2}}\right\}^{1/3}n^{-1/3},$$

where $b_{k}={\int \limits } uw(u)G(u)du$ and ${\sigma _{k}^{2}}={\int \limits } u^{2}w(u)du$, where w is a probability density function with zero mean and finite variance, known as the kernel. G(⋅) is the distribution function of the distribution with density w. h_n involves unknown constants Q_p, f and its derivative f⁽¹⁾ at Q_p. Chen and Tang (2005) suggested to approximate Q_p in h_n by the corresponding sample quantile. The authors suggested to approximate f and f⁽¹⁾ by the density and the first derivative of the generalized Pareto distribution. We denote the Chen and Tang’s quantile estimator by CT_p.

Harrell-Davis Estimator

Harrell and Davis (1982) introduced a quantile estimator (we call HD_p) which is a weighted linear combination of order statistics and defined as follows:

$$ \begin{array}{@{}rcl@{}} HD_{p}&=& \sum\limits_{i=1}^{n}w_{(i)}X_{(i),m}, \end{array} $$

(2.6)

$$ \begin{array}{@{}rcl@{}} w_{(i)}&=&I_{i/n}\left (p\left (n+1 \right ),\left (1-p \right )\left (n+1 \right ) \right )\\ &&-I_{(i-1)/n}\left (p\left (n+1 \right ),\left (1-p \right )\left (n+1 \right ) \right ), i=1,2,..,n\end{array} $$

(2.7)

where I_x(a, b) denotes the incomplete beta function. It is available in R software (see hdquantile function in Hmisc package in R software for statistical computing). − HD_1−p is equal to 100p percent V aR and 100p percent MS is 100(0.5p + 0.5) percent VaR.

Sfakianakis and Verginis estimator

Sfakianakis and Verginis (2008) introduced three L −statistic type estimators, SV 1_p, SV 2_p and SV 3_p (see Sfakianakis and Verginis (2008) for a detailed discussion). Among these estimators SV 3_p seems to be the appropriate estimator for Q_{m, p}, especially for 1 − p close to zero. It is defined as follows:

$$ SV3_{p}= \sum\limits_{i=1}^{n}B\left (i,n,p \right )X_{(i),m}+\left (2X_{(1),m}-X_{(2),m} \right )B(0,n,p) $$

(2.8)

where B(i, n, p) is the probability mass function of the Binomial distribution with parameters n and p. − SV 3_1−p is equal to 100p percent V aR and 100p percent MS is 100(0.5p + 0.5) percent VaR.

2.2 VaR and MS Estimation Based on Extreme Value Theory (EVT)

In this approach the idea is to use the high returns, above some threshold, in the observed data to estimate Q_{m, p} for p close to 1 and hence the VaR and the MS (see Dutta and Biswas (2017), Drees (2003) and references therein). From Definition 1, we find that V aR_{m, p} is equal to the p th quantile of the marginal distribution of Y_{t, m} = −X_{t, m}. Since the extreme value theory based estimator is suitable for estimating quantiles to the right tail of a distribution, we find 100p percent VaR by estimating the p th quantile of Y_{t, m}.

Pickands-Balkema-de Haan theorem (see Balkema and de Haan (1974)) claims that the conditional distribution of exceedance, given that a random variable exceeds a threshold value can be well approximated by Generalized Pareto distribution (GPD) provided the distribution function of the random variable is in the domain of attraction of the Generalized Extreme Value (GEV). Based on this theorem a GPD distribution is fitted to the k largest observations in the sample to approximate the tail of the conditional distribution of Y_{t, m} − u given Y_{t, m} > u where u is the threshold value. Usually the threshold value is chosen to be the (n − k)th order statistics, where n is the number of observed values of Y_{t, m} in the data. Let, F_m be the distribution function of Y_{t, m}. The choice of k is not straight-forward. For small k (i.e. for large threshold), the GPD approximation of the tail is more accurate, assuming F_m is in the domain of attraction of GEV. But for small k, lesser observations in the sample are available for fitting the GPD. In contrast for large k, more data are available for fitting the GPD distribution, but the the GPD approximation to the tail is biased.

Drees (2003) extended the extreme value theory for estimation of extreme quantiles of the marginal distribution of a stationary time series from i.i.d. assumption to β −mixing type dependence, which cover a broad class of time series models. The author assumed that the common distribution function F_m satisfies the property that as λ ↓ 0

$$ \frac{F^{-1}_{m}(1-\lambda t)}{F^{-1}_{m}(1-\lambda)}\rightarrow \frac{1}{t^{\xi}},\ \forall t>0 $$

for some ξ > 0 and $F^{-1}_{m}$ is the quantile function of F_m. Under further assumptions that as $n\rightarrow \infty $, $p\rightarrow 1$ and $k_{n}\rightarrow \infty $ in such a way that n(1 − p) = O(1) and $\frac {k_{n}}{n}=o(1)$, one can argue that

$$ Q_{m,p}\equiv Q_{m,1-k_{n}/n} \left( \frac{k_{n}}{n(1-p)}\right)^{\xi}, $$

see equation (4) in Drees (2003). A suitable estimator of the tail index ξ is the Hill estimator

$$ \hat \xi=\frac{1}{k_{n}}\sum\limits_{i=1}^{k_{n}}\log\frac{Y_{(n-i+1), m}}{Y_{(n-k_{n}), m}}, $$

where Y_{(i), m}, i = 1,⋯ , n are n ordered observations (from smallest to largest in magnitude) of Y_{t, m}. The above approximation naturally leads to the following estimator

$$ {EVT_{p}= Y_{(n-k_{n}), m}\left( \frac{k_{n}}{n(1-p)}\right)^{\hat \xi},} $$

(2.9)

(see Drees (2003)). Therefore, in EVT the 100p percent m −period VaR and MS estimators are EV T_p and EV T_0.5(1+p) respectively.

2.3 VaR and MS Estimation Based on Square Root of Time Rule (SRTR)

Dowd et al. (2004) have stated the “square-root of time rule” based VaR estimator which is given by

$$VaR(m)=\sqrt{m}VaR(1),$$

where V aR(1) is the VaR over 1 day and V aR(m) is the VaR over m days. V aR(1) is estimated based on some reliable assumptions and is multiplied with square root of m to get the m day VaR.

Spadafora et al. (2014) have proposed a VaR scaling formula at confidence level 1 − α for a horizon of m days which is given by,

$$ VaR(m, \alpha)=\sqrt{m}\frac{F^{-1}(\alpha)}{ F^{-1}(0.01)}VaR(1,0.01), $$

(2.10)

where F^− 1(α) is the α^th quantile of the distribution of short term (1 day) returns and V aR(0.01,1) is the estimated daily VaR at confidence level 99 percent.

The authors considered three distributions viz. Normal (N), Student’s t (ST) and Variance-Gamma (VG) distributions for fitting the short term (1 day) returns distribution and found that ST and VG distributions yield better fit results. Cont (2001) observed that the marginal distributions of the fine returns exhibit properties similar to the Student’s− t distribution. We fit Student’s t distribution to the fine return data and estimate the 100p percent daily VaR by taking the p th quantile of the fitted Student’s t distribution. From Eq. 2.10 we get the following formula the 100p percent VaR for m days.

$$ \widehat{VaR}(m,p)=\sqrt{m}F_{ST}^{-1}(p), $$

(2.11)

where $F_{ST}^{-1}(p)$ is the p th quantile of the Student’s t distribution fitted to the fine return data.

2.4 VaR and MS Estimation by Bootstrap Approach

We can replace Φ^− 1(1 − p) in Eq. 2.2 by the (1 − p)th quantile of the distribution of $\frac {X^{*}_{t, m }-X_{t,m}}{\sqrt {m}S_{m}}$ by using Lemma 2. We denote this new proposed estimator by $\widehat {VaR}_{boot,p}$.

$$ \widehat{VaR}_{boot,p}=-mE(X_{t})-\sqrt{mVar(X_{t})} \times D_{1-p}, $$

(2.12)

where D_1−p is the (1 − p)^th quantile of the distribution of $\frac {X^{*}_{t, m }-X_{t,m}}{\sqrt {m}S_{m}}$.

The 100p percent MS can be estimated by,

$$ \begin{array}{@{}rcl@{}} \widehat{MS}_{boot,p}&=&-mE(X_{t})-\sqrt{mVar(X_{t})} \times D_{1-0.5(1+p)}\\ &=&-mE(X_{t})-\sqrt{mVar(X_{t})} \times D_{0.5(1-p)} \end{array} $$

(2.13)

3 Simulation Study

The exact mean squared error (MSE) of the above mentioned VaR and MS estimators are difficult to obtain in general. However we can approximate and compare the MSE of these estimators in a Monte-Carlo (MC) simulation study for some specific data generating models satisfying Assumption 1. Given a data generating model, we generate B samples each of size n. Based on the B values of the statistic, say T₁, T₂,⋯ , T_B, the MC estimate of the MSE is equal to $\frac {1}{B}{\sum }^{B}_{i=1}\left (T_{i}-\theta \right )^{2}$, where 𝜃 is the parameter of interest. We have used B = 10,000.

In general the stochastic process generating the observed data is not known. However in a MC simulation study we can compute the MC estimate assuming some test distribution or data generating process. In this simulation study we consider the following three time series models described in Examples 1-3, for the 1 period returns {X_t}.

(I) X_t = Y_tP_t, where $\left \{ P_{t} \right \}_{t=1,2,...}$ be a sequence of positive valued random variables such that $\log (P_{t})$ follows an autoregressive process defined as follows

$$\log(P_{t})=\theta \epsilon_{t-1}+\sqrt{1-\theta^{2}}\epsilon_{t},$$

where $\left \{ \epsilon _{t} \right \}_{t=1,2...}$ is a sequence of i.i.d. N(0,1) random variables, independent of $\left \{ P_{t} \right \}_{t=1,2,...}$ and $\left \{ Y_{t} \right \}_{t=1,2,...}$ be a sequence of i.i.d. random variables (independent of $\left \{ P_{t} \right \}_{t=1,2,...}$ and $\left \{ \epsilon _{t} \right \}_{t=1,2...}$) such that P(Y_t = 1)= 0.5 and P(Y_t = − 1) = 0.5.

(II) X_t = Y_tP_t, where $\left \{ P_{t} \right \}_{t=1,2,...}$ be a sequence of positive valued random variables such that $\log (P_{t})$ follows an autoregressive process defined as follows

$$\log(P_{t})=\theta \log(P_{t-1})+\sqrt{1-\theta^{2}}\epsilon_{t},$$

where 𝜖_t and Y_t are defined in the same way as in the first model.

(III) GARCH(1,1) model given by X_t = σ_tZ_t where ${\sigma _{t}^{2}}=0.0001+0.4X_{t-1}^{2}+0.5\sigma _{t-1}^{2}$ and {Z_t}_{t= 1,2,.....} are i.i.d. standard normal random variables.

For each one of the above mentioned time series models I to III, we generate n × m values of X_t viz. {X_t+i}_{i= 1,⋯ , m}, t = 1,⋯ , n. Consequently, $X_{t,m}={\sum }^{m}_{i=1} X_{t+i},\ t=1,\cdots ,n$ are the n values of the m period return X_{t, m}. The unknown parameters E(X_t) and V ar(X_t) in our VaR and MS estimators Eq. 2.2, 2.5, 2.12 and 2.13 are estimated by the mean and variance of the n × m values {X_t+i}_{i= 1,⋯ , m, t= 1,⋯ , n}. We compute the other VaR and MS estimators based on the X_{t, m} t = 1,⋯ , n, which represents a sample of size n of long term returns of duration m. The process is repeated B = 10,000 times for each choice of n and m. The bootstrap process for Eqs. 2.12 and 2.13 is carried out D = 10000 times.

Let MSE1, MSE2 and MSE3 denote the MC estimate of the MSE of the sample quantile (SQ_p), our estimator ($\widehat {VaR}_{m, p}$) and our bootstrap based estimator $\widehat {VaR}_{boot,p}$ which is given by Eq. 2.12. MSE4 denotes the MC estimate of the MSE of Harell-Davis estimator (HD_p). Let MSE5 and MSE6 denote of the estimated MSE of the S-V estimator (SV 3_p) and the EVT estimator (EV T_p) respectively. MSE7 and MSE8 are the estimated MSE of the Kernel quantile estimators using bandwidths by Polansky and Baker (PB_p) and Chen and Tang (CT_p). MSE9 denotes the MC estimate of the MSE of the SRTR estimator (Eq. 2.11 and . In Tables 1 and 2 we report the ratios $\frac {MSE2}{MSE1}$ to $\frac {MSE9}{MSE1}$ for the eight 95 percent VaR estimators (other than the sample quantile) for different time series models, for different values of n and for m = 250 and 500 respectively. In Tables 3 and 4 we report the same ratios for 95 percent MS estimators, which are essentially 97.5 percent VaR estimators.

We observe that for n = 20 and for p = 0.95, k_n = 2 and in that case the EVT estimator EV T_p may not be defined for those samples where Y_{(n), m} and $Y_{(n-k_{n}), m}$ have opposite signs. Therefore for n = 20, the MC estimate of the MSE of the EVT estimator is not defined and is returned as NaN (not a number) by the R −programming environment. Following are the main observations based on the ratio of MSEs reported in Tables 1–4 (See Section 5, Appendix).

1.
Performance of our proposed estimators: a. The central limit theorem (CLT) based VaR and MS estimators Eqs. 2.2 and 2.5 exhibit the least MSE among all the 95 percent VaR and MS estimators for model (I) and (II) and for all choices of n and m. For the other time series model, viz. model (III), the proposed estimators Eqs. 2.2 and 2.5 outperform almost all the other estimators, viz. sample quantile, HD_p, SV 3_p, PB_p and CT_p except the SRTR estimator Eq. 2.11. For data generated by GARCH (1,1) model, the bootstrap based VaR estimators Eq. 2.12 outperforms the other VaR estimators (except SRTR) for sample size less than 50 and m = 250.

b. Overall, the MSE of our proposed CLT based estimators Eqs. 2.2 and 2.5 and the bootstrap based estimators Eqs. 2.12 and 2.13 do not seem to fluctuate widely under different time series models. For instance, $\frac {MSE2}{MSE1}<0.8$ and $\frac {MSE3}{MSE1}<0.8$ for all choices of n, m and for all the time series models which can be seen in Tables 1–4. In contrast, the MSE of the other estimators seem to be much higher for certain time series models and certain choices of n and m. For example, the ratio $\frac {MSE9}{MSE1}>1$, for models (I), (II) and n ≥ 50. For the GARCH(1,1) model $\frac {MSE8}{MSE1}, \frac {MSE5}{MSE1} >1$ for all choices of n and m. See Tables 1–4. The proposed CLT based estimators Eqs. 2.2 and 2.5 and the bootstrap based estimators Eqs. 2.12 and 2.13 seem to perform reliably in estimating 95 percent VaR and MS for all choices of n and m, irrespective of the time series model generating the fine return data.
2.
Performance of the SRTR estimator: The SRTR rule based 95 percent VaR and MS estimators seem to exhibit the least MSE for fine return data generated by model (III). However for model (III), $F_{ST}^{-1}(p)$ in Eq. 2.11 is replaced by the p th quantile of N(0, σ²) distribution with σ² equal to the sample variance, as this distribution serves as a better fit to the data generated by the GARCH(1,1) model (III).

However, the performance of SRTR estimator seems to be sensitive to the choice of the model for the fine returns. For instance, under model (I) and (II) the MSE of the SRTR rule based 95 percent VaR and MS estimators are much higher in comparison to the same under model III for the fine return data. In fact, in Tables 1 and 2, $\frac {MSE9}{MSE1}>1$ under models (I) and (II) for all choices of n and the ratio $\frac {MSE9}{MSE1}$ increases as n is increased. This indicates that the SRTR rule estimator is outperformed by the sample quantile based 95 percent VaR and MS estimator for models (I) and (II).
3.
Performance of the Sfakianakis and Verginis estimator: $\frac {MSE5}{MSE1}>1$ for almost all choices of n, m and time series models. See fifth column in Tables 1–4. This indicates that Sfakianakis and Verginis estimator (SV 3_p) is outperformed by the sample quantile for estimation of 95 percent VaR and MS for m = 250 and 500.
4.
Performance of extreme value theory based estimator: $\frac {MSE2}{MSE1}<\frac {MSE6}{MSE1}$ and $\frac {MSE3}{MSE1}<\frac {MSE6}{MSE1}$ for all choices of n, m and for all time series models in Tables 1–4. The estimator EV T_p in Eq. 2.9 is uniformly outperformed by our proposed estimators Eqs. 2.2 and 2.5 of VaR and MS respectively, for all the models. EV T_p is also outperformed by our proposed bootstrap based VaR and MS estimators Eqs. 2.12 and 2.13. For model (I), $\frac {MSE6}{MSE1}>1$ for n ≥ 50. This indicates that for this model the sample quantile based VaR and MS estimators outperform the EVT estimator.
5.
Performance of Harell Davis estimator: $\frac {MSE2}{MSE1}<\frac {MSE4}{MSE1}$ and $\frac {MSE3}{MSE1}<\frac {MSE4}{MSE1}$ for all choices of n, m and for all time series models in Tables 1–4. This indicates that the estimator HD_p in Eq. 2.6 is also uniformly outperformed by our proposed estimators Eqs. 2.2, 2.5, 2.12 and 2.13 of VaR and MS respectively, for all the models. For n ≥ 50 in GARCH(1,1) model, $\frac {MSE4}{MSE1}>1$ in Tables 3 and 4 which indicates that for this model sample quantile based MS estimator outperform the Harell Davis estimator.
6.
Performance of Kernel Quantile Estimators: The Kernel Quantile Estimator PB_p seems to have the least MSE for Model (III) in Table 4 for n ≥ 50. But apart from that PB_p is outperformed by our proposed estimators as can be seen in Tables 1–4 where $\frac {MSE2}{MSE1}<\frac {MSE7}{MSE1}$ and $\frac {MSE3}{MSE1}<\frac {MSE7}{MSE1}$.

CT_p is uniformly outperformed by our proposed estimators for all choices of n, m and for all time series models in Tables 1–4 since $\frac {MSE2}{MSE1}\!<\!\frac {MSE8}{MSE1}$ and $\frac {MSE3}{MSE1}\!<\!\frac {MSE8}{MSE1}$. Further, $\frac {MSE8}{MSE1}>1$ for GARCH(1,1) model for all n and m. This indicates that the CT_p performs poorly compared to sample quantile based VaR and MS estimator for the GARCH(1,1) model and hence it is not suitable for estimating long term VaR and MS for data generated by GARCH(1,1).

Remark 4.

1. The above observations suggests that the proposed estimators $\widehat {VaR}_{m,p}$ and $\widehat {VaR}_{boot,p}$ perform well for all the time series models and m ≥ 250. The EVT based estimator, Harell Davis estimator, Sfakianakis and Verginis estimator and the Kernel quantile estimator CT_p perform poorly in comparison to the proposed estimators $\widehat {VaR}_{m,p}$ and $\widehat {VaR}_{boot,p}$ for n ≥ 20, m ≥ 250. Therefore, these estimators are not recommended for estimating long term VaR (m ≥ 250).

2. There are two reasons for the proposed estimators of long term VaR and MS being reliable. The m − period VaR and MS are equal to negative of the extreme left quantiles of the m −period return distribution (See Eqs. 2.2 and 2.5). For large m, our Lemma 1 and the empirical observations in Cont (2001) suggest that the m −period return distribution is well approximated by the normal distribution. Therefore the proposed m − period VaR and MS estimators, especially Eqs. 2.2 and 2.5, which are based on approximating the quantiles of m −period return distribution by the corresponding quantiles of the normal distribution seem to work well for estimation of m − period VaR and MS, for large m (m ≥ 250). The extreme value theory (EVT) based quantile estimator Eq. 2.9 is based on the assumption that as λ ↓ 0

$$\frac{F^{-1}_{m}(1-\lambda t)}{F^{-1}_{m}(1-\lambda)}\rightarrow \frac{1}{t^{\xi}},\ \forall t>0$$

for some ξ > 0 and $F^{-1}_{m}$ is the quantile function of m − period return distribution (See page 10). Since for large m the m − period return distribution resembles normal distribution, therefore the above mentioned assumption on $F^{-1}_{m}$ does not seem to be appropriate for large m. Therefore the proposed estimators Eqs. 2.2 and 2.5, based on normal approximation of the long term return distribution, seem to be more appropriate than the EVT estimator Eq.2.9 for estimation of long term VaR and MS.

Moreover for large m, the number n of observed values of the m − period return X_{t, m} is small. For example, for m = 250 (i.e. one-year time period) there are n = 26 observations on X_{t, m}, i.e. annual return of Nifty 50 index. See Table 5. All the other m − period VaR and MS estimators, viz. the sample quantile and the kernel based estimator in page 9, the Harrell-Davis estimator Eq. 2.6 , Sfakianakis and Verginis estimator Eq. 2.8 and the EVT estimator Eq. 2.9 are based on the n observed values of X_{t, m}. Larger the m, the lesser number of observations n on X_{t, m} are available for computation of these other estimators. Hence these estimators are more suitable for estimation of short term VaR and MS, i.e. for small m and large n. In contrast, our proposed estimators Eq. 2.2 and 2.5 depend on estimation of E(X_t) and V ar(X_t) for which n × m observations on the 1 period return X_t are available. In our proposed methodology observations of short term returns are used to estimate the parameters of long term VaR and MS formulae Eqs. 2.2 and 2.5.

4 Risk Estimation and Backtesting Based on Real Data

4.1 Backtesting VaR Estimates

Santomila et al. (2018) describe the unconditional backtest of a 100p percent VaR estimation model as a procedure of comparing the observed number of times the losses exceed the estimated VaR in a given period with number of times the actual VaR is expected to be exceeded during the same time period. If the observed number of exceedances is much higher than the expected number of exceedances of the actual 100p percent VaR, the VaR estimate is considered to be inadequate for regulatory purposes (See Santomil et al. 2018).

Let us recall that X_{t+k, m} denotes the returns during (t + k, t + k + m] for k = 0,1,2,..., where X_{t+k, m} is defined in Eq. 1.3. For any natural number n ≥ 2, let

$$Z_{n}=\sum\limits_{k=0}^{n-1}I\left( -X_{t+k, m}>VaR_{m, p}\right).$$

Z_n is the number of times the m − period loss exceeds the 100p percent VaR level in n successive time intervals (t + k, t + k + m] for k = 0,1,2,..., n − 1. Under Assumption 1{X_{t+k, m}}_{k= 0,1,2,...} are identically distributed. Since V aR_{m, p} = −Q_m,1−p, the expected number of exceedances is equal to

$$E\left( Z_{n}\right)=n(1-p).$$

V aR_{m, p} is unknown. Replacing V aR_{m, p} by an estimator $\widehat {VaR}_{m, p}$ in Z_n we get the the observed number of exceedances (we call it $\hat {Z}_{n}$). Therefore

$$\widehat{Z}_{n}=\sum\limits_{k=0}^{n-1}I\left( -X_{t+k, m}>\widehat{VaR}_{m, p}\right).$$

However $E(\widehat {Z}_{n})$ is unknown (as $\widehat {VaR}_{m, p}$ may not be equal to the negative of (1 − p)th quantile of X_{t, m}).

We consider $\widehat {VaR}_{m, p}$ is an adequate estimator of V aR_{m, p} if $P\left (-X_{t,m}>\right .$ $\left .\widehat {VaR}_{m, p}\right )\le 1-p$ and inadequate if $P\left (-X_{t,m}>\widehat {VaR}_{m, p}\right )> 1-p$. Therefore we test

$$H_{0}:\ P\left( -X_{t, m}\!>\!\widehat{VaR}_{m, p}\right) = 1-p\ \ \text{against}\ H_{1}:\ P\left( - X_{t, m}\!>\!\widehat{VaR}_{m, p}\right)\!>\!1-p.$$

Under H₀, $E\left (\widehat {Z}_{n}\right )=n(1-p)=E\left (Z_{n}\right )$ (the expected number of exceedances). We reject H₀ at 100α percent level of significance if $\widehat {Z}_{n}>n(1-p)+z_{n,\ \alpha },$ where z_{n, α} is the 100(1 − α) percentile of the distribution of $\widehat {Z}_{n}-n(1-p)$ under H₀.

The traditional unconditional backtest by Kupiec (1995) assume $\widehat {Z}_{n}$ to follow binomial(n,1 − p) distribution under H₀. See for instance Kupiec (1995) and Santomila et al. (2018). We use unconditional backtest by Kupiec (1995) to test H₀ against H₁ based on the observed m −period return data.

4.2 Annual VaR and MS Estimation of the Nifty 50 Index, Crude Oil And Gold

The S& P CNX Nifty 50 is a well diversified 50 stock market index accounting for twenty two sectors of the Indian economy. It is used for a variety of purposes such as benchmarking fund portfolios (see www.nseindia.com for details). Also there are a number of Nifty 50 index funds which are passively maintained mutual funds mirroring the portfolio composition of the Nifty 50 index. An obvious interest for the investors is to measure the risk due to fluctuation of the value the Nifty 50 index over a certain period.

Crude oil and gold prices have important impacts on the financial markets and the economy of a country. Oil and gold are two of the world’s most important commodities. They have received much attention recently due to the fluctuations in their prices and the increase in their economic applications. Crude oil is one of the most commonly traded commodity, and its price exhibits high volatility in the commodity market (See Regnier (2007)). The price fluctuations of gold lead to parallel movements in the prices of other precious metals (See Sari et al. (2010)). Gold is also an investment asset and commonly known as a “safe haven” from the increasing risks in financial markets. The estimation of market risk of crude oil and gold are important for the various stakeholders and participants such as producers, exporters etc. (See https://www.mcxindia.com/products/energy/crude-oil and https://www.mcxindia.com/products/bullion/gold)

In India, a financial year (we call it FY) refers to the period from 1st April of a year to 31st March of the next year. There are approximately 250 trading days in a financial year (the exact number of days on which the stock markets in India remain closed may vary from one year to another). Dutta and Das (2018) published data on daily log-returns of the Nifty 50 index in national stock exchange (NSE) in India for the financial years (FY) from 1995-96 to 2017-18 in the form of twenty three “csv” files in Mendeley https://data.mendeley.com/datasets/tm2kzgf3gd/.^{Footnote 1} These log returns are reported as percentages, i.e. daily log return multiplied by 100.

The historical data on Gold and Crude Oil daily closing prices (in US dollars per barrel and per troy ounce respectively) from the FY 2001-02 to FY 2020-21 are available in Yahoo finance (https://finance.yahoo.com/quote/CL%3DF/history?p=CL%3DF, https://finance.yahoo.com/quote/GC%3DF/history?p=GC%3DF). The daily log returns are calculated by taking the logarithm of the ratio of closing prices on two consecutive days. The annual log return is the sum of the daily log returns within a financial year. In Tables 6 and 7, we report the annual log returns (in percentage) of crude oil and gold prices respectively for the 20 financial years, from FY 2001 − 02 to FY 2020 − 21.

4.2.1 Data Analysis

1. NIFTY 50 index: In Table 5, we report the annual log returns for the 26 financial years, from FY 1995 − 96 to FY 2020 − 21 for Nifty 50 index. Each of the annual log-return is the sum of the daily log returns recorded between the 1st and the last day of the FY. Dutta and Das(Dutta and Das, 2018) have reported data upto the FY 2017 − 18. The annual Nifty return data for the financial year FY 2018 − 19 to FY 2020 − 21 are obtained from the Yahoo Finance website https://in.finance.yahoo.com. The data in Table 5 is positively skewed and the moment coefficient of kurtosis is close to 3 (i.e. not heavy tailed).

We use the daily log return data set by Dutta and Das(Dutta and Das, 2018) and the Eqs. 2.2, 2.5, 2.12 and 2.13 to estimate the 95 percent value at risk (VaR) and the median shortfall (MS) of the Nifty 50 index over a period of one FY (i.e. 250 trading days starting from the 1st trading day in April). Here, m = 250 days. E(X_t) and V ar(X_t) in Eq. 2.2 are approximated by the mean and variances of the daily returns, i.e. negative of the numbers reported by Dutta and Das (2018). There are more than five thousand daily log return numbers for the twenty three years. The average daily log return of the Nifty 50 during 1995-96 to 2017-18 is equal to 0.039, with standard deviation 1.511.

The 95 percent VaR and MS estimates using Eqs. 2.2 and 2.5 for the Nifty 50 annual loss, i.e. m = 250 are equal to 29.460 and 36.992 percent respectively. These numbers imply that there is a five percent chance (one in twenty years) of the Nifty 50 value in log-scale depreciating by more than 29.46 percent in one financial year. In case the annual loss of the Nifty 50 value exceeds 29.46 percent, median annual loss (in log scale) beyond the VaR level is estimated to be 36.992 percent.

We also estimate the 95 percent annual VaR and MS of the NIFTY 50 index using the proposed bootstrap based estimator ($\widehat {VaR}_{boot,p}$), sample quantile estimator (SQ_p), Sfakianakis and Verginis estimator (SV 3_p), extreme value theory based estimator (EV T_p) and SRTR estimator. These estimates are based on the 26 annual log returns of the Nifty 50 index and are reported in Table 8. Using the Kupiec test described in Section 4.1 we test whether our proposed estimators $\widehat {VaR}_{m ,p}$ in Eq. 2.2 and $\widehat {MS}_{m ,p}$ in Eq. 2.5, ($\widehat {VaR}_{boot,p}$) in Eq. 2.12 and ($\widehat {MS}_{boot,p}$) in Eq. 2.13 and the other four estimators viz. SQ_p, SV 3_p, EV T_p and SRTR are adequate risk measures. The p-value is equal to the probability that at least $\widehat {Z}_{n}$ out of 26 annual losses of the Nifty 50 index exceed the estimated 95 percent VaR or the MS, assuming H₀ is true.

In Table 8 we report the annual 95 percent VaR and MS estimates of the NIFTY 50 by each one of the above mentioned estimators. Also reported are the number of exceedances $\widehat {Z}_{n}$ of the VaR and MS estimates by each method and corresponding p-values based on the unconditional backtest by Kupiec (1995).

All the p-values exceed the 5 percent level of significance, indicating that all the estimates of the one year 95 percent VaR and MS of the Nifty 50 index are adequate (See Table 8). However, the SRTR based one year MS estimate and SV 3_p based one year MS estimate exceed the magnitude of all the 26 annual losses of Nifty 50 index from FY 1995 − 96 to FY 2020 − 21 in Table 8. Clearly the SRTR method and SV 3_p over estimates the annual MS of the Nifty 50 index. This is in line with the observation given by Dowd et al. (2004). The proposed estimators Eqs. 2.2, 2.5, 2.12 and 2.13 and the other two estimators viz.SQ_p and EV T_p seem to yield similar estimates of the one year 95 percent VaR and MS of the Nifty 50 index.

2. Crude oil and gold prices:

The 95 percent annual VaR and MS estimates for the crude oil and gold prices are reported in Tables 9 and 10 respectively.

From Tables 9 and 10 we observe that the p-values of the unconditional backtest of the proposed VaR estimators viz. $\widehat {VaR}_{m,p}$ and $\widehat {VaR}_{boot,p}$ (See Eqs. 2.2 and 2.12) exceed the 5 percent level of significance. The same observation is also true for the other estimators viz. SQ_p, SV 3_p, EV T_p and SRTR.^{Footnote 2} Therefore, the proposed estimators and the estimators SQ_p, SV 3_p, EV T_p and SRTR provide adequate estimates of annual VaR and MS for crude oil and gold returns. The SV 3_p (in Tables 9 and 10) and SRTR (in Table 10) based VaR estimates seem to be exaggerated as there are no observed exceedances of the resulting risk estimates in the historical data of annual returns of crude oil and gold.

Comparing the VaR and MS estimates of the NIFTY 50, crude oil and gold annual returns we observe that gold exhibits the least market risk over a duration of one year and the crude oil exhibits similar annual market risk as the NIFTY 50 index.

Notes

Historical data on the Nifty indices were available for free from the NSE website till 2018. At present the historical data is available on payment basis. However, the data upto 31st March 2018 on various Nifty indices are avaiable with the author in the “csv” format.
For the gold returns data the pth quantile of N(0, σ²) was taken where σ² is the sample variance as this distribution serves as a better fit.

References

Alemany, R., Bolancé, C. and Guillé, M. (2013). A nonparametric approach to calculating value-at-risk. Insurance: Mathematics and Economics 52, 255–262.
MathSciNet MATH Google Scholar
Allen, D. E., Singh, A. K. and Powell, R. J. (2013). EVT And tail-risk modelling: Evidence from market indices and volatility series . The North American Journal of Economics and Finance 26, 355–369.
Article Google Scholar
Balkema, A. A. and de Haan, L. (1974). Residual life time at great age. Ann. Probab. 2, 792–804.
Article MathSciNet MATH Google Scholar
Cont, R. (2001). Empirical properties of asset returns: stylized facts and statistical issues. Quantitative Finance. 1, 223–236.
Article MATH Google Scholar
Chen, S. X. and Tang, C. Y. (2005). Nonparametric Inference of Value-at-Risk for Dependent Financial Returns. J. Financ. Econ. 3, 227–255.
Google Scholar
Danielsson, J. (2002). The emperor has no clothes: Limits to risk modelling. Journal of Banking & Finance.26, 1273–1296.
Article Google Scholar
Dowd, K., Blake, D. and Cairns, A. (2004). Long-Term Value at risk. Journal of Risk Finance. 5, 52–57. https://doi.org/10.1108/eb022986.
Article Google Scholar
Drees, H. (2003). Extreme quantile estimation for dependent data, with application to finance. Bernoulli 9, 617–657.
Article MathSciNet MATH Google Scholar
Dutta, S. and Biswas, S (2017). Extreme quantile estimation based on financial time series. Communications in Statistics–Simulation and Computation 46, 1–18.
Article MathSciNet MATH Google Scholar
Dutta, S. and Das, D. (2018). Data on daily Log returns of the NIFTY 50 index in the National Stock Exchange India. Mendeley Data V1. https://doi.org/10.17632/tm2kzgf3gd.1.
Efron, B (1979). Bootstrap methods: Another look at the jackknife. Ann. Stat. 7, 1–26.
Article MathSciNet MATH Google Scholar
Fedor, M. (2007). Economic capital versus regulatory capital for market risk in banking and insurance sectors: Basel II experience and the challenge for Solvency II. http://www.actuaries.org/AFIR/Colloquia/Stockholm/xFedor.pdf.
Goh, J. W., Lim, K. G., Sim, M. and Zhang, W. (2012). Portfolio value-at-risk optimization for asymmetrically distributed asset returns. European Journal of Operational Research. 221, 397–406.
Article MathSciNet MATH Google Scholar
Harrell, F. E. and Davis, C. E. (1982). A new distribution-free quantile estimator. Biometrika 69, 635–640.
Article MathSciNet MATH Google Scholar
Herrndorf, N. (1985). A Functional Central Limit Theorem for Strongly Mixing Sequences of Random Variables. Z. Wahrsheinlichkeitstheorie verw. Gebiete69, 541–550.
Article MathSciNet MATH Google Scholar
Kassberger, S. and Kiesel, R. (2006). A fully parametric approach to return modelling and risk management of hedge funds. Financial Markets and Portfolio Management. 20, 472–491.
Article Google Scholar
Kupiec, P. (1995). Techniques for verifying the accuracy of risk management models. J. Derivat. 3, 73–84.
Article Google Scholar
Lahiri, S. (2003). Resampling Methods for Dependent Data. Springer Series in Statistics. Springer, New York.
Book Google Scholar
Polanski, A. and Baker, E. R. (2000). Multistage Plug-in Bandwidth Selection for Kernel Distribution Function Estimates. J. Stat. Comput. Simul. 65, 63–80.
Article MathSciNet MATH Google Scholar
Posedel, P. (2005). Properties and estimation of GARCH(1,1) model. Metodoloski zvezki 2, 243–257.
Google Scholar
Regnier, E. (2007). Oil and energy price volatility. Energy Economics29, 405–427.
Article Google Scholar
Rieders, E. (1993). The size of the averages of strongly mixing random variables. Statistics and Probability Letters 18, 57–64.
Article MathSciNet MATH Google Scholar
Ruppert, D. (2004). Statistics and Finance. An Introduction. Springer Texts in Statistics.
Santomila, P. D., Gonza’leza, L. O., Cunillb, O. M. and Lindahlc, J. M. M. (2018). Backtesting an equity risk model under Solvency II. J. Bus. Res.89, 216–222.
Article Google Scholar
Sari, R., Hammoudeh, S. and Soytas, U. (2010). Dynamics of oil price, precious metal prices and exchange rate. Energy Economics 32, 351–362.
Article Google Scholar
Sfakianakis, M. E. and Verginis, D. G. (2008). A new family of nonparametric quantile estimators. Communications in Statistics-Simulation and Computation. 37, 337–345.
Article MathSciNet MATH Google Scholar
Sherfling, R. (1980). Approximation Theorems of Mathematical Statistics. Wiley, New York.
Book Google Scholar
So, M. K. P. and Wong, C. M. (2012). Estimation of multiperiod expected shortfall and median shortfall for risk management. Quantitative Finance12, 739–754.
Article MathSciNet MATH Google Scholar
Spadafora, L., Dubrovich, M. and Terraneo, M. (2014). Value-at-Risk time scaling for long-term risk estimation. arXiv:1408.2462.
Sun, S. X. (2006). The bahadur representation for sample quantiles under weak dependence. Statistics and Probability Letters 75, 1238–1244.
Article MathSciNet MATH Google Scholar
Tolikas, K. and Gettinby, G. D. (2009). Modelling the distribution of the extreme share returns in Singapore. J. Empir. Financ. 16, 254–263.
Article Google Scholar
Wang, X. J., Hu, S. H. and Yang, W. Z. (2011). The bahadur representation for sample quantiles under strongly mixing sequence. Journal of Statistical Planning and Inference. 141, 655–662.
Article MathSciNet MATH Google Scholar
Wu, W. B. (2005). On the bahadur representation of sample quantiles for dependent sequences. Ann. Stat. 33, 1934–1963.
Article MathSciNet MATH Google Scholar

Download references

Acknowledgments

We are thankful to the reviewers for their detailed comments and suggestions which lead to significant improvement of the manuscript. We are thankful to the Editor in Chief for giving us the opportunity to revise the manuscript and make a fresh submission. The research is supported by the MATRICS scheme MTR/2019/000502 of Science and Engineering Research Board (SERB), Govt. of India.

Funding

The first author, Santanu Dutta, has received research support from the MATRICS scheme MTR/2019/000502 of Science and Engineering Research Board (SERB), Govt. of India.

Author information

Authors and Affiliations

Professor, Mathematical Sciences Department, Tezpur University, Assam, India
Santanu Dutta
Research Scholar, Mathematical Sciences Department, Tezpur University, Assam, India
Tushar Kanti Powdel

Authors

Santanu Dutta
View author publications
You can also search for this author in PubMed Google Scholar
Tushar Kanti Powdel
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Santanu Dutta.

Ethics declarations

Conflict of Interest

The authors have no conflicts of interest to declare that are relevant to the content of this article.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Research supported by the MATRICS scheme of Science and Engineering Research Board (SERB), Govt. of India

Appendix

Table 1 Mean Squared Error(MSE) ratio estimated for 95 percent VaR estimators (for m = 250)

Full size table

Table 2 Mean Squared Error(MSE) ratio estimated for 95 percent VaR estimators (for m = 500)

Full size table

Table 3 Mean Squared Error(MSE) ratio estimated for 95 percent MS estimators (for m = 250)

Full size table

Table 4 Mean Squared Error(MSE) ratio estimated for 95 percent MS estimators (for m = 500)

Full size table

Table 5 Nifty 50 Annual log returns in percentage for 26 financial years (FY)

Full size table

Table 6 Crude Oil price annual log returns in percentage for 20 financial years (FY)

Full size table

Table 7 Gold price annual log returns in percentage for 20 financial years (FY)

Full size table

Table 8 VaR and MS estimates using the NIFTY 50 data given in Table 5

Full size table

Table 9 VaR and MS estimates using the Crude Oil data given in Table 6

Full size table

Table 10 VaR and MS estimates using the Gold data given in Table 7

Full size table

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Dutta, S., Powdel, T.K. Modeling Long Term Return Distribution and Nonparametric Market Risk Estimation. Sankhya B 85 (Suppl 1), 257–289 (2023). https://doi.org/10.1007/s13571-023-00303-x

Download citation

Received: 04 October 2021
Accepted: 14 January 2023
Published: 21 February 2023
Issue Date: May 2023
DOI: https://doi.org/10.1007/s13571-023-00303-x

Keywords

AMS (2000) subject classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Modeling Long Term Return Distribution and Nonparametric Market Risk Estimation

Abstract

Similar content being viewed by others

Estimation of Expected Shortfall Using Quantile Regression: A Comparison Study

Dependent bootstrapping for value-at-risk and expected shortfall

Value-at-Risk Estimation via a Semi-parametric Approach: Evidence from the Stock Markets

1 Introduction

1.1 Long Term Return Distribution and Aggregational Gaussianity

Assumption 1.

Example 1.

Example 2.

Example 3.

Lemma 1.

Remark 1.

Lemma 2.

Proof.

2 Estimation of VaR and MS

Definition 1.

Remark 2.

Remark 3.

Definition 2.

Lemma 3.

2.1 Other Non-Parametric VaR and MS Estimators

Sample Quantile and Kernel Quantile Estimator

Harrell-Davis Estimator

Sfakianakis and Verginis estimator

2.2 VaR and MS Estimation Based on Extreme Value Theory (EVT)

2.3 VaR and MS Estimation Based on Square Root of Time Rule (SRTR)

2.4 VaR and MS Estimation by Bootstrap Approach

3 Simulation Study

Remark 4.

4 Risk Estimation and Backtesting Based on Real Data

4.1 Backtesting VaR Estimates

4.2 Annual VaR and MS Estimation of the Nifty 50 Index, Crude Oil And Gold

4.2.1 Data Analysis

2. Crude oil and gold prices:

Notes

References

Acknowledgments

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Publisher’s Note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

AMS (2000) subject classification

Search

Navigation