Abstract
Quantiles are recognized tools for risk management and can be seen as minimizers of an \(L^1\)-loss function, but do not define coherent risk measures in general. Expectiles, meanwhile, are minimizers of an \(L^2\)-loss function and define coherent risk measures; they have started to be considered as good alternatives to quantiles in insurance and finance. Quantiles and expectiles belong to the wider family of \(L^p\)-quantiles. We propose here to construct kernel estimators of extreme conditional \(L^p\)-quantiles. We study their asymptotic properties in the context of conditional heavy-tailed distributions, and we show through a simulation study that taking \(p \in (1,2)\) may allow to recover extreme conditional quantiles and expectiles accurately. Our estimators are also showcased on a real insurance data set.
Access provided by Autonomous University of Puebla. Download chapter PDF
Similar content being viewed by others
1 Introduction
The quantile, also called Value-at-Risk in actuarial and financial areas, is a widespread tool for risk measurement, due to its simplicity and interpretability: if Y is a random variable with a cumulative distribution function F, the quantile at level \(\alpha \in (0,1)\) is defined as \(q(\alpha )= \inf \left\{ y \in \mathbb {R} | F(y) \ge \alpha \right\} \). As pointed out in Koenker and Bassett (1978), quantiles may also be seen as a solution of the following minimization problem:
where \(\rho _{\alpha }^{(1)}(y)=|\alpha -\mathbbm {1}_{\{ y \le 0 \}}| |y|\) is the quantile check function. However, the quantile is not subadditive in general and so is not a coherent risk measure in the sense of Artzner et al. (1999). An alternative risk measure gaining popularity is the expectile, introduced in Newey and Powell (1987). This is the solution of (1), with the new loss function \(\rho _{\alpha }^{(2)}(y)=|\alpha -\mathbbm {1}_{\{ y \le 0 \}}| y^2\) in place of \(\rho _{\alpha }^{(1)}\). Expectiles larger than the mean are coherent risk measures, and have started to be used in actuarial and financial practice (see for instance Cai and Weng 2016). A pioneering paper for the estimation of extreme expectiles in heavy-tailed settings is Daouia et al. (2018).
Quantiles and expectiles may be generalized by considering the family of \(L^p\)-quantiles. Introduced in Chen (1996), this class of risk measures is defined, for all \(p \ge 1\), by
where \(\rho _{\alpha }^{(p)}(y)=|\alpha -\mathbbm {1}_{\{ y \le 0 \}}| |y|^p\) is the \(L^p\)-quantile loss function; the case \(p=1\) leads to the quantile and \(p=2\) gives the expectile. Note that, for \(p>1\), using the formulation (2) and through the subtraction of the (at first sight unimportant) term \(\rho _{\alpha }^{(p)}(Y)\), it is a straightforward consequence of the mean value theorem applied to the function \(\rho _{\alpha }^{(p)}\) that the \(L^p\)-quantile \(q^{(p)}(\alpha )\) is well defined as soon as \(\mathbb {E}(|Y|^{p-1})<\infty \). While the expectile is the only coherent \(L^p\)-quantile (see Bellini et al. 2014), Daouia et al. (2019) showed that for extreme levels of quantiles or expectiles (\(\alpha \rightarrow 1\)), it may be better to estimate \(L^p\)-quantiles first (where typically p is between 1 and 2) and exploit an asymptotic proportionality relationship to estimate quantiles or expectiles. An overview of the potential applications of this kind of statistical assessment of extreme risk may for instance be found in Embrechts et al. (1997).
The contribution of this work is to propose a methodology to estimate extreme \(L^p\)-quantiles of \(Y|\mathbf {X}=\mathbf {x}\), where the random covariate vector \(\mathbf {X} \in \mathbb {R}^d\) is recorded alongside Y. In this context, the case \(p=1\) (quantile) has been considered in Daouia et al. (2011) and Daouia et al. (2013), and the case \(p=2\) (expectile) has recently been studied in Girard et al. (2021). For the general case \(p \ge 1\), only Usseglio-Carleve (2018) proposes an estimation procedure under the strong assumption that the vector \((\mathbf {X},Y)\) is elliptically distributed. The present paper avoids this modeling assumption by constructing a kernel estimator.
The paper is organized as follows. Section 2 introduces an estimator of conditional \(L^p\)-quantiles. Section 3 gives the asymptotic properties of the estimator previously introduced, at extreme levels. Finally, Sect. 4 proposes a simulation study in order to assess the accuracy of our estimator which is then showcased on a real insurance data set in Sect. 5. Proofs are postponed to the Appendix.
2 \(L^p\)-quantile Kernel Regression
Let \((\mathbf {X}_i,Y_i)\), \(i=1,...,n\) be independent realizations of a random vector \((\mathbf {X},Y) \in \mathbb {R}^d \times \mathbb {R}\). For the sake of simplicity we assume that \(Y\ge 0\) with probability 1. We denote by g the density function of \(\mathbf {X}\) and let, in the sequel, \(\mathbf {x}\) be a fixed point in \(\mathbb {R}^d\) such that \(g(\mathbf {x})>0\). We denote by \(\bar{F}^{(1)}(y|\mathbf {x})= \mathbb {P} \left( Y>y |\mathbf {X}=\mathbf {x} \right) \) the conditional survival function of Y given \(\mathbf {X}=\mathbf {x}\) and assume that this survival function is continuous and regularly varying with index \(-1/\gamma (\mathbf {x})\):
Such a distribution belongs to the Fréchet maximum domain of attraction (de Haan and Ferreira 2006). Note that for any \(k<1/\gamma (\mathbf {x})\), \(\mathbb {E}\left[ Y^k |\mathbf {X}=\mathbf {x} \right] < \infty \). Since the definition of \(L^p\)-quantiles in (2) requires \(\mathbb {E}\left[ |Y|^{p-1} |\mathbf {X}=\mathbf {x} \right] < \infty \), our minimal assumption will be that \(p-1<1/\gamma (\mathbf {x})\). From Eq. (2), \(L^p\)-quantiles of level \(\alpha \in (0,1)\) of Y given \(\mathbf {X}=\mathbf {x}\) may also be seen as the solution of the following equation:
In other terms, as noticed in Jones (1994), (conditional) \(L^p\)-quantiles can be equivalently defined as quantiles
of the distribution associated with the survival function
where, for all \(k \ge 0,\)
Obviously, if \(p=1\), we get the survival function introduced above. The case \(p=2\) leads to the function introduced in Jones (1994) and used in Girard et al. (2021). To estimate \(\bar{F}^{(p)}(y|\mathbf {x})\), we let K be a probability density function on \(\mathbb {R}^d\) and we introduce the kernel estimators
Note that \(\hat{m}_n^{(0)}(0|\mathbf {x})\) is the kernel density estimator of \(g(\mathbf {x})\), and \(\hat{m}_n^{(1)}(0|\mathbf {x})/\hat{m}_n^{(0)}(0|\mathbf {x})\) is the standard kernel regression estimator (since the \(Y_i\) are nonnegative). The kernel estimators of \(\bar{F}^{(p)}(y|\mathbf {x})\) and \(q^{(p)}(\alpha |\mathbf {x})\) are then easily deduced:
The case \(p=1\) gives the kernel quantile estimator introduced in Daouia et al. (2013), while \(p=2\) leads to the conditional expectile estimator of Girard et al. (2021). We study here the asymptotic properties of \(\hat{q}_n^{(p)}(\alpha |\mathbf {x})\) for an arbitrary \(p\ge 1\), when \(\alpha =\alpha _n\rightarrow 1\).
3 Main Results
We first make a standard assumption on the kernel. We fix a norm \(||\cdot ||\) on \(\mathbb {R}^d\).
\((\mathcal {K})\) The density function K is bounded and its support S is contained in the unit ball.
To be able to analyze extreme conditional \(L^p\)-quantiles in a reasonably simple way, we make a standard second-order regular variation assumption (for a survey of those conditions, see Sect. 2 in de Haan and Ferreira (2006)).
\(\mathcal {C}_2 \left( \gamma (\mathbf {x}), \rho (\mathbf {x}), A(.|\mathbf {x}) \right) \) There exist \(\gamma (\mathbf {x})>0\), \(\rho (\mathbf {x}) \le 0\) and a positive or negative function \(A(\cdot |\mathbf {x})\) converging to 0 such that
Our last assumption is a local Lipschitz condition which may be found for instance in Daouia et al. (2013); El Methni et al. (2014). We denote by \(B(\mathbf {x},r)\) the ball with center \(\mathbf {x}\) and radius r.
\((\mathcal {L})\) We have \(g(\mathbf {x})>0\) and there exist \(c, r>0\) such that
To be able to control the local oscillations of \((\mathbf {x},y)\mapsto \bar{F}^{(1)}(y|\mathbf {x})\), we let, for any nonnegative \(y_n \rightarrow \infty \),
The quantity \(\omega _{h_n}^{(1)}(y_n|\mathbf {x})\), discussed for instance in Girard et al. (2021), controls the oscillation of the conditional survival function with respect to \(\mathbf {x}\) in its right tail, while \(\omega _{h_n}^{(2)}(y_n|\mathbf {x})\) and \(\omega _{h_n}^{(3)}(y_n|\mathbf {x})\) are introduced to be able to deal with the case \(p\notin \{1,2\}\) specifically. Let us highlight that \(\omega _{h_n}^{(3)}(y_n|\mathbf {x})\) is again geared toward controlling an oscillation of the right tail of the conditional distribution; however, \(\omega _{h_n}^{(2)}(y_n|\mathbf {x})\) focuses on the oscillation of the center of the conditional distribution with respect to \(\mathbf {x}\). For \(p>1\), the introduction of a quantity such as \(\omega _{h_n}^{(2)}(y_n|\mathbf {x})\) is in some sense natural, since we will have to deal with the local oscillation of the conditional moment \(m^{(p-1)}(y|\mathbf {x})\), appearing in the denominator of \(\bar{F}^{(p)}(y|\mathbf {x})\), and this conditional moment indeed depends on the whole of the conditional distribution rather than merely on its right tail. Typically \(\omega _{h_n}^{(1)}(y_n|\mathbf {x})=O(h_n)\), \(\omega _{h_n}^{(2)}(y_n|\mathbf {x})=O(h_n)\) and \(\omega _{h_n}^{(3)}(y_n|\mathbf {x})=o(1)\) under reasonable assumptions; we give examples below.
Remark 1
Assume that \(Y|\mathbf {X}=\mathbf {x}\) has a Pareto distribution with tail index \(\gamma (\mathbf {x})>0\):
If \(\gamma \) is locally Lipschitz continuous, we clearly have \(\omega _{h_n}^{(1)}(y_n|\mathbf {x})=O(h_n)\). Furthermore, for any \(y\ge 1\), the mean value theorem yields
(Here and below \(\vee \) denotes the maximum operator.) Under this same local Lipschitz assumption, one then finds \(\omega _{h_n}^{(2)}(y_n|\mathbf {x})=O(h_n)\) as well. Finally, for any \(y,y'>1\),
by the mean value theorem again. This inequality yields \(\omega _{h_n}^{(3)}(y_n|\mathbf {x}) =o(1)\).
The same arguments, and asymptotic bounds on \(\omega _{h_n}^{(1)}(y_n|\mathbf {x})\), \(\omega _{h_n}^{(2)}(y_n|\mathbf {x})\) and \(\omega _{h_n}^{(3)}(y_n|\mathbf {x})\), apply to the conditional Fréchet model
Analogous results are easily obtained for the conditional Burr model
when \(\rho <0\) is assumed to be locally Lipschitz continuous, and the conditional mixture Pareto model
when \(\rho <0\) and \(c\in (0,1)\) are assumed to be locally Lipschitz continuous. \(\square \)
3.1 Intermediate \(L^p\)-quantile Regression
In this paragraph, we assume that \(\sigma _n^{-2}=n h_n^d (1-\alpha _n) \rightarrow \infty \). Such an assumption means that the \(L^p\)-quantile level \(\alpha _n\) tends to 1 slowly (by extreme value standards), hence the denominations intermediate sequence and intermediate \(L^p-\)quantiles. This assumption is widespread in the literature of risk measure regression: see, among others, Daouia et al. (2013, 2011); El Methni et al. (2014); Girard et al. (2021). Throughout, we let \(|| K ||_2^2= \int _S K(\mathbf {u})^2 d\mathbf {u}\) be the squared \(L^2-\)norm of K, \(\Psi (\cdot )\) denote the digamma function and \(IB(t,x,y)=\int _0^t u^{x-1}(1-u)^{y-1}du\) be the incomplete Beta function. Note that \(IB(1,x,y)=B(x,y)\) is the standard Beta function.
We now give our first result on the joint asymptotic normality of a finite number J of empirical conditional quantiles with an empirical conditional \(L^p\)-quantile (\(p>1\)).
Theorem 1
Assume that \((\mathcal {K})\), \((\mathcal {L})\) and \(\mathcal {C}_2 \left( \gamma (\mathbf {x}), \rho (\mathbf {x}), A(.|\mathbf {x}) \right) \) hold. Let \(\alpha _n \rightarrow 1\), \(h_n \rightarrow 0\) and \(a_n=1-\tau (1-\alpha _n) (1+o(1))\), where \(\tau >0\). Assume further that \(\sigma _n^{-2}=n h_n^d (1-\alpha _n) \rightarrow \infty \), \(n h_n^{d+2} (1-\alpha _n) \rightarrow 0\), \(\sigma _n^{-1} A \left( (1-\alpha _n)^{-1}|\mathbf {x} \right) =O(1)\), \(\omega _{h_n}^{(3)}( q^{(1)}(\alpha _n|\mathbf {x})|\mathbf {x}) \rightarrow 0\) and there exists \(\delta \in (0,1)\) such that
where \(\theta =\left( \tau \gamma (\mathbf {x})/B\left( p,\gamma (\mathbf {x})^{-1}-p+1 \right) \right) ^{-\gamma (\mathbf {x})}\). Let further \(\alpha _{n,j}=1-\tau _j(1-\alpha _n)\), for \(0<\tau _1<\tau _2<\, ... \,<\tau _J \le 1\) such that
Then, for all \(p \in (1,\gamma (\mathbf {x})^{-1}/2+1)\), one has
where \({\boldsymbol{\Sigma }}(\mathbf {x})\) is the symmetric matrix having entries
Theorem 1, which will be useful to introduce estimators of the tail index \(\gamma (\mathbf {x})\) as part of our extrapolation methodology, generalizes and adapts to the conditional setup several results already found in the literature: see Theorem 1 in Daouia et al. (2013), Theorem 1 in Daouia et al. (2019) and Theorem 3 in Daouia et al. (2020b). Note however that, although they are in some sense related, Theorem 1 does not imply Theorem 1 of Girard et al. (2021), because the latter is stated under weaker regularity conditions warranted by the specific context \(p=2\) of extreme conditional expectile estimation. On the technical side, assumptions (5) and (6) ensure that the bias introduced by smoothing in the \(\mathbf {x}\) direction is negligible compared to the standard deviation \(\sigma _n\) of the estimator. The aim of the next paragraph is now to extrapolate our intermediate estimators to properly extreme levels.
3.2 Extreme \(L^p\)-quantile Regression
We consider here a level \(\beta _n \rightarrow 1\) such that \(n h_n^d (1-\beta _n) \rightarrow c <\infty \). The estimators previously introduced no longer work at such an extreme level. In order to overcome this problem, we first recall a result of Daouia et al. (2019) (see also Lemma 5 below)
In the sequel, we shall use the notation \(g_p(\gamma )=\gamma /B \left( p,\gamma ^{-1}-p+1 \right) \). A first consequence of this result is that the \(L^p\)-quantile function is regularly varying, i.e.,
This suggests then that, by considering an intermediate sequence \((\alpha _n)\), our conditional extreme \(L^p\)-quantile may be approximated (and estimated) as follows:
Here, \(\hat{q}_n^{(p)}( \alpha _n |\mathbf {x})\) is the kernel estimator introduced in Eq. (4), and \(\hat{\gamma }_{\alpha _n}(\mathbf {x})\) is a consistent estimator of the conditional tail index \(\gamma (\mathbf {x})\). This is a class of Weissman-type estimators (see Weissman 1978) of which we give the asymptotic properties.
Theorem 2
Assume that \((\mathcal {K})\), \((\mathcal {L})\) and \(\mathcal {C}_2(\gamma (\mathbf {x}),\rho (\mathbf {x}),A(\cdot |\mathbf {x}))\) hold with \(\rho (\mathbf {x})<0\). Let \(\alpha _n, \beta _n\rightarrow 1\), \(h_n\rightarrow 0\) be such that \( \sigma _n^{-2}=n h_n^d (1-\alpha _n) \rightarrow \infty \) and \(nh_n^d (1-\beta _n) \rightarrow c<\infty \). Assume further that \(n h_n^{d+2} (1-\alpha _n) \rightarrow 0\), \(\omega _{h_n}^{(3)}( q^{(1)}(\alpha _n|\mathbf {x})|\mathbf {x}) \rightarrow 0\) and
-
(i)
\(\sigma _n^{-1} A \left( (1-\alpha _n)^{-1}|\mathbf {x} \right) =O(1)\), \(\sigma _n^{-1} (1-\alpha _n)=O(1)\) and
\(\sigma _n^{-1} \mathbb {E} \left[ Y \mathbbm {1}_{\{ 0< Y < q^{(1)}(\alpha _n|\mathbf {x}) \}} | \mathbf {x} \right] q^{(1)}(\alpha _n|\mathbf {x})^{-1}=O(1)\),
-
(ii)
For some \(\delta \in (0,1)\), \(\sigma _n^{-1} \omega _{h_n}^{(1)}((1-\delta ) [g_p(\gamma (\mathbf {x}))]^{-\gamma (\mathbf {x})} q^{(1)}(\alpha _n|\mathbf {x})|\mathbf {x}) \log (1-\alpha _n) \rightarrow 0\) and \(\sigma _n^{-1} \omega _{h_n}^{(2)}((1+\delta ) q^{(1)}(\alpha _n|\mathbf {x})|\mathbf {x}) \rightarrow 0\),
-
(iii)
\(\sigma _n^{-1}/ \log \left( (1-\alpha _n)/(1-\beta _n) \right) \rightarrow \infty \).
Take \(p \in (1, \gamma (\mathbf {x})^{-1}/2+1)\). If in addition \( \sigma _n^{-1}( \hat{\gamma }_{\alpha _n}(\mathbf {x})-\gamma (\mathbf {x})) {\mathop { \longrightarrow }\limits ^{d}}\Gamma , \) then
We notice, as is classical in the analysis of heavy tails, that the asymptotic distribution of the extrapolated estimator \(\tilde{q}^{(p)}_{n,\alpha _n}( \beta _n |\mathbf {x} )\) is exactly that of the purely empirical estimator \(\hat{\gamma }_{\alpha _n}(\mathbf {x})\) with a slightly slower rate of convergence. Technically speaking, assumption (i) controls the bias due to the asymptotic approximation (9), while assumption (ii) is used to deal with the bias due to smoothing.
Our aim is now to propose some estimators of \(\gamma (\mathbf {x})\) solely based on intermediate \(L^p\)-quantiles, in order to carry out the extrapolation step.
3.3 \(L^p\)-quantile-Based Estimation of the Conditional Tail Index
The aim of this paragraph is to discuss the estimation of the conditional tail index \(\gamma (\mathbf {x})\). A local Pickands estimator is studied in Daouia et al. (2013, 2011). This estimator however has a large variance, which is why Daouia et al. (2011) propose a simplified, conditional, and local version of the Hill estimator:
They also mentioned that taking \(J=9\) is an optimal choice, and leads to an asymptotic variance close to \(1.25 || K ||_2^2 \gamma (\mathbf {x})^2/g(\mathbf {x})\). Recently, Daouia et al. (2020a); Girard et al. (2021) have shown that replacing the quantile by the expectile in tail index estimators can lead to a significant variance reduction. Our idea here is to propose an estimator based on \(L^p\)-quantiles rather than quantiles. In this context, we propose to follow the approach of Girard et al. (2019) and exploit the asymptotic relationship (9) by introducing the following estimator, valid for all \(1<p<\gamma (\mathbf {x})^{-1}+1\):
This class of estimators is introduced in Girard et al. (2019) in an unconditional setting, and the (explicit) estimator \(\hat{\gamma }_{\alpha _n}^{(2)}(\mathbf {x})\) is introduced in Girard et al. (2021). Using the results previously obtained, we can give the asymptotic distribution of \(\hat{\gamma }_{\alpha _n}^{(p)}(\mathbf {x})\) for all \(1<p<\gamma (\mathbf {x})^{-1}/2+1\).
Theorem 3
Assume that \((\mathcal {K})\), \((\mathcal {L})\) and \(\mathcal {C}_2(\gamma (\mathbf {x}),\rho (\mathbf {x}),A(\cdot |\mathbf {x}))\) hold with \(\gamma (\mathbf {x})<1\). Let \(\alpha _n \rightarrow 1\) and \(h_n \rightarrow 0\). Assume further that \(\sigma _n^{-2} = n h_n^d (1-\alpha _n) \rightarrow \infty \), \(n h_n^{d+2} (1-\alpha _n) \rightarrow 0\), \(\omega _{h_n}^{(3)}( q^{(1)}(\alpha _n|\mathbf {x}) |\mathbf {x}) \rightarrow 0\) and
-
(i)
\(\sigma _n^{-1} A \left( (1-\alpha _n)^{-1} |\mathbf {x} \right) \rightarrow 0 \),
-
(ii)
\(\sigma _n^{-1} q^{(1)}(\alpha _n |\mathbf {x})^{-1} \rightarrow \lambda \in \mathbb {R} \),
-
(iii)
For some \(\delta \in (0,1)\), \(\sigma _n^{-1} \omega _{h_n}^{(1)}((1-\delta ) \left( g_p(\gamma (\mathbf {x}))^{-\gamma (\mathbf {x})} q^{(1)}(\alpha _n|\mathbf {x}) \right) |\mathbf {x}) \log (1-\alpha _n) \rightarrow 0\) and \(\sigma _n^{-1} \omega _{h_n}^{(2)}((1+\delta ) \left( q^{(1)}(\alpha _n|\mathbf {x}) \right) |\mathbf {x}) \rightarrow 0\).
Then, for all \(p \in (1,\gamma (\mathbf {x})^{-1}/2+1)\), one has
where \({\boldsymbol{\Theta }}\) is a bivariate Gaussian distribution with mean vector \(\left( b_p(\mathbf {x}),0 \right) \) and covariance matrix \(|| K ||_2^2 \gamma (\mathbf {x})^2 g(\mathbf {x})^{-1} {\boldsymbol{\Omega }}(\mathbf {x})\) such that
Let us remark here that although Theorem 3 can be seen as a version of Theorem 4 of Girard et al. (2021), the latter is stated under weaker regularity assumptions and applies to further examples of estimators developed specifically in the conditional expectile setup.
Note that condition \(\gamma (\mathbf {x})<1\) entails \(\mathbb {E}[Y|\mathbf {X}=\mathbf {x}]<\infty \) and leads to a simple expression of the bias term \(b_p(\mathbf {x})\). A result dropping this assumption is available in the unconditional setting in Girard et al. (2019); here, our motivation for this condition is that we shall use extreme regression \(L^p\)-quantiles as a way to estimate extreme regression expectiles, for the existence of which a natural condition is that \(\mathbb {E}[|Y||\mathbf {X}=\mathbf {x}]<\infty \). The bias term \(b_p(\mathbf {x})\) is related to \(\gamma (\mathbf {x})\), \(q^{(1)}(\alpha _n|\mathbf {x})\) and \(\mathbb {E}[Y|\mathbf {X}=\mathbf {x}]\). All these quantities may be easily estimated (the latter two by kernel regression estimators) to construct a bias-reduced conditional tail index estimator as follows:
Under the conditions of Theorem 3, it is clear that \( \sigma _n^{-1} ( \tilde{\gamma }_{\alpha _n}^{(p)}(\mathbf {x})- \gamma (\mathbf {x}) ) \overset{d}{\longrightarrow } \mathcal {N} \left( 0, \Omega _{11}(\mathbf {x}) \right) \) where \(\Omega _{11}(\mathbf {x})\) is given in Eq. (14). This bias reduction improves significantly the numerical results, and is used in the finite-sample study below.
Even though \(L^p\)-quantiles with \(1<p<2\) are more widely estimable than expectiles and take into account the whole tail information, they are neither easy to interpret nor coherent as risk measures. Recent work in Daouia et al. (2019) has shown that extreme \(L^p\)-quantiles can be used as vehicles for extreme quantile and expectile estimation; see also Gardes et al. (2020) for an analogous study of the estimation of (a compromise between) Median Shortfall and Conditional Tail Expectation at extreme levels, using tail \(L^p-\)medians. Our focus in the following finite-sample study is to analyze the potential of extreme regression \(L^p\)-quantiles for the estimation of extreme regression quantiles and expectiles.
4 Simulation Study
We consider here a one-dimensional covariate (\(d=1\)), uniformly distributed on [0, 1], and a Burr-type distribution for Y given \(X=x\):
Such a distribution fulfills Assumption \(\mathcal {C}_2(\gamma (x),\rho (x),A(\cdot |x))\) with auxiliary function \(A(y|x)=\gamma (x) y^{\rho (x)}\). We simulate \(N=500\) samples of size \(n=1{,}000\) independent replications of (X, Y), and propose to estimate the conditional quantiles and expectiles of level \(\beta _n=1-1/n=0.999\) using our extreme regression \(L^p\)-quantile estimators. Note that the quantiles may be calculated explicitly:
Expectiles have to be approximated numerically, since they do not have a simple closed form. In order to estimate these two quantities, we propose to compare different approaches (called either direct or indirect):
-
(i)
Use the conditional Weissman-type estimators, respectively, based on empirical quantiles and the estimator \(\hat{\gamma }_{\alpha _n}^{(H)}(x)\) (direct quantile estimator) and on empirical expectiles and \(\tilde{\gamma }_{\alpha _n}^{(2)}(x)\) (direct expectile estimator), i.e.
$$ \left( \frac{1-\alpha _n}{1-\beta _n} \right) ^{\hat{\gamma }_{\alpha _n}^{(H)}(x)} \hat{q}^{(1)}_n( \alpha _n |x) \text { , } \left( \frac{1-\alpha _n}{1-\beta _n} \right) ^{\tilde{\gamma }_{\alpha _n}^{(2)}(x)} \hat{q}^{(2)}_n( \alpha _n |x). $$ -
(ii)
Indirect quantile estimator: estimate first the conditional \(L^p\)-quantile using estimator (4), and exploit asymptotic relationship (9) to recover the extreme conditional quantile,
$$ \left( \frac{1-\alpha _n}{1-\beta _n} \right) ^{\tilde{\gamma }_{\alpha _n}^{(p)}(x)} \hat{q}^{(p)}_n( \alpha _n |x) \left( \frac{\tilde{\gamma }_{\alpha _n}^{(p)}(x)}{B \left( p,\tilde{\gamma }_{\alpha _n}^{(p)}(x)^{-1}-p+1 \right) } \right) ^{\tilde{\gamma }_{\alpha _n}^{(p)}(x)}. $$ -
(iii)
Indirect expectile estimator: use Eq. (9) to get a connection between \(L^p\)-quantile and quantile, and quantile and expectile, resulting in the extreme conditional expectile estimator
$$ \left( \frac{1-\alpha _n}{1-\beta _n} \right) ^{\tilde{\gamma }_{\alpha _n}^{(p)}(x)} \hat{q}^{(p)}_n( \alpha _n |x) \left( \frac{B \left( 2, \tilde{\gamma }_{\alpha _n}^{(p)}(x)^{-1}-1 \right) }{B \left( p, \tilde{\gamma }_{\alpha _n}^{(p)}(x)^{-1}-p+1 \right) } \right) ^{\tilde{\gamma }_{\alpha _n}^{(p)}(x)}. $$
The choice of p is discussed in Girard et al. (2019) using the MSE of (the unconditional version of) \(\tilde{\gamma }_{\alpha _n}^{(p)}(x)\) as a criterion. Cross-validation choices of the bandwidth \(h_n\) and intermediate quantile level \(\alpha _n\), meanwhile, are discussed in Daouia et al. (2013); Girard et al. (2021). For the sake of simplicity, we choose here common parameters \(p=1.7\) following the guidelines of Girard et al. (2019)), \(h_n=0.15\) and \(\alpha _n=1-1/\sqrt{n} \approx 0.968\) across all replications and K is the Epanechnikov kernel defined by \(K(t)=0.75(1-t^2) \mathbbm {1}_{\{ |t|<1 \}}\). Results are shown in Fig. 1.
We can notice that an indirect estimation of extreme quantiles or expectiles with a \(L^p\)-quantile (with p between 1 and 2) leads to a trade-off between bias and variance: the indirect \(L^p-\)estimator of an extreme regression quantile is less variable than the direct estimator but slightly more biased, and the indirect \(L^p-\)estimator of an extreme regression expectile is more variable than the direct estimator but less biased. For conditional quantiles, an explanation is that using the asymptotic approximation (9) in the construction of the indirect estimator adds a source of bias, while the reduced variance stems from the use of \(p=1.7\) in the estimator \(\tilde{\gamma }_{\alpha _n}^{(p)}(x)\), providing an estimator with lower variance compared to the simple Hill estimator in our case (see Girard et al. 2019). The case of conditional expectiles is less clear, although the increased variability observed for \(x\in [0,0.5]\) seems to originate in the use of the estimated constant \(B( 2, \tilde{\gamma }_{\alpha _n}^{(p)}(x)^{-1}-1)/B( p, \tilde{\gamma }_{\alpha _n}^{(p)}(x)^{-1}-p+1)\): when \(\tilde{\gamma }_{\alpha _n}^{(p)}(x)\) gets close to 1, which is sometimes the case in this zone where \(\gamma (x)\in [0.4,0.5]\), this estimated constant tends to explode, while the direct estimator is less affected. A similar observation, in the context of extreme Wang distortion risk measure estimation, is made by El Methni and Stupfler (2017).
5 Real Data Example
We study here a data set on motorcycle insurance, collected from the former Swedish insurance provider Wasa. Our data is on motorcycle insurance policies and claims over the period 1994–1998 and is available from www.math.su.se/GLMbook or the R packages insuranceData and CASdatasets, and analyzed in Ohlsson and Johansson (2010). We concentrate here on the relationship between the claim severity Y (defined as the ratio of claim cost by number of claims for each given policyholder) in Swedish kroner (SEK), and the number of years X of exposure of a policyholder. Data for \(X>3\) are very sparse, so we restrict our attention to the case \(Y>0\) and \(X\in [0,3]\), resulting in \(n = 593\) pairs \((X_i,Y_i)\).
Our goal in this section is to estimate extreme conditional quantiles and expectiles of Y given X, at a level \(\beta _n=1-3/n\approx 0.9949\). This level is slightly less extreme than the more standard \(\beta _n=1-1/n\approx 0.9985\), but is an appropriately extreme level in this conditional context where less data are available locally for the estimation. A preliminary diagnostic using a local version of the Hill estimator (which we do not show here) suggests that the data is indeed heavy-tailed with \(\gamma (x)\in [0.25,0.6]\). Following again the guidelines in Girard et al. (2019), we choose \(p=1.7\) for our indirect extreme conditional quantile and expectile estimators. These are, respectively, compared to
-
the estimator \(\widehat{q}_n^{W}(\beta _n|x)\) of Girard et al. (2021), calculated as in Sect. 5 therein, and our direct quantile estimator presented in Sect. 4 (i),
-
the estimator \(\widehat{e}_n^{W,BR}(\beta _n|x)\) of Girard et al. (2021), calculated as in Sect. 5 therein, and our direct expectile estimator presented in Sect. 4 (i).
For the direct and indirect estimators presented in Sect. 4 (ii)–(iii), the parameters \(\alpha _n\) and \(h_n\) are chosen by a cross-validation procedure analogous to that of Girard et al. (2021). The Epanechnikov kernel is adopted. Results are given in Fig. 2. In each case, all three estimators reassuringly point to roughly the same results, with slight differences; in particular, for quantile estimation and when data is scarce, the direct estimator in Sect. 4 (i) appears to be more sensitive to the local shape of the tail than the indirect, \(L^p\)-quantile-based estimator in Sect. 4 (ii), resulting in less stable estimates.
6 Appendix
6.1 Preliminary Results
Lemma 1
Assume that \((\mathcal {L})\) and \(\mathcal {C}_2 \left( \gamma (\mathbf {x}), \rho (\mathbf {x}), A(.|\mathbf {x}) \right) \) hold, and let \(y_n \rightarrow \infty \) and \(h_n \rightarrow 0\) be such that \(\omega _{h_n}^{(1)}(y_n|\mathbf {x}) \log (y_n)\rightarrow 0\) and \(\omega _{h_n}^{(2)}(y_n|\mathbf {x}) \rightarrow 0\). Then for all \(0 \le k < \gamma (\mathbf {x})^{-1}\) we have, uniformly in \(\mathbf {x}' \in B(\mathbf {x},h_n)\),
In particular \(m^{(k)}(y_n|\mathbf {x}') = y_n^k g(\mathbf {x}) \left( 1+ o(1) \right) \) uniformly in \(\mathbf {x}' \in B(\mathbf {x},h_n)\).
Proof
Let us first write
By the arguments of the proof of Lemma 3 in Girard et al. (2021),
Besides, an integration by parts yields
It clearly follows that
Now
by the dominated convergence theorem, and
see for instance Lemma 1(i) in Daouia et al. (2019). The result follows from direct calculations.
Lemma 2
Assume that \((\mathcal {K})\), \((\mathcal {L})\) and \(\mathcal {C}_2 \left( \gamma (\mathbf {x}), \rho (\mathbf {x}), A(.|\mathbf {x}) \right) \) hold, and let \(y_n \rightarrow \infty \) and \(h_n \rightarrow 0\) be such that \(nh_n^d \rightarrow \infty \), \(\omega _{h_n}^{(1)}(y_n|\mathbf {x}) \log (y_n)\rightarrow 0\) and \(\omega _{h_n}^{(2)}(y_n|\mathbf {x}) \rightarrow 0\). Then for all \(0 \le k < \gamma (\mathbf {x})^{-1}/2\),
Proof
Note that \( \mathbb {E} \left[ \hat{m}_n^{(k)}(y_n|\mathbf {x}) \right] = \int _S m^{(k)}(y_n|\mathbf {x}-\mathbf {u} h_n) K(\mathbf {u}) d\mathbf {u} \) by Assumption \((\mathcal {K})\) and a change of variables, and use Lemma 1 to get the first result. The second result is obtained through similar calculations. \(\square \)
Lemma 3
Assume that \((\mathcal {K})\), \((\mathcal {L})\) and \(\mathcal {C}_2 \left( \gamma (\mathbf {x}), \rho (\mathbf {x}), A(.|\mathbf {x}) \right) \) hold. Let \(y_n \rightarrow \infty \), \(h_n \rightarrow 0\) be such that \(nh_n^d \rightarrow \infty \) and \(\omega _{h_n}^{(1)}(y_n|\mathbf {x}) \log (y_n) \rightarrow 0\). Then for all \(0 \le k < \gamma (\mathbf {x})^{-1}/2\),
Proof
See Lemma 5 of Girard et al. (2021).
Lemma 4
Assume that \(\mathcal {C}_2 \left( \gamma (\mathbf {x}), \rho (\mathbf {x}), A(.|\mathbf {x}) \right) \) holds. Let \(\lambda \ge 1\), \(y_n \rightarrow \infty \), \(y_n'=\lambda y_n(1+o(1))\) and \(0< k < \gamma (\mathbf {x})^{-1}\).
(i) Then the following asymptotic relationship holds:
\(\square \)
(ii) Assume further that \(\omega _{h_n}^{(1)}(y_n \wedge y_n' | \mathbf {x}) \log (y_n) \rightarrow 0\) and \(\omega _{h_n}^{(3)}(y_n | \mathbf {x}) \rightarrow 0\). Then, uniformly in \(\mathbf {x}' \in B(\mathbf {x},h_n)\),
Proof
(i) Straightforward calculations entail
with \(y_n'=\lambda y_n (1+o(1))\). The result then comes directly from the regular variation property of \(\bar{F}^{(1)}(\cdot | \mathbf {x})\) and Lemma 1 in Daouia et al. (2019) with \(H(t)=(t-1)^k\) and \(b= \lambda \).
(ii) Note first that for n large enough
Write \((Y-y_n)^k = ((Y-y_n)^k - (\lambda -1)^k y_n^k) + (\lambda -1)^k y_n^k\). It then follows from the assumption \(\omega _{h_n}^{(3)}(y_n | \mathbf {x}) \rightarrow 0\) that, uniformly in \(\mathbf {x}' \in B(\mathbf {x},h_n)\),
Remark now \( \bar{F}^{(1)} \left( y_n'|\mathbf {x} \right) (y_n')^{-\omega _{h_n}^{(1)}(y_n'|\mathbf {x})} \le \bar{F}^{(1)} \left( y_n'|\mathbf {x}' \right) \le \bar{F}^{(1)} \left( y_n'|\mathbf {x} \right) (y_n')^{\omega _{h_n}^{(1)}(y_n'|\mathbf {x})}. \) Then condition \(\omega _{h_n}^{(1)}(y_n'|\mathbf {x}) \log (y_n) \rightarrow 0\) entails, uniformly in \(\mathbf {x}' \in B(\mathbf {x},h_n)\), \(\bar{F}^{(1)} \left( y_n'|\mathbf {x}' \right) = \bar{F}^{(1)} \left( y_n'|\mathbf {x} \right) (1+o(1)) = \bar{F}^{(1)} \left( \lambda y_n|\mathbf {x} \right) (1+o(1))\). Besides, for any \(z\ge \lambda y_n\ge y_n\), \( \bar{F}^{(1)} \left( z|\mathbf {x} \right) z^{-\omega _{h_n}^{(1)}(y_n|\mathbf {x})} \le \bar{F}^{(1)} \left( z|\mathbf {x}' \right) \le \bar{F}^{(1)} \left( z|\mathbf {x} \right) z^{\omega _{h_n}^{(1)}(y_n|\mathbf {x})}. \) Following the proof of Lemma 3 in Girard et al. (2021), we get, uniformly in \(\mathbf {x}' \in B(\mathbf {x},h_n)\),
Since \(\int _{\lambda y_n}^{\infty }(z-y_n)^{k-1} \bar{F}^{(1)}(z|\mathbf {x})dz\) is of order \(y_n^k \bar{F}^{(1)}(y_n|\mathbf {x})\) (by regular variation of \(\bar{F}^{(1)}(\cdot |\mathbf {x})\)), the conclusion follows.
Lemma 5
Assume that \(\mathcal {C}_2 \left( \gamma (\mathbf {x}), \rho (\mathbf {x}), A(.|\mathbf {x}) \right) \) holds. For all \(1 \le p < \gamma (\mathbf {x})^{-1}+1\),
where there are constants \(C_1(\mathbf {x})\), \(C_2(\mathbf {x})\), \(C_3(\mathbf {x})\) such that
Similarly
where there are constants \(D_1(\mathbf {x})\), \(D_2(\mathbf {x})\), \(D_3(\mathbf {x})\) such that
\(\square \)
Proof
We start by focusing on the ratio \(\bar{F}^{(p)}(y|\mathbf {x})/\bar{F}^{(1)}(y|\mathbf {x})\). By Lemma 1 in Girard et al. (2019), the function \(\bar{F}^{(p)}(\cdot |\mathbf {x})\) is continuous and strictly decreasing on the support of Y given \(\mathbf {X}=\mathbf {x}\). It is therefore enough to show the announced formula for \(y=q^{(p)}(\alpha |\mathbf {x})\) with \(\alpha \rightarrow 1\); this, in turn, is a simple corollary of Proposition 2 in Daouia et al. (2019). To show the analogous formula on \(q^{(p)}(\alpha |\mathbf {x})/q^{(1)}(\alpha |\mathbf {x})\), we define \(U^{(1)}(t|\mathbf {x}) = q^{(1)}(1-t^{-1}|\mathbf {x})\); \(U^{(1)}(\cdot |\mathbf {x})\) also satisfies a (local uniform) second-order regular variation condition, see Theorem 2.3.9 p.48 in de Haan and Ferreira (2006). Consequently, we note that the asymptotic expansion on \(\bar{F}^{(p)}(y|\mathbf {x})/\bar{F}^{(1)}(y|\mathbf {x})\) entails a similar expansion on
as \(y\rightarrow \infty \), with different constants (here Lemma 1 in Daouia et al. (2020b) was used). Setting \(y=q^{(p)}(\alpha |\mathbf {x})\), with \(\alpha \rightarrow 1\), gives the announced result.
Lemma 6
Assume that \((\mathcal {K})\), \((\mathcal {L})\) and \(\mathcal {C}_2 \left( \gamma (\mathbf {x}), \rho (\mathbf {x}), A(.|\mathbf {x}) \right) \) hold. Let \(y_n \rightarrow \infty \), \(h_n \rightarrow 0\) and \(z_n=\theta y_n (1+o(1))\), where \(\theta >0\). Assume further that \(\epsilon _n^{-2}=n h_n^d \bar{F}^{(1)}(y_n|\mathbf {x}) \rightarrow \infty \), \(n h_n^{d+2} \bar{F}^{(1)}(y_n|\mathbf {x}) \rightarrow 0\), there exists \(\delta \in (0,1)\) such that \( \epsilon _n^{-1} \omega _{h_n}^{(1)}((1-\delta ) (\theta \wedge 1) y_n|\mathbf {x}) \log (y_n) \rightarrow 0\), and \(\omega _{h_n}^{(3)}(z_n|\mathbf {x}) \rightarrow 0\). Letting, for all \(j \in \{ 1,...,J \}\), \(y_{n,j}=\tau _{j}^{-\gamma (\mathbf {x})} y_n (1+o(1))\) with \(0<\tau _1<\tau _2<\, ... \,<\tau _J \le 1\), and \(p \in (1,\gamma (\mathbf {x})^{-1}/2+1)\), one has
where \({\boldsymbol{\Lambda }}(\mathbf {x})\) is a symmetric matrix having entries:
Proof
Let \(\mathbf {\beta }=\left( \beta _1, ... ,\beta _J,\beta _{J+1} \right) \in \mathbb {R}^{J+1}\). Set
Clearly \(\omega _{h_n}^{(1)}(y_{n,j}|\mathbf {x}) \le \omega _{h_n}^{(1)}((1-\delta )y_{n}|\mathbf {x}) \) and \(\omega _{h_n}^{(1)}(z_n|\mathbf {x}) \le \omega _{h_n}^{(1)}((1-\delta )\theta y_n|\mathbf {x})\) for n large enough. Lemma 3 then provides \(\mathbb {E}(\mathcal {Z}_n) = o(1)\). It thus remains to focus on the asymptotic distribution of the centered variable \(Z_n=\mathcal {Z}_n-\mathbb {E}(\mathcal {Z}_n)\). Note that \(\mathbb {V}\text {ar}[Z_n]=\epsilon _n^{-2} \mathbf {\beta }^{\top } {\boldsymbol{B}}^{(n)} \mathbf {\beta }\), where \({\boldsymbol{B}}^{(n)}\) is the symmetric matrix having entries:
We recall \(z_n=\theta y_n (1+o(1))\), hence \(\bar{F}^{(1)}(z_n|\mathbf {x})= \theta ^{-1/\gamma (\mathbf {x})} \bar{F}^{(1)}(y_n|\mathbf {x})(1+o(1))\) and Lemma 3 combined with Eq. (15) immediately gives
The calculation of \(B_{j,\ell }^{(n)}(\mathbf {x})\) gives, through straightforward calculations and the use of Lemma 3 and Eq. (15),
The regular variation property of \(\bar{F}^{(1)}\) gives \( B_{j,\ell }^{(n)}(\mathbf {x})= \frac{|| K ||_2^2}{g(\mathbf {x})} (\tau _j \vee \tau _{\ell })^{-1} \epsilon _n^2 (1+o(1)) . \) It remains to calculate \(B_{j,J+1}^{(n)}(\mathbf {x})\). Using Eq. (15), with \(Q(\cdot )=K(\cdot )^2/|| K ||_2^2\) a kernel satisfying \((\mathcal {K})\), this term equals
Clearly, as a direct consequence of Lemma 3, the first term dominates. Remark that \(z_n \vee y_{n,j} = (1 \vee \tau _j^{-\gamma (\mathbf {x})}/\theta )z_n (1+o(1))\) and combine Assumption \((\mathcal {K})\), the results of Lemma 4 (with \(\lambda =(1 \vee \tau _j^{-\gamma (\mathbf {x})}/\theta )\)), and the regular variation property of \(\varphi ^{(k)}(\cdot )\) (see Eq. (15)) to find that the numerator of this first term is asymptotically equivalent to
And finally \(B_{j,J+1}^{(n)}(\mathbf {x})\) is asymptotically equivalent to
Therefore, \(\mathbb {V}\text {ar}[Z_n]=|| K ||_2^2 \mathbf {\beta }^{\top } {\boldsymbol{\Lambda }}(\mathbf {x}) \mathbf {\beta } / g(\mathbf {x})(1+o(1))\), where \({\boldsymbol{\Lambda }}(\mathbf {x})\) is given in Eq. (16). It remains to prove the asymptotic normality of \(Z_n\). For that purpose, we denote \(Z_n=\sum _{i=1}^n Z_{i,n}\), where
Taking \(\delta >0\) sufficiently small and arguing as in the closing stages of the proof of Lemma 6 in Girard et al. (2021), we find that \(n \mathbb {E}\left[ |Z_{1,n}|^{2+\delta } \right] =O \left( \epsilon _n^{\delta } \right) =o(1)\). Applying the classical Lyapunov central limit theorem concludes the proof. \(\square \)
Proposition 1
Assume that \((\mathcal {K})\), \((\mathcal {L})\) and \(\mathcal {C}_2 \left( \gamma (\mathbf {x}), \rho (\mathbf {x}), A(.|\mathbf {x}) \right) \) hold. Let \(y_n \rightarrow \infty \), \(h_n \rightarrow 0\) and \(z_n=\theta y_n (1+o(1))\), where \(\theta >0\). Assume further that \(\epsilon _n^{-2}=n h_n^d \bar{F}^{(1)}(y_n|\mathbf {x}) \rightarrow \infty \), \(n h_n^{d+2} \bar{F}^{(1)}(y_n|\mathbf {x}) \rightarrow 0\), \( \omega _{h_n}^{(3)}(y_n|\mathbf {x}) \rightarrow 0\) and there exists \(\delta \in (0,1)\) such that \( \epsilon _n^{-1} \omega _{h_n}^{(1)}((1-\delta ) (\theta \wedge 1) y_n|\mathbf {x}) \log (y_n) \rightarrow 0\). If, for all \(j \in \{ 1,...,J \}\), the \(y_{n,j}=\tau _{j}^{-\gamma (\mathbf {x})} y_n (1+o(1))\) with \(0<\tau _1<\tau _2<\, ... \,<\tau _J \le 1\) are such that \( \epsilon _n^{-1} \omega _{h_n}^{(2)}((1+\delta ) (\theta \vee \tau _1^{-\gamma (\mathbf {x})}) y_n|\mathbf {x}) \rightarrow 0 \), then, for all \(p \in (1,\gamma (\mathbf {x})^{-1}/2+1)\), one has
where \({\boldsymbol{\Lambda }}(\mathbf {x})\) is given in Eq. (16).
Proof
Notice that
Lemma 2 and the Chebyshev inequality ensure that for all \(p \in (1,\gamma (\mathbf {x})^{-1}/2+1)\) and \(u_n\in \{ y_{n,1},\ldots ,y_{n,J},z_n \}\), \(\hat{m}_n^{(p-1)}(u_n|\mathbf {x})/m^{(p-1)}(u_n|\mathbf {x}) - 1 =O_{\mathbb {P}}(1/\sqrt{nh_n^d})\), so that
Applying Lemma 6 concludes the proof. \(\square \)
6.2 Proofs of Main Results
Proof of Theorem 1 Let us denote \(\mathbf {t}=(t_1,...,t_J,t_{J+1})\) and focus on the probability
Set \(y_{n}=q^{(1)}(\alpha _{n}|\mathbf {x})\), \(y_{n,j}=q^{(1)}(\alpha _{n,j}|\mathbf {x}) \left( 1+ \sigma _n t_{j} \right) \) and \(z_n=q^{(p)}(a_{n}|\mathbf {x}) \left( 1+ \sigma _n t_{J+1} \right) \). The technique of proof of Proposition 1 in Girard et al. (2019) yields
Second-order regular variation arguments similar to those of the proof of Proposition 1 in Girard et al. (2019) give, for all \(j \in \{ 1, ... ,J \}\),
and similarly
Finally, notice that \(y_{n,j}=\tau _j^{-\gamma (\mathbf {x})} y_n (1+o(1))\) and \(z_n=\theta y_n(1+o(1))\) (see (9)). Moreover, for n large enough, \(\omega _{h_n}^{(1)}(y_{n,j}|\mathbf {x}) \le \omega _{h_n}^{(1)}\left( (1-\delta )q^{(1)}(\alpha _n|\mathbf {x}) |\mathbf {x} \right) \) and \(\omega _{h_n}^{(1)}(z_n|\mathbf {x}) \le \omega _{h_n}^{(1)}\left( (1-\delta ) \theta q^{(1)}(\alpha _n|\mathbf {x}) |\mathbf {x} \right) \). Similarly, \( \omega _{h_n}^{(2)}(y_{n,j}|\mathbf {x}) \le \omega _{h_n}^{(2)}\left( (1+\delta ) \tau _1^{-\gamma (\mathbf {x})} q^{(1)}(\alpha _n|\mathbf {x}) |\mathbf {x} \right) \) and \(\omega _{h_n}^{(2)}(z_n|\mathbf {x}) \le \omega _{h_n}^{(2)}\left( (1+\delta ) \theta q^{(1)}(\alpha _n|\mathbf {x}) |\mathbf {x} \right) \). Conclude using Proposition 1. \(\Box \)
Proof of Theorem 2 We recall \(\sigma _n^{-2}=n h_n^d (1-\alpha _n)\). Write
The first term converges in distribution to \(\Gamma \). The second one converges to 0 in probability, by Theorem 1. To control the third one, write
In view of Theorem 4.3.8 in de Haan and Ferreira (2006) and its proof, \( \left( (1-\alpha _n)/(1-\beta _n) \right) ^{\gamma (\mathbf {x})} q^{(1)}(\alpha _n|\mathbf {x}) = q^{(1)}(\beta _n|\mathbf {x}) \left( 1+O\left( A\left( (1-\alpha _n)^{-1} |\mathbf {x} \right) \right) \right) = q^{(1)}(\beta _n|\mathbf {x}) \left( 1+O(\sigma _n) \right) .\) By Lemma 5 then,
The third term therefore converges to 0. Conclude using Slutsky’s lemma and the delta-method. \(\Box \)
Proof of Theorem 3 This proof is similar to those of Theorem 4 in Girard et al. (2021) (where \(p=2\)) and Theorem 1 in Girard et al. (2019) (an unconditional version) and is thus left to the reader.
References
Artzner, P., Delbaen, F., Eber, J., & Heath, D. (1999). Coherent measures of risk. Mathematical Finance, 9(3), 203–228.
Bellini, F., Klar, B., Muller, A., & Gianin, E. R. (2014). Generalized quantiles as risk measures. Insurance: Mathematics and Economics 54, 41–48 (2014).
Cai, J., & Weng, C. (2016). Optimal reinsurance with expectile. Scandinavian Actuarial Journal, 2016(7), 624–645.
Chen, Z. (1996). Conditional \(L_p\)-quantiles and their application to the testing of symmetry in non-parametric regression. Statistics & Probability Letters 29(2), 107–115.
Daouia, A., Gardes, L., & Girard, S. (2013). On kernel smoothing for extremal quantile regression. Bernoulli, 19(5B), 2557–2589.
Daouia, A., Gardes, L., Girard, S., & Lekina, A. (2011). Kernel estimators of extreme level curves. TEST, 20(2), 311–333.
Daouia, A., Girard, S., & Stupfler, G. (2018). Estimation of tail risk based on extreme expectiles. Journal of the Royal Statistical Society: Series B, 80(2), 263–292.
Daouia, A., Girard, S., & Stupfler, G. (2019). Extreme M-quantiles as risk measures: from \(L^1\) to \(L^p\) optimization. Bernoulli, 25(1), 264–309.
Daouia, A., Girard, S., & Stupfler, G. (2020). ExpectHill estimation, extreme risk and heavy tails. Journal of Econometrics, 221(1), 97–117.
Daouia, A., Girard, S., & Stupfler, G. (2020). Tail expectile process and risk assessment. Bernoulli, 26(1), 531–556.
de Haan, L., & Ferreira, A. (2006). Extreme value theory: An introduction. New York: Springer.
El Methni, J., Gardes, L., & Girard, S. (2014). Non-parametric estimation of extreme risk measures from conditional heavy-tailed distributions. Scandinavian Journal of Statistics, 41(4), 988–1012.
El Methni, J., & Stupfler, G. (2017). Extreme versions of Wang risk measures and their estimation for heavy-tailed distributions. Statistica Sinica, 27(2), 907–930.
Embrechts, P., Kluppelberg, C., & Mikosch, T. (1997). Modelling extremal events. Berlin: Springer.
Gardes, L., Girard, S., & Stupfler, G. (2020). Beyond tail median and conditional tail expectation: extreme risk estimation using tail \(L^p-\)optimisation. Scandinavian Journal of Statistics, 47(3), 922–949.
Girard, S., Stupfler, G., & Usseglio-Carleve, A. (2019). An \(L^p\)-quantile methodology for estimating extreme expectiles, preprint. https://hal.inria.fr/hal-02311609v3/document.
Girard, S., Stupfler, G., & Usseglio-Carleve, A. (2021). Nonparametric extreme conditional expectile estimation, To appear in Scandinavian Journal of Statistics. https://hal.archives-ouvertes.fr/hal-02114255.
Jones, M. (1994). Expectiles and M-quantiles are quantiles. Statistics & Probability Letters 20, 149–153.
Koenker, R., & Bassett, G. J. (1978). Regression quantiles. Econometrica, 46(1), 33–50.
Newey, W., & Powell, J. (1987). Asymmetric least squares estimation and testing. Econometrica, 55(4), 819–847.
Ohlsson, E., & Johansson, B. (2010). Non-life insurance pricing with generalized linear models. Berlin: Springer.
Usseglio-Carleve, A. (2018). Estimation of conditional extreme risk measures from heavy-tailed elliptical random vectors. Electronic Journal of Statistics, 12(2), 4057–4093.
Weissman, I. (1978). Estimation of parameters and large quantiles based on the \(k\) largest observations. Journal of the American Statistical Association, 73(364), 812–815.
Acknowledgements
This research was supported by the French National Research Agency under the grant ANR-19-CE40-0013/ExtremReg project. S. Girard gratefully acknowledges the support of the Chair Stress Test, Risk Management and Financial Steering, led by the French Ecole Polytechnique and its Foundation and sponsored by BNP Paribas, and the support of the French National Research Agency in the framework of the Investissements d’Avenir program (ANR-15-IDEX-02).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Girard, S., Stupfler, G., Usseglio-Carleve, A. (2021). Extreme \(L^p\)-quantile Kernel Regression. In: Daouia, A., Ruiz-Gazen, A. (eds) Advances in Contemporary Statistics and Econometrics. Springer, Cham. https://doi.org/10.1007/978-3-030-73249-3_11
Download citation
DOI: https://doi.org/10.1007/978-3-030-73249-3_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-73248-6
Online ISBN: 978-3-030-73249-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)