Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Extreme Value Theory: A Brief Introduction

We use the notation γ for the extreme value index (EVI), the shape parameter in the extreme value distribution function (d.f.),

$$\displaystyle{ \mathit{EV }_{\gamma }(x) = \left \{\begin{array}{lll} \exp \{ - {(1 +\gamma x)}^{-1/\gamma }\},\ 1 +\gamma x > 0&\mbox{ if}&\gamma \not =0 \\ \exp \{ -\exp (-x)\},\ x \in \mathbb{R} & \mbox{ if}&\gamma = 0, \end{array} \right. }$$
(1)

and we consider models with a heavy right-tail. Note that in the area of statistics of extremes, and with the notation RV a standing for the class of regularly varying functions at infinity with an index of regular variation equal to \(a \in \mathbb{R}\), i.e. positive measurable functions g(⋅ ) such that for any x > 0, g(tx)∕g(t) → x a, as t →  (see [3], for details on regular variation), we usually say that a model F has a heavy right-tail \(\overline{F}:= 1 - F\) whenever \(\overline{F} \in \mathit{RV }_{-1/\gamma },\ \mbox{ for some}\ \gamma > 0.\) Then, as first proved in [14], F is in the domain of attraction for maxima of a Fréchet-type d.f., the EV γ  d.f. in (1), but with γ > 0, and we use the notation \(F \in \mathcal{D}_{M}(\mathit{EV }_{\gamma >0}) =: \mathcal{D}_{M}^{+}\). This means that given a sequence {X n } n ≥ 1 of independent and identically distributed random variables (r.v.’s), it is possible to normalise the sequence of maximum values, \(\{X_{n:n}:=\max (X_{1},\ldots,X_{n})\}_{n\geq 1}\) so that it converges weakly to an r.v. with the d.f. EV γ , with γ > 0.

In this same context of heavy right-tails, and with the notation \(U(t):= {F}^{\leftarrow }(1 - 1/t),t \geq 1,\) being F  ← (y): = inf{x: F(x) ≥ y} the generalised inverse function of F, we can further say that

$$\displaystyle{ F \in \mathcal{D}_{M}^{+}\quad \Longleftrightarrow\quad \overline{F} \in \mathit{RV }_{ -1/\gamma }\quad \Longleftrightarrow\quad U \in \mathit{RV }_{\gamma }, }$$
(2)

the so-called first-order condition. The second equivalence in (2), \(F \in \mathcal{D}_{M}^{+}\) if and only if U ∈ RV γ , was first derived in [7].

For a consistent semi-parametric EVI-estimation, in the whole \(\mathcal{D}_{M}^{+}\), we merely need to assume the validity of the first-order condition, in (2), and to work with adequate functionals, dependent on an intermediate tuning parameter k, related to the number of top order statistics involved in the estimation. To say that k is intermediate is equivalent to say that

$$\displaystyle{ k = k_{n} \rightarrow \infty \quad \mbox{ and}\quad k_{n} = o(n),\ \mbox{ i.e.}\ \ k/n \rightarrow 0,\ \mbox{ as}\ n \rightarrow \infty. }$$
(3)

To obtain information on the non-degenerate asymptotic behaviour of semi-parametric EVI-estimators, we further need to work in \(\mathcal{D}_{M \vert 2}^{+}\), assuming a second-order condition, ruling the rate of convergence in the first-order condition, in (2). The second-order parameter ρ( ≤ 0) rules such a rate of convergence, and it is the parameter appearing in the limiting result,

$$\displaystyle{ \lim _{t\rightarrow \infty }\ \frac{\ln U(\mathit{tx}) -\ln U(t) -\gamma \ln x} {A(t)} = \left \{\begin{array}{lll} \frac{{x}^{\rho }-1} {\rho } & \mbox{ if}&\rho < 0\\ \ln x &\mbox{ if} &\rho = 0, \end{array} \right. }$$
(4)

which we often assume to hold for all x > 0, and where | A | must be in RV ρ [13]. For technical simplicity, we usually further assume that ρ < 0, and use the parameterisation

$$\displaystyle{ A(t) =:\gamma \beta {t}^{\rho }. }$$
(5)

We are then working with a class of Pareto-type models, with a right-tail function

$$\displaystyle{ \overline{F}(x) = C{x}^{-1/\gamma }\Big(1 + D_{ 1}{x}^{\rho /\gamma } + o\big({x}^{\rho /\gamma }\big)\Big), }$$
(6)

as x → , with C > 0, D 1 ≠ 0 and ρ < 0.

In order to obtain full information on the asymptotic bias of corrected-bias EVI-estimators, it is further necessary to work in \(\mathcal{D}_{M \vert 3}^{+}\), assuming a general third-order condition, which guarantees that, for all x > 0,

$$\displaystyle{ \lim _{t\rightarrow \infty }\ \frac{\frac{\ln U(\mathit{tx})-\ln U(t)-\gamma \ln x} {A(t)} -\frac{{x}^{\rho }-1} {\rho } } {B(t)} = \frac{{x}^{\rho {+\rho }^{{\prime}} }- 1} {\rho {+\rho }^{{\prime}}}, }$$
(7)

where | B | must then be in \(\mathit{RV }_{{\rho }^{{\prime}}}\). More restrictively, and equivalently to the aforementioned third-order condition, in (7), but with ρ = ρ  < 0, we often consider a Pareto third-order condition, i.e. a class of Pareto-type models, with a right-tail function

$$\displaystyle{\overline{F}(x) = C{x}^{-1/\gamma }\Big(1 + D_{ 1}{x}^{\rho /\gamma } + D_{ 2}{x}^{2\rho /\gamma } + o\big({x}^{2\rho /\gamma }\big)\Big),}$$

as x → , with C > 0, D 1,  D 2 ≠ 0 and ρ < 0, a large sub-class of the classes of models in [26, 27]. Then we can choose in the general third-order condition, in (7),

$$\displaystyle{ B(t) {=\beta }^{{\prime}}\ {t}^{\rho } = \frac{{\beta }^{{\prime}}A(t)} {\beta \gamma } =: \frac{\xi \ A(t)} {\gamma },\quad \beta {,\ \beta }^{{\prime}}\not =0,\quad \xi = \frac{{\beta }^{{\prime}}} {\beta }, }$$
(8)

with β and β “scale” second and third-order parameters, respectively.

2 EVI-Estimators Under Consideration

For models in \(\mathcal{D}_{M}^{+}\), the classical EVI-estimators are the Hill estimators [28], averages of the scaled log-spacings or of the log-excesses, given by

$$\displaystyle{U_{i}:= i\left \{\ln \frac{X_{n-i+1:n}} {X_{n-i:n}} \right \}\qquad \mbox{ and}\qquad V _{ik}:=\ln \frac{X_{n-i+1:n}} {X_{n-k:n}},\qquad 1 \leq i \leq k < n,}$$

respectively. We thus have

$$\displaystyle{ H(k) \equiv H_{n}(k):= \tfrac{1} {k}\sum _{i=1}^{k}U_{ i} = \tfrac{1} {k}\sum _{i=1}^{k}V _{ ik},\quad 1 \leq k < n. }$$
(9)

But these EVI-estimators have often a strong asymptotic bias for moderate up to large values of k, of the order of A(nk), with A(⋅ ) the function in (4). More precisely, for intermediate k, i.e. if (3) holds, and under the validity of the general second-order condition in (4), \(\sqrt{ k}\left (H(k)-\gamma \right )\) is asymptotically normal with variance γ 2 and a non-null mean value, equal to \(\lambda /(1-\rho )\), whenever \(\sqrt{k}\ A(n/k)\ \rightarrow \ \lambda \not =0\), finite, the type of k-values which lead to minimal mean square error (MSE). Indeed, it follows from the results in [8] that under the second-order condition in (4), and with the notation \(\mathcal{N}(\mu {,\sigma }^{2})\) standing for a normal r.v. with mean μ and variance σ 2,

$$\displaystyle{ \sqrt{k}\left (H(k)-\gamma \right )\ \stackrel{d}{=}\ \mathcal{N}(0,\sigma _{{H}}^{2}) + b_{{ H}}\sqrt{k}\ A(n/k) + o_{p}\Big(\sqrt{k}\ A(n/k)\Big), }$$

where \(\sigma_{{H}}^{2} = \gamma^{2}\), and the bias \(b_{{H}}\sqrt{k}\ A(n/k)\), equal to \(\gamma \ \beta \ \sqrt{k}\ {(n/k)}^{\rho }/(1-\rho )\), whenever (5) holds, can be very large, moderate or small (i.e. go to infinity, constant or zero) as n → . This non-null asymptotic bias, together with a rate of convergence of the order of \(1/\sqrt{k}\), leads to sample paths with a high variance for small k, a high bias for large k, and a very sharp MSE pattern, as a function of k. The optimal k-value for the EVI-estimation through the Hill estimator, i.e. \(k_{0\vert H}:=\arg \min _{k}\mathrm{MSE}(H(k))\), is well approximated by \(k_{A\vert H}:=\arg \min _{k}\mathrm{AMSE}(H(k))\), with AMSE standing for asymptotic MSE, defined by

$$\displaystyle{\mathrm{AMSE}(H(k)) = \frac{{\gamma }^{2}} {k} + b_{{H}}^{2}{A}^{2}(n/k) =: \mathrm{AVAR}(k) +{ \mathrm{ABIAS}}^{2}(k),}$$

with AVAR and ABIAS standing for asymptotic variance and asymptotic bias. Then, we can easily see that k 0 | H is of the order of \({n}^{-2\rho /(1-2\rho )}\) due to the fact that

$$\displaystyle{k_{A\vert H} =\mathop{ \arg \min }\limits _{k}\left \{\frac{1} {k} + b_{{H}}^{2}{\beta }^{2}{(n/k)}^{2\rho }\right \} = \left ( \frac{{n}^{-2\rho }} {{\beta }^{2}(-2\rho ){(1-\rho )}^{-2}}\right )^{1/(1-2\rho )}.}$$

The adequate accommodation of this bias has recently been extensively addressed. We mention the pioneering papers [1, 11, 18, 29], among others. In these papers, authors are led to second-order reduced-bias (SORB) EVI-estimators, with asymptotic variances larger than or equal to \({\left (\gamma \ (1-\rho )/\rho \right )}^{2}\), where ρ( < 0) is the aforementioned “shape” second-order parameter, in (4). Recently, the authors in [4, 19, 21] considered, in different ways, the problem of corrected-bias EVI-estimation, being able to reduce the bias without increasing the asymptotic variance, which was shown to be kept at γ 2, the asymptotic variance of Hill’s estimator. Those estimators, called minimum-variance reduced-bias (MVRB) EVI-estimators, are all based on an adequate “external” consistent estimation of the pair of second-order parameters, \((\beta,\rho ) \in (\mathbb{R}, {\mathbb{R}}^{-})\), done through estimators denoted by \((\hat{\beta },\hat{\rho })\). For algorithms related to such estimation, see [17]. The estimation of β has been done through the class of estimators in [15]. The estimation of ρ has been usually performed though the simplest class of estimators in [12].

We now consider the simplest class of MVRB EVI-estimators in [4], defined as

$$\displaystyle{ \overline{H}(k) \equiv \overline{H}_{\hat{\beta },\hat{\rho }}(k):= H(k)\Big(1 - \tfrac{\hat{\beta }} {1-\hat{\rho }}{\left (\tfrac{n} {k}\right )}^{\hat{\rho }}\Big). }$$
(10)

Under the same conditions as before, i.e. if as n → , \(\sqrt{ k}\ A(n/k) \rightarrow \lambda\), finite, possibly non-null, \(\sqrt{k}\left (\overline{H}(k)-\gamma \right )\) is asymptotically normal with variance also equal to γ 2 but with a null mean value. Indeed, from the results in [4], we know that it is possible to adequately estimate the second-order parameters β and ρ, so that we get

$$\displaystyle{\sqrt{k}\left (\overline{H}(k)-\gamma \right )\stackrel{d}{=}\mathcal{N}(0{,\gamma }^{2}) + o_{ p}\Big(\sqrt{k}\ A(n/k)\Big).}$$

Consequently, \(\overline{H}(k)\) outperforms H(k) for all k. Indeed, under the validity of the aforementioned third-order condition related to the class of Pareto-type models, we can then adequately estimate the vector of second-order parameters, (β, ρ), and write [5],

$$\displaystyle{ \sqrt{k}\left (\overline{H}(k)-\gamma \right )\stackrel{d}{=}\mathcal{N}(0{,\gamma }^{2}) + b_{{ \overline{H}}}\sqrt{k}\ {A}^{2}(n/k) + o_{ p}\Big(\sqrt{k}\ {A}^{2}(n/k)\Big), }$$

where, with ξ defined in (8), \(b_{{\overline{H}}} =\big (\xi /(1 - 2\rho ) - 1/{(1-\rho )}^{2}\big)/\gamma.\)

In Fig. 1 we picture the comparative behaviour of the bias, variance and MSE of H and \(\overline{H}\), in (9) and (10), respectively.

Fig. 1
figure 1

Typical patterns of variance, bias and MSE of H and \(\overline{H}\), as a function of the sample fraction \(r = k/n\)

Now, \(k_{0\vert \overline{H}}:=\arg \min _{k}\mathrm{MSE}(\overline{H}(k))\) can be asymptotically approximated by \(k_{A\vert \overline{H}} =\big {({n}^{-4\rho }/\big{(\beta }^{2}(-2\rho )b_{{\overline{H}}}^{2}\big)\big)}^{1/(1-4\rho )},\) i.e. \(k_{0\vert \overline{H}}\) is of the order of \({n}^{-4\rho /(1-4\rho )}\), and depends not only on (β, ρ), as does k 0 | H , but also on (γ, ξ). Recent reviews on extreme value theory and statistics of univariate extremes can be found in [2, 20, 31].

3 Resampling Methodologies

The use of resampling methodologies (see [10]) has revealed to be promising in the estimation of the tuning parameter k, and in the reduction of bias of any estimator of a parameter of extreme events. For a recent review on the subject, see [30].

If we ask how to choose k in the EVI-estimation, either through H(k) or through \(\overline{H}(k)\), we usually consider the estimation of k 0 | H : = argmin k MSE(H(k)) or \(k_{0\vert \overline{H}} =\arg \min _{k}\mathrm{MSE}(\overline{H}(k))\). To obtain estimates of k 0 | H and \(k_{0\vert \overline{H}}\) one can then use a double-bootstrap method applied to an adequate auxiliary statistic which tends to be zero and has an asymptotic behaviour similar to either H(k) (see [6, 9, 16], among others) or \(\overline{H}(k)\) (see [22, 23], also among others). Such a double-bootstrap method will be sketched in Sect. 3.2.

But at such optimal levels, we still have a non-null asymptotic bias. If we still want to remove such a bias, we can make use of the generalised jackknife (GJ). It is then enough to consider an adequate pair of estimators of the parameter of extreme events under consideration and to build a reduced-bias affine combination of them. In [18], among others, we can find an application of this technique to the Hill estimator, H(k), in (9). In order to illustrate the use of these resampling methodologies in the field of univariate extremes, we shall consider, in Sect. 3.1 and just as in [24], the application of the GJ methodology to the MVRB estimators \(\overline{H}(k)\), in (10).

3.1 The Generalised Jackknife Methodology and Bias Reduction

The GJ-statistic was introduced in [25], and the main objective of the method is related to bias reduction. Let T n (1) and T n (2) be two biased estimators of γ, with similar bias properties, i.e. \(\mathrm{Bias}\big(T_{n}^{(i)}\big) =\phi (\gamma )d_{i}(n),\quad i = 1,2\). Then, if \(q = q_{n} = d_{1}(n)/d_{2}(n)\not =1\), the affine combination \(T_{n}^{\mathit{GJ}}:=\big (T_{n}^{(1)} - qT_{n}^{(2)}\big)/(1 - q)\) is an unbiased estimator of γ.

Given \(\overline{H}\), and with ⌊x⌋ denoting the integer part of x, the most natural GJ r.v. is the one associated with the random pair \(\left (\overline{H}(k),\overline{H}(\lfloor k/2\rfloor )\right )\), i.e.

$$\displaystyle{{ \overline{H}}^{\mathit{GJ(q)}}(k):= \frac{\overline{H}(k) - q\ \overline{H}(\lfloor k/2\rfloor )} {1 - q},\quad q > 0, }$$

with

$$\displaystyle{q = q_{n} = \frac{\mathrm{ABIAS}\left (\overline{H}(k)\right )} {\mathrm{ABIAS}\left (\overline{H}(\lfloor k/2\rfloor )\right )} = \frac{{A}^{2}(n/k)} {{A}^{2}(n/\lfloor k/2\rfloor )}\ \mathop{\longrightarrow }_{n/k\rightarrow \infty }{2}^{-2\rho }.}$$

It is thus sensible to consider \(q = {2}^{-2\rho }\), and, with \(\hat{\rho }\) a consistent estimator of ρ, the approximate GJ estimator,

$$\displaystyle{{ \overline{H}}^{\mathit{GJ}}(k):= \frac{{2}^{2\hat{\rho }}\ \overline{H}(k) -\overline{H}(\lfloor k/2\rfloor )} {{2}^{2\hat{\rho }} - 1}. }$$
(11)

Then, and provided that \(\hat{\rho }-\rho = o_{p}(1)\),

$$\displaystyle{\sqrt{k}\left ({\overline{H}}^{\mathit{GJ}}(k)-\gamma \right )\ \stackrel{d}{=}\ \mathcal{N}(0,\sigma _{{\mathit{ GJ}}}^{2}) + o_{ p}\big(\sqrt{k}\ {A}^{2}(n/k)\big),}$$

with \(\sigma _{{\mathit{ GJ}}}^{2} {=\gamma }^{2}\big(1 + 1/{({2}^{-2\rho } - 1)}^{2}\big).\) Further details on the estimators in (11) can be found in [24]. As expected, we have again a trade-off between variance and bias. The bias decreases, but the variance increases, and to try solving such a trade-off, an adequate estimation of third-order parameters, still an almost open topic of research in the area of statistics of extremes, would be needed. Anyway, at optimal levels, \({\overline{H}}^{\mathit{GJ}}\) can outperform \(\overline{H}\), as it is theoretically illustrated in Fig. 2.

Fig. 2
figure 2

Typical patterns of variance, bias and MSE of H, \(\overline{H}\) and \({\overline{H}}^{\mathit{GJ}}\), as a function of the sample fraction \(r = k/n\)

A Monte-Carlo simulation of the mean value (E) and the root MSE (RMSE) of the estimators under consideration have revealed similar patterns. On the basis of 5,000 runs, and for a Burr(γ, ρ) parent, with d.f. \(F(x) = 1 - {(1 + {x}^{-\rho /\gamma })}^{1/\rho }\), x ≥ 0, with γ = 1 and \(\rho = -0.5\), we present Fig. 3, as an illustration of the results obtained for different underlying parents and different sample sizes.

Fig. 3
figure 3

Simulated mean values (left) and RMSEs (right) of the estimators under study, for a sample of size n = 1, 000 from an underlying Burr(γ, ρ) model, with \((\gamma,\rho ) = (1,-0.5)\)

As usual, we define the relative efficiency of any EVI-estimator as the quotient between the simulated RMSE of the H-estimator and the one of any of the estimators under study, both computed at their optimal levels, i.e. for any T-statistic, consistent for the EVI-estimation,

$$\displaystyle{\mathrm{REFF}_{T_{0}\vert H_{0}}:= \frac{\mathrm{RMSE}(H_{0})} {\mathrm{RMSE}(T_{0})},}$$

with T 0: = T(k 0 | T ) and k 0 | T : = argmin k MSE(T(k)).

The simulation of those efficiencies for the same Burr model is based on 20 × 5, 000 replicates and, as shown in Fig. 4, the REFF-indicators as a function of n, are always larger than one, both for \(\overline{H}\), in (10) and for \({\overline{H}}^{\mathit{GJ}}\), in (11). Moreover, \({\overline{H}}^{\mathit{GJ}}\), computed at its optimal level, in the sense of minimal MSE, just as mentioned above, attains the highest REFF for this Burr underlying parent, as well as for other simulated parents with ρ > −1, unless n is very large. Details on multi-sample Monte-Carlo simulation can be found in [16].

Fig. 4
figure 4

Simulated REFF indicators, as a function of the sample size n, for the same Burr parent

Some General Comments:

  • The GJ-estimator has a bias always smaller than the one of the original estimator.

  • Regarding MSE, we are able to go below the MSE of the MVRB \(\overline{H}\)-estimator for a large variety of underlying parents and small values of | ρ | , as was illustrated here and can be further seen in [24].

  • Apart from what happens for very small values of ρ, there is a high reduction in the MSE of the GJ-estimator, at optimal levels, comparatively with the one of the original \(\overline{H}\)-estimator, despite the already nice properties of the \(\overline{H}\) EVI-estimator.

3.2 The Bootstrap Methodology for the Estimation of Sample Fractions

As already mentioned in Sect. 2,

$$\displaystyle\begin{array}{rcl} k_{A\vert \overline{H}}(n)& =& \arg \min _{k}\mathrm{AMSE}\big(\overline{H}(k)\big) =\arg \min _{k}\Big( \frac{{\gamma }^{2}} {k} + b_{{\overline{H}}}^{2}\ {A}^{4}(n/k)\Big) {}\\ & =& k_{0\vert \overline{H}}(n)(1 + o(1)). {}\\ \end{array}$$

The bootstrap methodology enables us to estimate the optimal sample fraction, \(k_{0\vert \overline{H}}(n)/n\) in a way similar to the one used for the classical EVI estimation, in [6, 9, 16], now through the use of any auxiliary statistic, such as

$$\displaystyle{T_{n}(k) \equiv T_{n}^{\overline{H}}(k):= \overline{H}(\lfloor k/2\rfloor ) -\overline{H}(k),\quad k = 2,\ldots,n - 1,}$$

which converges in probability to the known value zero, for intermediate k. Moreover, under the third-order framework, in (7), we get:

$$\displaystyle{T_{n}(k)\ \stackrel{d}{=}\ \ \frac{\gamma \ P_{k}} {\sqrt{k}} + b_{{\overline{H}}}\ ({2}^{2\rho } - 1)\ {A}^{2}(n/k) + O_{ p}\big(A(n/k)/\sqrt{k}\big),}$$

with P k asymptotically standard normal. The AMSE of T n (k) is thus minimal at a level k such that \(\sqrt{ k}\ {A}^{2}(n/k) \rightarrow \lambda _{{A}}^{{\prime}}\not =0\). Consequently, denoting

$$\displaystyle{k_{A\vert T}(n):=\arg \min _{k}\mathrm{AMSE}\big(T_{n}^{\overline{H}}(k)\big) = k_{ 0\vert T}(1 + o(1)),}$$

we have

$$\displaystyle{ k_{0\vert \overline{H}}(n) = k_{0\vert T}(n){\left (1 - {2}^{2\rho }\right )}^{ \frac{2} {1-4\rho } }(1 + o(1)). }$$
(12)

Note also that, with the adequate simple modifications, a similar comment applies to the GJ EVI-estimator \({\overline{H}}^{\mathit{GJ}}(k)\), in (11).

Given the sample \(\underline{X}_{n} = (X_{1},\ldots,X_{n})\) from an unknown model F, and the functional \(T_{n}(k) =:\phi _{k}(\underline{X}_{n})\), 1 ≤ k < n, consider for any \(n_{1} = O({n}^{1-\epsilon })\), 0 < ε < 1, the bootstrap sample \(\underline{X}_{n_{1}}^{{\ast}} = (X_{1}^{{\ast}},\ldots,X_{n_{1}}^{{\ast}}),\) from the model \(F_{n}^{{\ast}}(x) =\sum _{ i=1}^{n}I_{[X_{i}\leq x]}/n,\) the empirical d.f. associated with our sample \(\underline{X}_{n}\). Next, consider \(T_{n_{1}}^{{\ast}}(k_{1}):=\phi _{k_{1}}(\underline{X}_{n_{1}}^{{\ast}}),1 < k_{1} < n_{1}.\) Then, with \(k_{0\vert T}^{{\ast}}(n_{1}) =\arg \min _{k_{1}}\mathrm{MSE}\big(T_{n_{1}}^{{\ast}}(k_{1})\big)\),

$$\displaystyle{k_{0\vert T}^{{\ast}}(n_{ 1})/k_{0\vert T}(n) ={ \left (n_{1}/n\right )}^{ \frac{4\rho } {1-4\rho } }(1 + o(1)),\quad \mbox{ as}\ n \rightarrow \infty.\ }$$

To get a simpler way of computing k 0 | T (n) it is then sensible to use a double bootstrap, based on another sample size n 2. Then for every α > 1,

$$\displaystyle{\frac{\big{(k_{0\vert T}^{{\ast}}(n_{1})\big)}^{\alpha }} {k_{0\vert T}^{{\ast}}(n_{2})} \ {\left (\frac{n_{1}^{\alpha }} {{n}^{\alpha }} \frac{n} {n_{2}}\right )}^{- \frac{4\rho } {1-4\rho } } ={ \left \{k_{0\vert T}(n)\right \}}^{\alpha -1}(1 + o(1)).}$$

It is then enough to choose \(n_{2} = \left \lfloor n{\left (\frac{n_{1}} {n} \right )}^{\alpha }\right \rfloor \), in order to have independence of ρ. If we put \(n_{2} = \lfloor n_{1}^{2}/n\rfloor \), i.e. α = 2, we have

$$\displaystyle{\big{(k_{0\vert T}^{{\ast}}(n_{ 1})\big)}^{2}/k_{ 0\vert T}^{{\ast}}(n_{ 2}) = k_{0\vert T}(n)(1 + o(1)),}$$

and the possibility of estimating k 0 | T (n) on the basis of \(k_{0\vert T}^{{\ast}}(n_{1})\) and \(k_{0\vert T}^{{\ast}}(n_{2})\) only. We are next able to estimate \(k_{0\vert \overline{H}}(n)\), on the basis of (12) and any estimate \(\hat{\rho }\) of the second-order parameter ρ. Then, with \(\hat{k}_{0\vert T}^{{\ast}}\) denoting the sample counterpart of k 0 | T , we have the estimate

$$\displaystyle{\hat{k}_{0\vert \overline{H}}^{{\ast}}(n;n_{ 1}):=\min \left (n - 1,\ \left \lfloor \frac{c_{\hat{\rho }}\ {(\hat{k}_{0\vert T}^{{\ast}}(n_{1}))}^{2}} {\hat{k}_{0\vert T}^{{\ast}}(\lfloor n_{1}^{2}/n\rfloor + 1)}\right \rfloor + 1\right ),\quad c_{\hat{\rho }} ={ \left (1 - {2}^{2\hat{\rho }}\right )}^{ \frac{2} {1-4\hat{\rho }} }.}$$

The final estimate of γ is then given by \({\overline{H}}^{{\ast}}\equiv \overline{H}_{n,n_{1}\vert T}^{{\ast}}:= \overline{H}_{\hat{\beta },\hat{\rho }}(\hat{k}_{0\vert \overline{H}}(n;n_{1})).\) And a similar procedure can be used to estimate any other parameter of extreme events, as well as the EVI, either through H or through \({\overline{H}}^{\mathit{GJ}}\).

The application of the associated bootstrap algorithm, with n 1 = n 0. 975 and B = 250 generations, to the first randomly generated Burr(γ, ρ) sample of size n = 1, 000, with γ = 1 and \(\rho = -0.5\) led us to \(\hat{k}_{0\vert H}^{{\ast}} = 76\), \(\hat{k}_{0\vert \overline{H}}^{{\ast}} = 157\), and \(\hat{k}_{0\vert {\overline{H}}^{\mathit{GJ}}}^{{\ast}} = 790\). The bootstrap EVI-estimates were H  = 1. 259, \({\overline{H}}^{{\ast}} = 1.108\) and \({\overline{H}}^{\mathit{GJ{\ast}}} = 1.049\), a value indeed closer to the target value γ = 1. In Fig. 5 we present the sample paths of the EVI-estimators under study.

Fig. 5
figure 5

Sample paths of the EVI-estimators under study and bootstrap estimates of the k 0 | •-values, for a Burr random sample with γ = 1 and \(\rho = -0.5\)

4 Concluding Remarks

A few practical questions and final remarks can now be raised.

  • How does the asymptotic method work for moderate sample sizes? Is the method strongly dependent on the choice of n 1? Although aware of the theoretical need of n 1 = o(n), what happens if we choose \(n_{1} = n - 1\)? Answers to these questions have not yet been fully given for the class of GJ EVI-estimators, in (11), but will surely be similar to the ones given for classical estimation and for the MVRB estimation. Usually, the method does not depend strongly on n 1 and practically we can choose \(n_{1} = n - 1\). And here we can mention again the old controversy between theoreticians and practioners: The value \(n_{1} = \lfloor {n}^{1-\epsilon }\rfloor \) can be equal to n − 1 for small ε and a large variety of values of n, finite. Also, \(k_{n} = [c\ln n]\) is intermediate for every constant c, and if we take, for instance, \(c = 1/5\), we get k n  = 1 for every n ≤ 22, 026. And Hall’s formula of the asymptotically optimal level for the Hill EVI-estimation (see [26]), given by \(k_{0\vert H}(n) = \left \lfloor \big{({(1-\rho )}^{2}{n}^{-2\rho }/\big(-{2\ \rho \ \beta }^{2}\big)\big)}^{1/(1-2\rho )}\right \rfloor \) and valid for models in (6), may lead, for a fixed n, and for several choices of (β, ρ), to k 0 | H (n) either equal to 1 or to n − 1 according as ρ is close to 0 or quite small, respectively.

  • Note that bootstrap confidence intervals as well as asymptotic confidence intervals are easily associated with the estimates presented, and the smallest size (with a high coverage probability) is usually related to the EVI-estimator \({\overline{H}}^{\mathit{GJ}}\), in (11), as expected.