Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Extreme value theory (EVT) provides tools that enable to estimate the probability of events that are more extreme than any that have already been observed. The classical result in EVT states that if the maximum of an independent and identically distributed (i.i.d.) sequence of random variables (r.v.’s) converges to some nondegenerate function G γ , then it must be the generalized extreme value (GEV ) function,

$$\displaystyle\begin{array}{rcl} G_{\gamma }(x) =\exp (-{(1 +\gamma x)}^{-1/\gamma })\mathrm{,}1 +\gamma x > 0\mathrm{,}\gamma \in \mathbf{R},& & {}\\ \end{array}$$

with the usual continuity correction \(G_{0}(x) = exp(-{e}^{-x})\). The shape parameter γ, known as the tail index, determines the tail behavior: if γ > 0 we have a heavy tail (Fréchet max-domain of attraction), γ = 0 means an exponential tail (Gumbel max-domain of attraction) and γ < 0 indicates a short tail (Weibull max-domain of attraction).

The first results in EVT were developed under independence but, more recently, models for extreme values have been constructed under more realistic assumption of temporal dependence.

MARMA processes (maximum autoregressive moving average) with Fréchet marginals, in particular ARMAX [or MARMA(1,0)], given by,

$$\displaystyle\begin{array}{rcl} X_{i} =\max (c\,X_{i-1},W_{i}),& & {}\\ \end{array}$$

with 0 < c < 1 and \(\{W_{i}\}_{i\geq 1}\) i.i.d., have been successfully applied to time series modeling in alternative to classical linear heavy-tailed ARMA (see [2] and references therein). Generalizations of MARMA processes and respective applications to financial time series can be seen in, e.g., [13] and [4]. Here we shall focus on autoregressive Pareto processes, i.e., an autoregressive process whose marginal distributions are of the Pareto or generalized Pareto form. As Pareto observed [10], many economic variables have heavy-tailed distributions not well modeled by the normal curve. Instead, he proposed a model, subsequently called, in his honor, the Pareto distribution, whose tail function decreases at a negative power of x as x → , i.e., \(1 - F(x) \sim {\mathit{cx}}^{-\alpha },\mathrm{as}x \rightarrow \infty \). Generalizations of Pareto’s distribution have been proposed for modeling economic variables (a survey can be seen in [1]).

We consider autoregressive Pareto(III) processes, more precisely, the Yeh–Arnold–Robertson Pareto(III) [12], in short YARP(III)(1), given by

$$\displaystyle\begin{array}{rcl} X_{n} =\min \Big ({p}^{-1/\alpha }X_{ n-1}, \frac{1} {1 - U_{n}}\varepsilon _{n}\Big),& & {}\\ \end{array}$$

where innovations \(\{\varepsilon _{n}\}_{n\geq 1}\) are i.i.d. r.v.’s with distribution Pareto(III)(0,σ,α), i.e., a generalized type III Pareto, such that

$$\displaystyle\begin{array}{rcl} 1 - F_{\varepsilon }(x) =\Big {[1 +\Big {(\frac{x} {\sigma } \Big)}^{\alpha }\Big]}^{-1},\,\,x > 0.& & {}\\ \end{array}$$

with σ, α > 0. The sequence \(\{U_{n}\}_{n\geq 1}\) has i.i.d. r.v.’s with a Bernoulli(p) distribution (independent of the innovations). We interpret 1∕0 as + . By conditioning on U n , it is readily verified that the YARP(III)(1) process has a Pareto(III)(0,σ,α) stationary distribution and will be completely stationary if the distribution of the starting r.v. X 0 is also Pareto(III)(0,σ,α).

In this paper we analyze the dependence behavior of the YARP(III)(1) process in the right tail (the most used for applications). This process is almost unknown in literature but has large potential as it presents a quite similar tail behavior to ARMAX and more robust parameters estimation [3]. We characterize the lag-m tail dependence (m = 1, 2, ) by computing several coefficients considered in [5, 6], defined under a temporal approach. The lag-m tail dependence allows a characterization of the process in time, analogous to the role of the ACF of a linear time series. In addition, these measures are also important in applications, such as risk assessment in financial time series or in engineering, to investigate how the best performer in a system is attracted by the worst one.

2 Measures of Tail Dependence

The tail-dependence coefficient (TDC), usually denoted λ, was the first tail dependence concept appearing in literature in a paper by Sibuya, who has shown that, no matter how high we choose the correlation of normal random pairs, if we go far enough into the tail, extreme events tend to occur independently in each margin [11]. It measures the probability of occurring extreme values for one r.v. given that another assumes an extreme value too. More precisely,

$$\displaystyle\begin{array}{rcl} \lambda =\lim _{t\downarrow 0}P(F_{1}(X_{1}) > 1 - t\vert F_{2}(X_{2}) > 1 - t),& & {}\\ \end{array}$$

where F 1 and F 2 are the distribution functions (d.f.’s) of r.v.’s X 1 and X 2, respectively. It characterizes the dependence in the tail of a random pair (X 1, X 2), in the sense that, λ > 0 corresponds to tail dependence whose degree is measured by the value of λ, whereas λ = 0 means tail independence. Modern risk management is highly interested in assessing the amount of tail dependence. As an example, the Value-at-Risk at probability level 1 − t (VaR1−t ) of a random asset Z is given by the quantile function evaluated at 1 − t, \(F_{Z}^{-1}(1 - t) =\inf \{ x: F_{Z}(x) \geq 1 - t\}\), and estimation is highly sensitive towards the tail behavior and the tail dependence of the portfolio’s asset-return distribution. Observe that the TDC can be formulated as

$$\displaystyle\begin{array}{rcl} \lambda =\lim _{t\downarrow 0}P(X_{1} > \mathit{VaR}_{1-t}(X_{1})\vert X_{2} > \mathit{VaR}_{1-t}(X_{2})).& & {}\\ \end{array}$$

Generalizations of the TDC have been considered with several practical applications. In [6], for integers s and k such that 1 ≤ s < dk + 1 ≤ d, it was considered the upper s,k-extremal coefficient of random vector X = (X 1, , X d ), defined by

$$\displaystyle\begin{array}{rcl} \lambda _{U}(X_{s:d}\vert X_{d-k+1:d})& \equiv & \lambda _{U}(U_{s:d}\vert U_{d-k+1:d}) {}\\ & =& \lim _{t\downarrow 0}P(U_{s:d} > 1 - t\vert U_{d-k+1:d} > 1 - t), {}\\ \end{array}$$

where U 1: d  ≤  ≤ U d: d are the order statistics of \((F_{1}(X_{1}),\ldots,F_{d}(X_{d}))\) and X i: d the inverse probability integral transform of U i: d . In engineering, the coefficient \(\lambda _{U}(X_{s:d}\vert X_{d-k+1:d})\) can be interpreted as the limiting probability that the sth worst performer in a system is attracted by the kth best one, provided the latter has an extremely good performance. In mathematical finance, \(\lambda _{U}(X_{s:d}\vert X_{d-k+1:d})\) can be viewed as the limiting conditional probability that X s: d violates its value-at-risk at level 1 − t, given that X dk+1: d has done so. If s = k = 1, we obtain the upper extremal dependence coefficient, \({\epsilon }^{U}\), considered in [7].

The study of systemic stability is also an important issue within the context of extreme risk dependence. The fragility of a system has been addressed through the Fragility Index (FI) introduced in [8]. More precisely, consider a random vector X = (X 1, , X d ) with d.f. F and \(N_{x}:=\sum _{ i=1}^{d}\mathbf{1}_{\{X_{i}>x\}}\) the number of exceedances among \(X_{1},\ldots,X_{d}\) above a threshold x. The FI corresponding to X is the asymptotic conditional expected number of exceedances, given that there is at least one exceedance, i.e., \(FI =\lim _{x\rightarrow \infty }E(N_{x}\vert N_{x} > 0)\). The stochastic system {X 1, , X d } is called fragile whenever FI > 1. In [5] it can be seen as a generalization of the FI that measures the stability of a stochastic system divided into blocks. More precisely, the block-FI of a random vector X = (X 1, , X d ) relative to a partition \(\mathcal{D} =\{ I_{1},\ldots,I_{s}\}\) of D = { 1, , d} is

$$\displaystyle\begin{array}{rcl} FI(\mathbf{X},\mathcal{D}) =\lim _{x\rightarrow \infty }E(N_{\mathbf{x}}\vert N_{\mathbf{x}} > 0),& & {}\\ \end{array}$$

where N x is the number of blocks where it occurs at least one exceedance of x, i.e.,

$$\displaystyle{N_{\mathbf{x}} =\sum _{ j=1}^{s}\mathbf{1}_{\{\mathbf{ X}_{I_{j}}\not\leq \mathbf{x}_{I_{j}}\}},}$$

and where \(\mathbf{X}_{I_{j}}\) is a sub-vector of X whose components have indexes in I j , with j = 1, , s (i.e., \(\mathbf{X}_{I_{j}}\) is the j th block of random vector X) and \(\mathbf{x}_{I_{j}}\) is a vector of length | I j  | with components equal to x ∈ R. Observe that if we consider a partition \({\mathcal{D}}^{{\ast}} =\{ I_{j} =\{ j\}: j = 1,\ldots d\}\), then the coefficient \(FI(\mathbf{X},{\mathcal{D}}^{{\ast}})\) is the FI introduced in [8]. All operations and inequalities on vectors are meant componentwise.

Here we shall consider the abovementioned tail dependence coefficients defined in a time series perspective. More precisely, consider a stationary process {X i } i ≥ 1 with marginal d.f. F X . The lag-m TDC (m = 1, 2, ) is given by

$$\displaystyle\begin{array}{rcl} \lambda _{m} =\lim _{t\downarrow 0}P(F_{X}(X_{1+m}) > 1 - t\vert F_{X}(X_{1}) > 1 - t),& & {}\\ \end{array}$$

measuring the probability of occurring one extreme value observation given that another assumes an extreme value too, whenever separated in time by a lag-m. Analogously, we define the lag-m upper s,k-extremal coefficient,

$$\displaystyle\begin{array}{rcl} \lambda _{U}(X_{s:m}\vert X_{m-k+1:m})& \equiv & \lambda _{U}(U_{s:m}\vert U_{m-k+1:m}) {}\\ & =& \lim _{t\downarrow 0}P(U_{s:m} > 1 - t\vert U_{m-k+1:m} > 1 - t), {}\\ \end{array}$$

a measure of the probability that, for a horizon of m successive time instants, the sth worst performer is attracted by the kth best one, provided the latter has an extremely good performance. If s = k = 1, we obtain the lag-m upper extremal dependence coefficient, \(\epsilon _{m}^{U}\). Finally, the lag-m block-FI relative to a partition \(\mathcal{D}_{m}\) of D m  = { 1, , m} is

$$\displaystyle\begin{array}{rcl} FI(\mathbf{X},\mathcal{D}_{m}) =\lim _{x\rightarrow \infty }E(N_{\mathbf{x}}\vert N_{\mathbf{x}} > 0),& & {}\\ \end{array}$$

where, for a horizon of m successive time instants, N x is the number of blocks where it occurs at least one exceedance of x. Hence it measures the stability within m successive time instants of a stochastic process divided into blocks. Analogously we define the \(FI(\mathbf{X},\mathcal{D}_{m}^{{\ast}})\) for a partition \({\mathcal{D}}^{{\ast}} =\{ I_{j} =\{ j\}: j = 1,\ldots m\}\) as the lag-m FI version of [8].

3 Tail Dependence of YARP(III)(1)

In this section we shall present a characterization of the dependence structure and tail behavior of the YARP(III)(1) process. We start with the reference to some existing results and then we compute the above mentioned measures.

In order to determine the distribution of the maximum, \(M_{n} =\max _{0\leq i\leq n}X_{i}\), it is convenient to consider a family of level crossing processes {Z n (x)} indexed by x > 0, defined by

$$\displaystyle{Z_{n}(x) = \left \{\begin{array}{ll} 1&\mathrm{if}X_{n} > x \\ 0&\mathrm{if}X_{n} \leq x. \end{array} \right.}$$

These two processes are themselves Markov chains with corresponding transition matrices given by

$$\displaystyle{\begin{array}{c} P =\big {(1 +\big {(\frac{x} {\sigma } \big)}^{\alpha }\big)}^{-1}\left [\begin{array}{cc} p +\big {(\frac{x} {\sigma } \big)}^{\alpha } & 1 - p \\ (1 - p)\big{(\frac{x} {\sigma } \big)}^{\alpha }&1 + p\big{(\frac{x} {\sigma } \big)}^{\alpha } \end{array} \right ]. \end{array}}$$

Hence, we have

$$\displaystyle\begin{array}{rcl} \begin{array}{l} F_{M_{n}}(x) = P(M_{n} \leq x) = P(Z_{0}(x) = 0,Z_{1}(x) = 0,\ldots,Z_{n}(x) = 0) \\ = P(X_{0} \leq x)P{(Z_{i}(x) = 0\vert Z_{i-1}(x) = 0)}^{n} = \frac{\big{(\frac{x} {\sigma } \big)}^{\alpha }} {1+\big{(\frac{x} {\sigma } \big)}^{\alpha }}\Big{(\frac{p+\big{(\frac{x} {\sigma } \big)}^{\alpha }} {1+\big{(\frac{x} {\sigma } \big)}^{\alpha }}\Big)}^{n} \end{array} & & {}\\ \end{array}$$

and \(\frac{{n}^{-1/\alpha }} {\sigma } M_{n}\stackrel{d}{\rightarrow }\mathit{Fr\acute{e}chet}(0,{(1 - p)}^{-1},\alpha )\).

In [3] it was proved that the YARP(III)(1) process presents a β-mixing dependence structure. Hence, it satisfies the local dependence condition D(u n ) of Leadbetter [9] for any real sequence {u n } n ≥ 1 and so, for each τ > 0 such that n(1 − F X (u n )) → τ, as n → , we have \(P(M_{n} \leq u_{n}) \rightarrow {e}^{-\theta \tau }\) as n → , with θ = 1 − p (Proposition 2.2 of [3]). The parameter θ, known in literature as extremal index, is associated with the tendency of clustering of high levels: in case θ < 1 large values tend to occur in clusters, i.e., near each other and tail dependence takes place. Indeed, the YARP(III)(1) process presents tail dependence with lag-m TDC, \(\lambda _{m} = {p}^{m}\) (see Proposition 2.8 of [3]).

The one-step transition probability function (tpf) of the YARP(III)(1) process is given by:

$$\displaystyle\begin{array}{rcl} \begin{array}{rcl} Q(x,]0,y])& =&P(X_{n} \leq y\vert X_{n-1} = x) = P(\min ({p}^{-1/\alpha }x, \frac{\varepsilon _{n}} {1-U_{n}}) \leq y) \\ & =&\left \{\begin{array}{ll} 1 - P( \frac{\varepsilon _{n}} {1-U_{n}} > y)&,\,x >{ \mathit{yp}}^{1/\alpha } \\ 1 &,\,x \leq {\mathit{yp}}^{1/\alpha }\end{array} \right. = \left \{\begin{array}{ll} (1 - p)F_{\varepsilon }(y)&,\,x >{ \mathit{yp}}^{1/\alpha } \\ 1 &,\,x \leq {\mathit{yp}}^{1/\alpha }. \end{array} \right. \end{array} & & {}\\ \end{array}$$

Similarly, we derive the m-step tpf:

$$\displaystyle\begin{array}{rcl} \begin{array}{l} {Q}^{m}(x,]0,y]) = \left \{\begin{array}{ll} 1 -\prod _{j=0}^{m-1}[\overline{F}_{\varepsilon }({p}^{j/\alpha }y)(1 - p) + p]&,\,x >{ \mathit{yp}}^{m/\alpha } \\ 1 &,\,x \leq {\mathit{yp}}^{m/\alpha }. \end{array} \right. \end{array} & &{}\end{array}$$
(1)

In the sequel we shall denote a t the quantile function at 1 − t, i.e.,

$$\displaystyle\begin{array}{rcl} a_{t} \equiv F_{X}^{-1}(1 - t) =\sigma {({t}^{-1} - 1)}^{1/\alpha }& &{}\end{array}$$
(2)

and, for a set A, α(A) and ζ(A) denote the maximum and the minimum of A, respectively.

Proposition 1.

The YARP(III)(1) process has lag-m upper s,k-extremal coefficient,

$$\displaystyle{\begin{array}{l} \lambda _{U}(X_{s:m}\vert X_{m-k+1:m}) \\ \quad = \frac{\displaystyle\sum _{i=0}^{s-1}\sum _{ I\in \mathcal{F}_{i}}\sum _{J\subset I}{(-1)}^{\vert J\vert }{p}^{\alpha (\overline{I}\cup J)-\zeta (\overline{I}\cup J)}} {-\displaystyle\sum _{\varnothing \not =J\subset D_{m}}{(-1)}^{\vert J\vert }{p}^{\alpha (J)-\zeta (J)} -\sum _{ i=1}^{k-1}\sum _{ I\in \mathcal{F}_{i}}\sum _{J\subset \overline{I}}{(-1)}^{\vert J\vert }{p}^{\alpha (I\cup J)-\zeta (I\cup J)}}, \end{array} }$$

where \(\mathcal{F}_{i}\) denotes the family of all subsets of D m ={ 1,…,m} with cardinal equal to i and \(\overline{I}\) the complement set of \(I \in \mathcal{F}_{i}\) in D m .

Proof.

Consider notation \(P_{A}(t) = P\big(\bigcap _{a\in A}\{F_{X}(X_{a}) > 1 - t\}\big)\), for any set A. From Propositions 2.1 and 2.9 in [6], we have

$$\displaystyle{\begin{array}{c} \lambda _{U}(X_{s:m}\vert X_{m-k+1:m}) = \lim _{t\downarrow 0} \frac{\displaystyle\sum _{i=0}^{s-1}\sum _{ I\in \mathcal{F}_{i}}\sum _{J\subset I}{(-1)}^{\vert J\vert }P_{ \,\overline{I}\cup J}(t)/t} {-\displaystyle\sum _{\varnothing \not =J\subset \{1,\ldots,m\}}{(-1)}^{\vert J\vert }P_{ J}(t)/t-\sum _{i=1}^{k-1}\sum _{ I\in \mathcal{F}_{i}}\sum _{J\subset \overline{I}}{(-1)}^{\vert J\vert }P_{ I\cup J}(t)/t}. \end{array} }$$

Now just observe that, for \(i_{1} < i_{2} < i_{3}\), we have successively

$$\displaystyle{\begin{array}{rl} P_{\{i_{1},i_{2},i_{3}\}}(t) =&\int _{a_{t}}^{\infty }P(X_{ i_{3}} > a_{t},X_{i_{2}} > a_{t}\vert X_{i_{1}} = u_{1})\mathit{dF}_{X}(u_{1}) \\ =&\int _{a_{t}}^{\infty }\int _{ a_{t}}^{\infty }P(X_{ i_{3}} > a_{t}\vert X_{i_{2}} = u_{2})Q(u_{1},\mathit{du}_{2})\mathit{dF}_{X}(u_{1}) \\ =&\int _{a_{t}}^{\infty }\int _{ a_{t}}^{\infty }[1 - {Q}^{i_{3}-i_{2} }(u_{2},]0,a_{t}])]{Q}^{i_{2}-i_{1} }(u_{1},\mathit{du}_{2})\mathit{dF}_{X}(u_{1}),\\ \end{array} }$$

where a t is given in (2). Applying (1), we obtain

$$\displaystyle{\begin{array}{rl} P_{\{i_{1},i_{2},i_{3}\}}(t) =&t[t + {p}^{i_{3}-i_{2} }(1 - t)]\int _{a_{t}}^{\infty }\int _{ a_{t}}^{\infty }{Q}^{i_{2}-i_{1} }(u_{1},\mathit{du}_{2})\mathit{dF}_{X}(u_{1}) \\ =&t[t + {p}^{i_{3}-i_{2} }(1 - t)]\int _{a_{t}}^{\infty }[1 - {Q}^{i_{2}-i_{1} }(u_{1},]0,a_{t}])]\mathit{dF}_{X}(u_{1}) \\ =&[t + {p}^{i_{3}-i_{2} }(1 - t)][t + {p}^{i_{2}-i_{1} }(1 - t)]t. \end{array} }$$

A similar reasoning leads us to, for \(i_{1} < i_{2} <\ldots < i_{k}\),

$$\displaystyle\begin{array}{rcl} \begin{array}{rl} &P_{\{i_{1},\ldots,i_{k}\}}(t) \\ =&\int _{a_{t}}^{\infty }\ldots \int _{ a_{t}}^{\infty }\big(1 - {Q}^{i_{k}-i_{k-1} }\big(u_{i_{k-1}},]0,a_{t}]\big)\big)\prod _{j=2}^{k-1}{Q}^{i_{k-j}-i_{k-j+1} }(u_{i_{k-j}},\mathit{du}_{i_{k-j+1}})\mathit{dF}_{X}(u_{i_{1}}) \\ =&\prod _{j=2}^{k}(t + {p}^{i_{j}-i_{j-1}}(1 - t))t\,,\end{array} & & {}\\ \end{array}$$

and hence

$$\displaystyle\begin{array}{rcl} \lim _{t\downarrow 0}P_{\{i_{1},\ldots,i_{k}\}}(t)/t =\lim _{t\downarrow 0}\prod _{j=2}^{k}(t + {p}^{i_{j}-i_{j-1} }(1 - t)) = {p}^{i_{k}-i_{1} }\,.& &{}\end{array}$$
(3)

 □ 

Corollary 1.

The YARP(III)(1) process has lag-m upper extremal dependence coefficient,

$$\displaystyle{\epsilon _{m}^{U} = \frac{{p}^{m-1}} {m - (m - 1)p}.}$$

A positive \(\epsilon _{m}^{U}\) means the existence of extremal dependence on a time horizon of m time instants.

Proposition 2.

The YARP(III)(1) process has lag-m block-FI, relative to a partition \(\mathcal{D}_{m}\) of D m ={ 1,…,m}, given by

$$\displaystyle{FI(\mathbf{X},\mathcal{D}_{m}) = \frac{\sum _{j=1}^{s}\sum _{k\in I_{j}}{(-1)}^{k-1}\sum _{J\subset I_{j};\vert J\vert =k}{p}^{\alpha (J)-\zeta (J)}} {m - (m - 1)p}.}$$

Proof.

Based on Propositions 3.1 and 5.2 in [5], we have

$$\displaystyle\begin{array}{rcl} \begin{array}{rl} FI(\mathbf{X},\mathcal{D}_{m}) =&\lim _{t\downarrow 0} \frac{\sum _{j=1}^{s}P(\bigcup _{i\in I_{j}}\{F_{X}(X_{i}) > 1 - t\})} {1 - P(\bigcap _{i\in \{1,\ldots m\}}\{F_{X}(X_{i}) < 1 - t\})} \\ =&\lim _{t\downarrow 0}\frac{\sum _{j=1}^{s}\sum _{k\in I_{j}}{(-1)}^{k-1}\sum _{J\subset I_{j};\vert J\vert =k}P(\bigcap _{i\in J}\{F_{X}(X_{i}) > 1-t\})} {1 - F_{M_{m-1}}(a_{t})}. \end{array} & & {}\\ \end{array}$$

Now observe that, from (3), we have

$$\displaystyle{\lim _{t\downarrow 0}P(\cap _{i\in J}\{F_{X}(X_{i}) > 1 - t\})/t = {p}^{\alpha (J)-\zeta (J)}}$$

and from (1) and (2), we have

$$\displaystyle{\lim _{t\downarrow 0}(1-F_{M_{m-1}}(a_{t}))/t =\lim _{t\downarrow 0}\frac{1} {t}\bigg(1-\frac{{t}^{-1} - 1} {{t}^{-1}} \Big{(\frac{p + {t}^{-1} - 1} {{t}^{-1}} \Big)}^{m-1}\bigg) = m-(m-1)p\,.}$$

 □ 

Corollary 2.

The YARP(III)(1) process has lag-m FI,

$$\displaystyle{FI(\mathbf{X},\mathcal{D}_{m}^{{\ast}}) = \frac{m} {m - (m - 1)p}.}$$

Therefore, on a time horizon of m (m > 1) time instants the process is strongly fragile since FI > 1.

We remark that the tail measures given above only depend on the parameter p of the YARP(III)(1) process and thus can be estimated through this latter. For a survey on the estimation of p, see [3].