Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Let \(\boldsymbol{X}_{n} = (X_{1,n},\cdots \,,X_{d,n})\), n = 1, 2, ⋯ be independent and identically distributed (i.i.d.) random vectors with common distribution function (df) F. Define component-wise maxima \(M_{i,n} := \vee _{j=1}^{n}X_{i,j}\) and minima \(m_{i,n} := \wedge _{j=1}^{n}X_{i,j}\), 1 ≤ id. Here and hereafter ∨ ( ∧ ) denotes the maximum (minimum). This paper focuses on dependence comparison of the limiting distributions of properly normalized vectors of component-wise maxima \(\boldsymbol{M}_{n} := (M_{1,n},\ldots,M_{d,n})\) and of component-wise maxima \(\boldsymbol{m}_{n} := (m_{1,n},\ldots,m_{d,n})\), as n. The comparison method is based on asymptotic comparisons of upper tails of F of the underlying sample \((\boldsymbol{X}_{n},n \geq 1)\).

For any two vectors \(\boldsymbol{a},\boldsymbol{b} \in {\mathbb{R}}^{d}\), the sum \(\boldsymbol{a} +\boldsymbol{ b}\), product \(\boldsymbol{ab}\), quotient \(\boldsymbol{a}/\boldsymbol{b}\), and vector inequalities such as \(\boldsymbol{a} \leq \boldsymbol{ b}\) are all operated component-wise. Let G and H be dfs defined on \({\mathbb{R}}^{d}\) with nondegenerate margins. A df F is said to be in the domain of attraction of G for the maxima, denoted as \(F \in \mbox{ DA}_{\vee }(G)\), if there exist \({\mathbb{R}}^{d}\)-valued sequences \(\boldsymbol{a}_{n} = (a_{1,n},\cdots \,,a_{d,n})\) with \(a_{i,n} > 0\), 1 ≤ id, and \(\boldsymbol{b}_{n} = (b_{1,n},\cdots \,,b_{d,n})\), n = 1, 2, ⋯, such that for any \(\boldsymbol{x} = (x_{1},\cdots \,,x_{d})\), as n,

$$\displaystyle\begin{array}{rcl} & & \mathrm{P}\left \{\frac{M_{1,n} - b_{1,n}} {a_{1,n}} \leq x_{1},\cdots \,, \frac{M_{d,n} - b_{d,n}} {a_{d,n}} \leq x_{d}\right \} \\ & =& {F}^{n}(\boldsymbol{a}_{ n}x +\boldsymbol{ b}_{n}) \rightarrow G(\boldsymbol{x}), {}\end{array}$$
(19.1.1)

and in this case, G is called a max multivariate extreme value (MEV) distribution. Similar definitions for min MEV distributions and their domain of attraction can be made. For minima, Eq. (19.1.1) is replaced by

$$\displaystyle\begin{array}{rcl} & & \mathrm{P}\left \{\frac{m_{1,n} - b_{1,n}} {a_{1,n}} > x_{1},\cdots \,, \frac{m_{d,n} - b_{d,n}} {a_{d,n}} > x_{d}\right \} \\ & =&{ \overline{F}}^{n}(\boldsymbol{a}_{ n}x +\boldsymbol{ b}_{n}) \rightarrow \overline{H}(\boldsymbol{x}), {}\end{array}$$
(19.1.2)

which is denoted by \(F \in \mbox{ DA}_{\wedge }(H)\). Here and hereafter bars on the top of dfs denote (joint) survival functions. A key property of an MEV distribution G is that all positive powers of G are also distributions, and max MEV distributions coincide with the max-stable distributions, which form a subclass of max-infinitely divisible distributions. Similarly min MEV distributions coincide with the min-stable distributions, which form a subclass of min-infinitely divisible distributions. One needs only to study the case of maxima as the theory for minima is similar.

Let \(\boldsymbol{X} = (X_{1},\ldots,X_{d})\) be a generic random vector with distribution F and continuous, univariate margins \(F_{1},\ldots,F_{d}\). If F ∈ DA(G), then G is closely related to the upper tail distribution of X, which often possesses the heavy tail property of regular variation. Without loss of generality, we may assume that \(\boldsymbol{X}\) is nonnegative component-wise. Consider the standard case in which the survival functions \(\overline{F}_{i}(x) := 1 - F_{i}(x)\), 1 ≤ id of the margins are right tail equivalent; that is,

$$\displaystyle{ \frac{\overline{F}_{i}(x)} {\overline{F}_{1}(x)} = \frac{1 - F_{i}(x)} {1 - F_{1}(x)} \rightarrow 1,\ \mbox{ as}\ x \rightarrow \infty,\ 1 \leq i \leq d. }$$
(19.1.3)

The distribution F or random vector \(\boldsymbol{X}\) is said to be multivariate regularly varying (MRV) at with intensity measure ν if there exists a scaling function b(t) → and a nonzero Radon measure ν( ⋅) such that as t,

$$\displaystyle\begin{array}{rcl} t\,\mathrm{P}\left \{ \frac{\boldsymbol{X}} {b(t)} \in B\right \} \rightarrow \nu (B),& & \forall \ \mbox{ relatively compact sets}\ B \subset \overline{\mathbb{R}}_{+}^{d}\setminus \{0\}, \\ & & \quad \mbox{ with}\ \nu (\partial B) = 0, {}\end{array}$$
(19.1.4)

where \(\overline{\mathbb{R}}_{+}^{d} := {[0,\infty ]}^{d}\). The extremal dependence information of \(\boldsymbol{X}\) is encoded in the intensity measure ν that satisfies that \(\nu (tB) = {t}^{-\alpha }\nu (B)\), for all relatively compact subsets B that are bounded away from the origin, where α > 0 is known as the tail index. Since the set \(B_{1} =\{\boldsymbol{ x} \in {\mathbb{R}}^{d} : x_{1} > 1\}\) is relatively compact within the cone \(\overline{\mathbb{R}}_{+}^{d}\setminus \{0\}\) and ν(B 1) > 0 under Eq. (19.1.3), it follows from Eq. (19.1.4) that the scaling function b(t) can be chosen to satisfy that \(\overline{F}_{1}(b(t)) = {t}^{-1}\), t > 0, after appropriately normalizing the intensity measure by ν(B 1). That is, b(t) can be chosen as \(b(t) ={ \overline{F}}^{\,-1}({t}^{-1}) = F_{1}^{-1}(1 - {t}^{-1})\) under the condition (19.1.3), and thus, Eq. (19.1.4) can be expressed equivalently as

$$\displaystyle{ \lim _{t\rightarrow \infty }\frac{\mathrm{P}\{\boldsymbol{X} \in tB\}} {\mathrm{P}\{X_{1} > t\}} =\nu (B),\ \forall \ \mbox{ relatively compact sets}\ B \subset \overline{\mathbb{R}}_{+}^{d}\setminus \{0\}, }$$
(19.1.5)

satisfying that μ(∂ B) = 0. It follows from Eqs. (19.1.5) and (19.1.3) that for 1 ≤ id,

$$\displaystyle{\lim _{t\rightarrow \infty }\frac{\mathrm{P}\{X_{i} > ts\}} {\mathrm{P}\{X_{i} > t\}} =\nu ((s,\infty ] \times {\overline{\mathbb{R}}}^{d-1}) = {s}^{-\alpha }\nu ((1,\infty ] \times {\overline{\mathbb{R}}}^{d-1}),\ \forall \ s > 0.}$$

That is, univariate margins have regularly varying right tails. In general, a Borel-measurable function \(g : \mathbb{R}_{+} \rightarrow \mathbb{R}_{+}\) is regularly varying with exponent \(\rho \in \mathbb{R}\), denoted as g ∈ RV ρ , if and only if

$$\displaystyle{ g(t) = {t}^{\rho }\ell(t),\ \mbox{ with $\ell(\cdot ) \geq 0$ satisfying that}\ \lim _{t\rightarrow \infty }\frac{\ell(ts)} {\ell(t)} = 1,\ \mbox{ for}\ s > 0. }$$
(19.1.6)

The function ( ⋅) is known as a slowly varying function and denoted as ∈ RV0. Since \(\overline{F}_{1} \in \mbox{ RV}_{-\alpha }\), \(1/\overline{F}_{1} \in \mbox{ RV}_{\alpha }\), and thus, by Proposition 2.6 (v) of [387], the scaling function \(b \in \mbox{ RV}_{{\alpha }^{-1}}\).

Since all the margins are tail equivalent as assumed in Eq. (19.1.3), one has

$$\displaystyle{ \overline{F}_{i}(t) = {t}^{-\alpha }\ell_{ i}(t),\ \mbox{ where}\ \ell_{i} \in \mbox{ RV}_{0},\ \mbox{ and}\ \ell_{i}(t)/\ell_{j}(t) \rightarrow 1\ \mbox{ as}\ t \rightarrow \infty,\ \mbox{ for any}\ i\neq j, }$$
(19.1.7)

which, together with \(\overline{F}_{1}(b(t)) = {t}^{-1}\), imply that

$$\displaystyle{ \lim _{t\rightarrow \infty }t\,\mathrm{P}\{X_{i} > b(t)s\} =\lim _{t\rightarrow \infty }\frac{\mathrm{P}\{X_{i} > b(t)s\}} {\overline{F}_{i}(b(t))} \frac{\overline{F}_{i}(b(t))} {\overline{F}_{1}(b(t))} = {s}^{-\alpha },\ s > 0,\ 1 \leq i \leq d. }$$
(19.1.8)

The detailed discussions on univariate and multivariate regular variations can be found in [61, 387]. The extension of MRV beyond the nonnegative orthant can be done by using the tail probability of | | X | |, where | | ⋅ | | denotes a norm on \({\mathbb{R}}^{d}\), in place of the marginal tail probability in Eq. (19.1.5) (see [387], Sect. 6.5.5). The case that the limit in Eq. (19.1.3) is any nonzero constant can be easily converted into the standard tail equivalent case by properly rescaling margins. If the limit in Eq. (19.1.3) is zero or infinity, then some margins have heavier tails than others. One way to overcome this problem is to standardize the margins via marginal monotone transforms (see Theorem 6.5 in [387]) or to use the copula method [281].

Theorem 19.1.1 (Marshall and Olkin [309])

Assume that Eq.(19.1.3)holds. Then there exist normalization vectors \(\boldsymbol{a_{n} > 0}\) and \(\boldsymbol{b}_{n}\) such that, as n →∞,

$$\displaystyle{\mathrm{P}\Big\{\frac{\boldsymbol{M_{n} - b_{n}}} {\boldsymbol{a}_{n}} \leq \boldsymbol{ x}\Big\} \rightarrow G(\boldsymbol{x}),\qquad \forall \ \boldsymbol{x} \in \mathbb{R}_{+}^{d},}$$

where G is a d-dimensional distribution with Fréchet margins \(G_{i}(s) =\exp \{ -{s}^{-\alpha }\}\) , 1 ≤ i ≤ d, if and only if F is MRV with intensity measure \(\nu ({[0,\boldsymbol{x}]}^{c}) := -\log G(\boldsymbol{x})\) .

In other words, F ∈ DA(G) where G has Fréchet margins with tail index α if and only if F is MRV with intensity measure \(\nu ({[0,\boldsymbol{x}]}^{c}) = -\log G(\boldsymbol{x})\).

Remark 19.1.2.

  1. 1.

    The normalization vectors \(\boldsymbol{a}_{n} > 0\) and \(\boldsymbol{b}_{n}\) in Theorem 19.1.1 can be made precise so that \(\boldsymbol{b}_{n}=0\) and \(\boldsymbol{a}_{n}=(\overline{F}_{1}^{\,-1}(1/n),\ldots,\overline{F}_{d}^{\,-1}(1/n))\) that depend only on the margins of F.

  2. 2.

    If Eq. (19.1.3) does not hold, Theorem 19.1.1 can still be established, but the nonstandard global regular variation with different scaling functions among various margins needs to be used in place of Eq. (19.1.5), which uses the same scaling function among different margins.

  3. 3.

    One-dimensional version of Theorem 19.1.1 is due to Gnedenko [184]. Note that the parametric feature enjoyed by univariate extremes is lost in the multivariate context.

  4. 4.

    Let \(\mathbb{S}_{+}^{d-1} =\{\boldsymbol{ a} :\boldsymbol{ a} = (a_{1},\ldots,a_{d}) \in \mathbb{R}_{+}^{d},\vert \vert \boldsymbol{a}\vert \vert = 1\}\), where | | ⋅ | | is a norm defined in \({\mathbb{R}}^{d}\). Using the polar coordinates, G can be expressed as follows:

    $$\displaystyle{G(\boldsymbol{x}) =\exp \Big\{ -c\int _{\mathbb{S}_{+}^{d}}\max _{1\leq i\leq d}\{{(a_{i}/x_{i})}^{\alpha }\}\mathbb{Q}(\mathrm{d}\boldsymbol{a})\Big\},}$$

    where c > 0 and \(\mathbb{Q}\) is a probability measure defined on \(\mathbb{S}_{+}^{d-1}\). This is known as the Pickands representation [374], and \(c\mathbb{Q}(\cdot )\) is known as the spectral or angular measure.

  5. 5.

    Note that the spectral measure is a finite measure that can be approximated by a sequence of discrete measures. Using this idea, Marshall and Olkin [309] showed that the MEV distribution G is positively associated. This implies that as n is sufficiently large, we have asymptotically,

    $$\displaystyle{\mathrm{E}\Big[f\big(\boldsymbol{M}_{n}\big)g\big(\boldsymbol{M}_{n}\big)\Big] \geq \mathrm{ E}\Big[f\big(\boldsymbol{M}_{n}\big)\Big]\mathrm{E}\Big[g\big(\boldsymbol{M}_{n}\big)\Big]}$$

    for all nondecreasing functions \(f,g : {\mathbb{R}}^{d}\mapsto \mathbb{R}\). Observe that the sample vector \(\boldsymbol{X}_{n}\) could have any dependence structure, but the strong positive dependence emerges among multivariate extremes.

  6. 6.

    Since G is max-infinitely divisible, all bivariate margins of G are TP2, a positive dependence property that is even stronger than the positive association of bivariate margins (see Theorem 2.6 in [211]).

Since the normalization vectors \(\boldsymbol{a}_{n} > 0\) and \(\boldsymbol{b}_{n}\) in Theorem 19.1.1 depend only on the margins, dependence comparison of G can be easily established using the orthant dependence order on sample vectors. Recall that a d-dimensional random vector \(\boldsymbol{X} = (X_{1},\ldots,X_{d})\) with df F is said to be smaller than another d-dimensional random vector \(\boldsymbol{X}^{\prime} = (X^{\prime}_{1},\ldots,X^{\prime}_{d})\) with df F ′ in the upper (lower) orthant order, denoted as \(\boldsymbol{X} \leq _{\mathrm{uo}}\boldsymbol{X}^{\prime}\) or \(F \leq _{\mathrm{uo}}F^{\prime}\) (\(\boldsymbol{X} \leq _{\mathrm{lo}}\boldsymbol{X}^{\prime}\) or \(F \leq _{\mathrm{lo}}F^{\prime}\)), if

$$\displaystyle{ \mathrm{P}\{X_{1} > x_{1},\ldots,X_{d} > x_{d}\} \leq \mathrm{ P}\{X^{\prime}_{1} > x_{1},\ldots,X^{\prime}_{d} > x_{d}\}, }$$
(19.1.9)
$$\displaystyle{ \mathrm{P}\{X_{1} \leq x_{1},\ldots,X_{d} \leq x_{d}\} \leq \mathrm{ P}\{X^{\prime}_{1} \leq x_{1},\ldots,X^{\prime}_{d} \leq x_{d}\}, }$$
(19.1.10)

for all \((x_{1},\ldots,x_{d}) \in {\mathbb{R}}^{d}\). If, in addition, their corresponding univariate margins are identical, then \(\boldsymbol{X}\) is said to be smaller than \(\boldsymbol{X}^{\prime}\) in the upper (lower) orthant dependence order, denoted as \(\boldsymbol{X} \leq _{\mathrm{uod}}\boldsymbol{X}^{\prime}\) or Fuod F ′ (\(\boldsymbol{X} \leq _{\mathrm{lod}}\boldsymbol{X}^{\prime}\) or Flod F ′). Clearly \(\boldsymbol{X} \leq _{\mathrm{uod}}\boldsymbol{X}^{\prime}\) implies that \(\boldsymbol{X} \leq _{\mathrm{uo}}\boldsymbol{X}^{\prime}\), but the order ≤ uod focuses on comparing scale-invariant dependence among components. The detailed discussions on these orders can be found in [335, 426]. The following result is immediate due to the fact that the orthant order is closed under weak convergence.

Proposition 19.1.3

Let \((\boldsymbol{X}_{n},n \geq 1)\) and \((\boldsymbol{X}^{\prime}_{n},n \geq 1)\) be two i.i.d. samples with dfs F and F′, respectively. If F ∈DA (G) and F′∈DA (G′) with Fréchet margins, then \(\boldsymbol{X}_{n} \leq _{\mathrm{lod}}\boldsymbol{X}^{\prime}_{n}\) implies that G ≤ lod G′.

Note, however, that the ordering \(\boldsymbol{X}_{n} \leq _{\mathrm{lod}}\boldsymbol{X}^{\prime}_{n}\) is strongly affected by the behavior at the center and often too strong to be valid. The fact that MRV is a tail property motivates us to focus on comparing only upper tails of \(\boldsymbol{X}_{n}\) and \(\boldsymbol{X}^{\prime}_{n}\), leading to weaker notions of stochastic tail orders. In Sect. 19.2, we introduce a notion of stochastic tail order for random variables and establish related closure properties and discuss its relation with other asymptotic orders that are already available in the literature. In Sect. 19.3, we extend the stochastic tail order to random vectors and show that the stochastic tail order of sample vectors sufficiently implies the orthant dependence order of the corresponding MEV distributions.

2 Stochastic Tail Orders

Let X and Y be two \(\mathbb{R}_{+}\)-valued random variables. X is said to be smaller than Y in the sense of stochastic tail order, denoted as Xsto Y, if there exists a threshold constant t 0 > 0 (usually large) such that

$$\displaystyle{ \mathrm{P}\{X > t\} \leq \mathrm{ P}\{Y > t\},\qquad \forall \ t > t_{0}. }$$
(19.2.1)

Remark 19.2.1.

  1. 1.

    The stochastic tail order ≤ sto is reflexive and transitive. ≤ sto is antisymmetric if tail identically distributed random variables are considered to be equivalent.

  2. 2.

    If X is smaller than Y in the usual stochastic order (denoted as Xst Y ; see Sect. 1.A in [426]); that is, \(\mathrm{P}\{X > t\} \leq \mathrm{ P}\{Y > t\}\) for all t, then Xsto Y.

  3. 3.

    Xsto Y if and only if there exists a small open neighborhood of within which X is stochastically smaller than Y.

  4. 4.

    Xsto Y implies

    $$\displaystyle{ \limsup _{t\rightarrow \infty }\frac{\mathrm{P}\{X > t\}} {\mathrm{P}\{Y > t\}} \leq 1. }$$
    (19.2.2)

The stochastic tail orders using limiting inequalities such as Eq. (19.2.2) have been introduced and studied in [43, 242, 243, 389391] and more recently in [300]. Most of these tail orders, however, are based on limiting approaches rather than stochastic comparison theory.

  1. 1.

    Mainik and Rüschendorf studied in [300] the following weak tail order: A random variable X is said to be smaller than another random variable Y in the asymptotic portfolio loss order, denoted as Xapl Y, if the limiting inequality (19.2.2) holds. Observe that \(\sup \limits _{s>t}\frac{\mathrm{P}\{X>s\}} {\mathrm{P}\{Y >s\}}\) is decreasing in t, and as such, in the case of Xapl Y with

    $$\displaystyle{\limsup _{t\rightarrow \infty }\frac{\mathrm{P}\{X > t\}} {\mathrm{P}\{Y > t\}} =\lim _{t\rightarrow \infty }\left [\sup _{s>t}\frac{\mathrm{P}\{X > s\}} {\mathrm{P}\{Y > s\}}\right ] = 1,}$$

    one can find in any open neighborhood (c, ] of that P(X > s) ≥ P(Y > s) for some s > c. That is, neither X nor Y could dominate the other in any open neighborhood (c, ] of , but asymptotically, the right tail of X decays at the rate that is bounded from above by the tail decay rate of Y.

  2. 2.

    Rojo introduced in [391] a stronger version of tail orders: Define X < sq Y if

    $$\displaystyle{\limsup _{u\rightarrow 1}\frac{{F}^{-1}(u)} {{G}^{-1}(u)} < 1}$$

    where F − 1( ⋅) and G − 1( ⋅) denote the left-continuous inverses of dfs of X and Y, respectively. Obviously, X < sq Y implies that Xsto Y. Note, however, that < sq is not a partial ordering.

The stochastic tail orders via limiting inequalities resemble the idea of comparing the asymptotic decay rates that is often employed in theory of large (and small) deviations [284]. In contrast, the notion Eq. (19.2.1) compares stochastically random variables in a small open neighborhood of within which theory of stochastic orders retains its full power. For example, coupling remains valid in a small open neighborhood of .

Theorem 19.2.2

Let X and Y be two positive random variables with support [0,∞). X ≤ sto Y if and only if there exists a random variable Z defined on the probability space \((\Omega,\mathcal{F},\mathrm{P})\) with support [a,b], and nondecreasing functions ψ 1 and ψ 2 with \(\lim _{z\rightarrow b}\psi _{i}(z) = \infty \) , i = 1,2, such that \(X{ d \atop =} \psi _{1}(Z)\), \(Y{ d \atop =} \psi _{2}(Z)\) and \(\mathrm{P}\{\psi _{1}(Z) \leq \psi _{2}(Z)\mid Z \geq z_{0}\} = 1\) for some z 0 > 0.

Proof: Let X and Y have distributions F and G with support [0, ), respectively, and let F − 1( ⋅) and G − 1( ⋅) denote the corresponding left-continuous inverses. Recall that for any df H on \(\mathbb{R}\), the left-continuous inverse of H is defined as

$$\displaystyle{{H}^{-1}(u) :=\inf \{ s : H(s) \geq u\},\ 0 \leq u \leq 1.}$$

The left-continuous inverse has the following desirable properties:

  1. 1.

    H(H − 1(u)) ≥ u for all 0 ≤ u ≤ 1, and H − 1(H(x)) ≤ x for all \(x \in \mathbb{R}\).

  2. 2.

    H − 1(u) ≤ x if and only if uH(x).

  3. 3.

    The set {s : H(s) ≥ u} is closed for each 0 ≤ u ≤ 1.

Necessity: Using Properties 1 and 2, Xsto Y implies that \({F}^{-1}(u) \leq {G}^{-1}(u)\), ∀u > u 0 for some 0 < u 0 < 1. Let U be a random variable with standard uniform distribution, and thus \(\mathrm{P}\{{F}^{-1}(U) \leq {G}^{-1}(U)\mid U \geq u_{0}\} = 1\). Using Property 2, \(\mathrm{P}\{{F}^{-1}(U) \leq x\} =\mathrm{ P}\{U \leq F(x)\} = F(x)\). Similarly, \(\mathrm{P}\{{G}^{-1}(U) \leq x\} = G(x)\).

Sufficiency: For all tc, a constant with \(c >\psi _{1}(z_{0})\),

$$\displaystyle\begin{array}{rcl} \mathrm{P}\{X > t\}& =& \mathrm{P}\{Z \geq z_{0}\}\mathrm{P}\{\psi _{1}(Z) > t\mid Z \geq z_{0}\} {}\\ & \leq & \mathrm{P}\{Z \geq z_{0}\}\mathrm{P}\{\psi _{2}(Z) > t\mid Z \geq z_{0}\} {}\\ & \leq & \mathrm{P}\{\psi _{2}(Z) > t\} {}\\ & =& \mathrm{P}\{Y > t\}. {}\\ \end{array}$$

The tail coupling presented in Theorem 19.2.2 enables us to establish desirable closure properties for the stochastic tail order. A Borel measurable function \(\psi : {\mathbb{R}}^{d} \rightarrow \mathbb{R}\) is called a Radon function if ψ is bounded on every compact subset of \({\mathbb{R}}^{d}\). Obviously, any nondecreasing function and any continuous function defined on \({\mathbb{R}}^{d}\) are Radon functions.

Definition 19.2.3.

A Borel measurable function \(\psi : \mathbb{R}_{+}^{d} \rightarrow \mathbb{R}\) is said to be eventually increasing if there exists a compact subset \(S \subset \mathbb{R}_{+}^{d}\) such that ψ is component-wise nondecreasing on S c with \(\lim _{x_{i}\rightarrow \infty }\psi (x_{1},\ldots,x_{i},\ldots,x_{d}) = \infty \).

Proposition 19.2.4

Let X and Y be two positive random variables with support [0,∞).

  1. 1.

    X ≤ sto Y implies g(X) ≤ sto g(Y ) for any Radon function g that is eventually increasing.

  2. 2.

    If X 1 ,X 2 are independent, and X 1 ′,X 2 ′ are independent, then \(X_{1} \leq _{\mathrm{sto}}X_{1}^{\prime}\) and \(X_{2} \leq _{\mathrm{sto}}X_{2}^{\prime}\) imply \(g(X_{1},X_{2}) \leq _{\mathrm{sto}}g(X_{1}^{\prime},X_{2}^{\prime})\) for any Radon function g that is eventually increasing.

Proof

  1. (1)

    Since g is a Radon function that is eventually increasing, there exists a threshold x 0 > 0 such that g( ⋅) is increasing to on [x 0, ). By Theorem 19.2.2, there exists a random variable Z defined on the probability space \((\Omega,\mathcal{F}, \mathbb{P})\) and nondecreasing functions ψ 1 and ψ 2 with \(\lim _{z\rightarrow b}\psi _{i}(z) = \infty \), i = 1, 2, such that \(X{ d \atop =} \psi _{1}(Z)\) and \(Y{ d \atop =} \psi _{2}(Z)\) and

    $$\displaystyle{\mathrm{P}\{\psi _{1}(Z) \leq \psi _{2}(Z)\mid Z \geq z_{0}\}=1\quad \mbox{ for some $z_{0}>0$ with $\psi _{1}(z_{0})>x_{0}$.}}$$

    Thus,

    $$\displaystyle{\mathrm{P}\{g(\psi _{1}(Z)) \leq g(\psi _{2}(Z))\mid Z \geq z_{0}\} = 1\quad \mbox{ for $z_{0} > 0$.}}$$

    Clearly, \(g(X){ d \atop =} \,\, g(\psi _{1}(Z))\) and \(g(Y ){ d \atop =}\,\, g(\psi _{2}(Z))\), and thus,

    $$\displaystyle\begin{array}{rcl} \mathrm{P}\{g(X) > t\}& =& \mathrm{P}\{Z \geq z_{0}\}\mathrm{P}\{g(\psi _{1}(Z)) > t\mid Z \geq z_{0}\} {}\\ & \leq &\mathrm{P}\{Z \geq z_{0}\}\mathrm{P}\{g(\psi _{2}(Z)) > t\mid Z \geq z_{0}\} {}\\ & \leq &\mathrm{P}\{g(\psi _{2}(Z)) > t\} =\mathrm{ P}\{g(Y ) > t\} {}\\ \end{array}$$

    for any tc where c is a constant with \(c > g(\psi _{1}(z_{0}))\).

  2. (2)

    Without loss of generality, assume that (X 1, X 2) and (X 1 , X 2 ) are independent. We only need to show that \(g(X_{1},X_{2}) \leq _{\mathrm{sto}}g(X_{1}^{\prime},X_{2})\). Since g( ⋅) is a Radon function that is eventually increasing, there exists a (x 1, x 2) such that g( ⋅) is bounded on \([0,x_{1}] \times [0,x_{2}]\) and increasing on \({([0,x_{1}] \times [0,x_{2}])}^{c}\).

  1. 1.

    By Theorem 19.2.2, there exists a random variable Z 1 defined on the probability space \((\Omega _{1},\mathcal{F}_{1},\mathrm{P}_{1})\) and nondecreasing functions ψ 1 and ψ 1 such that \(X_{1}{ d \atop =} \psi _{1}(Z_{1})\) and \(X_{1}^{\prime}{ d \atop =} \psi _{1}^{\prime}(Z_{1})\) and \(\mathrm{P}_{1}\{\psi _{1}(Z_{1}) \leq \psi _{1}^{\prime}(Z_{1})\mid Z_{1} \geq z_{1}\} = 1\) for some z 1 > 0 with \(\psi _{1}(z_{1}) > x_{1}\).

  2. 2.

    Let \((\Omega _{2},\mathcal{F}_{2},\mathrm{P}_{2})\) denote the underlying probability space of X 2.

Construct a product probability space \((\Omega,\mathcal{F},\mathrm{P}) = (\Omega _{1} \times \Omega _{2},\mathcal{F},\mathrm{P}_{1} \times \mathrm{ P}_{2})\), where \(\mathcal{F}\) is the σ-field generated by \(\mathcal{F}_{1} \times \mathcal{F}_{2}\). On this enlarged product probability space, since \(\mathrm{P}\{\psi _{1}(Z_{1}) \leq \psi _{1}^{\prime}(Z_{1})\mid Z_{1} \geq z_{1}\} = 1\), and g( ⋅) is increasing on \({([0,x_{1}] \times [0,x_{2}])}^{c}\), we have that \(\mathrm{P}\{g(\psi _{1}(Z_{1}),X_{2}) \leq g(\psi _{1}^{\prime}(Z_{1}),X_{2})\mid Z_{1} \geq z_{1}\ \mbox{ or}\ X_{2} > x_{2}\} = 1\). Clearly, \(g(X_{1},X_{2}){ d \atop =} \,\,g(\psi _{1}(Z_{1}),X_{2})\) and \(g(X_{1}^{\prime},X_{2}){ d \atop =} \,\,g(\psi _{1}^{\prime}(Z_{1}),X_{2})\), and thus

$$\displaystyle\begin{array}{rcl} & & \mathrm{P}\{g(X_{1},X_{2}) > t\} {}\\ & =& \mathrm{P}\{Z_{1}\geq z_{1}\ \mbox{ or}\ X_{2}>x_{2}\}\mathrm{P}\{g(\psi _{1}(Z_{1}),X_{2}) > t\mid Z_{1} \geq z_{1}\ \mbox{ or}\ X_{2} > x_{2}\} {}\\ & \leq & \mathrm{P}\{Z_{1}\geq z_{1}\ \mbox{ or}\ X_{2}>x_{2}\}\mathrm{P}\{g(\psi _{1}^{\prime}(Z_{1}),X_{2}) > t\mid Z_{1} \geq z_{1}\ \mbox{ or}\ X_{2} > x_{2}\} {}\\ & \leq & \mathrm{P}\{g(\psi _{1}^{\prime}(Z_{1}),X_{2}) > t\} {}\\ & =& \mathrm{P}\{g(X_{1}^{\prime},X_{2}) > t\} {}\\ \end{array}$$

for any tc where c is a constant with \(c > g(\psi _{1}(z_{1}),x_{2})\). That is, \(g(X_{1},X_{2}) \leq _{\mathrm{sto}}g(X_{1}^{\prime},X_{2})\). Similarly, \(g(X_{1}^{\prime},X_{2}) \leq _{\mathrm{sto}}g(X_{1}^{\prime},X_{2}^{\prime})\).

Corollary 19.2.5

If X 1 ,X 2 are independent and Y 1 ,Y 2 are independent, then \(X_{1} \leq _{\mathrm{sto}}Y _{1}\) and \(X_{2} \leq _{\mathrm{sto}}Y _{2}\) imply

$$\displaystyle{X_{1}X_{2} \leq _{\mathrm{sto}}Y _{1}Y _{2},\qquad \qquad X_{1} + X_{2} \leq _{\mathrm{sto}}Y _{1} + Y _{2},}$$
$$\displaystyle{X_{1} \vee X_{2} \leq _{\mathrm{sto}}Y _{1} \vee Y _{2},\qquad \qquad X_{1} \wedge X_{2} \leq _{\mathrm{sto}}Y _{1} \wedge Y _{2}.}$$

In particular, R 1sto R 2 implies that \(R_{1}V \leq _{\mathrm{sto}}R_{2}V\) for any nonnegative random variable V that is independent of R 1, R 2. Mainik and Rüschendorf obtained this inequality in [300] for the random variable V that is bounded using the ordering ≤ apl, and their proof is based on the method of mixture.

Proposition 19.2.6

Let X and Y be two positive random variables with support [0,∞) and Θ be a random variable with bounded support [θ L U ]. Assume that:

  1. 1.

    Θ is a random variable with finite masses.

  2. 2.

    Θ is a continuous random variable such that \(\mathrm{P}\{X > t\mid \Theta =\theta \}\) and \(\mathrm{P}\{Y > t\mid \Theta =\theta \}\) are continuous in θ.

If \([X\mid \Theta =\theta ] \leq _{\mathrm{sto}}[Y \mid \Theta =\theta ]\) for all θ in the support of Θ, then X ≤ sto Y.

Proof: Since \([X\mid \Theta =\theta ] \leq _{\mathrm{sto}}[Y \mid \Theta =\theta ]\) for all \(\theta \in [\theta _{L},\theta _{U}]\), there exists a threshold t θ that is given by

$$\displaystyle{ t_{\theta } :=\sup \big\{ s :\mathrm{ P}\{X > s\mid \Theta =\theta \}>\mathrm{ P}\{Y > s\mid \Theta =\theta \}\big\}, }$$
(19.2.3)

such that

$$\displaystyle{\mathrm{P}\{X > t\mid \Theta =\theta \}\leq \mathrm{ P}\{Y > t\mid \Theta =\theta \},\ \ \forall \ t > t_{\theta }.}$$

Notice that the threshold t θ depends on the mixing value θ. Consider the following two cases. Construct the threshold \(t_{[\theta _{L},\theta _{U}]}\) as follows:

  1. 1.

    If Θ is discrete with finite masses, then define

    $$\displaystyle{t_{[\theta _{L},\theta _{U}]} :=\max \{ t_{\theta } :\theta \in [\theta _{L},\theta _{U}]\} < \infty.}$$
  2. 2.

    If Θ is continuous, then t θ is continuous in θ due to the assumption that \(\mathrm{P}\{X > t\mid \Theta =\theta \}\) and \(\mathrm{P}\{Y > t\mid \Theta =\theta \}\) are continuous in θ. Define

    $$\displaystyle{t_{[\theta _{L},\theta _{U}]} :=\sup \{ t_{\theta } :\theta \in [\theta _{L},\theta _{U}]\},}$$

    which is finite because of the continuity of t θ and the compactness of [θ 0, θ n ].

In any case, for any \(\theta \in [\theta _{L},\theta _{U}]\), any \(t > t_{[\theta _{L},\theta _{U}]}\),

$$\displaystyle{\mathrm{P}\{X > t\mid \Theta =\theta \}\leq \mathrm{ P}\{Y > t\mid \Theta =\theta \}.}$$

Taking the integrations on both sides from θ L to θ U , we obtain \(\mathrm{P}\{X > t\} \leq \mathrm{ P}\{Y > t\}\) for any \(t > t_{[\theta _{L},\theta _{U}]}\).

Remark 19.2.7.

The closure property under mixture when the mixing variable has unbounded support, say [0, ), becomes more subtle. This is because the threshold t θ defined in Eq. (19.2.3) can approach to infinity as θ goes to infinity. Our conjecture is that in the case of unbounded support, \([X\mid \Theta =\theta ] \leq _{\mathrm{sto}}[Y \mid \Theta =\theta ]\) for all θ in the support of Θ implies that \(\limsup _{t\rightarrow \infty }\frac{\mathrm{P}\{X>t\}} {\mathrm{P}\{Y >t\}} \leq 1\); that is, Xapl Y.

In the examples to be discussed below, all involved random variables fail to satisfy the usual stochastic order.

Example 19.2.8.

 

  1. 1.

    Let X have the Weibull distribution with unit scale parameter and shape parameter k and Y have the exponential distribution with unit (scale) parameter. If the shape parameter k > 1 (i.e., increasing hazard rate), then Xsto Y. If the shape parameter k < 1 (i.e., decreasing hazard rate), then Xsto Y. Note that both X and Y have exponentially decayed right tails.

  2. 2.

    Let X have the exponential distribution with unit (scale) parameter and Y have the distribution of Pareto Type II with tail index α = 2; that is,

    $$\displaystyle{ \mathrm{P}\{Y > t\} = {(1 + t)}^{-2},\ t \geq 0. }$$
    (19.2.4)

    Then Xsto Y. Note that Y has regularly varying right tail as described in Eq. (19.1.6).

  3. 3.

    If X has the Fréchet distribution with tail index α = 3 (see Theorem 19.1.1) and Y has the distribution Eq. (19.2.4) of Pareto Type II with tail index α = 2, then Xsto Y. Note that X and Y are regularly varying with respective tail indexes 3 and 2, but Y has a heavier tail than that of X.

  4. 4.

    Let X have the survival function of Pareto Type I as defined as follows:

    $$\displaystyle{\mathrm{P}\{X > t\} ={ \left ( \frac{t} {0.5}\right )}^{-1},\ t \geq 0.5.}$$

    Let Y have the survival function of Pareto Type II with tail index 1; that is,

    $$\displaystyle{\mathrm{P}\{Y > t\} = {(1 + t)}^{-1},\ t \geq 0.}$$

    Then Xsto Y. Note that both X and Y have regularly varying right tails with same tail index 1.

  5. 5.

    Let R 1 and R 2 have regularly varying distributions with tail indexes α 1 and α 2, respectively. If α 1 > α 2, then R 1sto R 2. That is, the random variable with heavier regularly varying right tail is larger stochastically in the tail.

  6. 6.

    Let R be regularly varying with tail index α. If V 1 and V 2 are random variables with finite moments of any order, independent of R, such that \(\mathrm{E}[V _{1}^{\alpha }] <\mathrm{ E}[V _{2}^{\alpha }]\). By Breiman’s theorem (see [387], p. 232),

    $$\displaystyle{\lim _{t\rightarrow \infty }\frac{\mathrm{P}\{RV _{1} > t\}} {\mathrm{P}\{R > t\}} =\mathrm{ E}[V _{1}^{\alpha }] <\mathrm{ E}[V _{ 2}^{\alpha }] =\lim _{ t\rightarrow \infty }\frac{\mathrm{P}\{RV _{2} > t\}} {\mathrm{P}\{R > t\}} }$$

    Thus, for t > t 0 where t 0 is sufficiently large, \(\mathrm{P}\{RV _{1} > t\} <\mathrm{ P}\{RV _{2} > t\}\), implying that \(RV _{1} \leq _{\mathrm{sto}}RV _{2}\).

A multivariate extension of scale mixtures discussed in Example 19.2.8 (6) includes the multivariate elliptical distribution. A random vector \(\boldsymbol{X} \in {\mathbb{R}}^{d}\) is called elliptically distributed if \(\boldsymbol{X}\) has the representation:

$$\displaystyle{ \boldsymbol{X}{ d \atop =} \boldsymbol{\mu } + RA\boldsymbol{U} }$$
(19.2.5)

where \(\boldsymbol{\mu }\in {\mathbb{R}}^{d}\), \(A \in {\mathbb{R}}^{d\times d}\) and \(\boldsymbol{U}\) is uniformly distributed on \(\mathbb{S}_{2}^{d-1} :=\{\boldsymbol{ x} \in {\mathbb{R}}^{d} : \vert \vert \boldsymbol{x}\vert \vert _{2} = 1\}\) and R ≥ 0 is independent of \(\boldsymbol{U}\). We denote this by \(\boldsymbol{X} \sim \mathcal{E}(\boldsymbol{\mu },\Sigma,R)\) where Σ = A A .

Proposition 19.2.9

Let \(\boldsymbol{X}\sim \mathcal{E}(\boldsymbol{\mu }_{1},\Sigma _{1},R_{1})\) and \(\boldsymbol{Y }\sim \mathcal{E}(\boldsymbol{\mu }_{2},\Sigma _{2},R_{2})\) . If

$$\displaystyle{\boldsymbol{\mu }_{1}\leq \boldsymbol{\mu }_{2},\ R_{1}\leq _{\mathrm{sto}}R_{2},\ {\mbox{ and}\ \boldsymbol{\xi }}^{\top }\Sigma _{ 1}\boldsymbol{\xi } {\leq \boldsymbol{\xi }}^{\top }\Sigma _{ 2}\boldsymbol{\xi },\ \mbox{ for fixed}\ \boldsymbol{\xi }\in \mathbb{S}_{1}^{d-1}:=\{\boldsymbol{x}\in {\mathbb{R}}^{d} : \vert \vert \boldsymbol{x}\vert \vert _{ 1}=1\},}$$

then \({\vert \boldsymbol{\xi }^{\top }X}\vert \leq _{\mathrm{sto}}{\vert \boldsymbol{\xi }^{\top }Y }\vert \) .

Proof: Let \(a_{i} :{=\boldsymbol{\xi } }^{\top }\Sigma _{i}\boldsymbol{\xi }\), i = 1, 2. Without loss of generality, we can assume that \(\boldsymbol{\mu }_{1} =\boldsymbol{\mu } _{2} =\boldsymbol{ 0}\) and a 1 > 0. Let

$$\displaystyle{\boldsymbol{v}_{i} :={ \frac{{A}^{\top }\boldsymbol{\xi }} {\boldsymbol{\xi }}^{\top }\Sigma _{i}\boldsymbol{\xi }} = \frac{{A}^{\top }\boldsymbol{\xi }} {a_{i}},\ i = 1,2.}$$

Clearly \(\boldsymbol{v_{i}^{\top }v_{i}} = 1\), i = 1, 2, and thus, by symmetry, \(\boldsymbol{v_{1}^{\top }U}\) and \(\boldsymbol{v_{2}^{\top }U}\) have the same distribution. Let \(\Theta :=\boldsymbol{ v_{1}^{\top }U}\), and we have

$$\displaystyle{{\vert \boldsymbol{\xi }^{\top }X}\vert = R_{ 1}a_{1}\vert \boldsymbol{v_{1}^{\top }U}\vert { d \atop =} R_{1}a_{1}\vert \Theta \vert,\ {\vert \boldsymbol{\xi }^{\top }Y }\vert = R_{ 2}a_{2}\vert \boldsymbol{v_{2}^{\top }U}\vert { d \atop =} R_{2}a_{2}\vert \Theta \vert.}$$

The inequality then follows from Corollary 19.2.5 immediately.

This is our ≤ sto-version of a similar result that is obtained in [300] using the ≤ apl order.

Remark 19.2.10.

Anderson in [12], Fefferman, Jodeit, and Perlman in [159] show that if \(\boldsymbol{\mu }_{1} =\boldsymbol{\mu } _{2}\), \(R_{1}{ d \atop =} R_{2}\), and

$$\displaystyle{\boldsymbol{\xi}^{\top }\Sigma _{ 1}\boldsymbol{\xi } {\leq \boldsymbol{\xi }}^{\top }\Sigma _{ 2}\boldsymbol{\xi },\ \forall \ \boldsymbol{\xi }\in {\mathbb{R}}^{d},}$$

then \(\mathrm{E}(\psi (\boldsymbol{X})) \leq \mathrm{ E}(\psi (\boldsymbol{Y }))\) for all symmetric and convex functions \(\psi : {\mathbb{R}}^{d}\mapsto \mathbb{R}\), such that the expectations exist. This is known as dilatation, which can be defined on any locally convex topological linear space \(\mathbb{V}\) (traced back to Karamata’s work in 1932; see [308], pp. 16–17). Dilatation provides various versions of continuous majorization [308].

3 Tail Orthant Orders

Let \(\boldsymbol{X} = (X_{1},\ldots,X_{d})\) and \(\boldsymbol{X}^{\prime} = (X^{\prime}_{1},\ldots,X^{\prime}_{d})\) be nonnegative random vectors with dfs F and F ′, respectively. Observe that \(\boldsymbol{X} \leq _{\mathrm{lo}}\boldsymbol{X}^{\prime}\) is equivalent to that \(\max \limits _{1\leq i\leq d}\{X_{i}/w_{i}\} \geq _{\mathrm{st}}\max \limits _{1\leq i\leq d}\{X^{\prime}_{i}/w_{i}\}\), and \(\boldsymbol{X} \leq _{\mathrm{uo}}\boldsymbol{X}^{\prime}\) is equivalent to that \(\min \limits _{1\leq i\leq d}\{X_{i}/w_{i}\} \leq _{\mathrm{st}}\min \limits _{1\leq i\leq d}\{X^{\prime}_{i}/w_{i}\}\). In comparing orthant tails of these random vectors, we focus on the cones \(\mathbb{R}_{+}^{d}\) and \(\overline{\mathbb{R}}_{+}^{d} :=\{ (x_{1},\ldots,x_{d}) : x_{i} > 0,1 \leq i \leq d\}\). Note that \(\boldsymbol{x} \in \overline{\mathbb{R}}_{+}^{d}\) can have some components taking + .

Definition 19.3.1.

  1. 1.

    \(\boldsymbol{X}\) is said to be smaller than \(\boldsymbol{X}^{\prime}\) in the sense of tail lower orthant order, denoted as \(\boldsymbol{X} \leq _{\mathrm{tlo}}\boldsymbol{X}^{\prime}\), if for all \(\boldsymbol{w} = (w_{1},\ldots,w_{d}) \in \overline{\mathbb{R}}_{+}^{d}\), \(\max \limits _{1\leq i\leq d}\{X_{i}/w_{i}\} \geq _{\mathrm{sto}}\max \limits _{1\leq i\leq d}\{X^{\prime}_{i}/w_{i}\}\).

  2. 2.

    \(\boldsymbol{X}\) is said to be smaller than \(\boldsymbol{X}^{\prime}\) in the sense of tail upper orthant order, denoted as \(\boldsymbol{X} \leq _{\mathrm{tuo}}\boldsymbol{X}^{\prime}\), if for all \(\boldsymbol{w} = (w_{1},\ldots,w_{d}) \in \mathbb{R}_{+}^{d}\), \(\min \limits _{1\leq i\leq d}\{X_{i}/w_{i}\} \leq _{\mathrm{sto}}\min \limits _{1\leq i\leq d}\{X^{\prime}_{i}/w_{i}\}\).

It follows from Eq. (19.2.1) that \(\boldsymbol{X} \leq _{\mathrm{tlo}}\boldsymbol{X}^{\prime}\) is equivalent to for \(\boldsymbol{w} = (w_{1},\ldots,w_{d}) \in \overline{\mathbb{R}}_{+}^{d}\),

$$\displaystyle{ \mathrm{P}\{X_{1} \leq tw_{1},\ldots,X_{d} \leq tw_{d}\} \leq \mathrm{ P}\{X^{\prime}_{1} \leq tw_{1},\ldots,X^{\prime}_{d} \leq tw_{d}\} }$$
(19.3.1)

for all \(t > t_{\boldsymbol{w}}\) for some \(t_{\boldsymbol{w}} > 0\) that may depend on \(\boldsymbol{w}\). Similarly, \(\boldsymbol{X} \leq _{\mathrm{tuo}}\boldsymbol{X}^{\prime}\) is equivalent to for \(\boldsymbol{w} = (w_{1},\ldots,w_{d}) \in \mathbb{R}_{+}^{d}\),

$$\displaystyle{ \mathrm{P}\{X_{1} > tw_{1},\ldots,X_{d} > tw_{d}\} \leq \mathrm{ P}\{X^{\prime}_{1} > tw_{1},\ldots,X^{\prime}_{d} > tw_{d}\} }$$
(19.3.2)

for all \(t > t_{\boldsymbol{w}}\) for some \(t_{\boldsymbol{w}} > 0\) that may depend on \(\boldsymbol{w}\).

In comparing tail dependence, however, we assume that all the margins of F and F ′ are tail equivalent. Since we need to compare upper interior orthant tails given fixed marginal tails, consider the two smaller cones:

  1. 1.

    \(\mathbb{C}_{l} := \overline{\mathbb{R}}_{+}^{d}\setminus \cup _{j=1}^{d}\{t\boldsymbol{e}_{j}^{-1},t \geq 0\}\), where \(\boldsymbol{e}_{j}^{-1}\), 1 ≤ jd, denotes the vector with the j-th component being 1 and infinity otherwise.

  2. 2.

    \(\mathbb{C}_{u} := \mathbb{R}_{+}^{d}\setminus \cup _{j=1}^{d}\{t\boldsymbol{e}_{j},t \geq 0\}\), where \(\boldsymbol{e}_{j}\), 1 ≤ jd, denotes the vector with the j-th component being 1 and zero otherwise.

That is, \(\mathbb{C}_{l}\) and \(\mathbb{C}_{u}\) are the subsets of \(\overline{\mathbb{R}}_{+}^{d}\) after eliminating all the axes that correspond to the margins of a distribution. Note that \(\boldsymbol{w} = (w_{1},\ldots,w_{d}) \in \mathbb{C}_{l}\) if and only if at least two components of w are finite, and \(\boldsymbol{w} = (w_{1},\ldots,w_{d}) \in \mathbb{C}_{u}\) if and only if at least two components of \(\boldsymbol{w}\) are positive.

Definition 19.3.2.

  1. 1.

    \(\boldsymbol{X}\) is said to be smaller than \(\boldsymbol{X}^{\prime}\) in the sense of tail lower orthant dependence order, denoted as \(\boldsymbol{X} \leq _{\mathrm{tlod}}\boldsymbol{X}^{\prime}\), if all the margins of F and F ′ are tail equivalent, and for all \(\boldsymbol{w} = (w_{1},\ldots,w_{d}) \in \mathbb{C}_{l}\),

    $$\displaystyle{\max _{1\leq i\leq d}\{X_{i}/w_{i}\} \geq _{\mathrm{sto}}\max _{1\leq i\leq d}\{X^{\prime}_{i}/w_{i}\}.}$$
  2. 2.

    \(\boldsymbol{X}\) is said to be smaller than \(\boldsymbol{X}^{\prime}\) in the sense of tail upper orthant dependence order, denoted as \(\boldsymbol{X} \leq _{\mathrm{tuod}}\boldsymbol{X}^{\prime}\), if all the margins of F and F ′ are tail equivalent, and for all \(\boldsymbol{w} = (w_{1},\ldots,w_{d}) \in \mathbb{C}_{u}\),

    $$\displaystyle{\min _{1\leq i\leq d}\{X_{i}/w_{i}\} \leq _{\mathrm{sto}}\min _{1\leq i\leq d}\{X^{\prime}_{i}/w_{i}\}.}$$

It follows from Eq. (19.2.1) that if all the margins of F and F ′ are tail equivalent, then \(\boldsymbol{X} \leq _{\mathrm{tlod}}\boldsymbol{X}^{\prime}\) is equivalent to that Eq. (19.3.1) holds for \(\boldsymbol{w} = (w_{1},\ldots,w_{d}) \in \mathbb{C}_{l}\), and \(\boldsymbol{X} \leq _{\mathrm{tuod}}\boldsymbol{X}^{\prime}\) is equivalent to that Eq. (19.3.2) holds for \(\boldsymbol{w} = (w_{1},\ldots,w_{d}) \in \mathbb{C}_{u}\).

Remark 19.3.3.

  1. 1.

    If

    $$\displaystyle{(w_{1},\ldots,w_{d})\in \bigcup _{j=1}^{d}\{t\boldsymbol{e}_{ j}^{-1},t\geq 0\}\quad \mbox{ or}\quad (w_{ 1},\ldots,w_{d})\in \bigcup _{j=1}^{d}\{t\boldsymbol{e}_{ j},t\geq 0\},}$$

    then the inequalities in Definition 19.3.2 reduce to the stochastic tail orders of the marginal distributions. Since the margins are assumed to be tail equivalent, which may not satisfy stochastic tail comparison, we need to eliminate the margins from consideration. On the other hand, with given fixed marginal tails, what really matters in dependence comparison is various interior orthant subsets of \(\mathbb{C}_{l}\) or \(\mathbb{C}_{u}\).

  2. 2.

    If some corresponding margins of F and F ′ are not tail equivalent, one can still define the tail orthant orders ≤ tlo and ≤ tuo to compare their tail behaviors in orthants. But all corresponding margins of F and F ′ have to be tail equivalent in order to compare their tail dependence.

  3. 3.

    If some margins of F (or F ′) are not tail equivalent, then one can still define the tail orthant dependence order, but scaling functions would be different among the components.

  4. 4.

    Another alternative is to convert all the margins of F and F ′ to standard Pareto margins, resulting in Pareto copulas [245], and then compare their Pareto copulas using the ≤ tlod and ≤ tuod orders.

The preservation properties under the ≤ tlod and ≤ tuod orders can be easily established using Definitions 19.3.1 and 19.3.2, and Propositions 19.2.4 and 19.2.6. In particular, we have the following.

Proposition 19.3.4

Let \(\boldsymbol{X} = (X_{1},\ldots,X_{d})\), \(\boldsymbol{X}^{\prime} = (X_{1}^{\prime},\ldots,X_{d}^{\prime})\) and \(\boldsymbol{Y } = (Y _{1},\ldots,Y _{d})\), \(\boldsymbol{Y }^{\prime} = (Y _{1}^{\prime},\ldots,Y _{d}^{\prime})\) be positive random vectors with support \(\mathbb{R}_{+}^{d}\) , and Θ be a random variable with bounded support. Assume that \((\boldsymbol{X,X^{\prime}})\) and \((\boldsymbol{Y,Y ^{\prime}})\) are independent, and the regularity conditions of Proposition  19.2.6 are satisfied:

  1. 1.

    \(\boldsymbol{X} \leq _{\mathrm{tlo}}\boldsymbol{X}^{\prime}\) and \(\boldsymbol{Y } \leq _{\mathrm{tlo}}\boldsymbol{Y }^{\prime}\) imply that \(\boldsymbol{X \vee Y } \leq _{\mathrm{tlo}}\boldsymbol{X^{\prime} \vee Y ^{\prime}}\) . \(\boldsymbol{X} \leq _{\mathrm{tlod}}\boldsymbol{X}^{\prime}\) and \(\boldsymbol{Y } \leq _{\mathrm{tlod}}\boldsymbol{Y }^{\prime}\) imply that \(\boldsymbol{X \vee Y } \leq _{\mathrm{tlod}}\boldsymbol{X^{\prime} \vee Y ^{\prime}}\) .

  2. 2.

    \(\boldsymbol{X} \leq _{\mathrm{tuo}}\boldsymbol{X}^{\prime}\) and \(\boldsymbol{Y } \leq _{\mathrm{tuo}}\boldsymbol{Y }^{\prime}\) imply that \(\boldsymbol{X \wedge Y } \leq _{\mathrm{tuo}}\boldsymbol{X^{\prime} \wedge Y ^{\prime}}\) . \(\boldsymbol{X} \leq _{\mathrm{tuod}}\boldsymbol{X}^{\prime}\) and \(\boldsymbol{Y } \leq _{\mathrm{tuod}}\boldsymbol{Y }^{\prime}\) imply that \(\boldsymbol{X \wedge Y } \leq _{\mathrm{tuod}}\boldsymbol{X^{\prime} \wedge Y ^{\prime}}\).

  3. 3.

    If \([\boldsymbol{X}\mid \Theta =\theta ] \leq _{\mathrm{tlo}}[\boldsymbol{X}^{\prime}\mid \Theta =\theta ]\) for all θ in the bounded support of Θ, then \(\boldsymbol{X} \leq _{\mathrm{tlo}}\boldsymbol{X}^{\prime}\) . If \([\boldsymbol{X}\mid \Theta =\theta ] \leq _{\mathrm{tlod}}[\boldsymbol{X}^{\prime}\mid \Theta =\theta ]\) for all θ in the bounded support of Θ, then \(\boldsymbol{X} \leq _{\mathrm{tlod}}\boldsymbol{X}^{\prime}\) .

  4. 4.

    If \([\boldsymbol{X}\mid \Theta =\theta ] \leq _{\mathrm{tuo}}[\boldsymbol{X}^{\prime}\mid \Theta =\theta ]\) for all θ in the bounded support of Θ, then \(\boldsymbol{X} \leq _{\mathrm{tuo}}\boldsymbol{X}^{\prime}\) . If \([\boldsymbol{X}\mid \Theta =\theta ] \leq _{\mathrm{tuod}}[\boldsymbol{X}^{\prime}\mid \Theta =\theta ]\) for all θ in the bounded support of Θ, then \(\boldsymbol{X} \leq _{\mathrm{tuod}}\boldsymbol{X}^{\prime}\).

Example 19.3.5.

Let \(\boldsymbol{X} \sim \mathcal{E}(\boldsymbol{0},\Sigma _{1},R_{1})\) and \(\boldsymbol{X}^{\prime} \sim \mathcal{E}(\boldsymbol{0},\Sigma _{2},R_{2})\) [see Eq. (19.2.5)], where \(\Sigma _{1} = A_{1}A_{1}^{\top } = (\sigma _{ij})\) and \(\Sigma _{2} = A_{2}A_{2}^{\top } = (\lambda _{ij})\). Consider \(\boldsymbol{X}_{+} =\boldsymbol{ X \vee 0}\) and \(\boldsymbol{X}^{\prime}_{+} =\boldsymbol{ X^{\prime} \vee 0}\):

  1. 1.

    Suppose that

    $$\displaystyle{R_{1} \leq _{\mathrm{sto}}R_{2},\ \Sigma _{1} \leq \Sigma _{2}\ \mbox{ component-wise with}\ \sigma _{ii} =\lambda _{ii},i = 1,\ldots,d.}$$

    It follows from Example 9.A.8 in [426] that \(\boldsymbol{X}_{+} \leq _{\mathrm{uo}}R_{1}(A_{2}\boldsymbol{U \vee 0})\), which implies that \(\boldsymbol{X}_{+} \leq _{\mathrm{tuo}}R_{1}(A_{2}\boldsymbol{U \vee 0})\). Clearly,

    $$ \displaystyle(\underbrace{\mathop{R_{1},\ldots,R_{1}}}\limits_{d}) \leq_{\mathrm{tuo}}(\underbrace{\mathop{R_{2},\ldots,R_{2}}}\limits_{d}),$$

    which, together with Proposition 19.3.4 (4) and the fact that \(A_{2}\boldsymbol{U}\) has a bounded support, imply that \(\boldsymbol{X}_{+} \leq _{\mathrm{tuo}}R_{1}(A_{2}\boldsymbol{U \vee 0}) \leq _{\mathrm{tuo}}R_{2}(A_{2}\boldsymbol{U \vee 0})\). Thus \(\boldsymbol{X}_{+} \leq _{\mathrm{tuo}}\boldsymbol{X}^{\prime}_{+}\).

  2. 2.

    Suppose that

    $$\displaystyle{R_{1} \geq _{\mathrm{sto}}R_{2},\ \Sigma _{1} \leq \Sigma _{2}\ \mbox{ component-wise with}\ \sigma _{ii} =\lambda _{ii},i = 1,\ldots,d.}$$

    It follows from Example 9.A.8 in [426] that \(\boldsymbol{X}_{+} \leq _{\mathrm{lo}}R_{1}(A_{2}\boldsymbol{U \vee 0})\), which implies that \(\boldsymbol{X}_{+} \leq _{\mathrm{tlo}}R_{1}(A_{2}\boldsymbol{U \vee 0})\). Clearly,

    $$\displaystyle{(\underbrace{\mathop{R_{1},\ldots,R_{1}}}\limits _{d}) \leq _{\mathrm{tlo}}(\underbrace{\mathop{R_{2},\ldots,R_{2}}}\limits _{d}),}$$

    which, together with Proposition 19.3.4 (3) and the fact that \(A_{2}\boldsymbol{U}\) has a bounded support, imply that \(\boldsymbol{X}_{+} \leq _{\mathrm{tlo}}R_{1}(A_{2}\boldsymbol{U \vee 0}) \leq _{\mathrm{tlo}}R_{2}(A_{2}\boldsymbol{U \vee 0})\). Thus \(\boldsymbol{X}_{+} \leq _{\mathrm{tlo}}\boldsymbol{X}^{\prime}_{+}\).

To construct a wide class of examples involving the ≤ tlod and ≤ tuod orders, we employ the copula approach. A copula C is a multivariate distribution with standard uniformly distributed margins on [0, 1]. Sklar’s theorem (see, e.g., [211], Sect. 1.6) states that every multivariate distribution F with margins \(F_{1},\ldots,F_{d}\) can be written as \(F(x_{1},\ldots,x_{d}) = C(F_{1}(x_{1}),\ldots,F_{d}(x_{d}))\) for some d-dimensional copula C. In fact, in the case of continuous margins, C is unique and

$$\displaystyle{C(u_{1},\ldots,u_{d}) = F(F_{1}^{-1}(u_{ 1}),\ldots,F_{d}^{-1}(u_{ d}))}$$

where \(F_{i}^{-1}(u_{i})\) are the quantile functions of the i-th margin, 1 ≤ id. Let \((U_{1},\ldots,U_{d})\) denote a random vector with U i , 1 ≤ id, being uniformly distributed on [0, 1]. The survival copula \(\widehat{C}\) is defined as follows:

$$\displaystyle{ \widehat{C}(u_{1},\ldots,u_{n}) =\mathrm{ P}\{1 - U_{1} \leq u_{1},\ldots,1 - U_{n} \leq u_{n}\} = \overline{C}(1 - u_{1},\ldots,1 - u_{n}) }$$
(19.3.3)

where \(\overline{C}\) is the joint survival function of C. The upper exponent and upper tail dependence functions (see [207, 213, 244, 360]) are defined as follows,

$$\displaystyle\begin{array}{rcl} a(\boldsymbol{w};C)& :=& \lim _{u\rightarrow {0}^{+}} \frac{\mathrm{P}\{ \cup _{i=1}^{d}\{U_{i} > 1 - uw_{i}\}\}} {u}, \\ & & \forall \ \boldsymbol{w} = (w_{1},\ldots,w_{d}) \in \mathbb{R}_{+}^{d}\setminus \{0\}{}\end{array}$$
(19.3.4)
$$\displaystyle\begin{array}{rcl} b(\boldsymbol{w};C)& :=& \lim _{u\rightarrow {0}^{+}} \frac{\mathrm{P}\{ \cap _{i=1}^{d}\{U_{i} > 1 - uw_{i}\}\}} {u}, \\ & & \forall \ \boldsymbol{w} = (w_{1},\ldots,w_{d}) \in \mathbb{R}_{+}^{d}\setminus \{0\}{}\end{array}$$
(19.3.5)

provided that the limits exist. Note that both \(a(\boldsymbol{w};C)\) and \(b(\boldsymbol{w};C)\) are homogeneous of order 1 in the sense that \(a(c\boldsymbol{w};C) = ca(\boldsymbol{w};C)\) and \(b(c\boldsymbol{w};C) = cb(\boldsymbol{w};C)\) for any c > 0.

Theorem 19.3.6

Let \(\boldsymbol{X} = (X_{1},\ldots,X_{d})\) and \(\boldsymbol{X}^{\prime} = (X_{1}^{\prime},\ldots,X_{d}^{\prime})\) be two d-dimensional random vectors with respective copulas C and C′ and their respective continuous margins \(F_{1},\ldots,F_{d}\) and \(F_{1}^{\prime},\ldots,F_{d}^{\prime}\).

  1. 1.

    If C = C′ and \(F_{i} \leq _{\mathrm{sto}}F_{i}^{\prime}\) , 1 ≤ i ≤ d, then \(\boldsymbol{X} \leq _{\mathrm{tuo}}\boldsymbol{X}^{\prime}\) .

  2. 2.

    Assume that the upper tail dependence functions b(⋅;C) and b(⋅;C′) exist, and \(\overline{F}_{i} \in \mbox{ RV}_{-\alpha _{i}}\) and \(\overline{F}_{i}^{\prime} \in \mbox{ RV}_{-\alpha _{i}^{\prime}}\), \(i = 1,\ldots,d\) . If F i and F i ′, 1 ≤ i ≤ d, are all tail equivalent, and \(b(\boldsymbol{w};C) < b(\boldsymbol{w};C^{\prime})\) for all \(\boldsymbol{w} = (w_{1},\ldots,w_{d})\) with w i > 0, 1 ≤ i ≤ d, then \(\boldsymbol{X} \leq _{\mathrm{tuod}}\boldsymbol{X}^{\prime}\).

  3. 3.

    Assume that the upper tail dependence functions b(⋅;C) and b(⋅;C′) exist, and \(\overline{F}_{i} \in \mbox{ RV}_{-\alpha _{i}}\) and \(\overline{F}_{i}^{\prime} \in \mbox{ RV}_{-\alpha _{i}^{\prime}}\), \(i = 1,\ldots,d\) . If \(F_{i} \leq _{\mathrm{sto}}F_{i}^{\prime}\) , 1 ≤ i ≤ d, and \(b(\boldsymbol{w};C) < b(\boldsymbol{w};C^{\prime})\) for all \(\boldsymbol{w} = (w_{1},\ldots,w_{d})\) with w i > 0, 1 ≤ i ≤ d, then \(\boldsymbol{X} \leq _{\mathrm{tuo}}\boldsymbol{X}^{\prime}\) .

Proof

  1. (1)

    Since \(F_{i}(tw_{i}) \geq F_{i}^{\prime}(tw_{i})\), 1 ≤ id, for all \(t > t_{w_{i}}\), we have, for all \(t >\max \{ t_{w_{i}},1 \leq i \leq d\}\)

    $$\displaystyle\begin{array}{rcl} & & \mathrm{P}\{X_{1} > tw_{1},\ldots,X_{d} > tw_{d}\} {}\\ & =& \mathrm{P}\{F_{1}(X_{1}) > F_{1}(tw_{1}),\ldots,F_{d}(X_{d}) > F_{d}(tw_{d})\} {}\\ & \leq &\mathrm{P}\{F_{1}(X_{1}) > F_{1}^{\prime}(tw_{1}),\ldots,F_{d}(X_{d}) > F_{d}^{\prime}(tw_{d})\} {}\\ & =& \mathrm{P}\{F_{1}^{\prime}(X_{1}^{\prime}) > F_{1}^{\prime}(tw_{1}),\ldots,F_{d}^{\prime}(X_{d}^{\prime}) > F_{d}^{\prime}(tw_{d})\} {}\\ & =& \mathrm{P}\{X_{1}^{\prime} > tw_{1},\ldots,X_{d}^{\prime} > tw_{d}\}. {}\\ \end{array}$$
  2. (2)

    Write \(\overline{F}_{i}(t) = L_{i}(t)\,{t}^{-\alpha _{i}}\) and \(\overline{F^{\prime}}_{i}(t) = L_{i}^{\prime}(t)\,{t}^{-\alpha _{i}^{\prime}}\), 1 ≤ id. Since all the margins are tail equivalent, we have

    $$\displaystyle{\alpha _{i} =\alpha _{i}^{\prime} =:\alpha,\ \mbox{ and}\ \lim _{t\rightarrow \infty }\frac{L_{i}(t)} {L_{1}(t)} =\lim _{t\rightarrow \infty }\frac{L_{i}^{\prime}(t)} {L_{1}(t)} = 1,\ 1 \leq i \leq d.}$$

    In addition, since the functions L i ( ⋅)s and L i ( ⋅)s are slowly varying, then for all \(\boldsymbol{w} = (w_{1},\ldots,w_{d})\) with w i > 0, 1 ≤ id,

    $$\displaystyle{\lim _{t\rightarrow \infty }\frac{L_{i}(tw_{i})} {L_{1}(t)} =\lim _{t\rightarrow \infty }\frac{L_{i}^{\prime}(tw_{i})} {L_{1}(t)} = 1,\ 1 \leq i \leq d.}$$

    That is, for any \(\varepsilon > 0\), there exists t w that is sufficiently large, such that, for 1 ≤ id and all t > t w ,

    $$\displaystyle{(1-\varepsilon )L_{1}(t) \leq L_{i}(tw_{i}) \leq (1+\varepsilon )L_{1}(t),}$$
    $$\displaystyle{(1-\varepsilon )L_{1}(t) \leq L_{i}^{\prime}(tw_{i}) \leq (1+\varepsilon )L_{1}(t).}$$

    Observe that

    $$\displaystyle\begin{array}{rcl} & & \mathrm{P}\{X_{1} > tw_{1},\ldots,X_{d} > tw_{d}\} {}\\ & =& \mathrm{P}\big\{F_{i}(X_{i}) > 1 - L_{i}(tw_{i}){t}^{-\alpha }w_{ i}^{-\alpha },1 \leq i \leq d\big\} {}\\ & \leq &\mathrm{P}\big\{F_{i}(X_{i}) > 1 - L_{1}(t){t}^{-\alpha }(1+\varepsilon )w_{ i}^{-\alpha },1 \leq i \leq d\big\}, {}\\ \end{array}$$

    and thus

    $$\displaystyle\begin{array}{rcl} & & \limsup _{t\rightarrow \infty }\frac{\mathrm{P}\{X_{1} > tw_{1},\ldots,X_{d} > tw_{d}\}} {\overline{F}_{1}(t)} {}\\ & \leq &\lim _{t\rightarrow \infty }\frac{\mathrm{P}\big\{F_{i}(X_{i}) > 1 -\overline{F}_{1}(t)(1+\varepsilon )w_{i}^{-\alpha },1 \leq i \leq d\big\}} {\overline{F}_{1}(t)} {}\\ & =& b\big((1+\varepsilon )\boldsymbol{{w}}^{-\alpha };C\big) = (1+\varepsilon )b\big(\boldsymbol{{w}}^{-\alpha };C\big). {}\\ \end{array}$$

    Similarly,

    $$\displaystyle{\liminf _{t\rightarrow \infty }\frac{\mathrm{P}\{X_{1} > tw_{1},\ldots,X_{d} > tw_{d}\}} {\overline{F}_{1}(t)} \geq (1-\varepsilon )b\big(\boldsymbol{{w}}^{-\alpha };C\big).}$$

    Let \(\varepsilon \rightarrow 0\), we have

    $$\displaystyle{\lim _{t\rightarrow \infty }\frac{\mathrm{P}\{X_{1} > tw_{1},\ldots,X_{d} > tw_{d}\}} {\overline{F}_{1}(t)} = b\big(\boldsymbol{{w}}^{-\alpha };C\big).}$$

    For \(\boldsymbol{X}^{\prime}\) with copula C ′, we have

    $$\displaystyle{\lim _{t\rightarrow \infty }\frac{\mathrm{P}\{X^{\prime}_{1} > tw_{1},\ldots,X^{\prime}_{d} > tw_{d}\}} {\overline{F}_{1}(t)} = b\big(\boldsymbol{{w}}^{-\alpha };C^{\prime}\big).}$$

    Since \(b\big(\boldsymbol{{w}}^{-\alpha };C\big) < b\big(\boldsymbol{{w}}^{-\alpha };C^{\prime}\big)\) for each \(\boldsymbol{w} = (w_{1},\ldots,w_{d})\) with w i > 0, 1 ≤ id, there exists \(t_{\boldsymbol{w}}\) such that, for all \(t > t_{\boldsymbol{w}}\),

    $$\displaystyle{\mathrm{P}\{X_{1} > tw_{1},\ldots,X_{d} > tw_{d}\} \leq \mathrm{ P}\{X^{\prime}_{1} > tw_{1},\ldots,X^{\prime}_{d} > tw_{d}\}.}$$
  3. (3)

    The stochastic tail order follows from (1) and (2).

For the ≤ tlo order, we can establish a similar result.

Theorem 19.3.7

Let \(\boldsymbol{X} = (X_{1},\ldots,X_{d})\) and \(\boldsymbol{X}^{\prime} = (X_{1}^{\prime},\ldots,X_{d}^{\prime})\) be d-dimensional random vectors with respective copulas C and C′ and respective continuous margins \(F_{1},\ldots,F_{d}\) and \(F_{1}^{\prime},\ldots,F_{d}^{\prime}\).

  1. 1.

    If C = C′ and \(F_{i} \geq _{\mathrm{sto}}F_{i}^{\prime}\) , 1 ≤ i ≤ d, then \(\boldsymbol{X} \leq _{\mathrm{tlo}}\boldsymbol{X}^{\prime}\) .

  2. 2.

    Assume that the exponent functions a(⋅;C) and a(⋅;C′) exist, and \(\overline{F}_{i} \in \mbox{ RV}_{-\alpha _{i}}\) and \(\overline{F}_{i}^{\prime} \in \mbox{ RV}_{-\alpha _{i}^{\prime}}\), \(i = 1,\ldots,d\) . If F i and F i ′, 1 ≤ i ≤ d, are all tail equivalent, and \(a(\boldsymbol{w};C) > a(\boldsymbol{w};C^{\prime})\) for all \(\boldsymbol{w} = (w_{1},\ldots,w_{d})\) with w i > 0, 1 ≤ i ≤ d, then \(\boldsymbol{X} \leq _{\mathrm{tlod}}\boldsymbol{X}^{\prime}\).

  3. 3.

    Assume that the upper tail dependence functions a(⋅;C) and a(⋅;C′) exist, and \(\overline{F}_{i} \in \mbox{ RV}_{-\alpha _{i}}\) and \(\overline{F}_{i}^{\prime} \in \mbox{ RV}_{-\alpha _{i}^{\prime}}\), \(i = 1,\ldots,d\) . If \(F_{i} \geq _{\mathrm{sto}}F_{i}^{\prime}\) , 1 ≤ i ≤ d, and \(a(\boldsymbol{w};C) > a(\boldsymbol{w};C^{\prime})\) for all \(\boldsymbol{w} = (w_{1},\ldots,w_{d})\) with w i > 0, 1 ≤ i ≤ d, then \(\boldsymbol{X} \leq _{\mathrm{tlo}}\boldsymbol{X}^{\prime}\) .

Example 19.3.8.

The tail dependence functions of Archimedean copulas were derived in [34, 84] (also see Propositions 2.5 and 3.3 in [213]). Let \(C_{\mbox{ Arch}}(\boldsymbol{u};\phi ) =\phi (\sum _{i=1}^{d}{\phi }^{-1}(u_{i}))\) be an Archimedean copula with strict generator ϕ − 1, where ϕ is regularly varying at with tail index θ > 0. The upper tail dependence function of the survival copula \(\widehat{C}_{\mbox{ Arch}}\) is given by

$$\displaystyle{b(\boldsymbol{w};\widehat{C}_{\mbox{ Arch}}) ={\Bigl (\sum _{ j=1}^{d}w_{ j}^{{-1/\theta }\Bigr )}}^{-\theta }.}$$

Observe that \(b(\boldsymbol{w};\widehat{C}_{\mbox{ Arch}})\) is strictly increasing in θ. For θ < θ ′, and C and C ′ be two copulas with df \(\widehat{C}_{\mbox{ Arch}}\) having parameters θ and θ ′ respectively. Thus \(b(\boldsymbol{w};C) < b(\boldsymbol{w};C^{\prime})\) for all \(\boldsymbol{w > 0}\). For 1 ≤ id, let F i have the Fréchet df with tail index α (i.e., \(F_{i}(s) =\exp \{ -{s}^{-\alpha }\}\)) and F i have the distribution of Pareto Type II with tail index α (i.e., \(F_{i}^{\prime}(s) = 1 - {(1 + s)}^{-\alpha }\)). Clearly, F i and F i are tail equivalent. Let \(\boldsymbol{X}\) and \(\boldsymbol{X}^{\prime}\) have the dfs of

$$\displaystyle{C(F_{1}(x_{1}),\ldots,F_{d}(x_{d})),\ \mbox{ and}\ C^{\prime}(F_{1}^{\prime}(x_{1}),\ldots,F_{d}^{\prime}(x_{d})),}$$

respectively, and by Theorem 19.3.6, \(\boldsymbol{X} \leq _{\mathrm{tuod}}\boldsymbol{X}^{\prime}\).

Example 19.3.9.

The exponent functions of Archimedean copulas were derived in [34, 177] (also see Propositions 2.5 and 3.3 in [213]). Let \(C_{\mbox{ Arch}}(\boldsymbol{u};\phi ) =\phi (\sum _{i=1}^{d}{\phi }^{-1}(u_{i}))\) be an Archimedean copula, where the generator ϕ − 1 is regularly varying at 1 with tail index β > 1. The upper exponent function of C Arch is given by

$$\displaystyle{a(\boldsymbol{w};C_{\mbox{ Arch}}) ={\bigl (\sum _{ j=1}^{d}w_{ j}^{{\beta }\bigr )}}^{1/\beta }.}$$

Observe that \(a(\boldsymbol{w};C_{\mbox{ Arch}})\) is strictly decreasing in β. For β < β ′, and C and C ′ be two copulas with df C Arch having parameters β and β ′, respectively. Thus \(a(\boldsymbol{w};C) > a(\boldsymbol{w};C^{\prime})\) for all w > 0. For 1 ≤ id, let F i have the Fréchet df with tail index α (i.e., \(F_{i}(s) =\exp \{ -{s}^{-\alpha }\}\)) and F i have the distribution of Pareto Type II with tail index α (i.e., \(F_{i}^{\prime}(s) = 1 - {(1 + s)}^{-\alpha }\)), such that F i and F i are tail equivalent. Let \(\boldsymbol{X}\) and \(\boldsymbol{X}^{\prime}\) have the dfs of

$$\displaystyle{C(F_{1}(x_{1}),\ldots,F_{d}(x_{d}))\quad \mbox{ and}\quad C^{\prime}(F_{1}^{\prime}(x_{1}),\ldots,F_{d}^{\prime}(x_{d})),}$$

respectively, and by Theorem 19.3.7, \(\boldsymbol{X} \leq _{\mathrm{tlod}}\boldsymbol{X}^{\prime}\).

Remark 19.3.10.

Due to the homogeneity property, the conditions on tail dependence and exponent functions used in Theorem 19.3.6 (2) and (3) and in Theorem 19.3.7 (2) and (3) can be simplified. For example, it is sufficient in Theorem 19.3.6 (2) and (3) to verify that \(b(\boldsymbol{w};C) < b(\boldsymbol{w};C^{\prime})\) for all \(\boldsymbol{w} = (w_{1},\ldots,w_{d})\) with w i > 0, 1 ≤ id, and \(\vert \vert \boldsymbol{w}\vert \vert = 1\), where | | ⋅ | | denotes any norm on \(\mathbb{R}_{+}^{d}\).

Theorem 19.3.11

Let \((\boldsymbol{X}_{n},n \geq 1)\) and \((\boldsymbol{X}^{\prime}_{n},n \geq 1)\) be two i.i.d. samples with dfs F and F′, respectively. Assume that F ∈DA (G) and F′∈DA (G′) with tail equivalent Fréchet margins.

  1. 1.

    If \(\boldsymbol{X}_{n} \leq _{\mathrm{tlod}}\boldsymbol{X}^{\prime}_{n}\) , then G ≤ lod G′.

  2. 2.

    G ≤ lod G′ if and only if G ≤ tlod G′.

Proof: Let \(\boldsymbol{Y } = (Y _{1},\ldots,Y _{d})\) and \(\boldsymbol{Y }^{\prime} = (Y _{1}^{\prime},\ldots,Y _{d}^{\prime})\) denote two random vectors that have the same distributions as these of \(\boldsymbol{X}_{n}\) and \(\boldsymbol{X}_{n}^{\prime}\), respectively.

  1. (1)

    It follows from Theorem 19.1.1 and Remark 19.1.2 (1) that

    $$\displaystyle{{[\mathrm{P}\{\boldsymbol{X}_{k} \leq \boldsymbol{ a_{n}x}\}]}^{n} \rightarrow G(\boldsymbol{x}),\ \forall \ \boldsymbol{x} = (x_{ 1},\ldots,x_{d}) \in \mathbb{R}_{+}^{d},}$$

    where \(\boldsymbol{a}_{n} = (a_{1,n},\ldots,a_{d,n}) = (\overline{F}_{1}^{\,-1}(1/n),\ldots,\overline{F}_{d}^{\,-1}(1/n))\). Taking the logarithm on both sides, we have, as n,

    $$\displaystyle{ n\log \mathrm{P}\{\boldsymbol{X}_{k} \leq \boldsymbol{ a_{n}x}\} \approx -n\mathrm{P}\big\{ \cup _{i=1}^{d}\{Y _{ i} > a_{i,n}x_{i}\}\big\} \rightarrow \log G(\boldsymbol{x}). }$$
    (19.3.6)

    Since the margins are tail equivalent, \(a_{i,n}/a_{1,n} \rightarrow 1\) as n. Thus, for any small \(\varepsilon > 0\), when n is sufficiently large,

    $$\displaystyle{ (1-\varepsilon )a_{1,n} \leq a_{i,n} \leq (1+\varepsilon )a_{1,n},\ i = 1,\ldots,n, }$$
    (19.3.7)

    which imply that

    $$\displaystyle\begin{array}{rcl} & & -n\mathrm{P}\big\{ \cup _{i=1}^{d}\{Y _{ i} > a_{1,n}x_{i}\}\big\} \geq -n\mathrm{P}\big\{ \cup _{i=1}^{d}\{Y _{ i} > a_{i,n}{(1+\varepsilon )}^{-1}x_{ i}\}\big\} {}\\ & & \quad \rightarrow \log G({(1+\varepsilon )}^{-1}\boldsymbol{x}). {}\\ \end{array}$$

    Observing that \(\log G(\boldsymbol{x})\) is homogeneous of order − α, we have

    $$\displaystyle{\liminf _{n\rightarrow \infty }\big[-n\mathrm{P}\big\{ \cup _{i=1}^{d}\{Y _{ i} > a_{1,n}x_{i}\}\big\}\big] \geq {(1+\varepsilon )}^{\alpha }\log G(\boldsymbol{x}).}$$

    Similarly,

    $$\displaystyle{\limsup _{n\rightarrow \infty }\big[-n\mathrm{P}\big\{ \cup _{i=1}^{d}\{Y _{ i} > a_{1,n}x_{i}\}\big\}\big] \leq {(1-\varepsilon )}^{\alpha }\log G(\boldsymbol{x}).}$$

    Let \(\varepsilon \rightarrow 0\), we obtain that

    $$\displaystyle{ \lim _{n\rightarrow \infty }\big[-n\mathrm{P}\big\{ \cup _{i=1}^{d}\{Y _{ i} > a_{1,n}x_{i}\}\big\}\big] =\log G(\boldsymbol{x}). }$$
    (19.3.8)

    That is, using the tail equivalence Eq. (19.3.7), we can rewrite the limit Eq. (19.3.6) in the form of Eq. (19.3.8), in which the scaling a 1, n is the same for all margins. Working on \(\boldsymbol{X}_{n}^{\prime}\) in the same way, we also obtain that

    $$\displaystyle{ \lim _{n\rightarrow \infty }\big[-n\mathrm{P}\big\{ \cup _{i=1}^{d}\{Y _{ i}^{\prime} > a_{1,n}^{\prime}x_{i}\}\big\}\big] =\log G^{\prime}(\boldsymbol{x}), }$$
    (19.3.9)

    where \(\boldsymbol{a}_{n}^{\prime} = (a_{1,n}^{\prime},\ldots,a_{d,n}^{\prime}) = ({\overline{F^{\prime}_{1}}}^{\,-1}(1/n),\ldots,{\overline{F^{\prime}_{d}}}^{\,-1}(1/n))\). Again, since the margins are tail equivalent, \(a^{\prime}_{1,n}/a_{1,n} \rightarrow 1\) as n. Using the same idea as that of Eq. (19.3.7), the limit Eq. (19.3.9) is equivalent to

    $$\displaystyle{ \lim _{n\rightarrow \infty }\big[-n\mathrm{P}\big\{ \cup _{i=1}^{d}\{Y _{ i}^{\prime} > a_{1,n}x_{i}\}\big\}\big] =\log G^{\prime}(\boldsymbol{x}). }$$
    (19.3.10)

    Since \(\boldsymbol{X}_{n} \leq _{\mathrm{tlod}}\boldsymbol{X}^{\prime}_{n}\), via Eq. (19.3.1), we have

    $$\displaystyle{\mathrm{P}\big\{ \cup _{i=1}^{d}\{Y _{ i} > a_{1,n}x_{i}\}\big\} \geq \mathrm{ P}\big\{ \cup _{i=1}^{d}\{Y _{ i}^{\prime} > a_{1,n}x_{i}\}\big\}.}$$

    It follows from Eqs. (19.3.8) and (19.3.10) that \(G(\boldsymbol{x}) \leq G^{\prime}(\boldsymbol{x})\) for all \(\boldsymbol{x} \in \mathbb{R}_{+}^{d}\).

  2. (2)

    It follows from the Pickands representation [see Remark 19.1.2 (4)] that

    $$\displaystyle\begin{array}{rcl} G(\boldsymbol{x})& =& \exp \Big\{-c\int _{\mathbb{S}_{+}^{d}}\max _{1\leq i\leq d}\{{(a_{i}/x_{i})}^{\alpha }\}\mathbb{Q}(\mathrm{d}\boldsymbol{a})\Big\},\ \ \ \ \ {}\end{array}$$
    (19.3.11)
    $$\displaystyle\begin{array}{rcl} G^{\prime}(\boldsymbol{x})& =& \exp \Big\{-c^{\prime}\int _{\mathbb{S}_{+}^{d}}\max _{1\leq i\leq d}\{{(a_{i}/x_{i})}^{\alpha }\}\mathbb{Q}^{\prime}(\mathrm{d}\boldsymbol{a})\Big\},\ \ \ \ \ \ \ {}\end{array}$$
    (19.3.12)

    where c > 0, c ′ > 0, and \(\mathbb{Q}\) and \(\mathbb{Q}^{\prime}\) are probability measures defined on \(\mathbb{S}_{+}^{d-1}\). Taking the scaling function t for both dfs, we have

    $$\displaystyle\begin{array}{rcl} G(t\boldsymbol{x})& =& \exp \Big\{-\frac{c} {{t}^{\alpha }}\int _{\mathbb{S}_{+}^{d}}\max _{1\leq i\leq d}\{{(a_{i}/x_{i})}^{\alpha }\}\mathbb{Q}(\mathrm{d}\boldsymbol{a})\Big\}, {}\\ G^{\prime}(t\boldsymbol{x})& =& \exp \Big\{-\frac{c^{\prime}} {{t}^{\alpha }}\int _{\mathbb{S}_{+}^{d}}\max _{1\leq i\leq d}\{{(a_{i}/x_{i})}^{\alpha }\}\mathbb{Q}^{\prime}(\mathrm{d}\boldsymbol{a})\Big\}. {}\\ \end{array}$$

    For each fixed \(\boldsymbol{x}\), when t is sufficiently large,

    $$\displaystyle{\frac{1 - G(t\boldsymbol{x})} {1 - G^{\prime}(t\boldsymbol{x})} \sim \frac{c\int _{\mathbb{S}_{+}^{d}}\max _{1\leq i\leq d}\{{(a_{i}/x_{i})}^{\alpha }\}\mathbb{Q}(\mathrm{d}\boldsymbol{a})} {c^{\prime}\int _{\mathbb{S}_{+}^{d}}\max _{1\leq i\leq d}\{{(a_{i}/x_{i})}^{\alpha }\}\mathbb{Q}^{\prime}(\mathrm{d}\boldsymbol{a})}.}$$

    If Gtlod G ′, then \(1 - G(t\boldsymbol{x}) \geq 1 - G^{\prime}(t\boldsymbol{x})\) for \(t > t_{\boldsymbol{x}}\), where \(t_{\boldsymbol{x}}\) is sufficiently large. That is,

    $$\displaystyle{c\int _{\mathbb{S}_{+}^{d}}\max _{1\leq i\leq d}\{{(a_{i}/x_{i})}^{\alpha }\}\mathbb{Q}(\mathrm{d}\boldsymbol{a}) \geq c^{\prime}\int _{\mathbb{S}_{+}^{d}}\max _{1\leq i\leq d}\{{(a_{i}/x_{i})}^{\alpha }\}\mathbb{Q}^{\prime}(\mathrm{d}\boldsymbol{a}),}$$

    which, together with Eqs. (19.3.11) and (19.3.12), imply that \(G(\boldsymbol{x}) \leq G^{\prime}(\boldsymbol{x})\) for all \(\boldsymbol{x}\).

Conversely, it is trivial that Glod G ′ implies that Gtlod G ′.

Using similar arguments, we can also establish the upper orthant dependence comparisons for MEV distributions.

Theorem 19.3.12

Let \((\boldsymbol{X}_{n},n \geq 1)\) and \((\boldsymbol{X}^{\prime}_{n},n \geq 1)\) be two i.i.d. samples with dfs F and F′ respectively. Assume that \(F \in \mbox{ DA}_{\wedge }(H)\) and \(F^{\prime} \in \mbox{ DA}_{\wedge }(H^{\prime})\) with tail equivalent, negative Fréchet margins (i.e., \(F_{i}(x) = F_{i}^{\prime}(x) = 1 -\exp \{-{(-x)}^{-\theta }\}\)).

  1. 1.

    If \(X_{n} \leq _{\mathrm{tuod}}X^{\prime}_{n}\) , then \(H \leq _{\mathrm{uod}}H^{\prime}\) .

  2. 2.

    \(H \leq _{\mathrm{uod}}H^{\prime}\) if and only if \(H \leq _{\mathrm{tuod}}H^{\prime}\).

As illustrated in Theorems 19.3.11 (2) and 19.3.12 (2), upper tail comparisons of MEV dfs trickle down to comparisons of entire distributions due to the scalable property of homogeneity.

We conclude this paper with an example to illustrate an idea for obtaining asymptotic Fréchet bounds.

Example 19.3.13.

Let (T 1, T 2) denote a random vector with a Marshall-Olkin distribution [307] as defined as follows:

$$\displaystyle{T_{1} = E_{1} \wedge E_{12},\ T_{2} = E_{2} \wedge E_{12},}$$

where \(E_{1},E_{2},E_{12}\) are i.i.d. exponentially distributed with unit mean. Clearly, for t 1 > 0, or t 2 > 0,

$$\displaystyle{\mathrm{P}\{T_{1} > t_{1},T_{2} > t_{2}\} = {e}^{-(t_{1}+t_{2})-t_{1}\vee t_{2} } < {e}^{-t_{1}\vee t_{2} } =\mathrm{ P}\{T_{1} > t_{1},T_{1} > t_{2}\}.}$$

Hence \((T_{1},T_{2}) \leq _{\mathrm{uod}}(T_{1},T_{1})\), which is known as the Fréchet upper bound for the class of dfs with fixed exponential margins.

Let R 1 and R 2 denote two nonnegative random variables that are independent of \((T_{1},T_{2})\). Assume that the survival functions of \(R_{1}^{-1}\) and \(R_{2}^{-1}\) are tail equivalent, regularly varying with tail index − α [see Eq. (19.1.6)]. Since \(R_{1}^{-1}\) and \(R_{2}^{-1}\) are tail equivalent, the margins of \((R_{2}^{-1}T_{1},R_{2}^{-1}T_{2})\) and \((R_{1}^{-1}T_{1},R_{1}^{-1}T_{1})\) are all tail equivalent. It follows from Theorem 3.2 of [280] that the upper tail dependence functions of \((R_{2}^{-1}T_{1},R_{2}^{-1}T_{2})\) and \((R_{1}^{-1}T_{1},R_{1}^{-1}T_{1})\) are given by

$$\displaystyle\begin{array}{rcl} b(w_{1},w_{2})& =& {2}^{\alpha }{(w_{1} + w_{2} + w_{1} \vee w_{2})}^{-\alpha } {}\\ b^{\prime}(w_{1},w_{2})& =& {2}^{\alpha }{(w_{1} \vee w_{2})}^{-\alpha }. {}\\ \end{array}$$

Clearly, \(b(w_{1},w_{2}) < b^{\prime}(w_{1},w_{2})\), and thus by Theorem 19.3.6 we have \((R_{2}^{-1}T_{1},R_{2}^{-1}T_{2}) \leq _{\mathrm{tuod}}(R_{1}^{-1}T_{1},R_{1}^{-1}T_{1})\). Note that the df of \((R_{1}^{-1}T_{1},R_{1}^{-1}T_{1})\) is viewed as an asymptotic Fréchet upper bound in the sense of tail upper orthant order, because the respective margins of \((R_{2}^{-1}T_{1},R_{2}^{-1}T_{2})\) and \((R_{1}^{-1}T_{1},R_{1}^{-1}T_{1})\) are only tail equivalent, rather than being identical as required in the case of Fréchet bounds.