Abstract
We establish two-sided bounds for expectations of order statistics (k-th maxima) of moduli of coordinates of centered log-concave random vectors with uncorrelated coordinates. Our bounds are exact up to multiplicative universal constants in the unconditional case for all k and in the isotropic case for k ≤ n − cn 5∕6. We also derive two-sided estimates for expectations of sums of k largest moduli of coordinates for some classes of random vectors.
Access provided by Autonomous University of Puebla. Download chapter PDF
Similar content being viewed by others
3.1 Introduction and Main Results
For a vector \(x\in {\mathbb R}^n\) let \(k{\text -}\max x_i\) (or \(k{\text -}\min x_i\)) denote its k-th maximum (respectively its k-th minimum), i.e. its k-th maximal (respectively k-th minimal) coordinate. For a random vector X = (X 1, …, X n), \(k{\text -}\min X_i\) is also called the k-th order statistic of X.
Let X = (X 1, …, X n) be a random vector with finite first moment. In this note we try to estimate \({\mathbb E} k{\text -}\max _i|X_i|\) and
Order statistics play an important role in various statistical applications and there is an extensive literature on this subject (cf. [2, 5] and references therein).
We put special emphasis on the case of log-concave vectors, i.e. random vectors X satisfying the property \({\mathbb P}(X\in \lambda K+(1-\lambda )L)\geq {\mathbb P}(X\in K)^{\lambda {\mathbb P}}(X\in L)^{1-\lambda }\) for any λ ∈ [0, 1] and any nonempty compact sets K and L. By the result of Borell [3] a vector X with full dimensional support is log-concave if and only if it has a log-concave density, i.e. the density of a form e −h(x) where h is convex with values in (−∞, ∞]. A typical example of a log-concave vector is a vector uniformly distributed over a convex body. In recent years the study of log-concave vectors attracted attention of many researchers, cf. monographs [1, 4].
To bound the sum of k largest coordinates of X we define
and start with an easy upper bound.
Proposition 3.1
For any random vector X with finite first moment we have
Proof
For any t > 0 we have
□
It turns out that this bound may be reversed for vectors with independent coordinates or, more generally, vectors satisfying the following condition
If α = 1 this means that moduli of coordinates of X are negatively correlated.
Theorem 3.2
Suppose that a random vector X satisfies condition (3.3) with some α ≥ 1. Then there exists a constant c(α) > 0 which depends only on α such that for any 1 ≤ k ≤ n,
We may take c(α) = (288(5 + 4α)(1 + 2α))−1.
In the case of i.i.d. coordinates two-sided bounds for \({\mathbb E}\max _{|I|=k} \sum _{i\in I} | a_iX_i|\) in terms of an Orlicz norm (related to the distribution of X i) of a vector (a i)i≤n where known before, see [7].
Log-concave vectors with diagonal covariance matrices behave in many aspects like vectors with independent coordinates. This is true also in our case.
Theorem 3.3
Let X be a log-concave random vector with uncorrelated coordinates (i.e. \({\operatorname {Cov}}(X_{i},X_{j})=0\) for i ≠ j). Then for any 1 ≤ k ≤ n,
In the above statement and in the sequel c and C denote positive universal constants.
The next two examples show that the lower bound cannot hold if n ≫ k and only marginal distributions of X i are log-concave or the coordinates of X are highly correlated.
Example 3.1
Let X = (ε 1g, ε 2g, …, ε ng), where ε 1, …, ε n, g are independent, \({\mathbb P}(\varepsilon _i=\pm 1)=1/2\) and g has the normal \({\mathcal N}(0,1)\) distribution. Then \({\operatorname {Cov}} X = {\operatorname {Id}}\) and it is not hard to check that \({\mathbb E}\max _{|I|=k}\sum _{i\in I}|X_i|=k \sqrt {2/\pi }\) and \(t(k, X)\sim \ln ^{1/2} (n/k)\) if k ≤ n∕2.
Example 3.2
Let X = (g, …, g), where \(g\sim {\mathcal N}(0,1)\). Then, as in the previous example, \({\mathbb E}\max _{|I|=k}\sum _{i\in I}|X_i|=k\sqrt {2/\pi }\) and \(t(k, X)\sim \ln ^{1/2} (n/k)\).
Question 3.1
Let \(X^{\prime }=(X^{\prime }_1,X^{\prime }_2,\ldots ,X^{\prime }_n)\) be a decoupled version of X, i.e. \(X^{\prime }_i\) are independent and \(X^{\prime }_i\) has the same distribution as X i. Due to Theorem 3.2 (applied to X ′), the assertion of Theorem 3.3 may be stated equivalently as
Is the more general fact true that for any symmetric norm and any log-concave vector X with uncorrelated coordinates
Maybe such an estimate holds at least in the case of unconditional log-concave vectors?
We turn our attention to bounding k-maxima of |X i|. This was investigated in [8] (under some strong assumptions on the function \(t\mapsto {\mathbb P}(|X_i|\ge t)\)) and in the weighted i.i.d. setting in [7, 9, 15]. We will give different bounds valid for log-concave vectors, in which we do not have to assume independence, nor any special conditions on the growth of the distribution function of the coordinates of X. To this end we need to define another quantity:
Theorem 3.4
Let X be a mean zero log-concave n-dimensional random vector with uncorrelated coordinates and 1 ≤ k ≤ n. Then
Moreover, if X is additionally unconditional then
The next theorem provides an upper bound in the general log-concave case.
Theorem 3.5
Let X be a mean zero log-concave n-dimensional random vector with uncorrelated coordinates and 1 ≤ k ≤ n. Then
and
In the isotropic case (i.e. \({\mathbb E} X_i=0, {\operatorname {Cov}} X = {\operatorname {Id}}\)) one may show that t ∗(k∕2, X) ∼ t ∗(k, X) ∼ t(k, X) for k ≤ n∕2 and \(t^*( p,X)\sim \frac {n- p}{n}\) for p ≥ n∕4 (see Lemma 3.24 below). In particular t ∗(n − k + 1 − (n − k + 1)5∕6∕2, X) ∼ k∕n + n −1∕6 for k ≤ n∕2. This together with the two previous theorems implies the following corollary.
Corollary 3.6
Let X be an isotropic log-concave n-dimensional random vector and 1 ≤ k ≤ n∕2. Then
and
If X is additionally unconditional then
Question 3.2
Does the second part of Theorem 3.4 hold without the unconditionality assumptions? In particular, is it true that in the isotropic log-concave case \({\mathbb E} k\text{-}\min _{i\leq n}|X_i|\sim k/n\) for 1 ≤ k ≤ n∕2?
Notation
Throughout this paper by letters C, c we denote universal positive constants and by C(α), c(α) constants depending only on the parameter α. The values of constants C, c, C(α), c(α) may differ at each occurrence. If we need to fix a value of constant, we use letters C 0, C 1, … or c 0, c 1, …. We write f ∼ g if cf ≤ g ≤ Cg. For a random variable Z we denote \(\|Z\|{ }_p=({\mathbb E}|Z|{ }^p)^{1/p}\). Recall that a random vector X is called isotropic, if \({\mathbb E} X=0\) and \({\operatorname {Cov}} X={\operatorname {Id}}\).
This note is organised as follows. In Sect. 3.2 we provide a lower bound for the sum of k largest coordinates, which involves the Poincaré constant of a vector. In Sect. 3.3 we use this result to obtain Theorem 3.3. In Sect. 3.4 we prove Theorem 3.2 and provide its application to comparison of weak and strong moments. In Sect. 3.5 we prove the first part of Theorem 3.4 and in Sect. 3.6 we prove the second part of Theorems 3.4, 3.5, and Lemma 3.24.
3.2 Exponential Concentration
A probability measure μ on \({\mathbb R}^n\) satisfies exponential concentration with constant α > 0 if for any Borel set A with μ(A) ≥ 1∕2,
We say that a random n-dimensional vector satisfies exponential concentration if its distribution has such a property.
It is well known that exponential concentration is implied by the Poincaré inequality
and \(\alpha \leq 3\sqrt {\beta }\) (cf. [12, Corollary 3.2]).
Obviously, the constant in the exponential concentration is not linearly invariant. Typically one assumes that the vector is isotropic. For our purposes a more natural normalization will be that all coordinates have L 1-norm equal to 1.
The next proposition states that bound (3.2) may be reversed under the assumption that X satisfies the exponential concentration.
Proposition 3.7
Assume that Y = (Y 1, …, Y n) satisfies the exponential concentration with constant α > 0 and \({\mathbb E} |Y_i|\geq 1\) for all i. Then for any sequence \(a=(a_i)_{i=1}^n\) of real numbers and X i := a iY i we have
where t(k, X) is given by (3.1).
We begin the proof with a few simple observations.
Lemma 3.8
For any real numbers z 1, …, z n and 1 ≤ k ≤ n we have
Proof
Without loss of generality we may assume that z 1 ≥ z 2 ≥… ≥ z n ≥ 0. Then
□
Fix a sequence (X i)i≤n and define for s ≥ 0,
Corollary 3.9
For any k = 1, …, n,
and for any t > 0,
In particular
Proof
We have
where the last equality follows by Lemma 3.8.
Moreover,
The last part of the assertion easily follows, since
□
Proof of Proposition 3.7
To shorten the notation put t k := t(k, X). Without loss of generality we may assume that a 1 ≥ a 2 ≥… ≥ a n ≥ 0 and a ⌈k∕4⌉ = 1. Observe first that
so we may assume that \(t_k\geq 16\alpha /\sqrt {k}\).
Let μ be the law of Y and
We have
so we may assume that μ(A) ≥ 1∕2.
Observe that if y ∈ A and \(\sum _{i=1}^n {\mathbf 1}_{\{|a_iz_i|\geq s\}}\geq l >k\) for some s ≥ t k then
Thus we have
Therefore
and
where to get the next-to-last inequality we used the fact that \(t_k\geq 16\alpha /\sqrt {k}\).
Hence Corollary 3.9 and the definition of t k yields
so \({\mathbb E}\max _{|I|=k}\sum _{i\in I}|X_i|\geq \frac {1}{2}kt_k\). □
We finish this section with a simple fact that will be used in the sequel.
Lemma 3.10
Suppose that a measure μ satisfies exponential concentration with constant α. Then for any c ∈ (0, 1) and any Borel set A with μ(A) > c we have
Proof
Let \(D:={\mathbb R}^n\setminus (A+rB_2^n)\). Observe that \(D+rB_2^n\) has an empty intersection with A so if μ(D) ≥ 1∕2 then
and \(r<\alpha \ln (1/c)\). Hence \(\mu (A+\alpha \ln (1/c)B_2^n)\geq 1/2\), therefore for s ≥ 0,
and the assertion easily follows. □
3.3 Sums of Largest Coordinates of Log-Concave Vectors
We will use the regular growth of moments of norms of log-concave vectors multiple times. By [4, Theorem 2.4.6], if \(f:{\mathbb R}^n \to {\mathbb R}\) is a seminorm, and X is log-concave, then
where C 1 is a universal constant.
We will also apply a few times the functional version of the Grünbaum inequality (see [14, Lemma 5.4]) which states that
Let us start with a few technical lemmas. The first one will be used to reduce proofs of Theorem 3.3 and lower bound in Theorem 3.4 to the symmetric case.
Lemma 3.11
Let X be a log-concave n-dimensional vector and X ′ be an independent copy of X. Then for any 1 ≤ k ≤ n,
and
Proof
The first estimate follows by the easy bound
To get the second bound we may and will assume that \({\mathbb E} |X_1|\geq {\mathbb E} |X_2|\geq \ldots \geq {\mathbb E} |X_n|\). Let us define \(Y:=X-{\mathbb E} X\), \(Y^{\prime }:=X^{\prime }-{\mathbb E} X\) and \(M:=\frac {1}{k}\sum _{i=1}^k{\mathbb E}|X_i|\geq \max _{i\geq k}{\mathbb E}|X_i|\). Obviously
We have \({\mathbb E} Y_i=0\), thus \({\mathbb P}(Y_i\leq 0)\geq 1/e\) by (3.8). Hence
for t ≥ 0. In the same way we show that
Therefore
We have
Together with (3.11) we get
and (3.9) easily follows.
In order to prove (3.10), note that for u > 0,
thus the last part of the assertion follows by the definition of parameters t ∗. □
Lemma 3.12
Suppose that V is a real symmetric log-concave random variable. Then for any t > 0 and λ ∈ (0, 1],
Moreover, if \({\mathbb P}(|V|\ge t)\le 1/4\) , then \({\mathbb E} |V|{\mathbf 1}_{\{|V|\geq t\}} \leq 4t {\mathbb P}(|V|\geq t).\)
Proof
Without loss of generality we may assume that \({\mathbb P}(|V|\geq t)\leq 1/4\) (otherwise the first estimate is trivial).
Observe that \({\mathbb P}(|V|\geq s)=\exp (-N(s))\) where N : [0, ∞) → [0, ∞] is convex and N(0) = 0. In particular
and
We have
This implies the second part of the lemma.
To conclude the proof of the first bound it is enough to observe that
□
Proof of Theorem 3.3
By Proposition 3.1 it is enough to show the lower bound. By Lemma 3.11 we may assume that X is symmetric. We may also obviously assume that \(\|X_i\|{ }_2^2={\mathbb E} X_i^2>0\) for all i.
Let Z = (Z 1, …, Z n), where Z i = X i∕∥X i∥2. Then Z is log-concave, isotropic and, by (3.7), \({\mathbb E} |Z_i|\geq 1/ (2C_1)\) for all i. Set Y := 2C 1Z. Then X i = a iY i and \({\mathbb E} |Y_i|\geq 1\). Moreover, since any m-dimensional projection of Z is a log-concave, isotropic m-dimensional vector, we know by the result of Lee and Vempala [13], that it satisfies the exponential concentration with a constants Cm 1∕4. (In fact an easy modification of the proof below shows that for our purposes it would be enough to have exponential concentration with a constant Cm γ for some γ < 1∕2, so one may also use Eldan’s result [6] which gives such estimates for any γ > 1∕3). So any m-dimensional projection of Y satisfies exponential concentration with constant C 2m 1∕4.
Let us fix k and set t := t(k, X), then (since X i has no atoms)
For l = 1, 2, … define
where β = 2−8. By (3.12) there exists l such that
Let us consider three cases.
-
(1)
l = 1 and |I 1|≤ k. Then
$$\displaystyle \begin{aligned} {\mathbb E}\max_{|I|=k}\sum_{i\in I}|X_i|\geq \sum_{i\in I_1}{\mathbb E} |X_i|{\mathbf 1}_{\{|X_i|\geq t\}}\geq \frac{1}{2}kt. \end{aligned}$$ -
(2)
l = 1 and |I 1| > k. Choose J ⊂ I 1 of cardinality k. Then
$$\displaystyle \begin{aligned} {\mathbb E}\max_{|I|=k}\sum_{i\in I}|X_i|\geq \sum_{i\in J}{\mathbb E}|X_i|\geq \sum_{i\in J}t{\mathbb P}(|X_i|\geq t)\geq \beta kt. \end{aligned}$$ -
(3)
l > 1. By Lemma 3.12 (applied with λ = 1∕8) we have
$$\displaystyle \begin{aligned} \sum_{i\in I_l}{\mathbb E} |X_i|{\mathbf 1}_{\{|X_i|\geq t/8\}}\geq \frac{1}{32}\beta^{-7(l-1)/8}\sum_{i\in I_l}{\mathbb E} |X_i|{\mathbf 1}_{\{|X_i|\geq t\}} \geq \frac{1}{32}\beta^{-7(l-1)/8}2^{-l}kt.\end{aligned} $$(3.13)Moreover for i ∈ I l, \({\mathbb P}(|X_i|\geq t)\leq \beta ^{l-1}\leq 1/4\), so the second part of Lemma 3.12 yields
$$\displaystyle \begin{aligned} 4t|I_l|\beta^{l-1}\geq \sum_{i\in I_l}{\mathbb E} |X_i|{\mathbf 1}_{\{|X_i|\geq t\}}\geq kt2^{-l} \end{aligned}$$and |I l|≥ β 1−l2−l−2k = 27l−10k ≥ k.
Set k ′ := β −7l∕82−lk = 26lk. If k ′≥|I l| then, using (3.13), we estimate
Otherwise set \(X^{\prime }=(X_i)_{i\in I_l}\) and \(Y^{\prime }=(Y_i)_{i\in I_l}\). By (3.12) we have
so |I l|≤ kβ −l and Y ′ satisfies exponential concentration with constant α ′ = C 2k 1∕4β −l∕4. Estimate (3.13) yields
so t(k ′, X ′) ≥ 2−12t. Moreover, by Proposition 3.7 we have (since k ′≤|I l|)
To conclude observe that
and since k ′≥ k,
□
3.4 Vectors Satisfying Condition (3.3)
Proof of Theorem 3.2
By Proposition 3.1 we need to show only the lower bound. Assume first that variables X i have no atoms and k ≥ 4(1 + α).
Let t k = t(k, X). Then \({\mathbb E} \sum _{i=1}^n |X_i| {\mathbf 1}_{\{|X_i| \ge t_k\}} = kt_k \). Note, that (3.3) implies that for all i ≠ j we have
We may assume that \({\mathbb E} \max _{|I|=k} \sum _{i\in I} |X_i| \le \frac 16 kt_k\), because otherwise the lower bound holds trivially.
Let us define
Since
it suffices to bound below the probability that Y ≥ kt k∕2 by a constant depending only on α.
We have
Therefore \(A^2 \le (1+2\alpha )k^2t_k^2\) and for any l ≥ k∕2 we have
By Corollary 3.9 we have (recall definition (3.6))
Assumption (3.3) implies that
Moreover for s ≥ kt k we have
so
Thus
and
This together with (3.16) and the assumption that k ≥ 4(1 + α) implies
and
Therefore
This applied to (3.15) with l = (12 + 24α)k gives us \({\mathbb P}(Y\ge kt_k/2)\ge ( 144+288\alpha )^{-1}\) and in consequence
Since k↦kt(k, X) is non-decreasing, in the case k ≤⌈4(1 + α)⌉ =: k 0 we have
The last step is to loose the assumption that X i has no atoms. Note that both assumption (3.3) and the lower bound depend only on \((|X_i|)_{i=1}^n\), so we may assume that X i are nonnegative almost surely. Consider \(X^{\varepsilon }:=(X_i +\varepsilon Y_i)_{i=1}^n\), where Y 1, …, Y n are i.i.d. nonnegative r.v’s with \({\mathbb E} Y_i <\infty \) and a density g, independent of X. Then for every s, t > 0 we have (observe that (3.3) holds also for s < 0 or t < 0).
Thus X ε satisfies assumption (3.3) and has the density function for every ε > 0. Therefore for all natural k we have
Clearly, \({\mathbb E} \max _{|I|=k} \sum _{i=1}^n X_i^{\varepsilon } \to {\mathbb E} \max _{|I|=k} \sum _{i=1}^n X_i\) as ε → 0, so the lower bound holds in the case of arbitrary X satisfying (3.3). □
We may use Theorem 3.2 to obtain a comparison of weak and strong moments for the supremum norm:
Corollary 3.13
Let X be an n-dimensional centered random vector satisfying condition (3.3). Assume that
Then the following comparison of weak and strong moments for the supremum norm holds: for all \(a\in {\mathbb R}^n\) and all p ≥ 1,
where C(α, β) is a constant depending only on α and β.
Proof
Let \(X^{\prime }=(X_i^{\prime })_{i\leq n}\) be a decoupled version of X. For any p > 0 a random vector (|a iX i|p)i≤n satisfies condition (3.3), so by Theorem 3.2
for all p > 0, up to a constant depending only on α. The coordinates of X ′ are independent and satisfy condition (3.17), so due to [11, Theorem 1.1] the comparison of weak and strong moments of X ′ holds, i.e. for p ≥ 1,
where C(β) depends only on β. These two observations yield the assertion. □
3.5 Lower Estimates for Order Statistics
The next lemma shows the relation between t(k, X) and t ∗(k, X) for log-concave vectors X.
Lemma 3.14
Let X be a symmetric log-concave random vector in \({\mathbb R}^n\). For any 1 ≤ k ≤ n we have
Proof
Let t k := t(k, X) and \(t_k^*:=t^*(k,X)\). We may assume that any X i is not identically equal to 0. Then \(\sum _{i=1}^n {\mathbb P}(|X_i|\ge t_k^{ *})\,{=}\, k\) and \(\sum _{i=1}^n {\mathbb E}|X_i|{\mathbf 1}_{{ \{}|X_i|\ge t_k{ \}}}\,{=}\,kt_k\).
Obviously \(t_k^*\leq t_k\). Also for any |I| = k we have
To prove the upper bound set
We have
so |I 1|≤ 4k. Hence
Moreover by the second part of Lemma 3.12 we get
so
Hence if \(s=4t_k^*+\frac {4}{k}\max _{|I|=k}\sum _{i\in I}{\mathbb E}|X_i|\) then
that is t k ≤ s. □
To derive bounds for order statistics we will also need a few facts about log-concave vectors.
Lemma 3.15
Assume that Z is an isotropic one- or two-dimensional log-concave random vector with a density g. Then g(t) ≤ C for all t. If Z is one-dimensional, then also g(t) ≥ c for all |t|≤ t 0, where t 0 > 0 is an absolute constant.
Proof
We will use a classical result (see [4, Theorem 2.2.2, Proposition 3.3.1, Proposition 3.3.2, and Proposition 2.5.9]): \(\|g\|{ }_{\sup }\sim g(0) \sim 1\) (note that here we use the assumption that Z is isotropic, in particular that \({\mathbb E} Z=0\), and that the dimension of Z is 1 or 2). This implies the upper bound on g.
In order to get the lower bound in the one-dimensional case, it suffices to prove that g(u) ≥ c for \(|u|=\varepsilon {\mathbb E} |Z|\ge (2C_1)^{-1} \varepsilon \), where 1∕4 > ε > 0 is fixed and its value will be chosen later (then by the log-concavity we get g(u)sg(0)1−s ≤ g(su) for all s ∈ (0, 1)). Since − Z is again isotropic we may assume that u ≥ 0.
If g(u) ≥ g(0)∕e, then we are done. Otherwise by log-concavity of g we get
On the other hand, Z has mean zero, so \({\mathbb E} |Z|=2{\mathbb E} Z_{+}\) and by the Paley–Zygmund inequality and (3.7) we have
For ε < c 0∕C 0 we get a contradiction. □
Lemma 3.16
Let Y be a mean zero log-concave random variable and let \({\mathbb P}(|Y|\geq t)\leq p\) for some p > 0. Then
Proof
By the Grünbaum inequality (3.8) we have \({\mathbb P}(Y\geq 0)\geq 1/e\), hence
Since − Y satisfies the same assumptions as Y we also have
□
Lemma 3.17
Let Y be a mean zero log-concave random variable and let \({\mathbb P}(|Y|\geq t)\geq p\) for some p > 0. Then there exists a universal constant C such that
Proof
Without loss of generality we may assume that \({\mathbb E} Y^2=1\). Then by Chebyshev’s inequality t ≤ p −1∕2. Let g be the density of Y . By Lemma 3.15 we know that ∥g∥∞≤ C and g(t) ≥ c on [−t 0, t 0], where c, C and t 0 ∈ (0, 1) are universal constants. Thus
and
□
Now we are ready to give a proof of the lower bound in Theorem 3.4. The next proposition is a key part of it.
Proposition 3.18
Let X be a mean zero log-concave n-dimensional random vector with uncorrelated coordinates and let α > 1∕4. Suppose that
Then
Proof
Let t ∗ = t ∗(α, X), k := ⌊4α⌋ and \(L=\lfloor \frac {\sqrt {C_3}}{4 \sqrt {e}}\rfloor \). We will choose C 3 in such a way that L is large, in particular we may assume that L ≥ 2. Observe also that \(\alpha = \sum _{i=1}^n {\mathbb P}(|X_i|\geq t^*(\alpha ,X))\leq nC_3^{-1}\), thus \(Lk\leq C_3^{1/2}e^{-1/2}\alpha \leq e^{-1/2}C_3^{-1/2}n\leq n\) if \(C_3\geq 1>\frac 1e\). Hence
Lemma 3.16 and the definition of t ∗(α, X) yield
This yields \(t(Lk,X)\geq t^*(Lk,X) \geq \frac {t^*}{2}\) and by Theorem 3.3 we have
Since for any norm \({\mathbb P}(\|X\|\leq t{\mathbb E} \|X\|)\leq Ct\) for t > 0 (see [10, Corollary 1]) we have
Let X ′ be an independent copy of X. By the Paley-Zygmund inequality and (3.7), \({\mathbb P}(|X_i|\geq \frac {1}{2}{\mathbb E} |X_i|)\geq \frac {({\mathbb E}|X_i|)^2}{4{\mathbb E} |X_i|{ }^2}> \frac {1}{C_3}\) if \(C_3 >16C_1^2\), so \(\frac {1}{2}{\mathbb E}|X_i|\leq t^*\). Moreover it is easy to verify that k = ⌊4α⌋ > α for α > 1∕4, thus t ∗(k, X) ≤ t ∗(α, X) = t ∗. Hence Proposition 3.1, Lemma 3.14, and inequality (3.10) yield
Therefore
so it is enough to choose C 3 in such a way that L ≥ 1600∕c 2. □
Proof of the First Part of Theorem 3.4
Let t ∗ = t ∗(k − 1∕2, X) and C 3 be as in Proposition 3.18. It is enough to consider the case when t ∗ > 0, then \({\mathbb P}(|X_i|=t^*)=0\) for all i and \(\sum _{i=1}^n {\mathbb P}(|X_i|\geq t^*) = k-1/2\). Define
If β = 0 then α = k − 1∕2, |I 1| = {1, …, n}, and the assertion immediately follows by Proposition 3.18 since 4α ≥ k.
Otherwise define
We have by Lemma 3.17 applied with p = 1∕C 3
Thus
Therefore
If α < 1∕2 then ⌈β⌉ = k and the assertion easily follows. Otherwise Proposition 3.18 yields
Observe that for α ≥ 1∕2 we have ⌊4α⌋ + ⌈β⌉ ≥ 4α − 1 + β ≥ α + 1∕2 + β = k, so
□
Remark 3.19
A modification of the proof above shows that under the assumptions of Theorem 3.4 for any p < 1 there exists c(p) > 0 such that
3.6 Upper Estimates for Order Statistics
We will need a few more facts concerning log-concave vectors.
Lemma 3.20
Suppose that X is a mean zero log-concave random vector with uncorrelated coordinates. Then for any i ≠ j and s > 0,
Proof
Let C 7, c 3 and t 0 be the constants from Lemma 3.15. If s > t 0∥X i∥2 then, by Lemma 3.15, \({\mathbb P}(|X_i|\leq s)\geq 2c_3t_0\) and the assertion is obvious (with any C 6 ≥ (2c 3t 0)−1). Thus we will assume that \(s\leq t_0\min \{\|X_i\|{ }_2,\|X_j\|{ }_2\}\).
Let \(\widetilde {X}_i=X_i/\|X_i\|{ }_2\) and let g ij be the density of \((\widetilde {X}_i,\widetilde {X}_j)\). By Lemma 3.15 we know that ∥g i,j∥∞≤ C 7, so
On the other hand the second part of Lemma 3.15 yields
□
Lemma 3.21
Let Y be a log-concave random variable. Then
Proof
We may assume that Y is non-degenerate (otherwise the statement is obvious), in particular Y has no atoms. Log-concavity of Y yields
Hence
Since − Y satisfies the same assumptions as Y , we also have
Adding both estimates we get
□
Lemma 3.22
Suppose that Y is a log-concave random variable and \({\mathbb P}(|Y|\leq t)\leq \frac {1}{10}\). Then \({\mathbb P}(|Y|\leq 21t)\geq 5{\mathbb P}(|Y|\leq t)\).
Proof
Let \({\mathbb P}(|Y|\leq t)=p\) then by Lemma 3.21
□
Let us now prove (3.4) and see how it implies the second part of Theorem 3.4. Then we give a proof of (3.5).
Proof of ( 3.4 )
Fix k and set t ∗ := t ∗(k − 1∕2, X). Then \(\sum _{i=1}^n {\mathbb P}(|X_i|\geq t^*)=k-1/2\). Define
Observe that for u > 3 and 1 ≤ l ≤|I 1| we have by Lemma 3.21
Consider two cases.
Case 2 β > |I 2|− 1∕2. Then |I 2| < β + 1∕2 ≤ k, so k −|I 2|≥ 1 and
Therefore by (3.23)
Case 2 β ≤|I 2|− 1∕2. Observe that for any disjoint sets J 1, J 2 and integers l, m such that l ≤|J 1|, m ≤|J 2| we have
Since
we have ⌈α⌉ + ⌈β⌉≤ k + 1 and, by (3.24),
Estimate (3.23) yields
To estimate \(\lceil \beta \rceil \text{-}\max _{i\in I_2}|X_i| =(|I_2|+1-\lceil \beta \rceil )\text{-}\min _{i\in I_2}|X_i|\) observe that by Lemma 3.22, the definition of I 2 and assumptions on β,
Set l := (|I 2| + 1 −⌈β⌉) and
Note that we know already that \({\mathbb E}\tilde {N}(21 t^*) \ge 2l\). Thus the Paley-Zygmund inequality implies
However Lemma 3.20 yields
Therefore
for sufficiently large u. □
The unconditionality assumption plays a crucial role in the proof of the next lemma, which allows to derive the second part of Theorem 3.4 from estimate (3.4).
Lemma 3.23
Let X be an unconditional log-concave n-dimensional random vector. Then for any 1 ≤ k ≤ n,
Proof
Let ν be the law of (|X 1|, …, |X n|). Then ν is log-concave on \({\mathbb R}_n^+\). Define for t > 0,
It is easy to check that \(\frac {1}{u}A_{ut}+(1-\frac {1}{u}){\mathbb R}_+^n\subset A_t\), hence
□
Proof of the Second Part of Theorem 3.4
Estimate (3.4) together with Lemma 3.23 yields
and the assertion follows by integration by parts. □
Proof of ( 3.5 )
Define I 1, I 2, α and β by (3.21) and (3.22), where this time t ∗ = t ∗(k − k 5∕6∕2, X). Estimate (3.23) is still valid so integration by parts yields
Set
Observe that
Hence ⌈α⌉ + k β ≤ k + 1.
If k β > |I 2|, then k −|I 2|≥⌈α⌉ + k β − 1 −|I 2|≥⌈α⌉, so
Therefore it suffices to consider case k β ≤|I 2| only.
Since ⌈α⌉ + k β − 1 ≤ k and k β ≤|I 2|, we have by (3.24),
Since \(\beta \leq k-\frac {1}{2}k^{5/6}\) and \(x\rightarrow x-\frac {1}{2}x^{5/6}\) is increasing for x ≥ 1∕2 we have
Therefore, considering \((X_{i})_{i\in I_2}\) instead of X and k β instead of k it is enough to show the following claim:
Let s > 0, n ≥ k and let X be an n-dimensional log-concave vector with uncorrelated coordinates. Suppose that
then
We will show the claim by induction on k. For k = 1 the statement is obvious (since the assumptions are contradictory). Suppose now that k ≥ 2 and the assertion holds for k − 1.
Case 2 \({\mathbb P}(|X_{i_0}|\geq s)\geq 1-\frac {5}{12}k^{-1/6}\) for some 1 ≤ i 0 ≤ n. Then
where to get the last inequality we used that x 5∕6 is concave on \({\mathbb R}_+\), so \((1-t)^{5/6}\leq 1-\frac {5}{6}t\) for t = 1∕k. Therefore by the induction assumption applied to \((X_i)_{i\neq i_0}\),
Case 2 \({\mathbb P}(|X_{i}|\leq s)\geq \frac {5}{12}k^{-1/6}\) for all i. Applying Lemma 3.15 we get
so maxi∥X i∥2 ≤ Ck 1∕6s. Moreover \(n\leq \frac {10}{9}k\). Therefore by the result of Lee and Vempala [13] X satisfies the exponential concentration with α ≤ C 9k 5∕12s.
Let \(l=\lceil k-\frac {1}{2}(k^{5/6}-1)\rceil \) then s ≥ t ∗(l − 1∕2, X) and \(k-l+1\geq \frac {1}{2}(k^{5/6}-1) \geq \frac {1}{9}k^{5/6}\). Let
By (3.4) (applied with l instead of k) we have \({\mathbb P}(X\in A)\geq c_{4}\). Observe that
Therefore by Lemma 3.10 we get
Integration by parts yields
and the induction step is shown in this case provided that \(C_8\geq C_{10}+3C_9(1-\ln c_{4})\). □
To obtain Corollary 3.6 we used the following lemma.
Lemma 3.24
Assume that X is a symmetric isotropic log-concave vector in \({\mathbb R}^n\) . Then
and
Proof
Observe that
Thus Lemma 3.15 implies that for p ≥ c 5n (with \( c_5\in (\frac 12,1)\)) we have \(t^*( p,X)\sim \frac {n- p}{n}\). Moreover, by the Markov inequality
so t ∗(n∕4, X) ≤ 4. Since p↦t ∗(p, X) is non-increasing, we know that t ∗(p, X) ∼ 1 for n∕4 ≤ p ≤ c 5n.
Now we will prove (3.26). We have
so it suffices to show that t ∗(k, X) ≥ ct(k, X). To this end we fix k ≤ n∕2. By (3.25) we know that t := C 11t ∗(k, X) ≥ C 11t ∗(n∕2, X) ≥ e, so the isotropicity of X and Markov’s inequality yield \({\mathbb P}(|X_i|\geq t)\leq e^{-2}\) for all i. We may also assume that t ≥ t ∗(k, X). Integration by parts and Lemma 3.21 yield
Therefore
so t(k, X) ≤ 4C 11t ∗(k, X). □
References
S. Artstein-Avidan, A. Giannopoulos, V.D. Milman, Asymptotic Geometric Analysis. Part I (American Mathematical Society, Providence, 2015)
N. Balakrishnan, A.C. Cohen, Order Statistics and Inference (Academic Press, New York, 1991)
C. Borell, Convex measures on locally convex spaces. Ark. Math. 12, 239–252 (1974)
S. Brazitikos, A. Giannopoulos, P. Valettas, B.H. Vritsiou, Geometry of Isotropic Convex Bodies (American Mathematical Society, Providence, 2014)
H.A. David, H.N. Nagaraja, Order Statistics, 3rd edn. (Wiley-Interscience, Hoboken, 2003)
R. Eldan, Thin shell implies spectral gap up to polylog via a stochastic localization scheme. Geom. Funct. Anal. 23, 532–569 (2013)
Y. Gordon, A. Litvak, C. Schütt, E. Werner, Orlicz norms of sequences of random variables. Ann. Probab. 30(4), 1833–1853 (2002)
Y. Gordon, A. Litvak, C. Schütt, E. Werner, On the minimum of several random variables. Proc. Amer. Math. Soc. 134(12), 3665–3675 (2006)
Y. Gordon, A. Litvak, C. Schütt, E. Werner, Uniform estimates for order statistics and Orlicz functions. Positivity 16(1), 1–28 (2012)
R. Latała, On the equivalence between geometric and arithmetic means for log-concave measures, in Convex Geometric Analysis (Berkeley, 1996). Mathematical Sciences Research Institute Publications, vol. 34 (Cambridge University Press, Cambridge, 1999), pp. 123–127
R. Latała, M. Strzelecka, Comparison of weak and strong moments for vectors with independent coordinates. Mathematika 64(1), 211–229 (2018)
M. Ledoux, The Concentration of Measure Phenomenon (American Mathematical Society, Providence, 2001)
Y.T. Lee, S. Vempala, Eldan’s stochastic localization and the KLS hyperplane conjecture: an improved lower bound for expansion, in 58th Annual IEEE Symposium on Foundations of Computer Science – FOCS 2017 (IEEE Computer Society, Los Alamitos, 2017), pp. 998–1007
L. Lovász, S. Vempala, The geometry of logconcave functions and sampling algorithms. Random Struct. Algoritm. 30(3), 307–358 (2007)
J. Prochno, S. Riemer, On the maximum of random variables on product spaces. Houston J. Math. 39(4), 1301–1311 (2013)
Acknowledgements
The research of RL was supported by the National Science Centre, Poland grant 2015/18/A/ST1/00553 and of MS by the National Science Centre, Poland grants 2015/19/N/ST1/02661 and 2018/28/T/ST1/00001.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Latała, R., Strzelecka, M. (2020). Two-Sided Estimates for Order Statistics of Log-Concave Random Vectors. In: Klartag, B., Milman, E. (eds) Geometric Aspects of Functional Analysis. Lecture Notes in Mathematics, vol 2266. Springer, Cham. https://doi.org/10.1007/978-3-030-46762-3_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-46762-3_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-46761-6
Online ISBN: 978-3-030-46762-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)