Small Ball Probability for the Condition Number of Random Matrices

Litvak, Alexander E.; Tikhomirov, Konstantin; Tomczak-Jaegermann, Nicole

doi:10.1007/978-3-030-46762-3_5

Alexander E. Litvak¹⁸,
Konstantin Tikhomirov¹⁹ &
Nicole Tomczak-Jaegermann¹⁸

Part of the book series: Lecture Notes in Mathematics ((LNM,volume 2266))

797 Accesses
2 Citations

Abstract

Let A be an n × n random matrix with i.i.d. entries of zero mean, unit variance and a bounded sub-Gaussian moment. We show that the condition number $s_{\max }(A)/s_{\min }(A)$ satisfies the small ball probability estimate

$$\displaystyle {\mathbb P}\big \{s_{\max }(A)/s_{\min }(A)\leq n/t\big \}\leq 2\exp (-c t^2),\quad t\geq 1, $$

where c > 0 may only depend on the sub-Gaussian moment. Although the estimate can be obtained as a combination of known results and techniques, it was not noticed in the literature before. As a key step of the proof, we apply estimates for the singular values of A, ${\mathbb P}\big \{s_{n-k+1}(A)\leq ck/\sqrt {n}\big \}\leq 2 \exp (-c k^2), \quad 1\leq k\leq n,$ obtained (under some additional assumptions) by Nguyen.

AMS 2010 Classification 60B20, 15B52, 46B06, 15A18

Access provided by Autonomous University of Puebla. Download chapter PDF

Small Ball Probability, Inverse Theorems, and Applications

Approximate Spielman-Teng theorems for the least singular value of random combinatorial matrices

Article 01 April 2021

A Simple Tool for Bounding the Deviation of Random Matrices on Geometric Sets

Keywords

5.1 Introduction

We say that a random variable ξ has sub-Gaussian moment bounded above by K > 0 if

$$\displaystyle \begin{aligned} {\mathbb P} \{|\xi|\geq t\} \leq \exp\big(1-t^2/(2K^2)\big), \quad t\geq 0. \end{aligned}$$

Let A be an n × n random matrix with i.i.d. entries of zero mean, unit variance and sub-Gaussian moment bounded above by K, and denote by s _i(A), 1 ≤ i ≤ n, its singular values arranged in non-increasing order. We will write $s_{\max }(A)$ and $s_{\min }(A)$ for s ₁(A) and s _n(A), respectively. Estimating the magnitude of the condition number,

$$\displaystyle \begin{aligned}\kappa(A)=s_{\max}(A)/s_{\min}(A),\end{aligned}$$

is a well studied problem, with connections to numerical analysis and computation of the limiting distribution of the matrix spectrum; we refer, in particular, to [20] for discussion. Since the largest singular value $s_{\max }(A)$ is strongly concentrated (see the proof of Corollary 5.1.2 below), estimating κ(A) is essentially reduced to estimating $s_{\min }(A)$ from above and below.

The main result of [12] provides small ball probability estimates for $s_{\min }(A)$ of the form

$$\displaystyle \begin{aligned}{\mathbb P}\big\{s_{\min}(A)\leq t/\sqrt{n}\big\}\leq Ct+e^{-c n},\quad t\leq 1,\end{aligned}$$

for some C, c > 0 depending only on the sub-Gaussian moment. It seems natural to investigate the complementary regime—the large deviation estimates for $s_{\min }(A)$. It was shown in [13] that

$$\displaystyle \begin{aligned} {\mathbb P}\big\{s_{\min}(A)\geq t/\sqrt{n}\big\}\leq \frac{C\ln t}{t}+e^{-c n},\quad t\geq 2 \end{aligned}$$

(see also [21] for an extension of this result to distributions with no assumptions on moments higher than 2). The probability estimate was improved in [10] to

$$\displaystyle \begin{aligned} {\mathbb P}\big\{s_{\min}(A)\geq t/\sqrt{n}\big\}\leq e^{-ct},\quad t\geq 2, \end{aligned}$$

for c > 0 depending only on the sub-Gaussian moment. The existing results on the distribution of the singular values of random Gaussian matrices [4, 18] suggest that the optimal dependence on t in the exponent on the right hand side is quadratic, i.e. the variable $\sqrt {n}\,s_{\min }(A)$ is sub-Gaussian. Specifically, it is shown in [18] that $s_{\min }(G)$ for the standard n × n Gaussian matrix G satisfies two-sided estimates

$$\displaystyle \begin{aligned} \exp(-C t^2)\leq {\mathbb P}\big\{s_{\min}(G)\geq t/\sqrt{n}\big\}\leq \exp(-c t^2),\quad t\geq C_1, \end{aligned}$$

where C, C ₁, c > 0 are some universal constants. The main result of our note provides matching upper estimate for matrices with sub-Gaussian entries:

Theorem 5.1.1

Let A be an n × n random matrix with i.i.d. entries of zero mean, unit variance, and sub-Gaussian moment bounded above by K > 0. Then the smallest singular value $s_{\min }(A)$ satisfies

$$\displaystyle \begin{aligned} {\mathbb P}\big\{s_{\min}(A)\geq t/\sqrt{n}\big\}\leq 2\exp(-c t^2),\quad t\geq 1, \end{aligned}$$

where c > 0 is a constant depending only on K.

As a simple corollary of the theorem, we obtain small ball probability estimates for the condition number:

Corollary 5.1.2

Let A be an n × n random matrix with i.i.d. entries of zero mean, unit variance, and sub-Gaussian moment bounded above by K > 0. Then the condition number κ(A) satisfies

$$\displaystyle \begin{aligned} {\mathbb P}\big\{\kappa(A)\leq n/t\big\}\leq 2\exp(-c t^2),\quad t\geq 1, \end{aligned}$$

where c > 0 is a constant depending only on K.

Theorem 5.1.1 is a consequence of the following theorem, which is of independent interest.

Theorem 5.1.3

Under conditions of Theorem 5.1.1 one has

$$\displaystyle \begin{aligned} {\mathbb P}\big\{\|A^{-1}\|{}_{HS} \leq \min(n/t, \, \sqrt{n/t})\big\}\leq 2\exp(-c t^2),\quad t\geq 0, \end{aligned}$$

where c > 0 is a constant depending only on K.

The proof of Theorem 5.1.3 uses, as a main step, the estimates

$$\displaystyle \begin{aligned}{\mathbb P}\big\{s_{n-k+1}(A)\leq ck/\sqrt{n}\big\}\leq 2 \exp(-c k^2),\quad \quad 1\leq k\leq n,\end{aligned}$$

for the singular values of the matrix A. These estimates, based on the restricted invertibility of matrices and certain averaging arguments, were recently obtained by Nguyen [9] under some additional assumptions (which will be discussed in the next section).

5.2 Preliminaries

Given a matrix A, it singular values s _i = s _i(A), i ≥ 1, are square roots of eigenvalues of AA ^∗. We always assume that s ₁ ≥ s ₂ ≥… By ∥A∥ and ∥A∥_HS we denote the operator ℓ ₂ → ℓ ₂ norm of A (also called the spectral norm) and the Hilbert–Schmidt norm respectively. Note that

$$\displaystyle \begin{aligned} \|A\| = s_1 \quad \quad \mbox{ and } \quad \quad \|A\|{}_{HS}^2=\sum_{i\geq 1} s_i^2. \end{aligned}$$

The columns and rows of A are denoted by C _i(A) and R _i(A), i ≥ 1, respectively. Given J ⊂ [m], the coordinate projection in ${\mathbb R}^m$ onto ${\mathbb R}^J$ is denoted by P _J. For convenience, we often write A _J instead of AP _J. Given m ≥ 1, the identity operator ${\mathbb R}^\ell \to {\mathbb R}^\ell $ we denote by I _m. Given $x, y\in {\mathbb {R}^n}$ by $\left \langle x, \cdot \right \rangle y$ we denote the operator $z\mapsto \left \langle x, z\right \rangle y$ (in the literature it is often denoted by x ⊗ y or yx ^⊤). The canonical Euclidean norm in ${\mathbb R}^m$ is denoted by ∥⋅∥₂ and the unit Euclidean sphere by S ^m−1.

As the most important part of our argument, we will use the following result.

Theorem 5.2.1

Let A be an n × n random matrix with i.i.d. entries of zero mean, unit variance, and sub-Gaussian moment bounded above by K > 0. Then for any 1 ≤ k ≤ n one has

$$\displaystyle \begin{aligned}{\mathbb P}\big\{s_{n-k+1}(A)\leq ck/\sqrt{n}\big\}\leq 2 \exp(-c k^2),\end{aligned}$$

where c > 0 is a constant depending only on K.

The above theorem, up to some minor modifications, was proved by Nguyen in [9]. Specifically, in the case $k\geq C\log n$, the theorem follows from [9, Theorem 1.7] (or [9, Corollary 1.8]) if one additionally assumes either that the entries of A are uniformly bounded by a constant, or that the distribution density of the entries is bounded. Removing these conditions requires a minor change of the proof in [9]. Further, in the case $k\leq C\log n$, the above result (in fact, in a stronger form) is stated as formula (4) in [9, Theorem 1.4]. However, [9, Theorem 3.6], which is used to derive [9, formula (4)], provides a non-trivial probability estimate only for the event $\{s_{n-k+1}(A)\leq c_\gamma k^{1-\gamma }/\sqrt {n}\}$ (for any given γ ∈ (0, 1) and c _γ depending on γ), see [9, formula (31)]. Again, a minor update of the argument of [9] provides the result needed for our purposes. In view of the above and for the reader’s convenience, we provide a proof of Theorem 5.2.1 in the last section.

The following result was proved in [17] as an extension of the classical Bourgain–Tzafriri restricted invertibility theorem [2]. With worse dependence on ε, the theorem was earlier proved in [22]. See also recent papers [1, 8] for further improvements and discussions.

Theorem 5.2.2 ([17])

Let T be n × n matrix. Then for any ε ∈ (0, 1) there is a set J ⊂ [n] such that

$$\displaystyle \begin{aligned} \ell:=|J|\geq \bigg\lfloor\frac{\varepsilon^2 \|T\|{}_{HS}^2}{\|T\|{}^2}\bigg\rfloor \quad \quad \mathit{\mbox{ and }} \quad \quad s_\ell (T_J)\geq \frac{(1-\varepsilon)\|T\|{}_{HS}}{\sqrt{n}}. \end{aligned}$$

We will use two following results by Rudelson–Verhsynin. The first one was one of the key ingredients in estimating the smallest singular value of rectangular matrices. The second one is an immediate consequence of the Hanson–Wright inequality [5, 23] generalized in [15].

Theorem 5.2.3 ([14, Theorem 4.1])

Let X be a vector in ${\mathbb R}^n$, whose coordinates are i.i.d. mean-zero, sub-Gaussian random variables with unit variance. Let F be a random subspace in ${\mathbb R}^n$ spanned by n − ℓ vectors, 1 ≤ ℓ ≤ c ^′n, whose coordinates are i.i.d. mean-zero, sub-Gaussian random variables with unit variance, jointly independent with X. Then, for every ε > 0, one has

$$\displaystyle \begin{aligned} {\mathbb P}\big\{{\mathrm{dist}} (X, F)\leq \varepsilon \sqrt{\ell} \big\}\leq (C\varepsilon)^\ell + \exp(-cn). \end{aligned}$$

where C > 0, c, c ^′∈ (0, 1) are constants depending only on the sub-Gaussian moments.

Theorem 5.2.4 ([15, Corollary 3.1])

Let X be a vector in ${\mathbb R}^n$, whose coordinates are i.i.d. mean-zero random variables with unit variance and with sub-Gaussian moment bounded by K. Let F be a fixed subspace in ${\mathbb R}^n$ of dimension n − ℓ. Then, for every t > 0, one has

$$\displaystyle \begin{aligned} {\mathbb P}\big\{| {\mathrm{dist}} (X, F)-\sqrt{\ell}|\geq t \big\}\leq 2 \exp(-c t^2/K^4). \end{aligned}$$

where c > 0 is an absolute constant.

We will also need the following standard claim, which can be proved by integrating the indicator functions (see e.g., [9, Claim 3.4], cf. [7, Claim 4.9]).

Claim 5.2.5

Let α, p ∈ (0, 1). Let ${\mathcal E}$ be an event. Let Z be a finite index set, and $ \{{\mathcal E}_z \} _{z\in Z} $ be a collection of |Z| events satisfying ${\mathbb P}({\mathcal E}_z)\leq p$ for every z ∈ Z. Assume that at least α|Z| of events ${\mathcal E}_z$ hold whenever the event ${\mathcal E}$ occurs. Then ${\mathbb P}({\mathcal E})\leq p/\alpha $.

5.3 Proofs of Main Results

Proof of Theorem 5.1.1

In the case t > n we have

$$\displaystyle \begin{aligned} {\mathbb P}\big\{ s_{\min}(A)\geq t/\sqrt{n} \big\} &= {\mathbb P}\Big\{s_1(A^{-1})\leq \sqrt{n}/t \Big\}\leq {\mathbb P}\Big\{\sum_{i=1}^n s_i(A^{-1})^2\leq n^2/t^2\Big\} \end{aligned} $$

and the result follows from Theorem 5.1.3.

Now we consider the case 1 ≤ t ≤ n. Let L ≥ 1 be a parameter which we will choose later. Then

$$\displaystyle \begin{aligned} {\mathbb P}\big\{s_{\min}(A)\geq t/\sqrt{n}\big\} &= {\mathbb P}\big\{s_{1}(A^{-1})\leq \sqrt{n}/t\big\} \\&\leq {\mathbb P}\Big\{s_1(A^{-1})^2\leq n/t^2\;\;\mbox{ and }\;\;\sum_{i\geq \lceil t\rceil} s_i(A^{-1})^2\geq L n/t\Big\}\\ & \quad +{\mathbb P}\Big\{s_1(A^{-1})^2\leq n/t^2\;\;\mbox{ and }\;\;\sum_{i\geq \lceil t\rceil} s_i(A^{-1})^2< L n/t\Big\}\\ &\leq {\mathbb P}\Big\{\sum_{i\geq \lceil t\rceil} s_i(A^{-1})^2\geq L n/t\Big\} \\ & \quad +{\mathbb P}\Big\{\sum_{i=1}^n s_i(A^{-1})^2\leq n/t+ L n/t\Big\}. \end{aligned} $$

For the first summand in the last expression, we apply Theorem 5.2.1. Since $\sum _{i=\lfloor t\rfloor }^\infty \frac {1}{i^2}\leq \frac {2}{t}$, we obtain

$$\displaystyle \begin{aligned} {\mathbb P}\Big\{\sum_{i=\lceil t\rceil}^n s_i(A^{-1})^2\geq L n/t\Big\} &\leq \sum_{i=\lceil t\rceil}^n {\mathbb P}\big\{s_i(A^{-1})^2\geq L n/(2 i^2)\big\}\\ &= \sum_{i=\lceil t\rceil}^n {\mathbb P}\big\{s_{n-i+1}(A)\leq \sqrt{2} i/\sqrt{L n}\big\}. \end{aligned} $$

Choosing L so that $\sqrt {2/L}$ is equal to the constant from Theorem 5.2.1, we get

$$\displaystyle \begin{aligned} \sum\limits_{i=\lfloor t\rfloor}^n {\mathbb P}\big\{s_{n-i+1}(A)\leq \sqrt{2} i/\sqrt{L n}\big\} \leq 2 \sum\limits_{i=\lfloor t\rfloor}^n \exp(-c i^2)\leq 3 \exp(-c^{\prime} t^2) \end{aligned}$$

for some c ^′ > 0 depending only on K. The bound on the second summand follows from Theorem 5.1.3 applied with t∕(L + 1) instead of t. This completes the proof. □

Proof of Corollary 5.1.2

Theorem 5.2.4 implies that there exists an absolute constant c ₁ > 0 depending only on K such that for every i ≤ n

$$\displaystyle \begin{aligned} {\mathbb P}(\|{\mathbf{C}}_i(A) \|{}_2\leq \sqrt{n}/2 )\leq \exp(- c_1 n) \end{aligned}$$

(this can be shown by direct calculations as well, see e.g. Fact 2.5 in [6]). Since the entries of A are independent, we obtain

$$\displaystyle \begin{aligned} {\mathbb P}(\|A \|\leq \sqrt{n}/2 )\leq \prod_{i=1}^n {\mathbb P}(\|{\mathbf{C}}_i(A) \|{}_2\leq \sqrt{n}/2 )\leq \exp(- c_1 n^2). \end{aligned}$$

Note that if $\|A \|\geq \sqrt {n}/2$ and κ(A) ≤ n∕2t then $s_n(A)= \|A\|/\kappa (A)\geq t/\sqrt {n}$. Therefore, by Theorem 5.1.1,

$$\displaystyle \begin{aligned} {\mathbb P}\{\kappa(A)\leq n/2t\} \leq 2\exp(-c t^2) + \exp(- c_1 n^2). \end{aligned}$$

By adjusting constants, this implies the conclusion for t ≤ n. Since κ(A) ≥ 1, the case t > n is trivial. □

Proof of Theorem 5.1.3

Adjusting the constant in the exponent if needed, without loss of generality, we assume that t ≥ C ₀, where C ₀ > 0 is a large enough constant depending only on K. Denote

$$\displaystyle \begin{aligned} {\mathcal E}_0:=\bigg\{\sum_{i=1}^n s_i(A^{-1})^2\leq n/t\bigg\}. \end{aligned}$$

We first consider the case t ≤ n. Applying the negative second moment identity (see e.g. Exercise 2.7.3 in [19]),

$$\displaystyle \begin{aligned} \sum_{i=1}^n s_i(A^{-1})^2=\sum_{i=1}^n {\mathrm{dist}}\big({\mathbf{C}}_i(A),{\mathrm{span}}\{{\mathbf{C}}_j(A),\; j\neq i\}\big)^{-2}, \end{aligned}$$

we observe that on the event ${\mathcal E}_0$,

$$\displaystyle \begin{aligned}\big|\big\{i\leq n:\;{\mathrm{dist}}\big({\mathbf{C}}_i(A),{\mathrm{span}}\{{\mathbf{C}}_j(A),\; j\neq i\}\big)\geq \sqrt{t/2}\big\}\big|\geq n/2.\end{aligned}$$

For each subset I ⊂ [n] of cardinality k ≤ n∕2 (the actual value of k will be defined later), let 1 _I be the indicator of the event

$$\displaystyle \begin{aligned} \big\{{\mathrm{dist}}\big({\mathbf{C}}_i(A),{\mathrm{span}}\{{\mathbf{C}}_j(A),\; j\in [n]\setminus I\}\big)\geq \sqrt{t/2} \mbox{ for all }i\in I\big\}. \end{aligned}$$

Then, in view of the above, everywhere on the event ${\mathcal E}_0$ we have

$$\displaystyle \begin{aligned} \sum_{I\subset[n],\;|I|=k} {\mathbf{1}}_{I}\geq {\lceil n/2\rceil \choose k}\geq \bigg(\frac{n}{2k}\bigg)^k \geq (2e)^{-k} {n\choose k}. \end{aligned}$$

Hence, by Markov’s inequality and permutation invariance of the matrix distribution,

$$\displaystyle \begin{aligned} {\mathbb P}({\mathcal E}_0) \leq (2e)^{k} \, {\mathbb E}\, {\mathbf{1}}_{[k]}. \end{aligned}$$

As the last step of the proof, we estimate the expectation of 1 _[k] (with a suitable choice of k). In view of independence and equidistribution of the matrix columns, we have

$$\displaystyle \begin{aligned} {\mathbb E}\, {\mathbf{1}}_{[k]}=\left( {\mathbb P}\big\{ {\mathrm{dist}}\big({\mathbf{C}}_1(A),{\mathrm{span}}\{{\mathbf{C}}_j(A),\; j\in [n]\setminus [k]\}\big)\geq \sqrt{t/2} \big\}\right)^k. \end{aligned}$$

Choose k := ⌊t∕4⌋≤ n∕2 and denote

$$\displaystyle \begin{aligned}D:={\mathrm{dist}}\big({\mathbf{C}}_1(A),{\mathrm{span}}\{{\mathbf{C}}_j(A),\; j\in [n]\setminus [k]\}\big).\end{aligned}$$

Using independence of columns of the matrix A and applying Theorem 5.2.4 with ℓ = k and F = span{C _j(A), j ∈ [n] ∖ [k]}, we obtain

$$\displaystyle \begin{aligned} {\mathbb P} \Big\{ D \geq \sqrt{t/2} \Big\} \leq {\mathbb P}\Big\{ D - \sqrt{k} \geq (\sqrt{2}-1)\, \sqrt{t/4} \Big\} \leq 2\exp(-\bar c\, t) \end{aligned}$$

for some $\bar c>0$ depending only on K. Hence,

$$\displaystyle \begin{aligned} {\mathbb P}({\mathcal E}_0) \leq (2e)^{k} \,2^k\,\exp(-\bar c\, t\,k)\leq \exp(-\bar c t^2/16), \end{aligned}$$

provided that t is larger than a certain constant depending only on K. This implies the desired result for t ≤ n.

In the case t > n we essentially repeat the argument along the same lines. Define

$$\displaystyle \begin{aligned} {\mathcal E}_0^{\prime}:=\bigg\{\sum_{i=1}^n s_i(A^{-1})^2\leq n^2/t^2\bigg\}. \end{aligned}$$

Observe that on the event ${\mathcal E}_0^{\prime }$,

$$\displaystyle \begin{aligned} \big|\big\{i\leq n:\;{\mathrm{dist}}\big({\mathbf{C}}_i(A),{\mathrm{span}}\{{\mathbf{C}}_j(A),\; j\neq i\} \big)\geq t/\sqrt{2n}\big\}\big|\geq n/2. \end{aligned}$$

Repeating the above computations with the same notation and with k = ⌊n∕4⌋ we obtain

$$\displaystyle \begin{aligned} {\mathbb P} \Big\{ D \geq t/\sqrt{2n} \Big\} \leq {\mathbb P}\Big\{ D - \sqrt{k} \geq t/(5 \sqrt{n}) \Big\} \leq 2\exp(-\bar c\, t^2/n), \end{aligned}$$

which leads to

$$\displaystyle \begin{aligned} {\mathbb P}({\mathcal E}_0^{\prime}) \leq (2e)^{k} \,2^k\,\exp(-\bar c\, k t^2/n )\leq \exp(-\bar c t^2/16), \end{aligned}$$

provided that t > Cn for large enough C depending only on K. For n < t ≤ Cn the result follows by adjusting the absolute constants. □

5.4 Small Ball Estimates for Singular Values

The goal of this section is to prove Theorem 5.2.1. As we have noted, the argument essentially reproduces that of [9]. An important part of the proof is the use of restricted invertibility (see also [3] and [11] for some recent applications of restricted invertibility in the context of random matrices).

We will use a construction from [9]. Given an integer k and an n × n matrix A define a k × n matrix Z = Z(A, k) in the following way. Consider singular value decomposition $A=\sum _{i=1}^n s_i \left \langle v_i, \cdot \right \rangle w_i$, where s _i = s _i(A) are singular values of A (arranged in non-increasing order) and {v _i}_i, {w _i}_i are two orthonormal systems in ${\mathbb {R}^n}$. For i ≤ k denote z _i = v _n−i+1. Let Z be the matrix whose rows are R _i(Z) = z _i. Clearly, the rows of Z are orthonormal and for every i ≤ k,

$$\displaystyle \begin{aligned} \|Az_i\|{}_2 = s_{n-i+1}\leq s_{n-k+1}. \end{aligned} $$

(5.1)

Moreover,

$$\displaystyle \begin{aligned} \|Z\|=1\quad \quad \mbox{ and } \quad \quad \|Z\|{}_{HS}=\sqrt{k}. \end{aligned}$$

The matrix Z is not uniquely defined when some of the k smallest singular values of A have non-trivial multiplicity; we will however assume that for each realization of A, a single admissible Z is chosen in such a way that Z is a (measurable) random matrix.

5.4.1 Proof of Theorem 5.2.1, the Case $k\geq \ln n$

Let C, c, c ^′ be constants from Theorem 5.2.3. Let $\gamma =\sqrt {c^{\prime }}$. Note that C, c, c ^′, γ depend only on K. Let Z = Z(A, k) be the k × n matrix constructed above. Applying Theorem 5.2.2 to Z (one can add zero rows to make it an n × n matrix), there exists J ⊂ [n] such that

$$\displaystyle \begin{aligned} |J| = \ell:=\lfloor \gamma^2 k \big\rfloor\leq c^{\prime} k \quad \quad \mbox{ and } \quad \quad s_\ell (Z_J)\geq (1-\gamma ) \sqrt{k/n}. \end{aligned} $$

Fix a (small enough, depending on K) constant c ₀ > 0. Define the event

$$\displaystyle \begin{aligned} {\mathcal E}_k := \big\{s_{n-k+1}(A)\leq c_0 k/\sqrt{n}\big\}. \end{aligned} $$

Consider the n × k matrix B = AZ ^⊤. Using property (5.1), on the event ${\mathcal E}_k$, we have for every i ≤ k,

$$\displaystyle \begin{aligned} \|{\mathbf{C}} _i (B) \|{}_2= \|A z_i\|{}_2\leq c_0 k/\sqrt{n}, \end{aligned} $$

hence $\|B\|{ }_{HS}\leq c_0 k^{3/2}/\sqrt {n}$. Now, since s _ℓ(Z _J) > 0, there exists a k × ℓ matrix M such that $Z^\top _J M=I_\ell $. Then

$$\displaystyle \begin{aligned} \|M\| =1/s_\ell (Z)\leq (1-\gamma )^{-1} \sqrt{n/k}. \end{aligned} $$

Therefore,

$$\displaystyle \begin{aligned} \|B M\|{}_{HS}\leq \|B\|{}_{HS}\, \|M\|\leq c_0 (1-\gamma )^{-1} k. \end{aligned}$$

Writing $B= A_J (Z_J)^\top + A_{J^c} (Z_{J^c})^\top $, we also have $BM= A_J + A_{J^c} (Z_{J^c})^\top M$. Next denote

$$\displaystyle \begin{aligned} F=F(A, J):={\mathrm{span}}\{ {\mathbf{C}}_i(A_{J^c})\}_{i\in J^c}, \end{aligned}$$

and let P be the orthogonal projection on F ^⊥. Then, on the event ${\mathcal E} _k$,

$$\displaystyle \begin{aligned} c_0^2 (1-\gamma )^{-2} k^2 &\geq \|P B M\|{}_{HS}^2 \geq \|P A_J\|{}_{HS}^2 = \sum _{i\in J} \|P \, {\mathbf{C}}_i(A_J)\|{}^2_2 \\ & = \sum _{i\in J} {\mathrm{dist}} ^2 ({\mathbf{C}}_i(A), F) . \end{aligned} $$

Therefore, for at least ℓ∕2 indices i ∈ J, one has

$$\displaystyle \begin{aligned} {\mathrm{dist}} ({\mathbf{C}}_i(A), F) \leq \sqrt{2} c_0 (1-\gamma )^{-1} k/\sqrt{\ell} \leq 2 c_0 \sqrt{\ell}/((1-\gamma )\gamma^2) . \end{aligned}$$

Note that the subspace F is spanned by n − ℓ random vectors, it is independent of columns C _i(A), i ∈ J, and that columns of A are independent. Therefore, by Theorem 5.2.3 and the union bound we obtain

Choosing small enough c ₀ and using $k\geq \ln n$, we obtain $ {\mathbb P}({\mathcal E} _k) \leq \exp (-c_3 \ell ^2) , $ where c ₃ > 0 depends only on K. By adjusting constants this proves the desired result for $k\geq \ln n$. □

5.4.2 Proof of Theorem 5.2.1, the Case $k\leq \ln n$

Let A be as in Theorem 5.2.1. It is well known (see e.g. Fact 2.4 in [6]) that there is an absolute constant C ₁ > 0 such that

$$\displaystyle \begin{aligned} {\mathbb P}\big\{\|A\|\leq C_1 K\sqrt{n} \big\}\geq 1- e^{-n}. \end{aligned} $$

(5.2)

Let ${\mathcal E}_{bd}$ denote the event from this equation. Further, from [16, Theorem 1.5] one infers that for any γ > 0 there are γ ₁, γ ₂, γ ₃ > 0 depending only on γ and K such that, denoting

$$\displaystyle \begin{aligned} {\mathcal E}_{inc}(\gamma) := \big\{&\forall x \in S^{n-1} \, \mbox{ with } \, \|Ax \|{}_2\leq \gamma_1 \sqrt{n}, \, \, \forall I\subset [n]\\ &\mbox{ with } \, |I|\geq \gamma n \, \mbox{ one has } \, \|P_I x\|{}_2\geq \gamma_2 \big\}, \end{aligned} $$

the event satisfies

$$\displaystyle \begin{aligned} {\mathbb P}({\mathcal E}_{inc}(\gamma))\geq 1-2 e^{-\gamma_3 n}. \end{aligned} $$

(5.3)

The following statement was proved by Nguyen ([9, Corollary 3.8]).

Proposition 5.4.1

For any K > 0 there are C, c ₁, c ₂, γ > 0 depending only on K with the following property. Let A be an n × n random matrix with i.i.d. entries of zero mean, unit variance, and sub-Gaussian moment bounded above by K. Let $2\leq k\leq n/(C\ln n)$, and let the random k × n matrix Z = Z(A, k) be defined as above. Then everywhere on the event $\big \{s_{n-k+1}(A)\leq c_1 k/\sqrt {n}\big \}\cap {\mathcal E}_{inc}(\gamma )\cap {\mathcal E}_{bd}$ one has

$$\displaystyle \begin{aligned} \big|\big\{J\subset[n]:\;|J|=\lfloor k/2\rfloor,\;s_{\lfloor k/2\rfloor}(Z_J)\geq c_1\sqrt{k/n}\big\}\big| \geq c_2^{k\ln k}\, n^{\lfloor k/2\rfloor}. \end{aligned}$$

Now assume that $k\leq \ln n$. Without loss of generality we may also assume that k is bounded below by a large constant. Let C, c, c ^′ be constants from Theorem 5.2.3 and c ₁, c ₂, γ from Proposition 5.4.1. Fix for a moment any realization of A from the event $\big \{s_{n-k+1}(A)\leq c_0 k/\sqrt {n}\big \}\cap {\mathcal E}_{inc}(\gamma )\cap {\mathcal E}_{bd}$, where c ₀ ∈ (0, c ₁] will be chosen later. Let ℓ := ⌊k∕2⌋ and

$$\displaystyle \begin{aligned} \mathcal{J} :=\big\{J\subset[n]:\;|J|=\lfloor k/2\rfloor,\;s_{\lfloor k/2\rfloor}(Z_J)\geq c_1\sqrt{k/n}\big\}. \end{aligned}$$

Fix $J\in \mathcal {J}$ and repeat the procedure used in Sect. 5.4.1 with J and ℓ. We obtain that for at least ℓ∕2 indices i ∈ J, one has

$$\displaystyle \begin{aligned} {\mathrm{dist}} ({\mathbf{C}}_i(A), F) \leq \sqrt{2} c_0 k/(c_1 \sqrt{\ell}) \leq 4 c_0\sqrt{\ell}/c_1, \end{aligned} $$

(5.4)

where $F={\mathrm {span}}\{ {\mathbf {C}}_i(A_{J^c})\}_{i\in J^c}$. For any fixed subset J ⊂ [n] of cardinality ℓ consider the event

$$\displaystyle \begin{aligned} {\mathcal E}_J:=\big\{\mbox{for at least }\ell/2\mbox{ indices }i\in J\mbox{ inequality (5.4) holds}\big\}. \end{aligned}$$

Applying Theorem 5.2.3 and the union bound we observe

$$\displaystyle \begin{aligned}{\mathbb P}({\mathcal E} _J)&\leq 2^\ell \, \left((4 c_0 C /c_1)^{\ell} + \exp(-cn) \right)^{\ell/2}\\ &\leq \left(4\, \max\left\{\left(4 c_0 C/c_1 \right)^\ell,\, \exp(-cn)\right\}\right)^{\ell/2}. \end{aligned} $$

Choosing c ₀ to be small enough we obtain that ${\mathbb P}({\mathcal E} _J)\leq \exp (-c_4 k^2)$, where c ₄ > 0 depends only on K. Combining this with Claim 5.2.5 and Proposition 5.4.1 we obtain

$$\displaystyle \begin{aligned} {\mathbb P}\big(\big\{s_{n-k+1}(A)\leq c_0 k/\sqrt{n}\big\}\cap {\mathcal E}_{inc}(\gamma)\cap {\mathcal E}_{bd}\big)\leq c_2^{- k\ln k } \exp(-c_4 k^2) \leq \exp(-c_5 k^2) \end{aligned}$$

provided that k ≥ C ₂, where C ₂ ≥ 1 ≥ c ₅ > 0 are constants depending on on K only. By Eqs. (5.2) and (5.3) this completes the proof in the case $k\leq \ln n$. □

References

J.D. Batson, D.A. Spielman, N. Srivastava, Twice-Ramanujan sparsifiers, in Proceedings of the 2009 ACM International Symposium on Theory of Computing (STOC’09) (ACM, New York, 2009), pp. 255–262. MR2780071
Google Scholar
J. Bourgain, L. Tzafriri, Invertibility of “large” submatrices with applications to the geometry of Banach spaces and harmonic analysis. Israel J. Math. 57(2), 137–224 (1987). MR0890420
Google Scholar
N. Cook, Lower bounds for the smallest singular value of structured random matrices. Ann. Prob. (to appear). arXiv:1608.07347
Google Scholar
A. Edelman, Eigenvalues and condition numbers of random matrices. SIAM J. Matrix Anal. Appl. 9(4), 543–560 (1988). MR0964668
Google Scholar
D.L. Hanson, F.T. Wright, A bound on tail probabilities for quadratic forms in independent random variables. Ann. Math. Stat. 42, 1079–1083 (1971). MR0279864
Google Scholar
A.E. Litvak, A. Pajor, M. Rudelson, N. Tomczak-Jaegermann, Smallest singular value of random matrices and geometry of random polytopes. Adv. Math. 195(2), 491–523 (2005). MR2146352
Google Scholar
A.E. Litvak, A. Lytova, K. Tikhomirov, N. Tomczak-Jaegermann, P. Youssef, Adjacency matrices of random digraphs: singularity and anti-concentration. J. Math. Anal. Appl. 445(2), 1447–1491 (2017). MR3545253
Google Scholar
A. Naor, P. Youssef, Restricted invertibility revisited, in A journey through discrete mathematics (Springer, Cham, 2017), pp. 657–691. MR3726618
Google Scholar
H.H. Nguyen, Random matrices: overcrowding estimates for the spectrum. J. Funct. Anal. 275(8), 2197–2224 (2018). MR3841540
Google Scholar
H.H. Nguyen, V.H. Vu, Normal vector of a random hyperplane. Int. Math. Res. Not. 2018(6), 1754–1778 (2018). MR3800634
Google Scholar
M. Rudelson, K. Tikhomirov, The sparse circular law under minimal assumptions. Geom. Funct. Anal. 29(2), 561–637 (2019)
Article MathSciNet Google Scholar
M. Rudelson, R. Vershynin, The Littlewood-Offord problem and invertibility of random matrices. Adv. Math. 218(2), 600–633 (2008). MR2407948
Google Scholar
M. Rudelson, R. Vershynin, The least singular value of a random square matrix is O(n ^−1∕2). C. R. Math. Acad. Sci. Paris 346(15–16), 893–896 (2008). MR2441928
Google Scholar
M. Rudelson, R. Vershynin, Smallest singular value of a random rectangular matrix. Commun. Pure Appl. Math. 62(12), 1707–1739 (2009). MR2569075
Google Scholar
M. Rudelson, R. Vershynin, Hanson-Wright inequality and sub-Gaussian concentration. Electron. Commun. Probab. 18(82), 9 pp. (2013). MR3125258
Google Scholar
M. Rudelson, R. Vershynin, No-gaps delocalization for general random matrices. Geom. Funct. Anal. 26(6), 1716–1776 (2016). MR3579707
Google Scholar
D.A. Spielman, N. Srivastava, An elementary proof of the restricted invertibility theorem. Israel J. Math. 190, 83–91 (2012). MR2956233
MathSciNet MATH Google Scholar
S.J. Szarek, Condition numbers of random matrices. J. Complexity 7(2), 131–149 (1991). MR1108773
Google Scholar
T. Tao, Topics in random matrix theory. Graduate Studies in Mathematics, vol. 132 (American Mathematical Society, Providence, 2012)
Google Scholar
T. Tao, V.H. Vu, Inverse Littlewood-Offord theorems and the condition number of random discrete matrices. Ann. Math. 169(2), 595–632 (2009). MR2480613
Google Scholar
K. Tatarko, An upper bound on the smallest singular value of a square random matrix. J. Complexity 48, 119–128 (2018). MR3828841
MathSciNet MATH Google Scholar
R. Vershynin, John’s decompositions: selecting a large part. Israel J. Math. 122, 253–277 (2001). MR1826503
MathSciNet MATH Google Scholar
F.T. Wright, A bound on tail probabilities for quadratic forms in independent random variables whose distributions are not necessarily symmetric. Ann. Probab. 1(6), 1068–1070 (1973). MR0353419
Google Scholar

Download references

Acknowledgements

The authors are grateful to the anonymous referee for careful reading and valuable suggestions that have helped to improve the presentation. The second named author would like to thank the Department of Mathematical and Statistical Sciences, University of Alberta for ideal working conditions.

Author information

Authors and Affiliations

Department of Mathematical and Statistical Sciences, University of Alberta, Edmonton, AB, Canada
Alexander E. Litvak & Nicole Tomczak-Jaegermann
School of Mathematics, Georgia Institute of Technology, Atlanta, GA, USA
Konstantin Tikhomirov

Authors

Alexander E. Litvak
View author publications
You can also search for this author in PubMed Google Scholar
Konstantin Tikhomirov
View author publications
You can also search for this author in PubMed Google Scholar
Nicole Tomczak-Jaegermann
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Mathematical Sciences, Tel Aviv University, Tel Aviv, Israel; Department of Mathematics, Weizmann Institute of Science, Rehovot, Israel
Bo'az Klartag
Department of Mathematics, Technion – Israel Institute of Technology, Haifa, Israel
Emanuel Milman

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Litvak, A.E., Tikhomirov, K., Tomczak-Jaegermann, N. (2020). Small Ball Probability for the Condition Number of Random Matrices. In: Klartag, B., Milman, E. (eds) Geometric Aspects of Functional Analysis. Lecture Notes in Mathematics, vol 2266. Springer, Cham. https://doi.org/10.1007/978-3-030-46762-3_5

Download citation

DOI: https://doi.org/10.1007/978-3-030-46762-3_5
Published: 09 July 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-46761-6
Online ISBN: 978-3-030-46762-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

Small Ball Probability for the Condition Number of Random Matrices

Abstract

Similar content being viewed by others

Small Ball Probability, Inverse Theorems, and Applications

Approximate Spielman-Teng theorems for the least singular value of random combinatorial matrices

A Simple Tool for Bounding the Deviation of Random Matrices on Geometric Sets

Keywords

5.1 Introduction

Theorem 5.1.1

Corollary 5.1.2

Theorem 5.1.3

5.2 Preliminaries

Theorem 5.2.1

Theorem 5.2.2 ([17])

Theorem 5.2.3 ([14, Theorem 4.1])

Theorem 5.2.4 ([15, Corollary 3.1])

Claim 5.2.5

5.3 Proofs of Main Results

Proof of Theorem 5.1.1

Proof of Corollary 5.1.2

Proof of Theorem 5.1.3

5.4 Small Ball Estimates for Singular Values

5.4.1 Proof of Theorem 5.2.1, the Case \(k\geq \ln n\)

5.4.2 Proof of Theorem 5.2.1, the Case \(k\leq \ln n\)

Proposition 5.4.1

References

Acknowledgements

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

Small Ball Probability for the Condition Number of Random Matrices

Abstract

Similar content being viewed by others

Small Ball Probability, Inverse Theorems, and Applications

Approximate Spielman-Teng theorems for the least singular value of random combinatorial matrices

A Simple Tool for Bounding the Deviation of Random Matrices on Geometric Sets

Keywords

5.1 Introduction

Theorem 5.1.1

Corollary 5.1.2

Theorem 5.1.3

5.2 Preliminaries

Theorem 5.2.1

Theorem 5.2.2 ([17])

Theorem 5.2.3 ([14, Theorem 4.1])

Theorem 5.2.4 ([15, Corollary 3.1])

Claim 5.2.5

5.3 Proofs of Main Results

Proof of Theorem 5.1.1

Proof of Corollary 5.1.2

Proof of Theorem 5.1.3

5.4 Small Ball Estimates for Singular Values

5.4.1 Proof of Theorem 5.2.1, the Case \(k\geq \ln n\)

5.4.2 Proof of Theorem 5.2.1, the Case \(k\leq \ln n\)

Proposition 5.4.1

References

Acknowledgements

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation