Abstract
We prove estimates for the expected value of operator norms of Gaussian random matrices with independent (but not necessarily identically distributed) and centered entries, acting as operators from \(\ell_{p^{{\ast}}}^{n}\) to ℓ q m, 1 ≤ p ∗ ≤ 2 ≤ q < ∞.
Access provided by CONRICYT-eBooks. Download chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
1 Introduction and Main Results
Random matrices and their spectra are under intensive study in Statistics since the work of Wishart [28] on sample covariance matrices, in Numerical Analysis since their introduction by von Neumann and Goldstine [25] in the 1940s, and in Physics as a consequence of Wigner’s work [26, 27] since the 1950s. His Semicircle Law, a fundamental theorem in the spectral theory of large random matrices describing the limit of the empirical spectral measure for what is nowadays known as Wigner matrices, is among the most celebrated results of the theory.
In Banach Space Theory and Asymptotic Geometric Analysis, random matrices appeared already in the 70s (see e.g. [2, 3, 9]). In [2], the authors obtained asymptotic bounds for the expected value of the operator norm of a random matrix B = (b ij ) i, j = 1 m, n with independent mean-zero entries with | b ij | ≤ 1 from ℓ 2 n to ℓ q m, 2 ≤ q < ∞. To be more precise, they proved that
where C q depends only on q. This was then successfully used to characterize (p, q)-absolutely summing operators on Hilbert spaces. Ever since, random matrices are extensively studied and methods of Banach spaces have produced numerous deep and new results. In particular, in many applications the spectral properties of a Gaussian matrix, whose entries are independent identically distributed (i.i.d.) standard Gaussian random variables, were used. Seginer proved in [22] that for an m × n random matrix with i.i.d. symmetric random variables the expectation of its spectral norm (that is, the operator norm from ℓ 2 n to ℓ 2 m) is of the order of the expectation of the largest Euclidean norm of its rows and columns. He also obtained an optimal result in the case of random matrices with entries ɛ ij a ij , where ɛ ij are independent Rademacher random variables and a ij are fixed numbers. We refer the interested reader to the surveys [6, 7] and references therein.
It is natural to ask similar questions about general random matrices, in particular about Gaussian matrices whose entries are still independent centered Gaussian random variables, but with different variances. In this structured case, where we drop the assumption of identical distributions, very little is known. It is conjectured that the expected spectral norm of such a Gaussian matrix is as in Seginer’s result, that is, of the order of the expectation of the largest Euclidean norm of its rows and columns. A big step toward the solution was made by Latała in [15], who proved a bound involving fourth moments, which is of the right order \(\max (\sqrt{m},\sqrt{n})\) in the i.i.d. setting, but does not capture the right behavior in the case of, for instance, diagonal matrices. On one hand, as is mentioned in [15], in view of the classical Bai-Yin theorem, the presence of fourth moments is not surprising, on the other hand they are not needed if the conjecture is true.
Later in [20], Riemer and Schütt proved the conjecture up to a logn factor. The two results are incomparable—depending on the choice of variances, one or another gives a better bound. The Riemer-Schütt estimate was used recently in [21].
We would also like to mention that the non-commutative Khintchine inequality can be used to show that the expected spectral norm is bounded from above by the largest Euclidean norm of its rows and columns times a factor\(\sqrt{\log n}\) (see e.g. (4. 9) in [23]).
Another big step toward the solution was made a short while ago by Bandeira and Van Handel [1]. In particular, they proved that
where \(\left \vert \left \vert \left \vert A\right \vert \right \vert \right \vert\) denotes the largest Euclidean norm of the rows and columns of (a ij ), C > 0 is a universal constant, and g ij are independent standard Gaussian random variables (see [1, Theorem 3.1]). Under mild structural assumptions, the bound (1) is already optimal. Further progress was made by Van Handel [24] who verified the conjecture up to a \(\sqrt{\log \log n}\) factor. In fact, more was proved in [24]. He computed precisely the expectation of the largest Euclidean norm of the rows and columns using Gaussian concentration. And, while the moment method is at the heart of the proofs in [22] and [1], he proposed a very nice approach based on the comparison of Gaussian processes to improve the result of Latała. His approach can be also used for our setting. We comment on this in Sect. 4.
The purpose of this work is to provide bounds for operator norms of such structured Gaussian random matrices considered as operators from \(\ell_{p^{{\ast}}}^{n}\) to ℓ q m.
In what follows, by g i , g ij , i ≥ 1, j ≥ 1 we always denote independent standard Gaussian random variables. Let \(n,m \in \mathbb{N}\) and \(A = (a_{ij})_{i,j=1}^{m,n} \in \mathbb{R}^{m\times n}\). We write G = G A = (a ij g ij ) i, j = 1 m, n. For r ≥ 1, we denote by \(\gamma _{r} \approx \sqrt{r}\) the L r -norm of a standard Gaussian random variable. The notation f ≈ h means that there are two absolute positive constants c and C (that is, independent of any parameters) such that cf ≤ h ≤ Cf and f ≈ p, q h means that there are two positive constants c(p, q) and C(p, q), which depend only on the parameters p and q, such that c(p, q)f ≤ h ≤ C(p, q)f.
Our main result is the following theorem.
Theorem 1.1
For every 1 < p ∗ ≤ 2 ≤ q < ∞ one has
where C is a positive absolute constant.
We conjecture the following bound.
Conjecture 1.2
For every 1 ≤ p ∗ ≤ 2 ≤ q ≤∞ one has
Here, as usual, p is defined via the relation 1∕p + 1∕p ∗ = 1. This conjecture extends the corresponding conjecture for the case p = q = 2 and m = n. In this case, Bandeira and Van Handel proved in [1] an estimate with \(\sqrt{ \log \min (m,n)}\max \vert a_{ij}\vert\) instead of \(\mathbb{E}\max \vert a_{ij}g_{ij}\vert\) (see Eq. (1)), while in [24] the corresponding bound is proved with \(\sqrt{\log \log n}\) in front of the right hand side.
Remark 1.3
The lower bound in the conjecture is almost immediate and follows from standard estimates. Thus the upper bound is the only difficulty.
Remark 1.4
In the case p ∗ = 1 and q ≥ 2, a direct computation following along the lines of Lemma 3.2 below, shows that
Remark 1.5
Note that if 1 ≤ p ∗ ≤ 2 ≤ q ≤ ∞, in the case of matrices of tensor structure, that is, (a ij ) i, j = 1 n = x ⊗ y = (x j ⋅ y i ) i, j = 1 n, with \(x,y \in \mathbb{R}^{n}\), Chevet’s theorem [3, 4] and a direct computation show that
If the matrix is diagonal, that is, \((a_{ij})_{i,j=1}^{n} =\mathop{ \mathrm{diag}}\nolimits (a_{11},\ldots,a_{nn})\), then we immediately obtain
where (a ii ∗) i ≤ n is the decreasing rearrangement of ( | a ii | ) i ≤ n and M g is the Orlicz function given by
(see Lemma 2.2 below and [11, Lemma 5.2] for the Orlicz norm expression).
Slightly different estimates, but of the same flavour, can also be obtained in the case 1 ≤ q ≤ 2 ≤ p ∗ ≤ ∞.
2 Notation and Preliminaries
By c, C, C 1, … we always denote positive absolute constants, whose values may change from line to line, and we write c p , C p , … if the constants depend on some parameter p.
Given p ∈ [1, ∞], p ∗ denotes its conjugate and is given by the relation 1∕p + 1∕p ∗ = 1. For \(x = (x_{i})_{i\leq n} \in \mathbb{R}^{n}\), ∥ x ∥ p denotes its ℓ p -norm, that is ∥ x ∥ ∞ = max i ≤ n | x i | and, for p < ∞,
The corresponding space \((\mathbb{R}^{n},\|\cdot \|_{p})\) is denoted by ℓ p n, its unit ball by B p n.
If E is a normed space, then E ∗ denotes its dual space and B E its closed unit ball. The modulus of convexity of E is defined for any ɛ ∈ (0, 2) by
We say that E has modulus of convexity of power type 2 if there exists a positive constant c such that for all ɛ ∈ (0, 2), δ E (ɛ) ≥ c ɛ 2. It is well known that this property (see e.g. [8] or [18, Proposition 2.4]) is equivalent to the fact that
holds for all x, y ∈ E, where λ > 0 is a constant depending only on c. In that case, we say that E has modulus of convexity of power type 2 with constant λ. We clearly have δ E (ɛ) ≥ ɛ 2∕(2λ 2).
Recall that a Banach space E is of Rademacher type r for some 1 ≤ r ≤ 2 if there is C > 0 such that for all \(n \in \mathbb{N}\) and for all x 1, …, x n ∈ E,
where (ɛ i ) i = 1 ∞ is a sequence of independent random variables defined on some probability space \((\Omega, \mathbb{P})\) such that \(\mathbb{P}(\varepsilon _{i} = 1) = \mathbb{P}(\varepsilon _{i} = -1) = \frac{1} {2}\) for every \(i \in \mathbb{N}\). The smallest C is called type-r constant of E, denoted by T r (E). This concept was introduced into Banach space theory by Hoffmann-Jørgensen [14] in the early 1970s and the basic theory was developed by Maurey and Pisier [17].
We will need the following theorem.
Theorem 2.1
Let E be a Banach space with modulus of convexity of power type 2 with constant λ. Let X 1 ,…,X m ∈ E ∗ be independent random vectors, q ≥ 2 and define
and
Then
Its proof is done following the argument “proof of condition (H)” of [13] in combination with the improvement on covering numbers established in [12, Lemma 2]. Indeed, in [12], the argument is only made in the simpler case q = 2, but it can be extended verbatim to the case q ≥ 2.
We also recall known facts about Gaussian random variables. The next lemma is well-known (see e.g. Lemmas 2.3, 2.4 in [24]).
Lemma 2.2
Let \(a = (a_{i})_{i\leq n} \in \mathbb{R}^{n}\) and (a i ∗ ) i≤n be the decreasing rearrangement of (|a i |) i≤n . Then
Note that in general the maximum of i.i.d. random variables weighted by coordinates of a vector a is equivalent to a certain Orlicz norm ∥ a ∥ M , where the function M depends only on the distribution of random variables (see [10, Corollary 2] and Lemma 5.2 in [11]).
The following theorem is the classical Gaussian concentration inequality (see e.g. [5] or inequality (2.35) and Proposition 2.18 in [16]).
Theorem 2.3
Let \(n \in \mathbb{N}\) and \((Y,\left \Vert \cdot \right \Vert _{Y })\) be a Banach space. Let y 1 ,…,y n ∈ Y and X = ∑ i=1 n g i y i . Then, for every t > 0,
where \(\sigma _{Y }(X) =\sup _{\|\xi \|_{Y^{{\ast}}}=1}\left (\sum _{i=1}^{n}\left \vert \xi (y_{i})\right \vert ^{2}\right )^{1/2}\) .
Remark 2.4
Let p ≥ 2. Let \(a = (a_{j})_{j\leq n} \in \mathbb{R}^{n}\) and X = (a j g j ) j ≤ n . Then we clearly have
Thus, Theorem 2.3 implies for X = (a j g j ) j ≤ n
Note also that
3 Proof of the Main Result
We will apply Theorem 2.1 with \(E =\ell_{ p^{{\ast}}}^{n}\), 1 < p ∗ ≤ 2 and X 1, …, X m being the rows of the matrix G = (a ij g ij ) i, j = 1 m, n. We start with two lemmas in which we estimate the quantity σ and the expectation, appearing in that theorem.
Lemma 3.1
Let \(m,n \in \mathbb{N}\) , 1 < p ∗ ≤ 2 ≤ q, and for i ≤ m let X i = (a ij g ij ) j=1 n . Then
Proof
For every i ≤ m, \(\langle X_{i},y\rangle =\sum _{ j=1}^{n}a_{ij}y_{j}g_{ij}\), is a Gaussian random variable with variance \(\|(a_{ij}y_{j})_{j=1}^{n}\|_{2}\). Hence,
Since p ∗ ≤ 2 ≤ q, the function
is a convex function on the simplex \(S =\{ z \in \mathbb{R}^{n}\,\vert \,\sum _{j=1}^{n} \leq 1,\,\forall j:\, z_{j} \geq 0\}\). Therefore, it attains its maximum on extreme points, that is, on vectors of the canonical unit basis of \(\mathbb{R}^{n}\), e 1, …, e n . Thus,
which completes the proof. □
Now we estimate the expectation in Theorem 2.1. The proof is based on the Gaussian concentration, Theorem 2.3, and is similar to Theorem 2.1 and Remark 2.2 in [24].
Lemma 3.2
Let \(m,n \in \mathbb{N}\) , 1 < p ∗ ≤ 2 ≤ q, and for i ≤ m let X i = (a ij g ij ) j=1 n . Then
where C is a positive absolute constant.
Proof
We have
For all i ≤ m and t > 0 by (3) we have
By permuting the rows of (a ij ) i, j = 1 m, n, we can assume that
For each i ≤ m, choose j(i) ≤ n such that | a ij(i) | = max j ≤ n | a ij | . Clearly,
and hence, by independence of g ij ’s and Lemma 2.2,
where the latter inequality follows since | a 1j(1) | ≥ … ≥ | a nj(n) | . Thus, for i ≤ m,
By (5) we observe for every t > 0,
whenever ct 2∕b 2 ≥ 4. Integrating the tail inequality proves that
By the triangle inequality, we obtain the first desired inequality, the second one follows by (4). □
We are now ready to present the proof of the main theorem.
Proof of Theorem 1.1
First observe that
We have
Hence, Theorem 2.1 applied with \(E =\ell_{ p^{{\ast}}}^{n}\) implies
where B and σ are defined in that theorem. Therefore,
Now, recall that \(T_{2}(\ell_{p}^{n}) \approx \sqrt{p}\) and that \(B_{p^{{\ast}}}^{n}\) has modulus of convexity of power type 2 with λ −2 ≈ 1∕p (see, e.g., [19, Theorem 5.3]). Therefore,
Applying Lemma 3.1, we obtain
The desired bound follows now from Lemma 3.2. □
Remark 3.3
This proof can be extended to the case of random matrices whose rows are centered independent vectors with multivariate Gaussian distributions. We leave the details to the interested reader.
4 Concluding Remarks
In this section, we briefly outline what can be obtained using the approach of [24]. We use a standard trick to pass to a symmetric matrix. The matrix G A being given, define S as
Then, S is a random symmetric matrix and
where the supremum in w is taken over all vectors of the form (u, v)T with \(u \in B_{p^{{\ast}}}^{n}\) and \(v \in B_{q^{{\ast}}}^{m}\). Repeating verbatim the proof of Theorem 4.1 in [24] one gets
where Y ∼ N(0, A −) and A − is a positive definite matrix whose diagonal elements are bounded by
However, the bounds obtained here and in Theorem 1.1 are incomparable. Depending on the situation one may be better than the other.
References
A. Bandeira, R. van Handel, Sharp nonasymptotic bounds on the norm of random matrices with independent entries. Ann. Probab. 44, 2479–2506 (2016)
G. Bennett, V. Goodman, C.M. Newman, Norms of random matrices. Pac. J. Math. 59 (2), 359–365 (1975)
Y. Benyamini, Y. Gordon, Random factorization of operators between Banach spaces. J. Anal. Math. 39, 45–74 (1981)
S. Chevet, Séries de variables aléatoires gaussiennes à valeurs dans \(E\hat{ \otimes }_{\varepsilon }F\). Application aux produits d’espaces de Wiener abstraits, in Séminaire sur la Géométrie des Espaces de Banach (1977–1978), pages Exp. No. 19, 15 (École Polytech., Palaiseau, 1978)
B.S. Cirel’son, I.A. Ibragimov, V.N. Sudakov, Norms of Gaussian sample functions, in Proceedings of the Third Japan-USSR Symposium on Probability Theory (Tashkent, 1975). Lecture Notes in Mathematics, vol. 550 (Springer, Berlin, 1976), pp. 20–41
K.R. Davidson, S.J. Szarek, Addenda and corrigenda to: “Local operator theory, random matrices and Banach spaces”, in Handbook of the Geometry of Banach Spaces, vol. 2 (North-Holland, Amsterdam, 2003)
K.R. Davidson, S.J. Szarek, Local operator theory, random matrices and Banach spaces, in Handbook of the Geometry of Banach Spaces, vol. 1 (North-Holland, Amsterdam, 2003)
T. Figiel, On the moduli of convexity and smoothness. Stud. Math. 56 (2), 121–155 (1976)
Y. Gordon, Some inequalities for gaussian processes and applications. Isr. J. Math. 50, 265–289 (1985)
Y. Gordon, A.E. Litvak, C. Schütt, E. Werner, Orlicz norms of sequences of random variables. Ann. Probab. 30 (4), 1833–1853 (2002)
Y. Gordon, A.E. Litvak, C. Schütt, E. Werner, Uniform estimates for order statistics and Orlicz functions. Positivity 16 (1), 1–28 (2012)
O. Guédon, S. Mendelson, A. Pajor, N. Tomczak-Jaegermann, Majorizing measures and proportional subsets of bounded orthonormal systems. Rev. Mat. Iberoam. 24 (3), 1075–1095 (2008)
O. Guédon, M. Rudelson, Moments of random vectors via majorizing measures. Adv. Math. 208 (2), 798–823 (2007)
J. Hoffmann-Jørgensen, Sums of independent Banach space valued random variables. Stud. Math. 52, 159–186 (1974)
R. Latała, Some estimates of norms of random matrices. Proc. Am. Math. Soc. 133 (5), 1273–1282 (electronic) (2005)
M. Ledoux, The Concentration of Measure Phenomenon. Mathematical Surveys and Monographs, vol. 89 (American Mathematical Society, Providence, RI, 2001)
B. Maurey, G. Pisier, Séries de variables aléatoires vectorielles indépendantes et propriétés géométriques des espaces de Banach. Stud. Math. 58 (1), 45–90 (1976)
G. Pisier, Martingales with values in uniformly convex spaces. Isr. J. Math. 20 (3–4), 326–350 (1975)
G. Pisier, Q. Xu, Non-commutative L p-spaces, in Handbook of the Geometry of Banach Spaces, vol. 2 (North-Holland, Amsterdam, 2003), pp. 1459–1517
S. Riemer, C. Schütt, On the expectation of the norm of random matrices with non-identically distributed entries. Electron. J. Probab. 18 (29), 1–13 (2013)
M. Rudelson, O. Zeitouni, Singular values of gaussian matrices and permanent estimators. Random Struct. Algorithm. 48, 183–212 (2016)
Y. Seginer, The expected norm of random matrices. Combin. Probab. Comput. 9, 149–166 (2000)
J.A. Tropp, User-friendly tail bounds for sums of random matrices. Found. Comput. Math. 12 (4), 389–434 (2012)
R. Van Handel, On the spectral norm of gaussian random matrices. Trans. Am. Math. Soc. (to appear)
J. von Neumann, H.H. Goldstine, Numerical inverting of matrices of high order. Bull. Am. Math. Soc. 53 (11), 1021–1099 (1947)
E.P. Wigner, Characteristic vectors of bordered matrices with infinite dimensions. Ann. Math. 62 (3), 548–564 (1955)
E.P. Wigner, On the distribution of the roots of certain symmetric matrices. Ann. Math. 67 (2), 325–327 (1958)
J. Wishart, The generalised product moment distribution in samples from a normal multivariate population. Biometrika 20A (1/2), 32–52 (1928)
Acknowledgements
Part of this work was done while Alexander E. Litvak visited Joscha Prochno at the Johannes Kepler University in Linz (supported by FWFM 1628000). Alexander E. Litvak thanks the support of the Bézout Research Foundation (Labex Bézout) for the invitation to the University Marne la Vallée (France).
We would also like to thank our colleague R. Adamczak for helpful comments. We are grateful to the anonymous referee for many useful comments and remarks helping us to improve the presentation as well as for showing the argument outlined in the last section.
J. Prochno was supported in parts by the Austrian Science Fund, FWFM 1628000.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this chapter
Cite this chapter
Guédon, O., Hinrichs, A., Litvak, A.E., Prochno, J. (2017). On the Expectation of Operator Norms of Random Matrices. In: Klartag, B., Milman, E. (eds) Geometric Aspects of Functional Analysis. Lecture Notes in Mathematics, vol 2169. Springer, Cham. https://doi.org/10.1007/978-3-319-45282-1_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-45282-1_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-45281-4
Online ISBN: 978-3-319-45282-1
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)