On the Expectation of Operator Norms of Random Matrices

Guédon, Olivier; Hinrichs, Aicke; Litvak, Alexander E.; Prochno, Joscha

doi:10.1007/978-3-319-45282-1_10

Olivier Guédon¹⁵,
Aicke Hinrichs¹⁶,
Alexander E. Litvak¹⁷ &
…
Joscha Prochno¹⁸

Part of the book series: Lecture Notes in Mathematics ((LNM,volume 2169))

1804 Accesses
5 Citations

Abstract

We prove estimates for the expected value of operator norms of Gaussian random matrices with independent (but not necessarily identically distributed) and centered entries, acting as operators from $\ell_{p^{{\ast}}}^{n}$ to ℓ _q ^m, 1 ≤ p ^∗ ≤ 2 ≤ q < ∞.

Access provided by CONRICYT-eBooks. Download chapter PDF

On the Submultiplicativity of Matrix Norms Induced by Random Vectors

Article 02 April 2024

Norms of structured random matrices

Article Open access 09 April 2023

The Expected Norm of a Sum of Independent Random Matrices: An Elementary Approach

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction and Main Results

Random matrices and their spectra are under intensive study in Statistics since the work of Wishart [28] on sample covariance matrices, in Numerical Analysis since their introduction by von Neumann and Goldstine [25] in the 1940s, and in Physics as a consequence of Wigner’s work [26, 27] since the 1950s. His Semicircle Law, a fundamental theorem in the spectral theory of large random matrices describing the limit of the empirical spectral measure for what is nowadays known as Wigner matrices, is among the most celebrated results of the theory.

In Banach Space Theory and Asymptotic Geometric Analysis, random matrices appeared already in the 70s (see e.g. [2, 3, 9]). In [2], the authors obtained asymptotic bounds for the expected value of the operator norm of a random matrix B = (b _ij)_{i, j = 1} ^m, n with independent mean-zero entries with | b _ij | ≤ 1 from ℓ ₂ ⁿ to ℓ _q ^m, 2 ≤ q < ∞. To be more precise, they proved that

$$\displaystyle{\mathbb{E}\,\big\|B:\,\ell_{ 2}^{n} \rightarrow \ell_{ q}^{m}\big\| \leq C_{ q} \cdot \max \big (m^{1/q},\sqrt{n}\big),}$$

where C _q depends only on q. This was then successfully used to characterize (p, q)-absolutely summing operators on Hilbert spaces. Ever since, random matrices are extensively studied and methods of Banach spaces have produced numerous deep and new results. In particular, in many applications the spectral properties of a Gaussian matrix, whose entries are independent identically distributed (i.i.d.) standard Gaussian random variables, were used. Seginer proved in [22] that for an m × n random matrix with i.i.d. symmetric random variables the expectation of its spectral norm (that is, the operator norm from ℓ ₂ ⁿ to ℓ ₂ ^m) is of the order of the expectation of the largest Euclidean norm of its rows and columns. He also obtained an optimal result in the case of random matrices with entries ɛ _ij a _ij, where ɛ _ij are independent Rademacher random variables and a _ij are fixed numbers. We refer the interested reader to the surveys [6, 7] and references therein.

It is natural to ask similar questions about general random matrices, in particular about Gaussian matrices whose entries are still independent centered Gaussian random variables, but with different variances. In this structured case, where we drop the assumption of identical distributions, very little is known. It is conjectured that the expected spectral norm of such a Gaussian matrix is as in Seginer’s result, that is, of the order of the expectation of the largest Euclidean norm of its rows and columns. A big step toward the solution was made by Latała in [15], who proved a bound involving fourth moments, which is of the right order $\max (\sqrt{m},\sqrt{n})$ in the i.i.d. setting, but does not capture the right behavior in the case of, for instance, diagonal matrices. On one hand, as is mentioned in [15], in view of the classical Bai-Yin theorem, the presence of fourth moments is not surprising, on the other hand they are not needed if the conjecture is true.

Later in [20], Riemer and Schütt proved the conjecture up to a logn factor. The two results are incomparable—depending on the choice of variances, one or another gives a better bound. The Riemer-Schütt estimate was used recently in [21].

We would also like to mention that the non-commutative Khintchine inequality can be used to show that the expected spectral norm is bounded from above by the largest Euclidean norm of its rows and columns times a factor$\sqrt{\log n}$ (see e.g. (4. 9) in [23]).

Another big step toward the solution was made a short while ago by Bandeira and Van Handel [1]. In particular, they proved that

$$\displaystyle\begin{array}{rcl} \mathbb{E}\,\big\|(a_{ij}g_{ij}):\ell_{ 2}^{n} \rightarrow \ell_{ 2}^{m}\big\| \leq C\Big(\left \vert \left \vert \left \vert A\right \vert \right \vert \right \vert + \sqrt{\log \min (n, m)} \cdot \max _{ ij}\vert a_{ij}\vert \Big),& &{}\end{array}$$

(1)

where $\left \vert \left \vert \left \vert A\right \vert \right \vert \right \vert$ denotes the largest Euclidean norm of the rows and columns of (a _ij), C > 0 is a universal constant, and g _ij are independent standard Gaussian random variables (see [1, Theorem 3.1]). Under mild structural assumptions, the bound (1) is already optimal. Further progress was made by Van Handel [24] who verified the conjecture up to a $\sqrt{\log \log n}$ factor. In fact, more was proved in [24]. He computed precisely the expectation of the largest Euclidean norm of the rows and columns using Gaussian concentration. And, while the moment method is at the heart of the proofs in [22] and [1], he proposed a very nice approach based on the comparison of Gaussian processes to improve the result of Latała. His approach can be also used for our setting. We comment on this in Sect. 4.

The purpose of this work is to provide bounds for operator norms of such structured Gaussian random matrices considered as operators from $\ell_{p^{{\ast}}}^{n}$ to ℓ _q ^m.

In what follows, by g _i, g _ij, i ≥ 1, j ≥ 1 we always denote independent standard Gaussian random variables. Let $n,m \in \mathbb{N}$ and $A = (a_{ij})_{i,j=1}^{m,n} \in \mathbb{R}^{m\times n}$. We write G = G _A = (a _ij g _ij)_{i, j = 1} ^m, n. For r ≥ 1, we denote by $\gamma _{r} \approx \sqrt{r}$ the L _r-norm of a standard Gaussian random variable. The notation f ≈ h means that there are two absolute positive constants c and C (that is, independent of any parameters) such that cf ≤ h ≤ Cf and f ≈ _p, q h means that there are two positive constants c(p, q) and C(p, q), which depend only on the parameters p and q, such that c(p, q)f ≤ h ≤ C(p, q)f.

Our main result is the following theorem.

Theorem 1.1

For every 1 < p ^∗ ≤ 2 ≤ q < ∞ one has

$$\displaystyle\begin{array}{rcl} \mathbb{E}\,\big\|G:\ell_{ p^{{\ast}}}^{n} \rightarrow \ell_{ q}^{m}\big\|& \leq & \Big(\mathbb{E}\,\big\|G:\ell_{ p^{{\ast}}}^{n} \rightarrow \ell_{ q}^{m}\big\|^{q}\Big)^{1/q} {}\\ & \leq & C\,p^{5/q}\,(\log m)^{1/q}\,\bigg[\,\gamma _{ p}\,\max _{i\leq m}\|(a_{ij})_{j=1}^{n}\|_{ p} +\gamma _{q}\,\mathbb{E}\max _{{ i\leq m \atop j\leq n} }\vert a_{ij}g_{ij}\vert \,\bigg] {}\\ & & +2^{1/q}\,\gamma _{ q}\,\max _{j\leq n}\|(a_{ij})_{i=1}^{m}\|_{ q}, {}\\ \end{array}$$

where C is a positive absolute constant.

We conjecture the following bound.

Conjecture 1.2

For every 1 ≤ p ^∗ ≤ 2 ≤ q ≤∞ one has

$$\displaystyle\begin{array}{rcl} \mathbb{E}\,\big\|G:\ell_{ p^{{\ast}}}^{n} \rightarrow \ell_{ q}^{m}\big\| \approx \max _{ i\leq m}\|(a_{ij})_{j=1}^{n}\|_{ p} +\max _{j\leq n}\|(a_{ij})_{i=1}^{m}\|_{ q} + \mathbb{E}\max _{{ i\leq m \atop j\leq n} }\vert a_{ij}g_{ij}\vert.& & {}\\ \end{array}$$

Here, as usual, p is defined via the relation 1∕p + 1∕p ^∗ = 1. This conjecture extends the corresponding conjecture for the case p = q = 2 and m = n. In this case, Bandeira and Van Handel proved in [1] an estimate with $\sqrt{ \log \min (m,n)}\max \vert a_{ij}\vert$ instead of $\mathbb{E}\max \vert a_{ij}g_{ij}\vert$ (see Eq. (1)), while in [24] the corresponding bound is proved with $\sqrt{\log \log n}$ in front of the right hand side.

Remark 1.3

The lower bound in the conjecture is almost immediate and follows from standard estimates. Thus the upper bound is the only difficulty.

Remark 1.4

In the case p ^∗ = 1 and q ≥ 2, a direct computation following along the lines of Lemma 3.2 below, shows that

$$\displaystyle{\mathbb{E}\,\big\|G:\ell_{ 1}^{n} \rightarrow \ell_{ q}^{m}\big\|\lesssim \gamma _{ q}\max _{j\leq n}\|(a_{ij})_{i=1}^{m}\|_{ q} + \mathbb{E}\max _{{ i\leq m \atop j\leq n} }\vert a_{ij}g_{ij}\vert.}$$

Remark 1.5

Note that if 1 ≤ p ^∗ ≤ 2 ≤ q ≤ ∞, in the case of matrices of tensor structure, that is, (a _ij)_{i, j = 1} ⁿ = x ⊗ y = (x _j ⋅ y _i)_{i, j = 1} ⁿ, with $x,y \in \mathbb{R}^{n}$, Chevet’s theorem [3, 4] and a direct computation show that

$$\displaystyle{\mathbb{E}\,\big\|G:\ell_{ p^{{\ast}}}^{n} \rightarrow \ell_{ q}^{n}\big\| \approx _{ p,q}\|y\|_{q}\|x\|_{\infty } +\| y\|_{\infty }\|x\|_{p}.}$$

If the matrix is diagonal, that is, $(a_{ij})_{i,j=1}^{n} =\mathop{ \mathrm{diag}}\nolimits (a_{11},\ldots,a_{nn})$, then we immediately obtain

$$\displaystyle{\mathbb{E}\,\big\|G:\ell_{ p^{{\ast}}}^{n} \rightarrow \ell_{ q}^{n}\big\| = \mathbb{E}\,\|(a_{ ii}g_{ii})_{i=1}^{n}\|_{ \infty }\approx \max _{i\leq n}\sqrt{\ln (i + 3)}\, \cdot a_{ii}^{{\ast}}\approx \| (a_{ ii})_{i=1}^{n}\|_{ M_{g}},}$$

where (a _ii ^∗)_i ≤ n is the decreasing rearrangement of ( | a _ii | )_i ≤ n and M _g is the Orlicz function given by

$$\displaystyle{M_{g}(s) = \sqrt{\frac{2} {\pi }} \int _{0}^{s}e^{- \frac{1} {2t^{2}} }\,dt}$$

(see Lemma 2.2 below and [11, Lemma 5.2] for the Orlicz norm expression).

Slightly different estimates, but of the same flavour, can also be obtained in the case 1 ≤ q ≤ 2 ≤ p ^∗ ≤ ∞.

2 Notation and Preliminaries

By c, C, C ₁, … we always denote positive absolute constants, whose values may change from line to line, and we write c _p, C _p, … if the constants depend on some parameter p.

Given p ∈ [1, ∞], p ^∗ denotes its conjugate and is given by the relation 1∕p + 1∕p ^∗ = 1. For $x = (x_{i})_{i\leq n} \in \mathbb{R}^{n}$, ∥ x ∥ _p denotes its ℓ _p-norm, that is ∥ x ∥ _∞ = max_i ≤ n | x _i | and, for p < ∞,

$$\displaystyle{ \|x\|_{p} =\Big (\sum _{i=1}^{n}\vert x_{ i}\vert ^{p}\Big)^{1/p}. }$$

The corresponding space $(\mathbb{R}^{n},\|\cdot \|_{p})$ is denoted by ℓ _p ⁿ, its unit ball by B _p ⁿ.

If E is a normed space, then E ^∗ denotes its dual space and B _E its closed unit ball. The modulus of convexity of E is defined for any ɛ ∈ (0, 2) by

$$\displaystyle{\delta _{E}(\varepsilon ):=\inf \Big\{ 1 -\Big\|\frac{x + y} {2} \Big\|_{E}\,:\,\| x\|_{E} = 1,\ \|y\|_{E} = 1,\ \|x - y\|_{E}>\varepsilon \Big\}.}$$

We say that E has modulus of convexity of power type 2 if there exists a positive constant c such that for all ɛ ∈ (0, 2), δ _E(ɛ) ≥ c ɛ ². It is well known that this property (see e.g. [8] or [18, Proposition 2.4]) is equivalent to the fact that

$$\displaystyle{\Big\|\frac{x + y} {2} \Big\|_{E}^{2} +\lambda ^{-2}\Big\|\frac{x - y} {2} \Big\|_{E}^{2} \leq \frac{\|x\|_{E}^{2} +\| y\|_{ E}^{2}} {2} }$$

holds for all x, y ∈ E, where λ > 0 is a constant depending only on c. In that case, we say that E has modulus of convexity of power type 2 with constant λ. We clearly have δ _E(ɛ) ≥ ɛ ²∕(2λ ²).

Recall that a Banach space E is of Rademacher type r for some 1 ≤ r ≤ 2 if there is C > 0 such that for all $n \in \mathbb{N}$ and for all x ₁, …, x _n ∈ E,

$$\displaystyle{\bigg(\mathbb{E}_{\varepsilon }\Big\|\sum _{i=1}^{n}\varepsilon _{ i}x_{i}\Big\|^{2}\bigg)^{1/2} \leq C\left (\sum _{ i=1}^{n}\|x_{ i}\|^{r}\right )^{1/r},}$$

where (ɛ _i)_i = 1 ^∞ is a sequence of independent random variables defined on some probability space $(\Omega, \mathbb{P})$ such that $\mathbb{P}(\varepsilon _{i} = 1) = \mathbb{P}(\varepsilon _{i} = -1) = \frac{1} {2}$ for every $i \in \mathbb{N}$. The smallest C is called type-r constant of E, denoted by T _r(E). This concept was introduced into Banach space theory by Hoffmann-Jørgensen [14] in the early 1970s and the basic theory was developed by Maurey and Pisier [17].

We will need the following theorem.

Theorem 2.1

Let E be a Banach space with modulus of convexity of power type 2 with constant λ. Let X ₁ ,…,X _m ∈ E ^∗ be independent random vectors, q ≥ 2 and define

$$\displaystyle{B:= C\lambda ^{4}T_{ 2}(E^{{\ast}})\sqrt{\frac{\log m} {m}}\Big(\mathbb{E}\max _{i\leq m}\|X_{i}\|_{E^{{\ast}}}^{q}\Big)^{1/2},}$$

and

$$\displaystyle{\sigma:=\sup _{y\in B_{E}}\left ( \frac{1} {m}\sum _{i=1}^{m}\mathbb{E}\vert \langle X_{ i},y\rangle \vert ^{q}\right )^{1/q}.}$$

Then

$$\displaystyle\begin{array}{rcl} \mathbb{E}\sup _{y\in B_{E}}\bigg\vert \frac{1} {m}\sum _{i=1}^{m}\vert \langle X_{ i},y\rangle \vert ^{q} - \mathbb{E}\vert \langle X_{ i},y\rangle \vert ^{q}\bigg\vert & \leq B^{2} + B \cdot \sigma ^{q/2}.& {}\\ \end{array}$$

Its proof is done following the argument “proof of condition (H)” of [13] in combination with the improvement on covering numbers established in [12, Lemma 2]. Indeed, in [12], the argument is only made in the simpler case q = 2, but it can be extended verbatim to the case q ≥ 2.

We also recall known facts about Gaussian random variables. The next lemma is well-known (see e.g. Lemmas 2.3, 2.4 in [24]).

Lemma 2.2

Let $a = (a_{i})_{i\leq n} \in \mathbb{R}^{n}$ and (a _i ^∗ ) _i≤n be the decreasing rearrangement of (|a _i |) _i≤n . Then

$$\displaystyle{\mathbb{E}\,\max _{i\leq n}\vert a_{i}g_{i}\vert \approx \max _{i\leq n}\sqrt{\ln (i + 3)}\, \cdot a_{i}^{{\ast}}.}$$

Note that in general the maximum of i.i.d. random variables weighted by coordinates of a vector a is equivalent to a certain Orlicz norm ∥ a ∥ _M, where the function M depends only on the distribution of random variables (see [10, Corollary 2] and Lemma 5.2 in [11]).

The following theorem is the classical Gaussian concentration inequality (see e.g. [5] or inequality (2.35) and Proposition 2.18 in [16]).

Theorem 2.3

Let $n \in \mathbb{N}$ and $(Y,\left \Vert \cdot \right \Vert _{Y })$ be a Banach space. Let y ₁ ,…,y _n ∈ Y and X = ∑ _i=1 ⁿ g _i y _i . Then, for every t > 0,

$$\displaystyle{ \mathbb{P}\Big(\big\vert \left \Vert X\right \Vert _{Y } - \mathbb{E}\left \Vert X\right \Vert _{Y }\big\vert \geq t\Big) \leq 2\exp \left (- \frac{t^{2}} {2\sigma _{Y }(X)^{2}}\right ), }$$

(2)

where $\sigma _{Y }(X) =\sup _{\|\xi \|_{Y^{{\ast}}}=1}\left (\sum _{i=1}^{n}\left \vert \xi (y_{i})\right \vert ^{2}\right )^{1/2}$ .

Remark 2.4

Let p ≥ 2. Let $a = (a_{j})_{j\leq n} \in \mathbb{R}^{n}$ and X = (a _j g _j)_j ≤ n. Then we clearly have

$$\displaystyle{\sigma _{\ell_{p}^{n}}(X) =\max _{j\leq n}\vert a_{j}\vert.}$$

Thus, Theorem 2.3 implies for X = (a _j g _j)_j ≤ n

$$\displaystyle\begin{array}{rcl} \mathbb{P}\Big(\big\vert \|X\|_{p} - \mathbb{E}\|X\|_{p}\big\vert> t\Big) \leq 2\,\exp \bigg(- \frac{t^{2}} {2\max _{j\leq n}\vert a_{j}\vert ^{2}}\bigg).& &{}\end{array}$$

(3)

Note also that

$$\displaystyle\begin{array}{rcl} \mathbb{E}\|X\|_{p} \leq \bigg (\sum _{j=1}^{n}\vert a_{ j}\vert ^{p}\,\mathbb{E}\vert g_{ j}\vert ^{p}\bigg)^{1/p} =\gamma _{ p}\|a\|_{p}.& &{}\end{array}$$

(4)

3 Proof of the Main Result

We will apply Theorem 2.1 with $E =\ell_{ p^{{\ast}}}^{n}$, 1 < p ^∗ ≤ 2 and X ₁, …, X _m being the rows of the matrix G = (a _ij g _ij)_{i, j = 1} ^m, n. We start with two lemmas in which we estimate the quantity σ and the expectation, appearing in that theorem.

Lemma 3.1

Let $m,n \in \mathbb{N}$ , 1 < p ^∗ ≤ 2 ≤ q, and for i ≤ m let X _i = (a _ij g _ij ) _j=1 ⁿ . Then

$$\displaystyle\begin{array}{rcl} \sigma & =\sup _{y\in B_{p^{{\ast}}}^{n}}\bigg( \frac{1} {m}\sum _{i=1}^{m}\mathbb{E}\big\vert \langle X_{ i},y\rangle \big\vert ^{q}\bigg)^{1/q} = \frac{\gamma _{q}} {m^{1/q}}\,\max _{j\leq n}\|(a_{ij})_{i=1}^{m}\|_{q}.& {}\\ \end{array}$$

Proof

For every i ≤ m, $\langle X_{i},y\rangle =\sum _{ j=1}^{n}a_{ij}y_{j}g_{ij}$, is a Gaussian random variable with variance $\|(a_{ij}y_{j})_{j=1}^{n}\|_{2}$. Hence,

$$\displaystyle\begin{array}{rcl} \sigma ^{q} =\sup _{ y\in B_{p^{{\ast}}}^{n}} \frac{1} {m}\sum _{i=1}^{m}\mathbb{E}\vert \langle X_{ i},y\rangle \vert ^{q} = \frac{\gamma _{q}^{q}} {m}\sup _{y\in B_{p^{{\ast}}}^{n}}\sum _{i=1}^{m}\bigg(\sum _{ j=1}^{n}\vert a_{ ij}y_{j}\vert ^{2}\bigg)^{q/2}.& & {}\\ \end{array}$$

Since p ^∗ ≤ 2 ≤ q, the function

$$\displaystyle{ \phi (z) =\sum _{ i=1}^{m}\bigg(\sum _{ j=1}^{n}\vert a_{ ij}\vert ^{2}\vert z_{ j}\vert ^{2/p^{{\ast}} }\bigg)^{q/2} }$$

is a convex function on the simplex $S =\{ z \in \mathbb{R}^{n}\,\vert \,\sum _{j=1}^{n} \leq 1,\,\forall j:\, z_{j} \geq 0\}$. Therefore, it attains its maximum on extreme points, that is, on vectors of the canonical unit basis of $\mathbb{R}^{n}$, e ₁, …, e _n. Thus,

$$\displaystyle{ \sup _{y\in B_{p^{{\ast}}}^{n}}\sum _{i=1}^{m}\bigg(\sum _{ j=1}^{n}\vert a_{ ij}y_{j}\vert ^{2}\bigg)^{q/2} =\sup _{ z\in S}\phi (z) =\sup _{k\leq n}\phi (e_{k}) =\max _{j\leq n}\|(a_{ij})_{i=1}^{m}\|_{ q}^{q}, }$$

which completes the proof. □

Now we estimate the expectation in Theorem 2.1. The proof is based on the Gaussian concentration, Theorem 2.3, and is similar to Theorem 2.1 and Remark 2.2 in [24].

Lemma 3.2

Let $m,n \in \mathbb{N}$ , 1 < p ^∗ ≤ 2 ≤ q, and for i ≤ m let X _i = (a _ij g _ij ) _j=1 ⁿ . Then

$$\displaystyle\begin{array}{rcl} \Big(\mathbb{E}\max _{i\leq m}\|X_{i}\|_{p}^{q}\Big)^{1/q}& \leq & \max _{ i\leq m}\mathbb{E}\|X_{i}\|_{p} + C\,\gamma _{q}\,\mathbb{E}\max _{{ i\leq m \atop j\leq n} }\vert a_{ij}g_{ij}\vert {}\\ &\leq & \gamma _{p}\,\max _{i\leq m}\|(a_{ij})_{j=1}^{n}\|_{ p} + C\,\gamma _{q}\,\mathbb{E}\max _{{ i\leq m \atop j\leq n} }\vert a_{ij}g_{ij}\vert, {}\\ \end{array}$$

where C is a positive absolute constant.

Proof

We have

$$\displaystyle\begin{array}{rcl} \Big(\mathbb{E}\max _{i\leq m}\|X_{i}\|_{p}^{q}\Big)^{1/q}& \leq & \big\|\max _{ i\leq m}\big\vert \|X_{i}\|_{p} - \mathbb{E}\|X_{i}\|_{p}\big\vert +\max _{i\leq m}\mathbb{E}\|X_{i}\|_{p}\big\|_{L_{q}} {}\\ & \leq & \Big(\mathbb{E}\max _{i\leq m}\big\vert \|X_{i}\|_{p} - \mathbb{E}\|X_{i}\|_{p}\big\vert ^{q}\Big)^{1/q} +\max _{ i\leq m}\mathbb{E}\|X_{i}\|_{p}. {}\\ \end{array}$$

For all i ≤ m and t > 0 by (3) we have

$$\displaystyle{ \mathbb{P}\Big(\big\vert \|X_{i}\|_{p} - \mathbb{E}\|X_{i}\|_{p}\big\vert> t\Big) \leq 2\,\exp \bigg(- \frac{t^{2}} {2\max _{j\leq n}\vert a_{ij}\vert ^{2}}\bigg). }$$

(5)

By permuting the rows of (a _ij)_{i, j = 1} ^m, n, we can assume that

$$\displaystyle{ \max _{j\leq n}\vert a_{1j}\vert \geq \ldots \geq \max _{j\leq n}\vert a_{mj}\vert. }$$

For each i ≤ m, choose j(i) ≤ n such that | a _ij(i) | = max_j ≤ n | a _ij | . Clearly,

$$\displaystyle{\max _{{ i\leq m \atop j\leq n} }\vert a_{ij}g_{ij}\vert \geq \max _{i\leq m}\vert a_{ij(i)}\vert \cdot \vert g_{ij(i)}\vert }$$

and hence, by independence of g _ij’s and Lemma 2.2,

$$\displaystyle\begin{array}{rcl} b:= \mathbb{E}\max _{{ i\leq m \atop j\leq n} }\vert a_{ij}g_{ij}\vert \geq \mathbb{E}\max _{i\leq m}\vert a_{ij(i)}\vert \cdot \vert g_{i}\vert \geq c\max _{i\leq m}\sqrt{\log (i + 3)} \cdot \vert a_{ij(i)}\vert,& & {}\\ \end{array}$$

where the latter inequality follows since | a _1j(1) | ≥ … ≥ | a _nj(n) | . Thus, for i ≤ m,

$$\displaystyle\begin{array}{rcl} \max _{j\leq n}\vert a_{ij}\vert ^{2} = a_{ ij(i)}^{2} \leq \frac{b^{2}} {c\log (i + 3)}.& & {}\\ \end{array}$$

By (5) we observe for every t > 0,

$$\displaystyle\begin{array}{rcl} \mathbb{P}\Big(\max _{i\leq m}\big\vert \|X_{i}\|_{p} - \mathbb{E}\|X_{i}\|_{p}\big\vert> t\Big)& \leq & 2\,\sum _{i=1}^{m}\exp \bigg(-\frac{ct^{2}\log (i + 3)} {2b^{2}} \bigg) {}\\ & =& 2\,\sum _{i=1}^{m}\bigg( \frac{1} {i + 3}\bigg)^{ct^{2}/2b^{2} } \leq 2\,\int _{3}^{\infty }x^{-ct^{2}/2b^{2} }\,dx {}\\ & \leq & 6 \cdot 3^{-ct^{2}/2b^{2} }, {}\\ \end{array}$$

whenever ct ²∕b ² ≥ 4. Integrating the tail inequality proves that

$$\displaystyle{\bigg(\mathbb{E}\max _{i\leq m}\Big\vert \|X_{i}\|_{p} - \mathbb{E}\|X_{i}\|_{p}\Big\vert ^{q}\bigg)^{1/q} \leq C_{ 1}\sqrt{q}\,b \leq C_{2}\,\gamma _{q}\,\,\mathbb{E}\max _{{ i\leq m \atop j\leq n} }\vert a_{ij}g_{ij}\vert.}$$

By the triangle inequality, we obtain the first desired inequality, the second one follows by (4). □

We are now ready to present the proof of the main theorem.

Proof of Theorem 1.1

First observe that

$$\displaystyle{\mathbb{E}\,\big\|G:\ell_{ p^{{\ast}}}^{n} \rightarrow \ell_{ q}^{m}\big\| \leq \Big (\mathbb{E}\,\big\|G:\ell_{ p^{{\ast}}}^{n} \rightarrow \ell_{ q}^{m}\big\|^{q}\Big)^{1/q} =\bigg (\mathbb{E}\sup _{ y\in B_{p^{{\ast}}}^{n}}\sum _{i=1}^{m}\big\vert \langle X_{ i},y\rangle \big\vert ^{q}\bigg)^{1/q}.}$$

We have

$$\displaystyle\begin{array}{rcl} \mathbb{E}\sup _{y\in B_{p^{{\ast}}}^{n}}\sum _{i=1}^{m}\big\vert \langle X_{ i},y\rangle \big\vert ^{q}& \leq & \mathbb{E}\sup _{ y\in B_{p^{{\ast}}}^{n}}\left [\sum _{i=1}^{m}\big\vert \langle X_{ i},y\rangle \big\vert ^{q} - \mathbb{E}\big\vert \langle X_{ i},y\rangle \big\vert ^{q}\right ]+\sup _{ y\in B_{p^{{\ast}}}^{n}}\sum _{i=1}^{m}\mathbb{E}\big\vert \langle X_{ i},y\rangle \big\vert ^{q} {}\\ & =& m \cdot \mathbb{E}\sup _{y\in B_{p^{{\ast}}}^{n}}\left [ \frac{1} {m}\sum _{i=1}^{m}\big\vert \langle X_{ i},y\rangle \big\vert ^{q} - \mathbb{E}\big\vert \langle X_{ i},y\rangle \big\vert ^{q}\right ]+m \cdot \sigma ^{q}. {}\\ \end{array}$$

Hence, Theorem 2.1 applied with $E =\ell_{ p^{{\ast}}}^{n}$ implies

$$\displaystyle\begin{array}{rcl} \mathbb{E}\,\big\|G:\ell_{ p^{{\ast}}}^{n} \rightarrow \ell_{ q}^{m}\big\|^{q} \leq m \cdot \big [B^{2} + B\sigma ^{q/2}\big] + m \cdot \sigma ^{q} \leq 2m\,\big(B^{2} +\sigma ^{q}\big),& & {}\\ \end{array}$$

where B and σ are defined in that theorem. Therefore,

$$\displaystyle\begin{array}{rcl} \Big(\mathbb{E}\,\big\|G:\ell_{ p^{{\ast}}}^{n} \rightarrow \ell_{ q}^{m}\big\|^{q}\Big)^{1/q} \leq 2^{1/q}m^{1/q}\,\left (B^{2/q}+\sigma \right ).& & {}\\ \end{array}$$

Now, recall that $T_{2}(\ell_{p}^{n}) \approx \sqrt{p}$ and that $B_{p^{{\ast}}}^{n}$ has modulus of convexity of power type 2 with λ ⁻² ≈ 1∕p (see, e.g., [19, Theorem 5.3]). Therefore,

$$\displaystyle\begin{array}{rcl} B^{2/q}& =& C^{2/q}\lambda ^{8/q}\,T_{ 2}^{2/q}(\ell_{ p}^{n})\left (\frac{\log m} {m}\right )^{1/q}\Big(\mathbb{E}\max _{ i\leq m}\|X_{i}\|_{p}^{q}\Big)^{1/q} {}\\ & =& C^{2/q}p^{5/q}(\log m)^{1/q}m^{-1/q}\Big(\mathbb{E}\max _{ i\leq m}\|X_{i}\|_{p}^{q}\Big)^{1/q}. {}\\ \end{array}$$

Applying Lemma 3.1, we obtain

$$\displaystyle\begin{array}{rcl} & & \Big(\mathbb{E}\,\big\|G:\ell_{ p^{{\ast}}}^{n} \rightarrow \ell_{ q}^{m}\big\|^{q}\Big)^{1/q} {}\\ & & \quad \leq (2C^{2})^{1/q} \cdot p^{5/q} \cdot (\log m)^{1/q}\Big(\mathbb{E}\max _{ i\leq m}\|X_{i}\|_{p}^{q}\Big)^{1/q} {}\\ & & \qquad + 2^{1/q}\gamma _{ q} \cdot \max _{j\leq n}\|(a_{ij})_{i=1}^{m}\|_{ q}. {}\\ \end{array}$$

The desired bound follows now from Lemma 3.2. □

Remark 3.3

This proof can be extended to the case of random matrices whose rows are centered independent vectors with multivariate Gaussian distributions. We leave the details to the interested reader.

4 Concluding Remarks

In this section, we briefly outline what can be obtained using the approach of [24]. We use a standard trick to pass to a symmetric matrix. The matrix G _A being given, define S as

$$\displaystyle{S = \frac{1} {2}\left (\begin{array}{*{10}c} 0\quad G_{A}^{T} \\ G_{A}\quad 0 \end{array} \right ).}$$

Then, S is a random symmetric matrix and

$$\displaystyle{\sup _{w}\langle Sw,w\rangle =\sup _{u\in B_{p^{{\ast}}}^{n}}\sup _{v\in B_{q^{{\ast}}}^{m}}\langle G_{A}u,v\rangle =\big\| G_{A}:\ell_{ p^{{\ast}}}^{n} \rightarrow \ell_{ q}^{m}\big\|,}$$

where the supremum in w is taken over all vectors of the form (u, v)^T with $u \in B_{p^{{\ast}}}^{n}$ and $v \in B_{q^{{\ast}}}^{m}$. Repeating verbatim the proof of Theorem 4.1 in [24] one gets

$$\displaystyle\begin{array}{rcl} \mathbb{E}\,\big\|G_{A}:\ell_{ p^{{\ast}}}^{n} \rightarrow \ell_{ q}^{m}\big\|\quad & \lesssim _{ p,q}& \mathbb{E}\max _{i\leq m}\bigg(\sum _{j=1}^{n}\vert g_{ j}\vert ^{p}\vert a_{ ij}\vert ^{p}\bigg)^{1/p} {}\\ & & \quad + \mathbb{E}\max _{j\leq n}\bigg(\sum _{i=1}^{m}\vert g_{ i}\vert ^{q}\vert a_{ ij}\vert ^{g}\bigg)^{1/q} + \mathbb{E}\max _{ i}Y _{i}, {}\\ \end{array}$$

where Y ∼ N(0, A ⁻) and A ⁻ is a positive definite matrix whose diagonal elements are bounded by

$$\displaystyle{ \max \Bigg(\max _{i\leq m}\sqrt{\sum _{j } a_{ij }^{4}}\,,\,\max _{j\leq n}\sqrt{\sum _{i } a_{ij }^{4}}\,\Bigg). }$$

However, the bounds obtained here and in Theorem 1.1 are incomparable. Depending on the situation one may be better than the other.

References

A. Bandeira, R. van Handel, Sharp nonasymptotic bounds on the norm of random matrices with independent entries. Ann. Probab. 44, 2479–2506 (2016)
Article MathSciNet MATH Google Scholar
G. Bennett, V. Goodman, C.M. Newman, Norms of random matrices. Pac. J. Math. 59 (2), 359–365 (1975)
Article MathSciNet MATH Google Scholar
Y. Benyamini, Y. Gordon, Random factorization of operators between Banach spaces. J. Anal. Math. 39, 45–74 (1981)
Article MathSciNet MATH Google Scholar
S. Chevet, Séries de variables aléatoires gaussiennes à valeurs dans $E\hat{ \otimes }_{\varepsilon }F$. Application aux produits d’espaces de Wiener abstraits, in Séminaire sur la Géométrie des Espaces de Banach (1977–1978), pages Exp. No. 19, 15 (École Polytech., Palaiseau, 1978)
Google Scholar
B.S. Cirel’son, I.A. Ibragimov, V.N. Sudakov, Norms of Gaussian sample functions, in Proceedings of the Third Japan-USSR Symposium on Probability Theory (Tashkent, 1975). Lecture Notes in Mathematics, vol. 550 (Springer, Berlin, 1976), pp. 20–41
Google Scholar
K.R. Davidson, S.J. Szarek, Addenda and corrigenda to: “Local operator theory, random matrices and Banach spaces”, in Handbook of the Geometry of Banach Spaces, vol. 2 (North-Holland, Amsterdam, 2003)
Google Scholar
K.R. Davidson, S.J. Szarek, Local operator theory, random matrices and Banach spaces, in Handbook of the Geometry of Banach Spaces, vol. 1 (North-Holland, Amsterdam, 2003)
Google Scholar
T. Figiel, On the moduli of convexity and smoothness. Stud. Math. 56 (2), 121–155 (1976)
MathSciNet MATH Google Scholar
Y. Gordon, Some inequalities for gaussian processes and applications. Isr. J. Math. 50, 265–289 (1985)
Article MathSciNet MATH Google Scholar
Y. Gordon, A.E. Litvak, C. Schütt, E. Werner, Orlicz norms of sequences of random variables. Ann. Probab. 30 (4), 1833–1853 (2002)
Article MathSciNet MATH Google Scholar
Y. Gordon, A.E. Litvak, C. Schütt, E. Werner, Uniform estimates for order statistics and Orlicz functions. Positivity 16 (1), 1–28 (2012)
Article MathSciNet MATH Google Scholar
O. Guédon, S. Mendelson, A. Pajor, N. Tomczak-Jaegermann, Majorizing measures and proportional subsets of bounded orthonormal systems. Rev. Mat. Iberoam. 24 (3), 1075–1095 (2008)
Article MathSciNet MATH Google Scholar
O. Guédon, M. Rudelson, Moments of random vectors via majorizing measures. Adv. Math. 208 (2), 798–823 (2007)
Article MathSciNet MATH Google Scholar
J. Hoffmann-Jørgensen, Sums of independent Banach space valued random variables. Stud. Math. 52, 159–186 (1974)
MathSciNet MATH Google Scholar
R. Latała, Some estimates of norms of random matrices. Proc. Am. Math. Soc. 133 (5), 1273–1282 (electronic) (2005)
Google Scholar
M. Ledoux, The Concentration of Measure Phenomenon. Mathematical Surveys and Monographs, vol. 89 (American Mathematical Society, Providence, RI, 2001)
Google Scholar
B. Maurey, G. Pisier, Séries de variables aléatoires vectorielles indépendantes et propriétés géométriques des espaces de Banach. Stud. Math. 58 (1), 45–90 (1976)
MATH Google Scholar
G. Pisier, Martingales with values in uniformly convex spaces. Isr. J. Math. 20 (3–4), 326–350 (1975)
Article MathSciNet MATH Google Scholar
G. Pisier, Q. Xu, Non-commutative L ^p-spaces, in Handbook of the Geometry of Banach Spaces, vol. 2 (North-Holland, Amsterdam, 2003), pp. 1459–1517
Google Scholar
S. Riemer, C. Schütt, On the expectation of the norm of random matrices with non-identically distributed entries. Electron. J. Probab. 18 (29), 1–13 (2013)
MathSciNet Google Scholar
M. Rudelson, O. Zeitouni, Singular values of gaussian matrices and permanent estimators. Random Struct. Algorithm. 48, 183–212 (2016)
Article MathSciNet MATH Google Scholar
Y. Seginer, The expected norm of random matrices. Combin. Probab. Comput. 9, 149–166 (2000)
Article MathSciNet MATH Google Scholar
J.A. Tropp, User-friendly tail bounds for sums of random matrices. Found. Comput. Math. 12 (4), 389–434 (2012)
Article MathSciNet MATH Google Scholar
R. Van Handel, On the spectral norm of gaussian random matrices. Trans. Am. Math. Soc. (to appear)
Google Scholar
J. von Neumann, H.H. Goldstine, Numerical inverting of matrices of high order. Bull. Am. Math. Soc. 53 (11), 1021–1099 (1947)
Article MathSciNet MATH Google Scholar
E.P. Wigner, Characteristic vectors of bordered matrices with infinite dimensions. Ann. Math. 62 (3), 548–564 (1955)
Article MathSciNet MATH Google Scholar
E.P. Wigner, On the distribution of the roots of certain symmetric matrices. Ann. Math. 67 (2), 325–327 (1958)
Article MathSciNet MATH Google Scholar
J. Wishart, The generalised product moment distribution in samples from a normal multivariate population. Biometrika 20A (1/2), 32–52 (1928)
Article MATH Google Scholar

Download references

Acknowledgements

Part of this work was done while Alexander E. Litvak visited Joscha Prochno at the Johannes Kepler University in Linz (supported by FWFM 1628000). Alexander E. Litvak thanks the support of the Bézout Research Foundation (Labex Bézout) for the invitation to the University Marne la Vallée (France).

We would also like to thank our colleague R. Adamczak for helpful comments. We are grateful to the anonymous referee for many useful comments and remarks helping us to improve the presentation as well as for showing the argument outlined in the last section.

J. Prochno was supported in parts by the Austrian Science Fund, FWFM 1628000.

Author information

Authors and Affiliations

Laboratoire d’Analyse et de Mathématiques Appliquées, Université Paris-Est, 77454, Marne-la-Vallée, Cedex 2, France
Olivier Guédon
Institut für Analysis, Johannes Kepler Universität Linz, Altenbergerstrasse 69, 4040, Linz, Austria
Aicke Hinrichs
Department of Mathematical and Statistical Sciences, University of Alberta, Edmonton, AB, Canada, T6G 2G1
Alexander E. Litvak
Department of Mathematics, University of Hull, Cottingham Road, Hull, HU6 7RX, UK
Joscha Prochno

Authors

Olivier Guédon
View author publications
You can also search for this author in PubMed Google Scholar
Aicke Hinrichs
View author publications
You can also search for this author in PubMed Google Scholar
Alexander E. Litvak
View author publications
You can also search for this author in PubMed Google Scholar
Joscha Prochno
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alexander E. Litvak .

Editor information

Editors and Affiliations

School of Mathematical Sciences, Tel Aviv University, Tel-Aviv, Israel
Bo'az Klartag
Mathematics Department, Technion - Israel Institute of Technology, Haifa, Israel
Emanuel Milman

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Guédon, O., Hinrichs, A., Litvak, A.E., Prochno, J. (2017). On the Expectation of Operator Norms of Random Matrices. In: Klartag, B., Milman, E. (eds) Geometric Aspects of Functional Analysis. Lecture Notes in Mathematics, vol 2169. Springer, Cham. https://doi.org/10.1007/978-3-319-45282-1_10

Download citation

DOI: https://doi.org/10.1007/978-3-319-45282-1_10
Published: 19 April 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-45281-4
Online ISBN: 978-3-319-45282-1
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

On the Expectation of Operator Norms of Random Matrices

Abstract

Similar content being viewed by others

On the Submultiplicativity of Matrix Norms Induced by Random Vectors

Norms of structured random matrices

The Expected Norm of a Sum of Independent Random Matrices: An Elementary Approach

Keywords

1 Introduction and Main Results

Theorem 1.1

Conjecture 1.2

Remark 1.3

Remark 1.4

Remark 1.5

2 Notation and Preliminaries

Theorem 2.1

Lemma 2.2

Theorem 2.3

Remark 2.4

3 Proof of the Main Result

Lemma 3.1

Proof

Lemma 3.2

Proof

Proof of Theorem 1.1

Remark 3.3

4 Concluding Remarks

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

On the Expectation of Operator Norms of Random Matrices

Abstract

Similar content being viewed by others

On the Submultiplicativity of Matrix Norms Induced by Random Vectors

Norms of structured random matrices

The Expected Norm of a Sum of Independent Random Matrices: An Elementary Approach

Keywords

1 Introduction and Main Results

Theorem 1.1

Conjecture 1.2

Remark 1.3

Remark 1.4

Remark 1.5

2 Notation and Preliminaries

Theorem 2.1

Lemma 2.2

Theorem 2.3

Remark 2.4

3 Proof of the Main Result

Lemma 3.1

Proof

Lemma 3.2

Proof

Proof of Theorem 1.1

Remark 3.3

4 Concluding Remarks

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation