Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

The study of random matrices started in the 1920s with the seminal work of Wishart [16]. His basic motivation was the analysis of data. On the other hand, Wigner used the eigenvalues of random matrices to model the spectra of heavy-nuclei atoms [15]. Nowadays, random matrix theory is a field with many applications from telecommunications to random graphs and with many interesting and surprising results.

A central role in the study of random matrices with growing dimension is played by their eigenvalues. To introduce them let, for any \(n \in \mathbb{N}\), \(\left \{a_{n}(p,q),1 \leq p \leq q \leq n\right \}\) be a real-valued random field. Define the symmetric random n ×n matrix X n by

$$\displaystyle{ \mathbf{X}_{n}(q,p) = \mathbf{X}_{n}(p,q) = \frac{1} {\sqrt{n}}a_{n}(p,q),\qquad 1 \leq p \leq q \leq n. }$$

We will denote the (real) eigenvalues of X n by \(\lambda _{1}^{(n)} \leq \lambda _{2}^{(n)} \leq \ldots \lambda _{n}^{(n)}\). Let μ n be the empirical eigenvalue distribution, i.e.

$$\displaystyle{ \mu _{n} = \frac{1} {n}\sum _{k=1}^{n}\delta _{ \lambda _{k}^{(n)}}. }$$

Wigner proved in his fundamental work [15] that, if a n (p, q) are independent, normally distributed with mean 0 and variance 1, for off-diagonal elements, and variance 2 on the diagonal, the empirical eigenvalue distribution μ n converges weakly (in probability) to the so-called semicircle distribution (or law), i.e. the probability distribution ν on \(\mathbb{R}\) with density

$$\displaystyle{ \frac{1} {2\pi }\ \sqrt{4 - {x}^{2}}\ \mathbf{1}_{[-2,2]}(x)\ dx. }$$

An important step to show the universality of this result was taken by Arnold [1], who verified that the convergence to the semicircle law also is true, if one replaces the Gaussian distributed random variables by independent and identically distributed (i.i.d.) random variables with a finite fourth moment. Also the identical distribution may be replaced by some other assumptions (see e.g. [8]). There are various ways to prove such a result. Among others, large deviations techniques as developed in [4] can be applied as well as Stieltjes transforms [2] (the latter method can also be applied to obtain results on the speed of convergence, see [12]). A still very powerful instrument is the moment method, originally employed by Wigner. To this end, it is useful to notice that, if Y is some random variable distributed according to this semicircle distribution, its moments are given by

$$\displaystyle{ \mathbb{E}[{Y }^{k}] = \left \{\begin{array}{@{}l@{\quad }l@{}} 0, \quad &\text{if}\ k\ \text{is odd},\\ C_{\frac{ k} {2} },\quad &\text{if}\ k\ \text{is even}, \end{array} \right. }$$

where C k , \(k \in \mathbb{N}\), are the Catalan numbers defined by C k  = (2k)! ∕ (k! (k + 1)! ).

Recently, it was observed by Erdös et al. [9] that the convergence of the spectral measure towards the semicircle law holds in a local sense. More precisely, this can be proved on intervals with width going to zero sufficiently slowly.

However, the assumption of the entries being independent cannot be renounced without any replacement. Bryc, Dembo and Jiang [5] studied random symmetric Toeplitz matrices and obtained a different limiting distribution. To be more precise, they considered a family \(\{X_{i},0 \leq i \leq n - 1\}\), \(n \in \mathbb{N}\), of independent and identically distributed real-valued random variables, and assumed that \(\mathrm{Var}(X_{1}) = 1\). Then, the scaled symmetric Toeplitz matrix T n was defined by \(\mathbf{T}_{n}(i,j) = 1/\sqrt{n}\ X_{\vert i-j\vert }\), 1 ≤ i, j ≤ n, i.e.

$$\displaystyle{ \mathbf{T}_{n} = \frac{1} {\sqrt{n}}\left (\begin{array}{lllllll} &X_{0} & X_{1} & X_{2} & \ldots & X_{n-2} & X_{n-1} \\ & X_{1} & X_{0} & X_{1} & \ldots & X_{n-3} & X_{n-2}\\ & & \ddots & \ddots & \ddots & & \\ & & & \ddots & \ddots & \ddots & \\ & X_{n-2} & X_{n-3} & X_{n-4} & \ldots & X_{0} & X_{1} \\ & X_{n-1} & X_{n-2} & X_{n-3} & \ldots & X_{1} & X_{0} \end{array} \right ). }$$
(1)

In this situation the empirical spectral distribution of T n converges weakly almost surely to some non-random probability measure γ T as n → . This measure does not depend on the distribution of X 1. Moreover, it has existing moments of all orders, is symmetric, and has an unbounded support. The aim of the present note is, to investigate the borderline between convergence to the semicircle law and the convergence to γ T . To this end we will study random matrices with independent diagonals, where the elements on the diagonals may be correlated. If they are independent, we are, of course, back in the Wigner case, while for complete correlation the matrix is a random Toeplitz matrix.

We will see that depending on the strength of the correlation, the empirical spectral distribution either converges to the semicircle law, or to some mixture of γ T and the semicircle distribution. We hence have a sort of phase transition. Similar results were obtained in [10] for the case of weak correlations and in [11] for stronger correlations. A particularly nice example is borrowed from statistical mechanics. There the Curie-Weiss model is the easiest model of a ferromagnet. Here a magnetic substance has little atoms that carry a magnetic spin, that is either + 1 or  − 1. These spins interact in cooperative way, the strength of the interaction being triggered by a parameter, the so-called inverse temperature. The model exhibits phase transition from paramagnetic to magnetic behavior (the standard reference for the Curie-Weiss model is [7]). We will see that this phase transition can be recovered on the level of the limiting spectral distribution of random matrices, if we fill their diagonals independently with the spins of Curie-Weiss models. For small interaction parameter, this limiting spectral distribution is the semicircle law, while for a large interaction parameter we obtain a distribution which shows the influence of γ T .

The rest of this article is organized in the following way. In the two following sections we will fix our notation. Section 2 contains a description of the measures γ T introduced above, while Sect. 3 describes the kind of matrices we will deal with in a general framework. From here we follow two paths. Section 4 contains our results for convergence towards the semicircle law, while Sect. 5 is devoted to the case of strong correlations along the diagonals. The basic ideas of the proofs, however are so similar, that we can treat them in a unified way. This is done in Sect. 6.

2 The Measure γ T

The limiting spectral distribution γ T can be defined by its moments which are described with the help of Toeplitz volumes.

Compared to [5], we will use a slightly different notation. This will make it easier to understand the arguments of the following sections. Thus, denote by \(\mathcal{P}\mathcal{P}(k)\), \(k \in \mathbb{N}\), the set of all pair partitions of \(\{1,\ldots,k\}\). For any \(\pi \in \mathcal{P}\mathcal{P}(k)\), we write i ∼  π j if i and j are in the same block of π. To introduce Toeplitz volumes, we associate to any \(\pi \in \mathcal{P}\mathcal{P}(k)\) the following system of equations in unknowns x 0, , x k :

$$\displaystyle{ \begin{array}{ll} x_{1} - x_{0} + x_{l_{1}} - x_{l_{1}-1} & = 0,\quad \text{if}\ 1 \sim _{\pi }l_{1}, \\ x_{2} - x_{1} + x_{l_{2}} - x_{l_{2}-1} & = 0,\quad \text{if}\ 2 \sim _{\pi }l_{2},\\ &\vdots \\ x_{i} - x_{i-1} + x_{l_{i}} - x_{l_{i}-1} & = 0,\quad \text{if}\ i \sim _{\pi }l_{i},\\ &\vdots \\ x_{k} - x_{k-1} + x_{l_{k}} - x_{l_{k}-1} & = 0,\quad \text{if}\ k \sim _{\pi }l_{k}.\end{array} }$$
(2)

Since π is a pair partition, we in fact have only k ∕ 2 equations although we have listed k. However, we have k + 1 variables. If \(\pi =\{\{ i_{1},j_{1}\},\ldots,\{i_{k/2},j_{k/2}\}\}\) with i l  < j l for any l = 1, , k ∕ 2, we solve (2) for \(x_{j_{1}},\ldots,x_{j_{k/2}}\), and leave the remaining variables undetermined. We further impose the condition that all variables x 0, , x k lie in the interval I = [0, 1]. Solving the equations above in this way determines a cross section of the cube I k ∕ 2 + 1. The volume of this will be denoted by p T (π). To give an example, consider the partition \(\pi =\{\{ 1,3\},\{2,4\}\}\). Solving (2) for \(x_{3} = x_{0} - x_{1} + x_{2}\) and \(x_{4} = x_{1} - x_{2} + x_{3} = x_{0}\), we obtain a cross section of I 3 given by

$$\displaystyle{ \{x_{0} - x_{1} + x_{2} \in I\} \cap \{ x_{0} \in I\}. }$$

This set has the volume p T (π) = 2 ∕ 3.

Returning to the measure γ T , it is shown in [5] that all odd moments are zero, and for any even \(k \in \mathbb{N}\), the k-th moment is given by

$$\displaystyle{ \int {x}^{k}d\gamma _{ T}(x) =\sum _{\pi \in \mathcal{P}\mathcal{P}(k)}p_{T}(\pi ). }$$

Since | p T (π) | ≤ 1 for any \(\pi \in \mathcal{P}\mathcal{P}(k)\) and \(\#\mathcal{P}\mathcal{P}(k) = (k - 1)!!\), we have

$$\displaystyle{ \left \vert \int {x}^{k}d\gamma _{ T}(x)\right \vert \leq (k - 1)!!. }$$

In particular, Carleman’s condition holds implying that γ T is uniquely determined by its moments. The results for the independent as well as the Toeplitz case will follow directly from Theorems 1 and 2 in case we assume the uniform boundedness of the moments of all orders.

3 Matrices with Independent Processes on the Diagonals

We want to study two different models of symmetric matrices with dependent entries. Both models have the common property that entries from different diagonals are independent while on each diagonal we have a stochastic process with a given covariance structure. Therefore, consider for any \(n \in \mathbb{N}\) a family \(\{a_{n}(p,q),1 \leq p \leq q \leq n\}\) of real-valued random variables. Introduce the symmetric random n ×n matrix X n with

$$\displaystyle{ \mathbf{X}_{n}(p,q) = \mathbf{X}_{n}(q,p) = \frac{1} {\sqrt{n}}a_{n}(p,q),\quad 1 \leq p \leq q \leq n. }$$

Put a n (p, q) = a n (q, p) if 1 ≤ q < p ≤ n. Since we will resort to the method of moments, we first of all want to assume that

  1. (A1)

    \(\mathbb{E}\left [a_{n}(p,q)\right ] = 0\), \(\mathbb{E}\left [a_{n}{(p,q)}^{2}\right ] = 1\), and

    $$\displaystyle{ m_{k}:=\sup _{n\in \mathbb{N}}\max _{1\leq p\leq q\leq n}\mathbb{E}\left [{\left \vert a_{n}(p,q)\right \vert }^{k}\right ] <\infty,\quad k \in \mathbb{N}. }$$
    (3)

Note that the assumption of centered entries can be made without loss of generality if the family \(\{a_{n}(p,q),1 \leq p \leq q \leq n\}\) consists of identically distributed random variables. Indeed, assuming \(\mathbb{E}\left [a_{n}(p,q)\right ] = b_{n}\) for any 1 ≤ p ≤ q ≤ n, \(n \in \mathbb{N}\), and some sequence \((b_{n})_{n\in \mathbb{N}}\) such that b n  = o(n) yields the same limiting spectral distribution as in the centered case, if it exists. This follows from the rank inequality for Hermitian matrices (cf. [3], Lemma 2.2). Changing the variance, however, provides a different limit which is a scaled version of that we obtain with assumption (A1). To make the condition of independent diagonals more precise, we suppose that

  1. (A2)

    For any \(n \in \mathbb{N}\), \(j \in \{ 1,\ldots,n\}\), and distinct integers \(r_{1},\ldots,r_{j} \in \{ 0,\ldots,n - 1\}\), the families \(\{a_{n}(p,p + r_{1}),1 \leq p \leq n - r_{1}\},\ldots,\{a_{n}(p,p + r_{j}),1 \leq p \leq n - r_{j}\}\) are independent.

So far, we know that if we also have independence among the entries on the same diagonal, we will obtain the semicircle law as the limiting spectral distribution. Although we will violate this assumption in our first model, the quickly decaying dependency structure will ensure that nevertheless, we get the same limiting distribution. In our second model, we will basically assume that the covariance is the same for any two entries on the same diagonal. If it is equal to the variance, it is not surprising that the resulting limit is the same as in the Toeplitz case. In general, we will find that we have a combination of the Toeplitz distribution and the semicircle law.

4 Quickly Decaying Covariances

In our first model, the dependency structure within the diagonals is determined by the conditions

  1. (A3)

    The covariance of two entries on the same diagonal can be bounded by some constant depending only on their distance, i.e. for any \(n \in \mathbb{N}\) and 0 ≤ τ ≤ n − 1, there is a constant c n (τ) ≥ 0 such that

    $$\displaystyle{ \vert \mathrm{Cov}(a_{n}(p,q),a_{n}(p+\tau,q+\tau ))\vert \leq c_{n}(\tau ),\quad 1 \leq p \leq q \leq n-\tau, }$$
  2. (A4)

    The entries on the diagonals have a quickly decaying dependency structure, which will be expressed in terms of the condition

    $$\displaystyle{ \sum _{\tau =0}^{n-1}c_{ n}(\tau ) = o(n). }$$

Theorem 1.

Assume that the symmetric random matrix X n satisfies the conditions (A1)–(A4). Then, with probability 1, the empirical spectral distribution of X n converges weakly to the standard semicircle distribution.

Remark 1.

Note that in order for the semicircle law to hold, it is not possible to renounce condition (A4) without any replacement. To understand this, consider a Toeplitz matrix. We clearly have \(c_{n}(\tau ) = \mathcal{O}(1)\), and indeed, the empirical distribution of a sequence of Toeplitz matrices tends with probability 1 to a nonrandom probability measure with unbounded support.

4.1 Examples

We want to give some examples of processes that satisfy the assumptions of Theorem 1. Obviously, this is the case if the entries \(\{a_{n}(p,q),1 \leq p \leq q \leq n\}\) are independent satisfying (A1). The following three examples deal with finite Markov chains, Gaussian Markov processes and m-dependent processes.

  1. (i)

    Assume that \(\left \{x(p),p \in \mathbb{N}\right \}\) is a stationary Markov chain on a finite state space \(S =\{ s_{1},\ldots,s_{N}\}\), N ≥ 2. Denote by \(\varrho = (\varrho _{1},\ldots,\varrho _{N})\) the stationary distribution, and suppose that

    $$\displaystyle{ \mathbb{E}[x(p)] =\sum _{ j=1}^{N}s_{ j}\varrho (j) = 0,\quad \mathbb{E}[x{(p)}^{2}] =\sum _{ j=1}^{N}s_{ j}^{2}\varrho (j) = 1. }$$

    If \(\left \{x(p),p \in \mathbb{N}\right \}\) is aperiodic and irreducible, we have that for some constant C > 0 and some α ∈ (0, 1),

    $$\displaystyle{ \max _{i,j\in \{1,\ldots,N\}}\vert \mathbb{P}(x(p) = s_{i}\ \vert \ x(1) = s_{j}) -\varrho (i)\vert \leq {C\alpha }^{p-1},\quad p \in \mathbb{N}. }$$

    For more details, see [13], Theorem 4.9. In particular, we obtain

    $$\displaystyle{ \vert \mathrm{Cov}(x(p),x(1))\vert = \left \vert \sum _{i,j=1}^{N}s_{ i}s_{j}\Big(\mathbb{P}(x(p) = s_{i}\ \vert \ x(1) = s_{j}) -\varrho (i)\Big)\varrho (j)\right \vert \leq {C\alpha }^{p-1}. }$$

    Now assume that the processes \(\{a(p,p + r),p \in \mathbb{N}\}\), \(r \in \mathbb{N}_{0}\), are independent copies of \(\{x(p),p \in \mathbb{N}\}\), and put a n (p, q): = a(p, q) for any \(n \in \mathbb{N}\), 1 ≤ p ≤ q ≤ n. Condition (A2) then holds by definition, and the uniform moment bound in (A1) is given since we have a bounded support. Furthermore,

    $$\displaystyle{ \vert \mathrm{Cov}(a_{n}(p,q),a_{n}(p+\tau,q+\tau ))\vert \leq c_{n}(\tau ), }$$

    where \(c_{n}(\tau ) = c(\tau ) = {C\alpha }^{\tau }\). This is assumption (A3). Finally, (A4) follows since \(\sum _{\tau =0}^{\infty }c(\tau ) <\infty\), implying \(\sum _{\tau =0}^{n-1}c(\tau ) = o(n)\).

  2. (ii)

    Let \(\{y(p),p \in \mathbb{N}\}\) be a stationary Gaussian Markov process with mean 0 and variance 1. In addition to this, assume that the process is non-degenerate in the sense that \(\mathbb{E}\left [y(p)\vert y(q),q \leq p - 1\right ]\neq y(p)\). In this case, we can represent y(p) as

    $$\displaystyle{ y(p) = b_{p}\sum _{j=1}^{p}d_{ j}\xi _{j}, }$$

    where \(\{\xi _{j},j \in \mathbb{N}\}\) is a family of independent standard Gaussian variables and \(b_{p},d_{1},\ldots,d_{p} \in \mathbb{R}\setminus \{0\}\). We can now calculate

    $$\displaystyle{ \bar{c}(\tau ):=\mathrm{ Cov}(y(p+\tau ),y(p)) = b_{p+\tau }b_{p}\sum _{i=1}^{p+\tau }\sum _{ j=1}^{p}d_{ i}d_{j}\mathbb{E}[\xi _{i}\xi _{j}] = b_{p+\tau }b_{p}\sum _{j=1}^{p}d_{ j}^{2}. }$$

    Note that \(1 = \mathbb{E}[y{(p)}^{2}] = b_{p}^{2}\sum _{j=1}^{p}d_{j}^{2}\). As a consequence, we have

    $$\displaystyle{ \bar{c}(\tau ) = \frac{b_{p+\tau }} {b_{p}} = \frac{b_{p+\tau }} {b_{p+\tau -1}} \frac{b_{p+\tau -1}} {b_{p+\tau -2}}\cdots \frac{b_{p+1}} {b_{p}} =\bar{ c}{(1)}^{\tau }. }$$

    To see that \(\vert \bar{c}(\tau )\vert <1\), we first compute \(\bar{c}(1)y(1) = b_{2}b_{1}^{2}d_{1}^{3}\xi _{1} = b_{2}d_{1}\xi _{1}\), implying that \(y(2) = b_{2}d_{2}\xi _{2} +\bar{ c}(1)y(1)\). Using this identity to calculate the variance, we can take account of the independence of ξ 2 and y(1) to obtain

    $$\displaystyle{ 1 = \mathbb{E}[y{(2)}^{2}] = b_{ 2}^{2}d_{ 2}^{2} +\bar{ c}{(1)}^{2}. }$$

    Since b 2, d 2≠0, we conclude that \(\vert \bar{c}(1)\vert <1\). In analogy to the first example, we assume that the processes \(\{a(p,p + r),p \in \mathbb{N}\}\), \(r \in \mathbb{N}_{0}\), are independent copies of \(\{y(p),p \in \mathbb{N}\}\), and put a n (p, q): = a(p, q) for any \(n \in \mathbb{N}\), 1 ≤ p ≤ q ≤ n. Conditions (A1) and (A2) obviously hold. Defining \(c(\tau ):= \vert \bar{c}(\tau )\vert\), we further obtain

    $$\displaystyle{ \vert \mathrm{Cov}(a_{n}(p,q),a_{n}(p+\tau,q+\tau ))\vert \leq c(\tau ). }$$

    Since \(\vert \bar{c}(1)\vert <1\), we have \(\sum _{\tau =0}^{\infty }c(\tau ) <\infty\), implying \(\sum _{\tau =0}^{n-1}c(\tau ) = o(n)\). Thus assumptions (A3) and (A4) are satisfied.

  3. (iii)

    Assume that \(\left \{z(p),p \in \mathbb{N}\right \}\) is a stationary process of m-dependent random variables, i.e. z(p) and z(q) are stochastically independent whenever | p − q | > m. Moreover, suppose that z(1) is centered with unit variance, and has existing moments of all orders. Define

    $$\displaystyle{ c(\tau ):= \vert \mathrm{Cov}(z(1),z(\tau +1))\vert,\quad \tau \in \mathbb{N}_{0}. }$$

    Then, c(τ) = 0 for any τ > m. Thus, \(\sum _{\tau =0}^{n-1}c(\tau ) =\sum _{ \tau =0}^{m}c(\tau ) = o(n)\) for any n ≥ m + 1. Let \(\{a(p,p + r),p \in \mathbb{N}\}\), \(r \in \mathbb{N}_{0}\), be independent copies of \(\{z(p),p \in \mathbb{N}\}\), and a n (q, p): = a(p, q) for any \(n \in \mathbb{N}\), \(1 \leq p \leq q \leq n\). Then, (A1)–(A4) are satisfied.

5 Constant Covariances

For our second model, we assume that

  1. (A3′)

    The covariance of two distinct entries on the same diagonal depends only on n, i.e. for any 1 ≤ τ ≤ n − 1 and 1 ≤ p, q ≤ n − τ, we can define

    $$\displaystyle{ c_{n}:=\mathrm{ Cov}(a_{n}(p,q),a_{n}(p+\tau,q+\tau )), }$$
  2. (A4′)

    The limit c: = lim n →  c n exists.

To describe the limiting spectral distribution in this case, we want to resort to pair partitions. However, we need a further notion which proved to be useful in [5] when considering the limiting spectral distribution of Markov matrices.

Definition 1.

Let \(k \in \mathbb{N}\) be even, and fix \(\pi \in \mathcal{P}\mathcal{P}(k)\). The height h(π) of π is the number of elements i ∼  π j, i < j, such that either j = i + 1 or the restriction of π to \(\{i + 1,\ldots,j - 1\}\) is a pair partition.

Note that the property that the restriction of π to \(\{i + 1,\ldots,j - 1\}\) is a pair partition in particular requires that the distance j − i − 1 ≥ 1 is even. To give an example how to calculate the height of a partition, take \(\pi =\{\{ 1,6\},\{2,4\},\{3,5\}\}\). Considering the block \(\{1,6\}\), we see that the restriction of π to \(\{2,3,4,5\}\) is a pair partition, namely \(\{\{2,4\},\{3,5\}\}\). However, this is not true for both remaining blocks. Hence, h(π) = 1.

In the following, we will say that a pair partition \(\pi \in \mathcal{P}\mathcal{P}(k)\) is crossing if there are i < j < l < m such that i ∼  π l and j ∼  π m. Otherwise, we call the pair partition non-crossing. The set of all crossing pair partitions of \(\{1,\ldots,k\}\) is denoted by \(\mathcal{C}\mathcal{P}\mathcal{P}(k)\), and the set of all non-crossing pair partitions by \(\mathcal{N}\mathcal{P}\mathcal{P}(k)\). Recall that for even \(k \in \mathbb{N}\), the Catalan number C k ∕ 2 is given by \(C_{k/2} = \#\mathcal{N}\mathcal{P}\mathcal{P}(k)\).

We can now state the main result of this section.

Theorem 2.

Assume that the symmetric random matrix X n satisfies the conditions (A1), (A2), (A3′), and (A4′). Then, with probability 1, the empirical spectral distribution of X n converges weakly to a deterministic probability distribution ν c with k-th moment

$$\displaystyle{ \int {x}^{k}d\nu _{ c}(x) = \left \{\begin{array}{llll} &C_{\frac{k} {2} } +\sum _{\pi \in \mathcal{C}\mathcal{P}\mathcal{P}(k)}p_{T}(\pi ){c}^{\frac{k} {2} -h(\pi )},&&\text{if}\ k\ \text{is even}, \\ &0, &&\text{if}\ k\ \text{is odd}. \end{array} \right. }$$

If k is even, we can also write \(\int {x}^{k}d\nu _{c}(x) =\sum _{\pi \in \mathcal{P}\mathcal{P}(k)}p_{T}(\pi ){c}^{k/2-h(\pi )}\) .

Remark 2.

As for the limiting distribution in the Toeplitz case, we can verify the Carleman condition to see that ν c is uniquely determined by its moments.

Remark 3.

If c = 0, Theorem 2 states that the limiting distribution is the semicircle law since h(π) < k ∕ 2 for any crossing partition \(\pi \in \mathcal{C}\mathcal{P}\mathcal{P}(k)\). This result can also be deduced from Theorem 1. Indeed, choose c n (τ) = | c n  | for any τ ≥ 1, and c n (0) = 1. We then have for any \(1 \leq p \leq q \leq n-\tau\),

$$\displaystyle{ \vert \mathrm{Cov}(a_{n}(p,q),a_{n}(p+\tau,q+\tau ))\vert = c_{n}(\tau ). }$$

Furthermore, we obtain \(\sum _{\tau =0}^{n-1}c_{n}(\tau ) = 1 + (n - 1)\vert c_{n}\vert = o(n)\) since lim n →  c n  = c = 0. Consequently, (A3) and (A4) are satisfied.

5.1 Examples

We want to give some examples of processes satisfying the assumptions of Theorem 2.

  1. (i)

    Consider a symmetric Toeplitz matrix as in (1). The limiting spectral distribution can be deduced from Theorem 2 as well. Indeed, assuming that the entries are centered with unit variance and have existing moments of any order, we see that all conditions are satisfied with c = c n  = 1. Thus, we get

    $$\displaystyle{ \int {x}^{k}d\nu _{ 1}(x) = \left \{\begin{array}{llll} &C_{\frac{k} {2} } +\sum _{\pi \in \mathcal{C}\mathcal{P}\mathcal{P}(k)}p_{T}(\pi ) =\sum _{\pi \in \mathcal{P}\mathcal{P}(k)}p_{T}(\pi ),&&\text{if}\ k\ \text{is even}, \\ &0, &&\text{if}\ k\ \text{is odd},\end{array} \right. }$$

    as proven in [5].

  2. (ii)

    Suppose that for any \(n \in \mathbb{N}\), \(\left \{x_{n}(p),1 \leq p \leq n\right \}\) is a family of exchangeable random variables, i.e. the distribution of the vector (x n (1), , x n (n)) is the same as that of (x n (σ(1)), , x n (σ(n))) for any permutation σ of \(\{1,\ldots,n\}\). In this case, we can conclude that for any 1 ≤ p < q ≤ n, we have

    $$\displaystyle{ \mathrm{Cov}(x_{n}(p),x_{n}(q)) =\mathrm{ Cov}(x_{n}(1),x_{n}(2)) =: c_{n}. }$$

    Now assume that \(c_{n} \rightarrow c \in \mathbb{R}\) as \(n \rightarrow \infty\). Define for any \(n \in \mathbb{N}\), \(r \in \{ 0,\ldots,n - 1\}\), the process \(\{a_{n}(p,p + r),1 \leq p \leq n - r\}\) to be an independent copy of \(\{x_{n}(p),1 \leq p \leq n - r\}\). Then, all conditions of Theorem 2 are satisfied if we ensure that the moment condition (A1) holds. The resulting limiting distribution for different choices of c is depicted in Fig. 1.

    Fig. 1
    figure 1

    Histograms of the empirical spectral distribution of 100 realizations of 1, 000 ×1, 000 matrices X 1, 000 with standard Gaussian entries. (a) c = 0. 25. (b) c = 0. 5. (c) c = 0. 75

    An example for a process with exchangeable variables is the Curie-Weiss model with inverse temperature β > 0. Here, the vector \(x_{n}\,=\,(x_{n}(1),\ldots,x_{n}(n))\) takes values in \(\{-1,{1\}}^{n}\), and for any \(\omega = (\omega (1),\ldots,\omega (n)) \in \{-1,{1\}}^{n}\), we have

    $$\displaystyle{ \mathbb{P}(x_{n} =\omega ) = \frac{1} {Z_{n,\beta }}\exp \left ( \frac{\beta } {2n}{\left (\sum _{i=1}^{n}\omega (i)\right )}^{2}\right ), }$$

    where Z n, β is the normalizing constant. Since \(\mathbb{P}(x_{n}(1) = -1) = \mathbb{P}(x_{n}(1) = 1) = 1/2\), we obtain \(\mathbb{E}[x_{n}(1)] = 0\). Further, we clearly have \(\mathbb{E}[x_{n}{(1)}^{2}] = 1\). It remains to determine c = lim n →  c n . Therefore, we want to make use of the identity

    $$\displaystyle{ c_{n} =\mathrm{ Cov}(x_{n}(1),x_{n}(2)) = \mathbb{E}[x_{n}(1)x_{n}(2)] = \frac{n} {n - 1}\mathbb{E}[m_{n}^{2}] - \frac{1} {n - 1}, }$$

    where \(m_{n}:= 1/n\ \sum _{i=1}^{n}x_{n}(i)\) is the so-called magnetization of the system. Since | m n  | ≤ 1, we see that m n is uniformly integrable. Thus, m n converges in \({\mathcal{L}}^{2}\) to some random variable m if and only if m n  → m in probability. In [6], it was verified that m n  → 0 in probability if β ≤ 1, and m n  → m with \(m \sim 1/2\ \delta _{m(\beta )} + 1/2\ \delta _{-m(\beta )}\) for some m(β) > 0 if β > 1. The function m(β) is monotonically increasing on (1, ), and satisfies m(β) → 0 as β ↘ 1 and m(β) → 1 as β → . We now obtain

    $$\displaystyle{ c =\lim _{n\rightarrow \infty }c_{n} = \left \{\begin{array}{llll} &0, &&\text{if}\ \beta \leq 1, \\ &m{(\beta )}^{2},&&\text{if}\ \beta> 1.\end{array} \right. }$$

    Thus, the limiting spectral distribution of X n is the semicircle law if β ≤ 1, and approximately the Toeplitz limit if β is large. This is insofar not surprising as the different sites in the Curie-Weiss model show little interaction, i.e. behave almost independently, if the temperature is high, or, in other words, β is small. However, if the temperature is low, i.e. β is large, the magnetization of the sites strongly depends on each other. The phase transition at the critical inverse temperature β = 1 in the Curie-Weiss model is thus reflected in the limiting spectral distribution of X n as well.

6 Proof of Theorems 1 and 2

The proofs of both theorems start with the same idea. We need to distinguish them only as soon as the covariances have to be calculated. The main technique we want to apply is the method of moments. The idea is to first determine the weak limit of the expected empirical distribution. Afterwards, concentration inequalities can be used to obtain almost sure convergence.

6.1 The Expected Empirical Spectral Distribution

To determine the limit of the k-th moment of the expected empirical spectral distribution μ n of X n , we write

$$\displaystyle\begin{array}{rcl} \mathbb{E}\left [\int {x}^{k}d\mu _{ n}(x)\right ]& =& \frac{1} {n}\mathbb{E}\Big[\mathrm{tr}\left (\mathbf{X}_{n}^{k}\right )\Big] {}\\ & =& \frac{1} {{n}^{\frac{k} {2} +1}}\sum _{p_{1},\ldots,p_{k}=1}^{n}\mathbb{E}\left [a(p_{ 1},p_{2})a(p_{2},p_{3})\cdots a(p_{k-1},p_{k})a(p_{k},p_{1})\right ]. {}\\ \end{array}$$

The main task is now to compute the expectations on the right hand side. However, we have to face the problem that some of the entries involved are independent and some are not. To be more precise, \(a(p_{1},q_{1}),\ldots,a(p_{j},q_{j})\) are independent whenever they can be found on different diagonals of X n , i.e. the distances \(\vert p_{1} - q_{1}\vert,\ldots,\vert p_{j} - q_{j}\vert\) are distinct. Hence, a first step in our proof is to consider the expectation \(\mathbb{E}\left [a(p_{1},p_{2})a(p_{2},p_{3})\cdots a(p_{k-1},p_{k})a(p_{k},p_{1})\right ]\), and to identify entries with the same distance of their indices. Therefore, we want to adapt some concepts of [14] and [5] to our situation.

To start with, fix \(k \in \mathbb{N}\), and define \(\mathcal{T}_{n}(k)\) to be the set of k-tuples of consistent pairs, that is multi-indices \(\left (P_{1},\ldots,P_{k}\right )\) satisfying for any j = 1, , k,

  1. 1.

    \(P_{j} = (p_{j},q_{j}) \in {\left \{1,\ldots,n\right \}}^{2}\),

  2. 2.

    \(q_{j} = p_{j+1}\), where k + 1 is cyclically identified with 1.

With this notation, we find that

$$\displaystyle{ \frac{1} {n}\mathbb{E}\Big[\mathrm{tr}\left (\mathbf{X}_{n}^{k}\right )\Big] = \frac{1} {{n}^{\frac{k} {2} +1}}\sum _{\left (P_{1},\ldots,P_{k}\right )\in \mathcal{T}_{n}(k)}\mathbb{E}\left [a_{n}(P_{1})\cdots a_{n}(P_{k})\right ]. }$$

To reflect the dependency structure among the entries \(a_{n}(P_{1})\ldots a_{n}(P_{k})\), we want to make use of the set \(\mathcal{P}(k)\) of partitions of \(\{1,\ldots,k\}\). Thus, take \(\pi \in \mathcal{P}(k)\). We say that an element \(\left (P_{1},\ldots,P_{k}\right ) \in \mathcal{T}_{n}(k)\) is a π -consistent sequence if

$$\displaystyle{ \left \vert p_{i} - q_{i}\right \vert = \left \vert p_{j} - q_{j}\right \vert \quad \Longleftrightarrow\quad i \sim _{\pi }j. }$$

According to condition (A2), this implies that \(a_{n}(P_{i_{1}}),\ldots,a_{n}(P_{i_{l}})\) are stochastically independent if i 1, , i l belong to l different blocks of π. The set of all π-consistent sequences \(\left (P_{1},\ldots,P_{k}\right ) \in \mathcal{T}_{n}(k)\) is denoted by S n (π). Note that the sets S n (π), \(\pi \in \mathcal{P}(k)\), are pairwise disjoint, and \(\bigcup _{\pi \in \mathcal{P}(k)}S_{n}(\pi ) = \mathcal{T}_{n}(k)\). Consequently, we can write

$$\displaystyle{ \frac{1} {n}\mathbb{E}\left [\mathrm{tr}\left (\mathbf{X}_{n}^{k}\right )\right ] = \frac{1} {{n}^{\frac{k} {2} +1}}\sum _{\pi \in \mathcal{P}(k)}\sum _{\left (P_{1},\ldots,P_{k}\right )\in S_{n}(\pi )}\mathbb{E}\left [a_{n}(P_{1})\cdots a_{n}(P_{k})\right ]. }$$
(4)

In a next step, we want to exclude partitions that do not contribute to (4) as n → . These are those partitions satisfying either # π > k ∕ 2 or # π < k ∕ 2, where # π denotes the number of blocks of π. We want to treat the two cases separately.

First case: # π > k ∕ 2. Since π is a partition of \(\left \{1,\ldots,k\right \}\), there is at least one singleton, i.e. a block containing only one element i. Consequently, a n (P i ) is independent of \(\{a_{n}(P_{j}),j\neq i\}\) if \(\left (P_{1},\ldots,P_{k}\right ) \in S_{n}(\pi )\). Since we assumed the entries to be centered, we obtain

$$\displaystyle{ \mathbb{E}\left [a_{n}(P_{1})\cdots a_{n}(P_{k})\right ] = \mathbb{E}\Big[\prod _{i\neq l}a_{n}(P_{i})\Big]\mathbb{E}\left [a_{n}(P_{l})\right ] = 0. }$$

This yields

$$\displaystyle{ \frac{1} {{n}^{\frac{k} {2} +1}}\sum _{\left (P_{1},\ldots,P_{k}\right )\in S_{n}(\pi )}\mathbb{E}\left [a_{n}(P_{1})\cdots a_{n}(P_{k})\right ] = 0. }$$

Second case: r: = # π < k ∕ 2. Here, we want to argue that π gives vanishing contribution to (4) as n →  by calculating # S n (π). To fix an element \((P_{1},\ldots,P_{k}) \in S_{n}(\pi )\), we first choose the pair \(P_{1} = (p_{1},q_{1})\). There are at most n possibilities to assign a value to p 1, and another n possibilities for q 1. To fix \(P_{2} = (p_{2},q_{2})\), note that the consistency of the pairs implies p 2 = q 1. If now 1 ∼  π 2, the condition \(\left \vert p_{1} - q_{1}\right \vert = \left \vert p_{2} - q_{2}\right \vert\) allows at most two choices for q 2. Otherwise, if \(1\not\sim _{\pi }2\), we have at most n possibilities. We now proceed sequentially to determine the remaining pairs. When arriving at some index i, we check whether i is in the same block as some preceding index 1, , i − 1. If this is the case, then we have at most two choices for P i and otherwise, we have n. Since there are exactly r = # π different blocks, we can conclude that

$$\displaystyle{ \#S_{n}(\pi ) \leq {n}^{2}{n}^{r-1}{2}^{k-r} \leq C\ {n}^{r+1} }$$
(5)

with a constant C = C(r, k) depending on r and k.

Now the uniform boundedness of the moments (3) and the Hölder inequality together imply that for any sequence (P 1, , P k ),

$$\displaystyle{ \left \vert \mathbb{E}\left [a_{n}(P_{1})\cdots a_{n}(P_{k})\right ]\right \vert \leq {\left [\mathbb{E}{\left \vert a_{n}(P_{1})\right \vert }^{k}\right ]}^{\frac{1} {k} }\cdots {\left [\mathbb{E}{\left \vert a_{n}(P_{k})\right \vert }^{k}\right ]}^{\frac{1} {k} } \leq m_{k}. }$$
(6)

Consequently, taking account of the relation r < k ∕ 2, we get

$$\displaystyle{ \frac{1} {{n}^{\frac{k} {2} +1}}\sum _{\left (P_{1},\ldots,P_{k}\right )\in S_{n}(\pi )}\left \vert \mathbb{E}\left [a_{n}(P_{1})\cdots a_{n}(P_{k})\right ]\right \vert \leq C\ \frac{\#S_{n}(\pi )} {{n}^{\frac{k} {2} +1}} \leq C\ \frac{1} {{n}^{\frac{k} {2} -r}} = o(1). }$$

Combining the calculations in the first and the second case, we can conclude that

$$\displaystyle{ \frac{1} {n}\mathbb{E}\left [\mathrm{tr}\left (\mathbf{X}_{n}^{k}\right )\right ] = \frac{1} {{n}^{\frac{k} {2} +1}}\sum _{ \begin{array}{c}\pi \in \mathcal{P}(k), \\ \#\pi =\frac{k} {2} \end{array}}\sum _{\left (P_{1},\ldots,P_{k}\right )\in S_{n}(\pi )}\mathbb{E}\left [a_{n}(P_{1})\cdots a_{n}(P_{k})\right ] + o(1). }$$

Now assume that k is odd. Then the condition # π = k ∕ 2 cannot be satisfied, and the considerations above immediately yield

$$\displaystyle{ \lim _{n\rightarrow \infty }\frac{1} {n}\mathbb{E}\left [\mathrm{tr}\left (\mathbf{X}_{n}^{k}\right )\right ] = 0. }$$

It remains to determine the even moments. Thus, let \(k \in \mathbb{N}\) be even. Recall that we denoted by \(\mathcal{P}\mathcal{P}(k) \subset \mathcal{P}(k)\) the set of all pair partitions of {1, , k}. In particular, # π = k ∕ 2 for any \(\pi \in \mathcal{P}\mathcal{P}(k)\). On the other hand, if # π = k ∕ 2 but \(\pi \notin \mathcal{P}\mathcal{P}(k)\), we can conclude that π has at least one singleton and hence, as in the first case above, the expectation corresponding to the π-consistent sequences will become zero. Consequently,

$$\displaystyle{ \frac{1} {n}\mathbb{E}\left [\mathrm{tr}\left (\mathbf{X}_{n}^{k}\right )\right ] = \frac{1} {{n}^{\frac{k} {2} +1}}\sum _{\pi \in \mathcal{P}\mathcal{P}(k)}\sum _{\left (P_{1},\ldots,P_{k}\right )\in S_{n}(\pi )}\mathbb{E}\left [a_{n}(P_{1})\cdots a_{n}(P_{k})\right ] + o(1).\quad }$$
(7)

We have now reduced the original set \(\mathcal{P}(k)\) to the subset \(\mathcal{P}\mathcal{P}(k)\). Next we want to fix a \(\pi \in \mathcal{P}\mathcal{P}(k)\) and concentrate on the set S n (π). The following lemma will help us to calculate that part of (7) which involves non-crossing partitions.

Lemma 1 (cf. [5], Proposition 4.4.). 

Let \(S_{n}^{{\ast}}(\pi ) \subseteq S_{n}(\pi )\) denote the set of π-consistent sequences \((P_{1},\ldots,P_{k})\) satisfying

$$\displaystyle{ i \sim _{\pi }j\quad \Longrightarrow\quad q_{i} - p_{i} = p_{j} - q_{j} }$$

for all i≠j. Then, we have

$$\displaystyle{ \#\left (S_{n}(\pi )\setminus S_{n}^{{\ast}}(\pi )\right ) = o\left ({n}^{1+\frac{k} {2} }\right ). }$$

Proof.

We call a pair \((P_{i},P_{j})\) with i ∼  π j, ij, positive if \(q_{i} - p_{i} = q_{j} - p_{j}> 0\) and negative if \(q_{i} - p_{i} = q_{j} - p_{j} <0\). Since \(\sum _{i=1}^{k}q_{i} - p_{i} = 0\) by consistency, the existence of a negative pair implies the existence of a positive one. Thus, we can assume that any \((P_{1},\ldots,P_{k}) \in S_{n}(\pi )\setminus S_{n}^{{\ast}}(\pi )\) contains a positive pair (P l , P m ). To fix such a sequence, we first determine the positions of l and m, and then fix the signs of the remaining differences q i  − p i . The number of possibilities to accomplish this depends only on k and not on n. Now we choose one of n possible values for p l , and continue with assigning values to the differences | q i  − p i  | for all P i except for P l and P m . Since π is a pair partition, we have at most n k ∕ 2 − 1 possibilities for that. Then, \(\sum _{i=1}^{k}q_{i} - p_{i} = 0\) implies that

$$\displaystyle{ 0 <2(q_{l} - p_{l}) = q_{l} - p_{l} + q_{m} - p_{m} =\sum _{\begin{array}{c}i\in \{1,\ldots,k\}, \\ i\neq l,m \end{array}}p_{i} - q_{i}. }$$

Since we have already chosen the signs of the differences | q i  − p i  | , il, m, as well as their absolute values, we know the value of the sum on the right hand side. Hence, the difference \(q_{l} - p_{l} = q_{m} - p_{m}\) is fixed. We now have the index p l , all differences \(\left \vert q_{i} - p_{i}\right \vert,i \in \left \{1,\ldots,k\right \}\), and their signs. Thus, we can start at P l and go systematically through the whole sequence (P 1, , P k ) to see that it is uniquely determined. Consequently, our considerations lead to

$$\displaystyle{ \#\left (S_{n}(\pi )\setminus S_{n}^{{\ast}}(\pi )\right ) \leq C{n}^{\frac{k} {2} } = o\left ({n}^{1+\frac{k} {2} }\right ). }$$

 □ 

A consequence of Lemma 1 and relation (6) is the identity

$$\displaystyle{ \frac{1} {n}\mathbb{E}\left [\mathrm{tr}\left (\mathbf{X}_{n}^{k}\right )\right ] = \frac{1} {{n}^{\frac{k} {2} +1}}\sum _{\pi \in \mathcal{P}\mathcal{P}(k)}\sum _{\left (P_{1},\ldots,P_{k}\right )\in S_{n}^{{\ast}}(\pi )}\mathbb{E}\left [a_{n}(P_{1})\cdots a_{n}(P_{k})\right ] + o(1). }$$
(8)

As already mentioned, the sets S n  ∗ (π) help us to deal with the set \(\mathcal{N}\mathcal{P}\mathcal{P}(k)\) of non-crossing pair partitions.

Lemma 2.

Let \(\pi \in \mathcal{N}\mathcal{P}\mathcal{P}(k)\) . For any \(\left (P_{1},\ldots,P_{k}\right ) \in S_{n}^{{\ast}}(\pi )\) , we have

$$\displaystyle{ \mathbb{E}\left [a_{n}(P_{1})\cdots a_{n}(P_{k})\right ] = 1. }$$

Proof.

Let l < m with l ∼  π m. Since π is non-crossing, the number l − m − 1 of elements between l and m must be even. In particular, there is l ≤ i < j ≤ m with i ∼  π j and j = i + 1. By the properties of S n  ∗ (π), we have \(a_{n}(P_{i}) = a_{n}(P_{j})\), and the sequence \(\left (P_{1},\ldots,P_{l},\ldots,P_{i-1},P_{i+2},\ldots,P_{m},\ldots,P_{k}\right )\) is still consistent. Applying this argument successively, all pairs between l and m can be eliminated and we see that the sequence \(\left (P_{1},\ldots,P_{l},P_{m},\ldots,P_{k}\right )\) is consistent, that is q l  = p m . Then, the identity p l  = q m also holds. In particular, \(a_{n}(P_{l}) = a_{n}(P_{m})\). Since this argument applies for arbitrary l ∼  π m, we obtain

$$\displaystyle{ \mathbb{E}\left [a_{n}(P_{1})\cdots a_{n}(P_{k})\right ] =\prod _{\begin{array}{c}l<m, \\ l\sim _{\pi }m \end{array}}\mathbb{E}\left [a_{n}(P_{l})a_{n}(P_{m})\right ] = 1. }$$

 □ 

By Lemma 2, we can conclude that

$$\displaystyle{ \frac{1} {{n}^{\frac{k} {2} +1}}\sum _{\pi \in \mathcal{N}\mathcal{P}\mathcal{P}(k)}\sum _{\left (P_{1},\ldots,P_{k}\right )\in S_{n}^{{\ast}}(\pi )}\mathbb{E}\left [a_{n}(P_{1})\cdots a_{n}(P_{k})\right ] = \frac{1} {{n}^{\frac{k} {2} +1}}\sum _{\pi \in \mathcal{N}\mathcal{P}\mathcal{P}(k)}\#S_{n}^{{\ast}}(\pi ). }$$

The following lemma allows us to finally calculate the term on the right hand side.

Lemma 3.

For any \(\pi \in \mathcal{N}\mathcal{P}\mathcal{P}(k)\) , we have

$$\displaystyle{ \lim _{n\rightarrow \infty }\frac{\#S_{n}^{{\ast}}(\pi )} {{n}^{\frac{k} {2} +1}} = 1. }$$

Proof.

Since π is non-crossing, we can find a nearest neighbor pair i ∼  π i + 1. Now fix \((P_{1},\ldots,P_{k}) \in S_{n}^{{\ast}}(\pi )\), and write \(P_{l} = (p_{l},p_{l+1})\), l = 1, , k, where k + 1 is identified with 1. Then the properties of S n  ∗ (π) ensure that \((p_{i},p_{i+1}) = (p_{i+2},p_{i+1})\). Hence, we can eliminate the pairs P i , P i + 1 to obtain a sequence \((P_{1}^{(1)},\ldots,P_{k-2}^{(1)}):= (P_{1},\ldots,P_{i-1},P_{i+2},\ldots,P_{k})\) which is still consistent. Denote by π ′ the partition obtained from π by deleting the block \(\{i,i + 1\}\), and relabeling any l ≥ i + 2 to l − 2. Since π is non-crossing, we have \(\pi \prime \in \mathcal{N}\mathcal{P}\mathcal{P}(k - 2)\). Moreover, \((P_{1}^{(1)},\ldots,P_{k-2}^{(1)}) \in S_{n}^{{\ast}}(\pi \prime)\). Thus we see that any \((P_{1},\ldots,P_{k}) \in S_{n}^{{\ast}}(\pi )\) can be reconstructed from a tuple \((P_{1}^{(1)},\ldots,P_{k-2}^{(1)}) \in S_{n}^{{\ast}}(\pi \prime)\) and a choice of p i + 1. The latter admits n − (k − 2) ∕ 2 possibilities since \(\{i,i + 1\}\) forms a block on its own in π. Consequently,

$$\displaystyle{ \frac{\#S_{n}^{{\ast}}(\pi )} {{n}^{\frac{k} {2} +1}} = \frac{\#S_{n}^{{\ast}}(\pi \prime)} {{n}^{\frac{k} {2} }} + o(1). }$$
(9)

Now if k = 2, we get \(S_{n}^{{\ast}}(\pi ) =\{ ((p,q),(q,p)): p,q \in \{ 1,\ldots,n\}\}\), implying \(\#S_{n}^{{\ast}}(\pi )/{n}^{2} = 1\). For arbitrary even \(k \in \mathbb{N}\), the statement of Lemma 3 follows then by induction using the identity in (9). □ 

Taking account of the relation \(\#\mathcal{N}\mathcal{P}\mathcal{P}(k) = C_{k/2}\), we now arrive at

$$\displaystyle\begin{array}{rcl} & & \frac{1} {n}\mathbb{E}\left [\mathrm{tr}\left (\mathbf{X}_{n}^{k}\right )\right ] \\ & & = C_{\frac{k} {2} } + \frac{1} {{n}^{\frac{k} {2} +1}}\sum _{\pi \in \mathcal{C}\mathcal{P}\mathcal{P}(k)}\sum _{\left (P_{1},\ldots,P_{k}\right )\in S_{n}^{{\ast}}(\pi )}\mathbb{E}\left [a_{n}(P_{1})\cdots a_{n}(P_{k})\right ] + o(1),\quad {}\end{array}$$
(10)

with \(\mathcal{C}\mathcal{P}\mathcal{P}(k)\) being the set of all crossing pair partitions of \(\{1,\ldots,k\}\). At this point, we have to distinguish between Theorems 1 and 2. Indeed, to obtain the semicircle law, we need to show that the sum over all crossing partitions is negligible in the limit. However, the limiting distribution in Theorem 2 indicates that we do have a contribution.

6.2 Convergence of the Expected Empirical Spectral Distribution in Theorem 1

The convergence of the expected empirical spectral distribution to the semicircle distribution follows directly from relation (10) and

Lemma 4.

For any crossing \(\pi \in \mathcal{C}\mathcal{P}\mathcal{P}(k)\) , we have

$$\displaystyle{ \sum _{\left (P_{1},\ldots,P_{k}\right )\in S_{n}^{{\ast}}(\pi )}\mathbb{E}\left [a_{n}(P_{1})\cdots a_{n}(P_{k})\right ] = o\left ({n}^{\frac{k} {2} +1}\right ). }$$

Proof.

Let π be crossing and consider a sequence \((P_{1},\ldots,P_{k}) \in S_{n}^{{\ast}}(\pi )\). Write \(P_{l} = (p_{l},p_{l+1})\). Note that if there is an \(l \in \{ 1,\ldots,k\}\) with \(l \sim _{\pi }l + 1\), where k + 1 is identified with 1, we immediately have \(a_{n}(P_{l}) = a_{n}(P_{l+1})\). In particular,

$$\displaystyle{ \mathbb{E}\left [a_{n}(P_{l})a_{n}(P_{l+1})\right ] = 1. }$$

The sequence \((P_{1}^{(1)},\ldots,P_{k-2}^{(1)}):= (P_{1},\ldots,P_{l-1},P_{l+2},\ldots,P_{k})\) is still consistent, and

$$\displaystyle{ \mathbb{E}\left [a_{n}(P_{1})\cdots a_{n}(P_{k})\right ] = \mathbb{E}\left [a_{n}(P_{1}^{(1)})\cdots a_{ n}(P_{k-2}^{(1)})\right ]. }$$

Define \({\pi }^{(1)} \in \mathcal{C}\mathcal{P}\mathcal{P}(k - 2)\) to be the pair partition induced by π after eliminating the indices l and l + 1. In particular, \((P_{1}^{(1)},\ldots,P_{k-2}^{(1)}) \in S_{n}^{{\ast}}{(\pi }^{(1)})\). Since there are at most n choices for p l + 1 when \((P_{1}^{(1)},\ldots,P_{k-2}^{(1)})\) is fixed, we have for any \((Q_{1},\ldots,Q_{k-2}) \in S_{n}^{{\ast}}{(\pi }^{(1)})\),

$$\displaystyle{ \#\{(P_{1},\ldots,P_{k}) \in S_{n}^{{\ast}}(\pi ): (P_{ 1}^{(1)},\ldots,P_{ k-2}^{(1)}) = (Q_{ 1},\ldots,Q_{k-2})\} \leq n. }$$

Let r denote the maximum number of pairs of indices that can be eliminated in this way. Since π is crossing, there are at least two pairs left and hence, r ≤ k ∕ 2 − 2. Define \({\pi }^{(r)} \in \mathcal{C}\mathcal{P}\mathcal{P}(k - 2r)\) and \((P_{1}^{(r)},\ldots,P_{k-2r}^{(r)}) \in S_{n}^{{\ast}}{(\pi }^{(r)})\) to be the partition and the sequence left after this elimination. By induction, we conclude that for any \((Q_{1},\ldots,Q_{k-2r}) \in S_{n}^{{\ast}}{(\pi }^{(r)})\), we have the estimate

$$\displaystyle{ \#\{(P_{1},\ldots,P_{k}) \in S_{n}^{{\ast}}(\pi ): (P_{ 1}^{(r)},\ldots,P_{ k-2r}^{(r)}) = (Q_{ 1},\ldots,Q_{k-2r})\} \leq {n}^{r}. }$$

Since \(\mathbb{E}\left [a_{n}(P_{1})\cdots a_{n}(P_{k})\right ] = \mathbb{E}[a_{n}(P_{1}^{(r)})\cdots a_{n}(P_{k-2r}^{(r)})]\), we obtain

$$\displaystyle\begin{array}{rcl} & & \sum _{\left (P_{1},\ldots,P_{k}\right )\in S_{n}^{{\ast}}(\pi )}\left \vert \mathbb{E}\left [a_{n}(P_{1})\cdots a_{n}(P_{k})\right ]\right \vert \\ & &\qquad \qquad \qquad \leq {n}^{r}\sum _{ (Q_{1},\ldots,Q_{k-2r})\in S_{n}^{{\ast}}{(\pi }^{(r)})}\left \vert \mathbb{E}\left [a_{n}(Q_{1})\cdots a_{n}(Q_{k-2r})\right ]\right \vert.\qquad {}\end{array}$$
(11)

Choose \(i \sim _{{\pi }^{(r)}}i + j\) such that j is minimal. For any sequence \((Q_{1},\ldots,Q_{k-2r}) \in S_{n}^{{\ast}}{(\pi }^{(r)})\), put \(Q_{l} = (q_{l},q_{l+1})\), l = 1, , k − 2r. We want to count the number of such sequences given that q i and q i + j + 1 are fixed. Therefore, we start with choosing one of n possible values for q i + 1. Then, the fact that i is equivalent to i + j ensures that we can also deduce the value of

$$\displaystyle{ q_{i+j} = q_{i+1} - q_{i} + q_{i+j+1}. }$$

Hence, Q i and Q i + j are fixed. Since j is minimal, any element in \(\{i + 1,\ldots,i + j - 1\}\) is equivalent to some element outside the set \(\{i,\ldots,i + j\}\). There are n possibilities to fix \(Q_{i+1} = (q_{i+1},q_{i+2})\) because \(q_{i+1}\) is already fixed. Proceeding sequentially, we have n possibilities for the choice of any pair Q l with \(l \in \{ i + 2,\ldots,i + j - 2\}\), and there is only one choice for Q i + j − 1 since q i + j is already chosen. We thus made n j − 2 choices to fix all pairs Q l , \(l \in \{ i + 1,\ldots,i + j - 1\}\). For any Q l with \(l \in \{ 1,\ldots,k - 2r\}\setminus \{i,\ldots,i + j\}\), there are at most n possibilities if it is not equivalent to one pair that has already been chosen. Otherwise, there is only one possibility. Since there were k ∕ 2 − r − j new equivalence classes left, we have at most n k ∕ 2 − r − j choices for those pairs. Hence, assuming that the elements q i and q i + j + 1 are fixed, we have at most

$$\displaystyle{ n{n}^{j-2}{n}^{\frac{k} {2} -r-j} = {n}^{\frac{k} {2} -r-1} }$$

possibilities to choose the rest of the sequence \((Q_{1},\ldots,Q_{k-2r}) \in S_{n}^{{\ast}}{(\pi }^{(r)})\). Note that \(\vert \mathbb{E}[a_{n}(Q_{l})a_{n}(Q_{m})]\vert \leq {(\mathbb{E}[a_{n}{(Q_{l})}^{2}])}^{1/2}{(\mathbb{E}[a_{n}{(Q_{m})}^{2}])}^{1/2} = 1\). Since π (r) is a pair partition, we thus get

$$\displaystyle{ \left \vert \mathbb{E}\left [a_{n}(Q_{1})\cdots a_{n}(Q_{k-2r})\right ]\right \vert \leq \vert \mathbb{E}[a_{n}(Q_{i})a_{n}(Q_{i+j})]\vert. }$$

By assumption (A3), the expectation on the right hand side depends only on the absolute value of the difference

$$\displaystyle{ \min \{q_{i},q_{i+1}\} -\min \{ q_{i+j},q_{i+j+1}\} =\max \{ q_{i},q_{i+1}\} -\max \{ q_{i+j},q_{i+j+1}\}. }$$

Now the definition of \(S_{n}^{{\ast}}{(\pi }^{(r)})\) ensures that \(q_{i} - q_{i+1} = q_{i+j+1} - q_{i+j}\). In particular, \(\min \{q_{i},q_{i+1}\} -\min \{ q_{i+j},q_{i+j+1}\} = q_{i} - q_{i+j+1} = q_{i+j} - q_{i+1}\), and

$$\displaystyle{ \mathbb{E}[a_{n}(Q_{i})a_{n}(Q_{i+j})] = c_{n}(\vert q_{i} - q_{i+j+1}\vert ). }$$

Consequently, estimating the term in (11) further, we obtain

$$\displaystyle\begin{array}{rcl} \sum _{\left (P_{1},\ldots,P_{k}\right )\in S_{n}^{{\ast}}(\pi )}\left \vert \mathbb{E}\left [a_{n}(P_{1})\cdots a_{n}(P_{k})\right ]\right \vert & \leq & {n}^{\frac{k} {2} -1}\sum _{q_{ i},q_{i+j+1}=1}^{n}c_{ n}(\vert q_{i+j+1} - q_{i}\vert ) {}\\ & \leq & C\ {n}^{\frac{k} {2} }\sum _{\tau =0}^{n-1}c_{n}(\tau ) = o\left ({n}^{\frac{k} {2} +1}\right ), {}\\ \end{array}$$

since \(\sum _{\tau =0}^{n-1}c_{n}(\tau ) = o(n)\) by condition (A4). □ 

6.3 Convergence of the Expected Empirical Spectral Distribution in Theorem 2

We again start with the identity in (10). Since we consider only pair partitions, we know that the expectation on the right hand side is of the form

$$\displaystyle{ \mathbb{E}\left [a_{n}(p_{1},q_{1})a_{n}(p_{1} +\tau _{1},q_{1} +\tau _{1})\right ]\cdots \mathbb{E}\left [a_{n}(p_{r},q_{r})a_{n}(p_{r} +\tau _{r},q_{r} +\tau _{r})\right ], }$$

for r: = k ∕ 2 and some choices of \(p_{1},q_{1},\tau _{1},\ldots,p_{r},q_{r},\tau _{r} \in \mathbb{N}\). In order to calculate this expectation, assumption (A3) indicates that we only need to distinguish for any i = 1, , k, whether we have τ i  = 0 or not. In the first case, we get the identity \(\mathbb{E}\left [a_{n}(p_{i},q_{i})a_{n}(p_{i} +\tau _{i},q_{i} +\tau _{i})\right ] = 1\), in the second we can conclude that \(\mathbb{E}\left [a_{n}(p_{i},q_{i})a_{n}(p_{i} +\tau _{i},q_{i} +\tau _{i})\right ] = c_{n}\). Fix some pair partition \(\pi \in \mathcal{P}\mathcal{P}(k)\), and take \((P_{1},\ldots,P_{k}) \in S_{n}^{{\ast}}(\pi )\). Motivated by these considerations, we put \(P_{i} = (p_{i},q_{i})\), and define

$$\displaystyle\begin{array}{rcl} m\left (P_{1},\ldots,P_{k}\right ):& =& \#\{1 \leq i <j \leq k: a_{n}(P_{i}) = a_{n}(P_{j})\} {}\\ & =& \#\{1 \leq i <j \leq k: (p_{i},q_{i}) = (q_{j},p_{j})\}. {}\\ \end{array}$$

Obviously, we have \(0 \leq m\left (P_{1},\ldots,P_{k}\right ) \leq k/2\). With this notation, we find that

$$\displaystyle{ \frac{1} {{n}^{\frac{k} {2} +1}}\sum _{\left (P_{1},\ldots,P_{k}\right )\in S_{n}^{{\ast}}(\pi )}\mathbb{E}\left [a_{n}(P_{1})\cdots a_{n}(P_{k})\right ] = \frac{1} {{n}^{\frac{k} {2} +1}}\sum _{l=0}^{k/2}c_{ n}^{\frac{k} {2} -l}\#A_{n}^{(l)}\left (\pi \right ), }$$
(12)

where

$$\displaystyle{ A_{n}^{(l)}\left (\pi \right ):=\{ \left (P_{ 1},\ldots,P_{k}\right ) \in S_{n}^{{\ast}}(\pi ): m\left (P_{ 1},\ldots,P_{k}\right ) = l\}. }$$

The following lemma states that if a pair P i , P j contributes to m(P 1, , P k ), then we can assume that the block \(\{i,j\}\) in π is not crossed by any other block.

Lemma 5.

Let \(\pi \in \mathcal{P}\mathcal{P}(k)\) and fix i ∼ π j, i < j. Define

$$\displaystyle{ S_{n}^{{\ast}}(\pi;i,j):=\{ \left (P_{ 1},\ldots,P_{k}\right ) \in S_{n}^{{\ast}}(\pi ): P_{ i} = (p_{i},q_{i}),P_{j} = (p_{j},q_{j}),p_{i} = q_{j},q_{i} = p_{j}\}. }$$

Assume that there is some i′∼ π j′ such that i < i′ < j, and either j′ < i or j < j′. Then,

$$\displaystyle{ \#S_{n}^{{\ast}}(\pi;i,j) = o\left ({n}^{\frac{k} {2} +1}\right ). }$$

Proof.

To fix some \(\left (P_{1},\ldots,P_{k}\right ) \in S_{n}^{{\ast}}(\pi;i,j)\), we first choose a value for p i  = q j and q i  = p j . This allows for at most n 2 possibilities. Hence, P i and P j are fixed. Now consider the pairs P i + 1, , P i′ − 1. p i + 1 is uniquely determined by consistency. For q i + 1, there are at most n choices. Then, p i + 2 = q i + 1. If i + 2 ∼  π i + 1, we have one choice for q i + 2. Otherwise, there are at most n. Proceeding in the same way, we see that we have n possibilities whenever we start a new equivalence class. Similarly, we can assign values to the pairs \(P_{j},\ldots,P_{i\prime+1}\) in this order. Now P i′ is determined by consistency. When fixing \(P_{i-1},\ldots,P_{1},P_{k},\ldots,P_{j+1}\), we again have n choices for any new equivalence class. To sum up, we are left with at most

$$\displaystyle{ {n}^{2}{n}^{\frac{k} {2} -2} = {n}^{\frac{k} {2} } }$$

possible values for an element in S n  ∗ (π; i, j). □ 

Recall Definition 1 where we introduced the notion of the height h(π) of a pair partition π. Lemma 5 in particular implies that only those \(\left (P_{1},\ldots,P_{k}\right ) \in S_{n}^{{\ast}}(\pi )\) with

$$\displaystyle{ 0 \leq m\left (P_{1},\ldots,P_{k}\right ) \leq h(\pi ) }$$

contribute to the limit of (12). Indeed, if \(m(P_{1},\ldots,P_{k})> h(\pi )\), we can find some i ∼  π j, i < j, such that \((P_{1},\ldots,P_{k}) \in S_{n}^{{\ast}}(\pi;i,j)\) and neither j = i + 1 nor is the restriction of π to \(\{i + 1,\ldots,j - 1\}\) a pair partition. Hence, the crossing property in Lemma 5 is satisfied, and (P 1, , P k ) is contained in a set that is negligible in the limit. The identity in (12) thus becomes

$$\displaystyle{ \frac{1} {{n}^{\frac{k} {2} +1}}\sum _{\left (P_{1},\ldots,P_{k}\right )\in S_{n}^{{\ast}}(\pi )}\mathbb{E}\left [a_{n}(P_{1})\cdots a_{n}(P_{k})\right ] = \frac{1} {{n}^{\frac{k} {2} +1}}\sum _{l=0}^{h(\pi )}c_{ n}^{\frac{k} {2} -l}\#B_{n}^{(l)}(\pi ) + o(1), }$$

where

$$\displaystyle\begin{array}{rcl} & & B_{n}^{(l)}(\pi ):= \left \{\left (P_{ 1},\ldots,P_{k}\right ) \in S_{n}^{{\ast}}(\pi ): m\left (P_{ 1},\ldots,P_{k}\right ) = l;\right. {}\\ & & \qquad \qquad \left.a_{n}(P_{i}) = a_{n}(P_{j}),i <j\ \Rightarrow \ j = i + 1\ \text{or}\ \pi \vert _{\{i+1,\ldots,j-1\}}\ \text{is a pair partition}\right \}. {}\\ \end{array}$$

In the next step, we want to simplify the expression above further by showing that \(B_{n}^{(l)}(\pi ) = \varnothing\) whenever \(0 \leq l <h(\pi )\). This is ensured by

Lemma 6.

Let \(\pi \in \mathcal{P}\mathcal{P}(k)\) . For any \((P_{1},\ldots,P_{k}) \in S_{n}^{{\ast}}(\pi )\) , we have

$$\displaystyle{ m(P_{1},\ldots,P_{k}) \geq h(\pi ). }$$

Proof.

If h(π) = 0, there is nothing to prove. Thus, suppose that h(π) ≥ 1 and take some i ∼  π j, i < j, such that either j = i + 1 or j − i − 1 ≥ 2 is even and the restriction of π to \(\{i + 1,\ldots,j - 1\}\) is a pair partition. Fix \((P_{1},\ldots,P_{k}) \in S_{n}^{{\ast}}(\pi )\), and write \(P_{l} = (p_{l},p_{l+1})\) for any l = 1, , k. We need to verify that \(p_{i+1} = p_{j}\). If we achieve this, the definition of S n  ∗ (π) will also ensure that p i  = p j + 1. As a consequence, the π-block \(\{i,j\}\) will contribute to \(m(P_{1},\ldots,P_{k})\). Since there are h(π) such blocks, we will obtain \(m(P_{1},\ldots,P_{k}) \geq h(\pi )\) for any choice of \((P_{1},\ldots,P_{k}) \in S_{n}^{{\ast}}(\pi )\).

If j = i + 1, we immediately obtain p i + 1 = p j . To show this property in the second case, note that the sequence \((P_{i+1},\ldots,P_{j-1})\) solves the following system of equations:

$$\displaystyle\begin{array}{rcl} p_{i+2} - p_{i+1} + p_{l_{1}+1} - p_{l_{1}}& = & 0,\quad \text{if}\ i + 1 \sim _{\pi }l_{1}, {}\\ p_{i+3} - p_{i+2} + p_{l_{2}+1} - p_{l_{2}}& = & 0,\quad \text{if}\ i + 2 \sim _{\pi }l_{2}, {}\\ & \vdots & {}\\ p_{i+m+1} - p_{i+m} + p_{l_{m}+1} - p_{l_{m}}& = 0,\quad \text{if}\ i + m \sim _{\pi }l_{m},& {}\\ & \vdots & {}\\ p_{j} - p_{j-1} + p_{l_{j-i-1}+1} - p_{l_{j-i-1}}& = & 0,\quad \text{if}\ j - 1 \sim _{\pi }l_{j-i-1}. {}\\ \end{array}$$

Start with solving the first equation for p i + 2 which yields

$$\displaystyle{ p_{i+2} = p_{i+1} - p_{l_{1}+1} + p_{l_{1}}. }$$

Then, insert this in the second equation, and solve it for p i + 3 to obtain

$$\displaystyle{ p_{i+3} = p_{i+1} - p_{l_{1}+1} + p_{l_{1}} - p_{l_{2}+1} + p_{l_{2}}. }$$

In the j − i − 1-th step, we substitute \(p_{j-1} = p_{i+(j-i-1)}\) in the j − i − 1-th equation, and solve it for \(p_{j} = p_{i+(j-i-1)+1}\). We then have

$$\displaystyle{ p_{j} = p_{i+1} -\sum _{m=1}^{j-i-1}(p_{ l_{m}+1} - p_{l_{m}}). }$$

Since the restriction of π to \(\{i + 1,\ldots,j - 1\}\) is a pair partition, we can conclude that the sets \(\{l_{1},\ldots,l_{j-i-1}\}\) and \(\{i + 1,\ldots,j - 1\}\) are equal. Hence, we obtain \(\sum _{m=1}^{j-i-1}(p_{l_{m}+1} - p_{l_{m}}) = p_{j} - p_{i+1}\), implying p j  = p i + 1. □ 

With the help of Lemma 6, we thus arrive at

$$\displaystyle{ \frac{1} {{n}^{\frac{k} {2} +1}}\sum _{(P_{1},\ldots,P_{k})\in S_{n}^{{\ast}}(\pi )}\mathbb{E}\left [a_{n}(P_{1})\cdots a_{n}(P_{k})\right ] = \frac{\#B_{n}^{(h(\pi ))}(\pi )} {{n}^{\frac{k} {2} +1}} \ c_{n}^{\frac{k} {2} -h(\pi )} + o(1). }$$

Note that any element \((P_{1},\ldots,P_{k}) \in S_{n}^{{\ast}}(\pi )\) satisfying the condition

$$\displaystyle{ a_{n}(P_{i}) = a_{n}(P_{j}),\ i <j\quad \Rightarrow \quad j = i + 1\ \text{or}\ \pi \vert _{\{i+1,\ldots,j-1\}}\ \text{is a pair partition}, }$$
(13)

fulfills the condition \(m(P_{1},\ldots,P_{k}) = h(\pi )\) as well. Indeed, (13) guarantees that \(m(P_{1},\ldots,P_{k}) \leq h(\pi )\), and Lemma 6 ensures that \(m(P_{1},\ldots,P_{k}) \geq h(\pi )\). Thus, we can write

$$\displaystyle\begin{array}{rcl} & & B_{n}^{(h(\pi ))}(\pi ) = \left \{\left (P_{ 1},\ldots,P_{k}\right ) \in S_{n}^{{\ast}}(\pi ):\right. {}\\ & & \quad \left.a_{n}(P_{i}) = a_{n}(P_{j}),i <j\ \Rightarrow \ j = i + 1\ \text{or}\ \pi \vert _{\{i+1,\ldots,j-1\}}\ \text{is a pair partition}\right \}. {}\\ \end{array}$$

Now any element in the complement of \(B_{n}^{(h(\pi ))}(\pi )\) satisfies for some i ∼  π j the crossing assumption in Lemma 5. This yields

$$\displaystyle{ \frac{\#{\left (B_{n}^{(h(\pi ))}(\pi )\right )}^{c}} {{n}^{\frac{k} {2} +1}} = o(1). }$$

Since \(B_{n}^{(h(\pi ))}(\pi ) \cup {\left (B_{n}^{(h(\pi ))}(\pi )\right )}^{c} = S_{n}^{{\ast}}(\pi )\), we obtain that

$$\displaystyle{ \frac{1} {{n}^{\frac{k} {2} +1}}\sum _{\left (P_{1},\ldots,P_{k}\right )\in S_{n}^{{\ast}}(\pi )}\mathbb{E}\left [a_{n}(P_{1})\cdots a_{n}(P_{k})\right ] = \frac{\#S_{n}^{{\ast}}(\pi )} {{n}^{\frac{k} {2} +1}} \ c_{n}^{\frac{k} {2} -h(\pi )} + o(1). }$$
(14)

To calculate the limit on the right-hand side, we have

Lemma 7 (cf. [5], Lemma 4.6). 

For any \(\pi \in \mathcal{P}\mathcal{P}(k)\) , it holds that

$$\displaystyle{ \lim _{n\rightarrow \infty }\frac{\#S_{n}^{{\ast}}(\pi )} {{n}^{\frac{k} {2} +1}} = p_{T}(\pi ), }$$

where p T (π) is the Toeplitz volume defined by solving the system of equation (2).

Proof.

Fix \(\pi \in \mathcal{P}\mathcal{P}(k)\). Note that if \(P =\{ (p_{i},p_{i+1}),i = 1,\ldots,k\} \in S_{n}^{{\ast}}(\pi )\), then \(x_{0},x_{1},\ldots,x_{k}\) with \(x_{i} = p_{i+1}/n\) is a solution of the system of equation (2). On the other hand, if \(x_{0},x_{1},\ldots,x_{k} \in \{ 1/n,2/n,\ldots,1\}\) is a solution of (2) and \(p_{i+1} = nx_{i}\), then either \(\{(p_{i},p_{i+1}),i = 1,\ldots,k\} \in S_{n}^{{\ast}}(\pi )\) or \(\{(p_{i},p_{i+1}),i = 1,\ldots,k\} \in S_{n}(\eta )\) for some partition \(\eta \in \mathcal{P}(k)\) such that \(i \sim _{\pi }j \Rightarrow i \sim _{\eta }j\), but # η < # π.

In (2), we have k + 1 variables and only k ∕ 2 equations. Denote the k ∕ 2 + 1 undetermined variables by y 1, , y k ∕ 2 + 1. We thus need to assign values from the set \(\{1/n,2/n,\ldots,1\}\) to y 1, , y k ∕ 2 + 1, and then to calculate the remaining k ∕ 2 variables from the equations. Since the latter are also supposed to be in the range \(\{1/n,2/n,\ldots,1\}\), it might happen that not all values for the undetermined variables are admissible. Let p n (π) denote the admissible fraction of the n k ∕ 2 + 1 choices for \(y_{1},\ldots,y_{k/2+1}\). By our remark at the beginning of the proof and estimate (5), we have that

$$\displaystyle{ \lim _{n\rightarrow \infty }\frac{\#S_{n}^{{\ast}}(\pi )} {{n}^{\frac{k} {2} +1}} =\lim _{n\rightarrow \infty }p_{n}(\pi ), }$$

if the limits exist. Now we can interpret \(y_{1},\ldots,y_{k/2+1}\) as independent random variables with a uniform distribution on \(\{1/n,2/n,\ldots,1\}\). Then, p n (π) is the probability that the computed values stay within the interval (0, 1]. As n → , \(y_{1},\ldots,y_{k/2+1}\) converge in law to independent random variables uniformly distributed on [0, 1]. Hence, p n (π) → p T (π). □ 

Applying Lemma 7 and assumption (A4) to Eq. (14), we arrive at

$$\displaystyle{ \lim _{n\rightarrow \infty } \frac{1} {{n}^{\frac{k} {2} +1}}\sum _{\left (P_{1},\ldots,P_{k}\right )\in S_{n}^{{\ast}}(\pi )}\mathbb{E}\left [a_{n}(P_{1})\cdots a_{n}(P_{k})\right ] = p_{T}(\pi ){c}^{\frac{k} {2} -h(\pi )}. }$$

Substituting this result in (10), we find that for any even \(k \in \mathbb{N}\),

$$\displaystyle{ \lim _{n\rightarrow \infty }\frac{1} {n}\mathbb{E}\left [\mathrm{tr}\left (\mathbf{X}_{n}^{k}\right )\right ] = C_{\frac{ k} {2} } +\sum _{\pi \in \mathcal{C}\mathcal{P}\mathcal{P}(k)}p_{T}(\pi ){c}^{\frac{k} {2} -h(\pi )}. }$$

To obtain the alternative expression for the even moments in Theorem 2, note that the considerations above were not restricted to crossing partitions. In particular, we can start from identity (8) instead of (10) to see that

$$\displaystyle{ \lim _{n\rightarrow \infty }\frac{1} {n}\mathbb{E}\left [\mathrm{tr}\left (\mathbf{X}_{n}^{k}\right )\right ] =\lim _{ n\rightarrow \infty }\sum _{\pi \in \mathcal{P}\mathcal{P}(k)}\frac{\#S_{n}^{{\ast}}(\pi )} {{n}^{\frac{k} {2} +1}} \ c_{n}^{\frac{k} {2} -h(\pi )} =\sum _{\pi \in \mathcal{P}\mathcal{P}(k)}p_{T}(\pi ){c}^{\frac{k} {2} -h(\pi )}. }$$

6.4 Almost Sure Convergence

The aim of this part of the proof is to show almost sure convergence in both situations, that of Theorem 1 and that of Theorem 2. Therefore, we want to follow the ideas used in [5], Proposition 4.9, to verify the same for Toeplitz matrices. This is possible since the arguments only rely on the fact that the diagonals are independent. The particular dependency structure can be neglected. We want to start with the proof of a concentration inequality for the moments of the spectral measure. The obtained bound will enable us to use the Borel-Cantelli lemma.

Lemma 8.

Suppose that conditions (A1) and (A2) hold. Then, for any \(k,n \in \mathbb{N}\),

$$\displaystyle{ \mathbb{E}\left [{\left (\mathrm{tr}\left (\mathbf{X}_{n}^{k}\right ) - \mathbb{E}\left [\mathrm{tr}\left (\mathbf{X}_{ n}^{k}\right )\right ]\right )}^{4}\right ] \leq C\ {n}^{2}. }$$

Proof.

Fix \(k,n \in \mathbb{N}\). Using the notation

$$\displaystyle{ P = (P_{1},\ldots,P_{k}) = ((p_{1},q_{1}),\ldots,(p_{k},q_{k})),\qquad a_{n}(P) = a_{n}(P_{1})\cdots a_{n}(P_{k}), }$$

we have that

$$\displaystyle\begin{array}{rcl} & & \mathbb{E}\left [{\left (\mathrm{tr}\left (\mathbf{X}_{n}^{k}\right ) - \mathbb{E}\left [\mathrm{tr}\left (\mathbf{X}_{ n}^{k}\right )\right ]\right )}^{4}\right ] \\ & & \quad = \frac{1} {{n}^{2k}}\sum _{{\pi }^{(1)},\ldots {,\pi }^{(4)}\in \mathcal{P}(k)}\sum _{ \begin{array}{c}{P}^{(i)}\in S_{ n}\left( {\pi }^{(i)}\right ), \\ i=1,\ldots,4 \end{array}}\mathbb{E}\Big[\prod _{j=1}^{4}\left (a_{ n}({P}^{(j)}) - \mathbb{E}\left [a_{ n}({P}^{(j)})\right ]\right )\Big].\quad {}\end{array}$$
(15)

Now consider a partition \(\boldsymbol{\pi }\) of \(\left \{1,\ldots,4k\right \}\). We say that a sequence \(({P}^{(1)},\ldots,{P}^{(4)})\) is π-consistent if each P (i), i = 1, , 4, is a consistent sequence and

$$\displaystyle{ \big\vert q_{l}^{(i)} - p_{ l}^{(i)}\big\vert \ =\ \left \vert q_{ m}^{(j)} - p_{ m}^{(j)}\right \vert \quad \Longleftrightarrow\quad l + (i - 1)k\ \sim _{\boldsymbol{\pi }}\ m + (j - 1)k. }$$

Let \(\mathcal{S}_{n}(\pi )\) denote the set of all π-consistent sequences with entries in \(\left \{1,\ldots,n\right \}\). Then, (15) becomes

$$\displaystyle\begin{array}{rcl} & & \mathbb{E}\left [{\left (\mathrm{tr}\left (\mathbf{X}_{n}^{k}\right ) - \mathbb{E}\left [\mathrm{tr}\left (\mathbf{X}_{ n}^{k}\right )\right ]\right )}^{4}\right ] \\ & & \qquad = \frac{1} {{n}^{2k}}\sum _{\pi \in \mathcal{P}(4k)}\sum _{({P}^{(1)},\ldots,{P}^{(4)})\in \mathcal{S}_{n}\left (\pi \right )}\mathbb{E}\Big[\prod _{j=1}^{4}\left (a_{ n}({P}^{(j)}) - \mathbb{E}\left [a_{ n}({P}^{(j)})\right ]\right )\Big].\quad {}\end{array}$$
(16)

We want to analyze the expectation on the right hand side. Therefore, fix a \(\pi \in \mathcal{P}(4k)\). We call π a matched partition if

  1. 1.

    Any block of π contains at least two elements,

  2. 2.

    For any \(i \in \{ 1,\ldots,4\}\), there is a ji and \(l,m \in \{ 1,\ldots,k\}\) with

    $$\displaystyle{ l + (i - 1)k\ \sim _{\boldsymbol{\pi }}\ m + (j - 1)k. }$$

In case π does not satisfy (i), we have a singleton \(\{l + (i - 1)k\}\), implying that \(\mathbb{E}[a_{n}({P}^{(i)})] = 0\). As a consequence,

$$\displaystyle{ \sum _{({P}^{(1)},\ldots,{P}^{(4)})\in \mathcal{S}_{n}(\pi )}\mathbb{E}\Big[\prod _{j=1}^{4}\left (a_{ n}({P}^{(j)}) - \mathbb{E}\left [a_{ n}({P}^{(j)})\right ]\right )\Big] = 0. }$$
(17)

If π does not satisfy (ii), we can conclude that for some \(i \in \{ 1,\ldots,4\}\), the term \(a_{n}({P}^{(i)}) - \mathbb{E}\left [a_{n}({P}^{(i)})\right ]\) is independent of \(a_{n}({P}^{(j)}) - \mathbb{E}\left [a_{n}({P}^{(j)})\right ]\), ji. Thus, (17) holds in this case as well. To sum up, we only have to consider matched partitions to evaluate the sum in (16). Let \(\boldsymbol{\pi }\) be such a partition and denote by r = # π the number of blocks of π. Note that condition (i) implies r ≤ 2k. We want to count all π consistent sequences \(({P}^{(1)},\ldots,{P}^{(4)})\). Therefore, first choose one of at most n r possibilities to fix the r different equivalence classes. Afterwards, we fix the elements \(p_{1}^{(1)},\ldots,p_{1}^{(4)}\), which can be done in n 4 ways. Since now the differences \(\vert q_{l}^{(i)} - p_{l}^{(i)}\vert\) are uniquely determined by the choice of the corresponding equivalence classes, we can proceed sequentially to see that there are at most two choices left for any pair P l (i). To sum up, we have at most

$$\displaystyle{ {2}^{4k}{n}^{4}{n}^{r} = C\ {n}^{r+4} }$$

possibilities to choose \(({P}^{(1)},\ldots,{P}^{(4)})\). If now r ≤ 2k − 2, we can conclude that

$$\displaystyle{ \#\mathcal{S}_{n}(\pi ) \leq C\ {n}^{2k+2}. }$$
(18)

It remains to consider the cases in which r = 2k − 1 and r = 2k, respectively. To begin with, let r = 2k − 1. Then, we have either two equivalence classes with three elements or one equivalence class with four. Since π is matched, there must exist an \(i \in \left \{1,\ldots,4\right \}\) and an \(l \in \left \{1,\ldots,k\right \}\) such that P l (i) is not equivalent to any other pair in the sequence P (i). Without loss of generality, we can assume that i = 1. In contrast to the construction of (P (1), , P (4)) as above, we now alter our procedure as follows: We fix all equivalence classes except of that P l (1) belongs to. There are n r − 1 possibilities to accomplish that. Now we choose again one of n 4 possible values for \(p_{1}^{(1)},\ldots,p_{1}^{(4)}\). Hereafter, we fix q m (1), m = 1, , l − 1, and then start from \(q_{k}^{(1)} = p_{1}^{(1)}\) to go backwards and obtain the values of \(p_{k}^{(1)},\ldots,p_{l+1}^{(1)}\). Each of these steps leaves at most two choices to us, that is 2k − 1 choices in total. But now, P l (1) is uniquely determined since \(p_{l}^{(1)} = q_{l-1}^{(1)}\) and \(q_{l}^{(1)} = p_{l+1}^{(1)}\) by consistency. Thus, we had to make one choice less than before, implying (18).

Now, let r = 2k. In this case, each equivalence class has exactly two elements. Since we consider a matched partition, we can find here as well an \(l \in \left \{1,\ldots,k\right \}\) such that P l (1) is not equivalent to any other pair in the sequence P (1). But in addition to that, we also have an \(m \in \left \{1,\ldots,k\right \}\) such that, possibly after relabeling, P m (2) is neither equivalent to any element in P (1) nor to any other element in P (2). Thus, we can use the same argument as before to see that this time, we can reduce the number of choices to at most \(C\ {n}^{r+2} = C\ {n}^{2k+2}\). In conclusion, (18) holds for any matched partition π. To sum up our results, we obtain that

$$\displaystyle\begin{array}{rcl} & & \mathbb{E}\left [{\left (\mathrm{tr}\left (\mathbf{X}_{n}^{k}\right ) - \mathbb{E}\left [\mathrm{tr}\left (\mathbf{X}_{ n}^{k}\right )\right ]\right )}^{4}\right ] {}\\ & & \quad = \frac{1} {{n}^{2k}}\sum _{\stackrel{\pi \in \mathcal{P}(4k),}{\pi \ \text{matched}}}\sum _{({P}^{(1)},\ldots,{P}^{(4)})\in \mathcal{S}_{n}(\pi )}\mathbb{E}\Big[\prod _{j=1}^{4}\left (a_{ n}({P}^{(j)}) - \mathbb{E}\left [a_{ n}({P}^{(j)})\right ]\right )\Big] \leq C\ {n}^{2},{}\\ \end{array}$$

which is the statement of Lemma 8. □ 

From Lemma 8 and Chebyshev’s inequality, we can now conclude that for any \(\varepsilon> 0\) and any \(k,n \in \mathbb{N}\),

$$\displaystyle{ \mathbb{P}\left (\left \vert \frac{1} {n}\mathrm{tr}\left (\mathbf{X}_{n}^{k}\right ) - \mathbb{E}\left [ \frac{1} {n}\mathrm{tr}\left (\mathbf{X}_{n}^{k}\right )\right ]\right \vert>\varepsilon \right ) \leq { \frac{C} {\varepsilon ^{4}{n}^{2}}}. }$$

Applying the Borel-Cantelli lemma, we see that

$$\displaystyle{ \frac{1} {n}\mathrm{tr}\left (\mathbf{X}_{n}^{k}\right ) - \mathbb{E}\left [ \frac{1} {n}\mathrm{tr}\left (\mathbf{X}_{n}^{k}\right )\right ] \rightarrow 0,\quad \text{a.s.}. }$$
(19)

Let Y be a random variable distributed according to the semicircle law or according to the distribution given by the moments in Theorem 2, depending on whether we want to prove Theorems 1 or 2. The convergence of the moments of the expected empirical distributions and relation (19) yield

$$\displaystyle{ \frac{1} {n}\mathrm{tr}\left (\mathbf{X}_{n}^{k}\right ) \rightarrow \mathbb{E}[{Y }^{k}],\quad \text{a.s.}. }$$

Since the distribution of Y is uniquely determined by its moments, we obtain almost sure weak convergence of the empirical spectral distribution of X n to the distribution of Y.