1 Introduction and Main Result

Let \(\mathcal L_1\) be the space of functions \(a:[0,1]^2\times [-\pi ,\pi ]\rightarrow \mathbb C\) such that \(a(x,y,\cdot )\in L^1([-\pi ,\pi ])\) for all \((x,y)\in [0,1]^2\). Every \(a\in \mathcal L_1\) can be formally represented by its Fourier series in the last variable,

$$\begin{aligned} a(x,y,\theta )=\sum _{k\in \mathbb Z}\hat{a}_k(x,y)\mathrm{e}^{\mathrm{i}k\theta }, \end{aligned}$$

where

$$\begin{aligned} \hat{a}_k(x,y):=\frac{1}{2\pi }\int _{-\pi }^\pi a(x,y,\theta )\mathrm{e}^{-\mathrm{i}k\theta }d\theta ,\quad k\in \mathbb Z, \end{aligned}$$

are the Fourier coefficients of \(a(x,y,\cdot )\). For \(n\ge 2\), the \(n\times n\) variable-coefficient Toeplitz matrix associated with a is defined as

$$\begin{aligned} A_n(a):=\left[ \hat{a}_{i-j}\Bigl (\frac{i-1}{n-1},\frac{j-1}{n-1}\Bigr )\right] _{i,j=1}^n. \end{aligned}$$

Throughout this paper, by a matrix-sequence (or sequence of matrices) we mean a sequence of the form \(\{A_n\}_n\), in which the n-th term \(A_n\) is a \(n\times n\) matrix. The specific matrix-sequence \(\{A_n(a)\}_n\) is referred to as the variable-coefficient Toeplitz sequence associated with a, which in turn is called the generating function of \(\{A_n(a)\}_n\).

Variable-coefficient Toeplitz matrices are also known as generalized convolutions and appear in many different contexts. As testified by the literature, this kind of matrices has received a certain attention in the last years; see, e.g., [3,4,5, 10, 17,18,19, 21, 22]. We also refer the reader to [6, 8, 9] for a numerical-oriented literature about orthogonal polynomials with varying recurrence coefficients: the associated Jacobi matrices can be interpreted as tridiagonal symmetric Toeplitz matrices with variable coefficients.

In this paper we show that, under suitable assumptions on a, \(\{A_n(a)\}_n\) is a generalized locally Toeplitz (GLT) sequence with symbol \(a(x,x,\theta )\). This property, in combination with the theory of GLT sequences [7], allows one to derive a lot of spectral distribution (Szegő-type) results for various matrix-sequences, including those obtained from algebraic and non-algebraic operations on variable-coefficient Toeplitz sequences. Let us formulate our main result in a precise way. For any \(\varepsilon >0\), define the strip

$$\begin{aligned} S_\varepsilon :=\{(x,y)\in [0,1]^2:\ |x-y|\le \varepsilon \}. \end{aligned}$$

Set

$$\begin{aligned} \mathcal W&:=\biggl \{a\in \mathcal L_1:\quad \sum _{k\in \mathbb Z}\sup _{(x,y)\in [0,1]^2}|\hat{a}_k(x,y)|<\infty ,\\&\qquad \text {for all }k\in \mathbb Z \,\text {there is}\,\varepsilon (k)>0\,\text {such that}\,\hat{a}_k(\cdot ,\cdot )\,\text{ is } \text{ continuous } \text{ on }\, S_{\varepsilon (k)}\biggr \}, \end{aligned}$$
$$\begin{aligned} \mathcal X:= & {} \biggl \{\,\sum _{r=1}^q\alpha _r(x,y)\beta _r(\theta ):\quad \alpha _r\in C([0,1]^2) \text{ and } \beta _r\in L^2([-\pi ,\pi ])\\&\text{ for } \text{ all } \, r=1,\ldots ,q,\quad q\in \mathbb N\biggr \}, \end{aligned}$$
$$\begin{aligned} \mathcal Y:= & {} \biggl \{\,\sum _{r=1}^q\alpha _r(x)\beta _r(y)\gamma _r(\theta ): \quad \alpha _r,\beta _r\in C([0,1]) \text{ and } \gamma _r\in L^1([-\pi ,\pi ])\\&\text{ for } \text{ all } r=1,\ldots ,q,\quad q\in \mathbb N\biggr \}. \end{aligned}$$

Note that \(\mathcal W\) contains every continuous function \(a\in C([0,1]^2\times [-\pi ,\pi ])\) satisfying the Wiener-type condition

$$\begin{aligned} \sum _{k\in \mathbb Z}\sup _{(x,y)\in [0,1]^2}|\hat{a}_k(x,y)|<\infty . \end{aligned}$$
(1)

Our main result is the following.

Theorem 1

If \(a\in \mathcal W\cup \mathcal X\cup \mathcal Y\) then \(\{A_n(a)\}_n\sim _\mathrm{GLT}a(x,x,\theta )\), i.e., \(\{A_n(a)\}_n\) is a GLT sequence with symbol \(a(x,x,\theta )\).

The meaning of this statement will become clear in Sect. 2, where we present a full overview of the theory of GLT sequences. This is done because, in addition to the proof of the main result and its corollaries, a further purpose of this paper is to provide the reader with a complete and concise picture of the theory of GLT sequences. In Sect. 3 we prove the main result, and in Sect. 4 we investigate some of its consequences. Finally, in Sect. 5 we make some concluding remarks and illustrate possible future lines of research.

2 GLT Sequences

In this section we present the theory of GLT sequences, which goes back to the pioneering works by Tilli [20] and the second author [14, 15], and is developed in detail in [7]. Despite its conciseness, our presentation contains all the necessary ingredients for a full understanding of this theory. All the content of this section can be found in [7].

Singular value and eigenvalue distribution of a matrix-sequence Let \(\mu _k\) be the Lebesgue measure in \(\mathbb R^k\) and let \(C_c(\mathbb C)\) be the space of continuous complex-valued functions with bounded support defined on \(\mathbb C\). If A is a \(n\times n\) matrix, the singular values and the eigenvalues of A are denoted by \(\sigma _1(A),\ldots ,\sigma _n(A)\) and \(\lambda _1(A),\ldots ,\lambda _n(A)\), respectively.

Definition 1

Let \(\{A_n\}_n\) be a matrix-sequence and let \(f:D\subset \mathbb R^k\rightarrow \mathbb C\) be a measurable function defined on a set D with \(0<\mu _k(D)<\infty \).

  • We say that \(\{A_n\}_n\) has a singular value distribution described by f, and we write \(\{A_n\}_n\sim _\sigma f\), if

    $$\begin{aligned} \lim _{n\rightarrow \infty }\frac{1}{n}\sum _{i=1}^nF(\sigma _i(A_n))=\frac{1}{\mu _k(D)}\int _DF(|f(y_1,\ldots ,y_k)|)dy_1\ldots dy_k \end{aligned}$$

    for all \(F\in C_c(\mathbb C)\).

  • We say that \(\{A_n\}_n\) has an eigenvalue distribution described by f, and we write \(\{A_n\}_n\sim _\lambda f\), if

    $$\begin{aligned} \lim _{n\rightarrow \infty }\frac{1}{n}\sum _{i=1}^nF(\lambda _i(A_n))=\frac{1}{\mu _k(D)}\int _DF(f(y_1,\ldots ,y_k))dy_1\ldots dy_k \end{aligned}$$

    for all \(F\in C_c(\mathbb C)\).

Toeplitz matrices If \(n\in \mathbb N\) and \(f:[-\pi ,\pi ]\rightarrow \mathbb C\) is a function in \(L^1([-\pi ,\pi ])\), the Toeplitz matrix \(T_n(f)\) associated with f is the \(n\times n\) matrix defined as

$$\begin{aligned} T_n(f):=\bigl [\hat{f}_{i-j}\bigr ]_{i,j=1}^n, \end{aligned}$$

where the numbers \(\hat{f}_k\) are the Fourier coefficients of f,

$$\begin{aligned} \hat{f}_k:=\frac{1}{2\pi }\int _{-\pi }^\pi f(\theta )\mathrm{e}^{-\mathrm{i} k\theta }d\theta ,\qquad k\in \mathbb Z. \end{aligned}$$

It is not difficult to see that all the matrices \(T_n(f)\) are Hermitian when f is real almost everywhere (a.e.). Moreover, the variable-coefficient Toeplitz matrix \(A_n(a)\) coincides with the Toeplitz matrix \(T_n(a)\) whenever a is independent of x and y. The next proposition is proved in [13, 16] and contains important inequalities involving Toeplitz matrices and the so-called Schatten p-norms. If A is a \(n\times n\) matrix and \(1\le p\le \infty \), the Schatten p-norm of A is denoted by \(\Vert A\Vert _p\) and is defined as the p-norm of the vector \((\sigma _1(A),\ldots ,\sigma _n(A))\) formed by the singular values of A; see [1]. The Schatten \(\infty \)-norm \(\Vert A\Vert _\infty \) is the largest singular value of A and coincides with the spectral norm \(\Vert A\Vert \). The Schatten 1-norm \(\Vert A\Vert _1\) is the sum of all the singular values of A and is often referred to as the trace-norm of A. The Schatten 2-norm \(\Vert A\Vert _2\) can be proved to be equal to \((\sum _{i,j=1}^n|a_{ij}|^2)^{1/2}\), i.e., it is just the Frobenius norm of A. The Schatten p-norms satisfy the Hölder-type inequality

$$\begin{aligned} \Vert AB\Vert _1\le \Vert A\Vert _p\Vert B\Vert _q, \quad A,B\in \mathbb C^{n\times n}, \end{aligned}$$
(2)

where \(1\le p,q\le \infty \) are conjugate exponents, i.e., \(1/p+1/q=1\); see [1, Problem III.6.2 and Corollary IV.2.6]. For convenience, throughout this paper we use the natural convention \(1/\infty =0\).

Proposition 1

Let \(f\in L^p([-\pi ,\pi ])\), \(n\in \mathbb N\) and \(1\le p\le \infty \). Then

$$\begin{aligned} \Vert T_n(f)\Vert _p\le \frac{n^{1/p}}{(2\pi )^{1/p}}\Vert f\Vert _{L^p}. \end{aligned}$$

Diagonal sampling matrices If \(n\in \mathbb N\) and a is a function from [0, 1] to \(\mathbb C\), the diagonal sampling matrix \(D_n(a)\) associated with a is the \(n\times n\) diagonal matrix defined as

$$\begin{aligned} D_n(a):=\mathop {\mathrm{diag}}\limits _{i=1,\ldots ,n}a\Bigl (\frac{i}{n}\Bigr ). \end{aligned}$$

Zero-distributed sequences Any matrix-sequence \(\{Z_n\}_n\) such that \(\{Z_n\}_n\sim _\sigma 0\) is referred to as a zero-distributed sequence. In other words, \(\{Z_n\}_n\) is zero-distributed if and only if \(\lim _{n\rightarrow \infty }\frac{1}{n}\sum _{i=1}^nF(\sigma _i(Z_n))=F(0)\) for all \(F\in C_c(\mathbb C)\). An important characterization of zero-distributed sequences is stated in Proposition 2 together with a useful criterion to understand whether a given matrix-sequence is zero-distributed.

Proposition 2

The following properties hold.

  • \(\{Z_n\}_n\) is zero-distributed if and only if \(Z_n=R_n+N_n\) with \(\displaystyle \lim _{n\rightarrow \infty }\frac{\mathrm{rank}(R_n)}{n}=\lim _{n\rightarrow \infty }\Vert N_n\Vert =0\).

  • \(\{Z_n\}_n\) is zero-distributed if \(\displaystyle \lim _{n\rightarrow \infty }\frac{\Vert Z_n\Vert _p}{n^{1/p}}=0\) for some \(p\in [1,\infty ]\).

Approximating classes of sequences The notion of approximating classes of sequences (a.c.s.es) is due to the second author [12] and is deeply studied in [7]. This is indeed the fundamental concept on which the theory of GLT sequences is based.

Definition 2

Let \(\{A_n\}_n\) be a matrix-sequence. A sequence of matrix-sequences \(\{\{B_{n,m}\}_n\}_m\) is said to be an approximating class of sequences (a.c.s.) for \(\{A_n\}_n\) if the following condition is met: for every m there exists \(n_m\) such that, for \(n\ge n_m\),

$$\begin{aligned} A_n=B_{n,m}+R_{n,m}+N_{n,m},\qquad \mathrm{rank}(R_{n,m})\le c(m)n,\qquad \Vert N_{n,m}\Vert \le \omega (m), \end{aligned}$$

where the quantities \(n_m,\,c(m),\,\omega (m)\) depend only on m, and

$$\begin{aligned} \lim _{m\rightarrow \infty }c(m)=\lim _{m\rightarrow \infty }\omega (m)=0. \end{aligned}$$

Roughly speaking, \(\{\{B_{n,m}\}_n\}_m\) is an a.c.s. for \(\{A_n\}_n\) if, for large m, the sequence \(\{B_{n,m}\}_n\) approximates \(\{A_n\}_n\) in the sense that \(A_n\) is eventually equal to \(B_{n,m}\) plus a small-rank matrix (with respect to the matrix size n) plus a small-norm matrix. It turns out that the notion of a.c.s.es is a notion of convergence in the space of matrix-sequences

$$\begin{aligned} \mathscr {E}:=\bigl \{\{A_n\}_n:\ \{A_n\}_n \text{ is } \text{ a } \text{ matrix-sequence }\bigr \}. \end{aligned}$$
(3)

More precisely, let \(p(\cdot )\) be the function defined on \(\mathscr {E}\) by

$$\begin{aligned} p(\{A_n\}_n):= & {} \inf \left\{ \limsup _{n\rightarrow \infty }\biggl (\frac{\mathrm{rank}(R_n)}{n}+\Vert N_n\Vert \biggr ):\quad \{A_n\}_n=\{R_n+N_n\}_n,\right. \nonumber \\&~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\left. \quad \{R_n\}_n,\{N_n\}_n\in \mathscr {E}\right\} , \end{aligned}$$
(4)

and set

$$\begin{aligned} d_\mathrm{a.c.s.}(\{A_n\}_n,\{B_n\}_n):=p(\{A_n-B_n\}_n),\qquad \{A_n\}_n,\{B_n\}_n\in \mathscr {E}. \end{aligned}$$
(5)

Then \(d_\mathrm{a.c.s.}\) is a distance on \(\mathscr {E}\) which turns \(\mathscr {E}\) into a topological (pseudometric) space \((\mathscr {E},\tau _\mathrm{a.c.s.})\) where the statement “\(\{\{B_{n,m}\}_n\}_m\) converges to \(\{A_n\}_n\)” is equivalent to “\(\{\{B_{n,m}\}_n\}_m\) is an a.c.s. for \(\{A_n\}_n\)”. In particular, we can reformulate Definition 2 in the following way: a sequence of matrix-sequences \(\{\{B_{n,m}\}_n\}_m\) is said to be an a.c.s. for \(\{A_n\}_n\) if \(\{B_{n,m}\}_n\) converges to \(\{A_n\}_n\) in \((\mathscr {E},\tau _\mathrm{a.c.s.})\) as \(m\rightarrow \infty \), i.e., if \(d_\mathrm{a.c.s.}(\{B_{n,m}\}_n,\{A_n\}_n)\rightarrow 0\) as \(m\rightarrow \infty \). The theory of a.c.s.es may then be interpreted as an approximation theory for matrix-sequences, and for this reason we will use the convergence notation

$$\begin{aligned} \{B_{n,m}\}_n\mathop {\longrightarrow }\limits ^\mathrm{a.c.s.}\{A_n\}_n \end{aligned}$$

to indicate that \(\{\{B_{n,m}\}_n\}_m\) is an a.c.s. for \(\{A_n\}_n\). A useful criterion to identify a.c.s.es is provided in the next proposition.

Proposition 3

Let \(\{A_n\}_n\) be a matrix-sequence, let \(\{\{B_{n,m}\}_n\}_m\) be a sequence of matrix-sequences, and let \(p\in [1,\infty ]\). Suppose that for every m there exists \(n_m\) such that, for \(n\ge n_m\),

$$\begin{aligned} \Vert A_n-B_{n,m}\Vert _p\le \varepsilon (m,n)n^{1/p}, \end{aligned}$$

where \(\displaystyle \lim _{m\rightarrow \infty }\limsup _{n\rightarrow \infty }\varepsilon (m,n)=0\). Then \(\{B_{n,m}\}_n\mathop {\longrightarrow }\limits ^\mathrm{a.c.s.}\{A_n\}_n\).

Remark 1

It sometimes happens to hear about a.c.s.es parameterized by a positive \(\varepsilon \rightarrow 0\) rather than an integer \(m\rightarrow \infty \). A.c.s.es parameterized by a positive \(\varepsilon \rightarrow 0\) appear, for example, in the definition of GLT sequences [7, Definition 8.1]. Thanks to the above topological interpretation, it is easy to guess what an a.c.s. of this kind should be. Here is the definition: a class of matrix-sequences \(\{\{B_{n,\varepsilon }\}_n\}_{\varepsilon >0}\) is said to be an a.c.s. for \(\{A_n\}_n\) as \(\varepsilon \rightarrow 0\) if \(\{B_{n,\varepsilon }\}_n\) converges to \(\{A_n\}_n\) in \((\mathscr {E},\tau _\mathrm{a.c.s.})\) as \(\varepsilon \rightarrow 0\), i.e., if \(d_\mathrm{a.c.s.}(\{B_{n,\varepsilon }\}_n,\{A_n\}_n)\rightarrow 0\) as \(\varepsilon \rightarrow 0\).

GLT sequences A GLT sequence \(\{A_n\}_n\) is a special matrix-sequence equipped with a measurable function \(\kappa :[0,1]\times [-\pi ,\pi ]\rightarrow \mathbb C\), the so-called symbol (or kernel). We use the notation \(\{A_n\}_n\sim _\mathrm{GLT}\kappa \) to indicate that \(\{A_n\}_n\) is a GLT sequence with symbol \(\kappa \). The symbol of a GLT sequence is unique, in the sense that if \(\{A_n\}_n\sim _\mathrm{GLT}\kappa \) and \(\{A_n\}_n\sim _\mathrm{GLT}\xi \) then \(\kappa =\xi \) a.e. in \([0,1]\times [-\pi ,\pi ]\).

To formulate the precise definition of GLT sequences, we introduce the set of measurable functions

$$\begin{aligned} \mathcal M:=\{\kappa :[0,1]\times [-\pi ,\pi ]\rightarrow \mathbb C:\ \kappa \text{ is } \text{ measurable }\}. \end{aligned}$$

This is a *-algebra with respect to the natural operations, and it is also a topological (pseudometric) space with respect to the distance

$$\begin{aligned} d_\mathrm{measure}(\kappa ,\xi )=\int _{[0,1]\times [-\pi ,\pi ]}\frac{|\kappa -\xi |}{1+|\kappa -\xi |}, \end{aligned}$$

which induces the topology \(\tau _\mathrm{measure}\) of convergence in measure [2, 11]. Consider the product space \(\mathscr {E}\times \mathcal M\) equipped with the product topology \(\tau _\mathrm{a.c.s.}\times \tau _\mathrm{measure}\) and with the structure of product *-algebra:

$$\begin{aligned} (\{A_n\}_n,\kappa )^*&:=(\{A_n^*\}_n,\overline{\kappa }),\\ \alpha (\{A_n\}_n,\kappa )+\beta (\{B_n\}_n,\xi )&:=(\{\alpha A_n+\beta B_n\}_n,\alpha \kappa +\beta \xi ),\\ (\{A_n\}_n,\kappa )(\{B_n\}_n,\xi )&:=(\{A_nB_n\}_n,\kappa \xi ). \end{aligned}$$

The set of “GLT pairs” is defined as the subset of \(\mathscr {E}\times \mathcal M\) given by

$$\begin{aligned} \bigl \{(\{A_n\}_n,\kappa )\in \mathscr {E}\times \mathcal M:\ \{A_n\}_n\sim _\mathrm{GLT}\kappa \bigr \}. \end{aligned}$$
(6)

Let \(C_\mathrm{a.e.}([0,1])\) be the space of functions \(a:[0,1]\rightarrow \mathbb C\) that are continuous a.e. on [0, 1], i.e., the space of functions a from [0, 1] to \(\mathbb C\) such that a is continuous at x for almost every \(x\in [0,1]\), and consider the specific subset of \(\mathscr {E}\times \mathcal M\) given by

$$\begin{aligned}&\bigl \{(\{T_n(f)\}_n,f):\ f\in L^1([-\pi ,\pi ])\bigr \}\cup \bigl \{(\{D_n(a)\}_n,a):\ a\in C_\mathrm{a.e.}([0,1])\bigr \}\nonumber \\&\cup \,\bigl \{(\{Z_n\}_n,0):\ \{Z_n\}_n\sim _\sigma 0\bigr \}. \end{aligned}$$
(7)

Definition 3

The set of GLT pairs (6) is the closed *-subalgebra of \(\mathscr {E}\times \mathcal M\) generated by the pairs (7), i.e., the smallest closed *-subalgebra of \(\mathscr {E}\times \mathcal M\) containing (7).

Definition 3 looks quite different from the traditional one [7, Definition 8.1]. However, as proved in [7, Sect. 8.5], the two definitions are equivalent. Moreover, Definition 3 is much easier to formulate and this is why we preferred it to [7, Definition 8.1].

The theory of GLT sequences is summarized in the following list of properties. Throughout this paper, the Moore–Penrose pseudoinverse of a matrix A is denoted by \(A^\dag \); recall that \(A^\dag =A^{-1}\) whenever A is invertible.

  • GLT 1. If \(\{A_n\}_n\sim _\mathrm{GLT}\kappa \) then \(\{A_n\}_n\sim _\sigma \kappa \). If, moreover, the matrices \(A_n\) are Hermitian, then \(\{A_n\}_n\sim _\lambda \kappa \).

  • GLT 2. If \(\{A_n\}_n\sim _\mathrm{GLT}\kappa \) and \(A_n=X_n+Y_n\), where

    • \({\bullet }\) \(\Vert X_n\Vert ,\,\Vert Y_n\Vert \le C\) for some constant C independent of n,

    • \({\bullet }\) every \(X_n\) is Hermitian,

    • \({\bullet }\) \(\displaystyle \lim _{n\rightarrow \infty }\frac{\Vert Y_n\Vert _1}{n}=0\),

    then \(\{A_n\}_n\sim _\lambda \kappa \).

  • GLT 3. We have:

    • \({\bullet }\) \(\{T_n(f)\}_n\sim _\mathrm{GLT}\kappa (x,\theta ):=f(\theta )\) if \(f\in L^1([-\pi ,\pi ])\);

    • \({\bullet }\) \(\{D_n(a)\}_n\sim _\mathrm{GLT}\kappa (x,\theta ):=a(x)\) if \(a\in C_\mathrm{a.e.}([0,1])\);

    • \({\bullet }\) \(\{Z_n\}_n\sim _\mathrm{GLT}\kappa (x,\theta ):=0\) if and only if \(\{Z_n\}_n\sim _\sigma 0\).

  • GLT 4. If \(\{A_n\}_n\sim _\mathrm{GLT}\kappa \) then \(\{A_n^*\}_n\sim _\mathrm{GLT}\overline{\kappa }\).

  • GLT 5. If \(A_n=\sum _{i=1}^r\alpha _i\prod _{j=1}^{q_i}A_n^{(i,j)}\), with \(r,q_1,\ldots ,q_r\in \mathbb N\), \(\alpha _1,\ldots ,\alpha _r\in \mathbb C\), and \(\{A_n^{(i,j)}\}_n\sim _\mathrm{GLT}\kappa _{ij}\), then \(\{A_n\}_n\sim _\mathrm{GLT}\kappa :=\sum _{i=1}^r\alpha _i\prod _{j=1}^{q_i}\kappa _{ij}\).

  • GLT 6. If \(\{A_n\}_n\sim _\mathrm{GLT}\kappa \) and \(\kappa \ne 0\) a.e., then \(\{A_n^\dag \}_n\sim _\mathrm{GLT}\kappa ^{-1}\).

  • GLT 7. If \(\{A_n\}_n\sim _\mathrm{GLT}\kappa \) and each \(A_n\) is Hermitian, then \(\{f(A_n)\}_n\sim _\mathrm{GLT}f(\kappa )\) for all continuous functions \(f:\mathbb C\rightarrow \mathbb C\).

  • GLT 8. \(\{A_n\}_n\sim _\mathrm{GLT}\kappa \) if and only if there exist GLT sequences \(\{B_{n,m}\}_n\sim _\mathrm{GLT}\kappa _m\) such that \(\{B_{n,m}\}_n\mathop {\longrightarrow }\limits ^\mathrm{a.c.s.}\{A_n\}_n\) and \(\kappa _m\rightarrow \kappa \) in measure over \([0,1]\times [-\pi ,\pi ]\).

The first properties GLT 1 – GLT 2 contain the main spectral distribution results for GLT sequences. GLT 3 lists the fundamental examples of GLT sequences, from which one can construct, via GLT 5, many other GLT sequences. Note that GLT 1 and GLT 3 imply the famous Szegő first limit theorem and the Avram–Parter theorem for Toeplitz sequences \(\{T_n(f)\}_n\) with \(f\in L^1([-\pi ,\pi ])\). From GLT 4 – GLT 5 we see that GLT sequences form a *-algebra. More specifically, let \(\mathscr {E}\) be the space of matrix-sequences defined in (3), and note that \(\mathscr {E}\) is a *-algebra over \(\mathbb C\) with respect to the natural (pointwise) operations of Hermitian transposition, linear combination and product of matrix-sequences:

$$\begin{aligned} \{A_n\}_n^*&:=\{A_n^*\}_n,\\ \alpha \{A_n\}_n+\beta \{B_n\}_n&:=\{\alpha A_n+\beta B_n\}_n,\\ \{A_n\}_n\{B_n\}_n&:=\{A_nB_n\}_n. \end{aligned}$$

Then, the set

$$\begin{aligned} \mathscr {A}:=\bigl \{\{A_n\}_n:\ \{A_n\}_n\sim _\mathrm{GLT}\kappa \text{ for } \text{ some } \text{ measurable } \kappa :[0,1]\times [-\pi ,\pi ]\rightarrow {\mathbb C}\bigr \} \end{aligned}$$

is a *-subalgebra of \(\mathscr {E}\), which is referred to as the GLT algebra. In view of GLT 3, the GLT algebra contains the algebra generated by zero-distributed sequences, Toeplitz sequences, and sequences of diagonal sampling matrices associated with a.e. continuous functions. To be precise, let

$$\begin{aligned} \mathscr {B}:= & {} \bigl \{\{T_n(f)\}_n:\ f\in L^1([-\pi ,\pi ])\bigr \}\cup \bigl \{\{D_n(a)\}_n:\ a\in C_\mathrm{a.e.}([0,1])\bigr \}\\&\cup \,\bigl \{\{Z_n\}_n:\ \{Z_n\}_n\sim _\sigma 0\bigr \} \end{aligned}$$

and let \(\mathrm{algebra}(\mathscr {B})\) be the subalgebra of \(\mathscr {E}\) generated by \(\mathscr {B}\), i.e., the smallest subalgebra of \(\mathscr {E}\) containing \(\mathscr {B}\). Then,

$$\begin{aligned} \mathrm{algebra}(\mathscr {B})= & {} \left\{ \left\{ \sum _{i=1}^r\prod _{j=1}^{q_i}A_n^{(i,j)}\right\} :\ r,q_1,\ldots ,q_r\in \mathbb N,\ \right. \\&~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\left. \{A_n^{(i,j)}\}_n\in \mathscr {B} \text { for all } i=1,\ldots ,r \text { and } j=1,\ldots ,q_i\right\} \subseteq \mathscr {A}. \end{aligned}$$

GLT 6 – GLT 7 show that the GLT algebra is closed under more complex operations than sums and products, such as pseudoinversions and matrix functions. GLT 8 can be informally rephrased as follows: if a sequence of GLT sequences \(\{B_{n,m}\}_n\) converges to a matrix-sequence \(\{A_n\}_n\) (in the a.c.s. sense), and if the corresponding sequence of symbols \(\kappa _m\) converges to a function \(\kappa \) (in measure), then \(\{A_n\}_n\) is a GLT sequence with symbol \(\kappa \). This is then the closure property of GLT pairs expressed in Definition 3.

3 Proof of the Main Result

This section is devoted to the proof of Theorem 1. If \(\alpha :[0,1]^2\rightarrow \mathbb C\), \(n\ge 2\) and \(k\in \mathbb Z\), we define the \(n\times n\) diagonal matrix

$$\begin{aligned} D_{n,k}(\alpha ):=\mathop {\mathrm{diag}}\limits _{h=1,\ldots ,n}\alpha \Bigl (\frac{(h-1+k)\,\mathrm{mod}\,n}{n-1},\frac{h-1}{n-1}\Bigr ). \end{aligned}$$

Lemma 1

For every \(a\in \mathcal L_1\) and \(n\ge 2\), we have

$$\begin{aligned} A_n(a)=\sum _{k=-(n-1)}^{n-1}T_n(\mathrm{e}^{\mathrm{i}k\theta })D_{n,k}(\hat{a}_k). \end{aligned}$$

Proof

The result of this lemma is already known in the literature; see, e.g., [10, p. 6]. Nevertheless, we include the details of the proof for the reader’s convenience. For every \(n\ge 2\) and \(k\in \mathbb Z\), we have

$$\begin{aligned} (T_n(\mathrm{e}^{\mathrm{i}k\theta }))_{ij}=\delta _{i-j,k}:=\left\{ \begin{array}{ll}1, &{} \mathrm{if}\ i-j=k,\\ 0, &{} \mathrm{otherwise}, \end{array}\right. \qquad i,j=1,\ldots ,n. \end{aligned}$$

Hence, for all \(i,j=1,\ldots ,n\),

$$\begin{aligned}&\left( \sum _{k=-(n-1)}^{n-1}T_n(\mathrm{e}^{\mathrm{i}k\theta })D_{n,k}(\hat{a}_k)\right) _{ij} =\sum _{k=-(n-1)}^{n-1}\delta _{i-j,k}\,\hat{a}_k\Bigl (\frac{(j-1+k)\,\mathrm{mod}\,n}{n-1},\frac{j-1}{n-1}\Bigr )\nonumber \\&=\hat{a}_{i-j}\Bigl (\frac{i-1}{n-1},\frac{j-1}{n-1}\Bigr )=(A_n(a))_{ij}, \end{aligned}$$

and the lemma is proved. \(\square \)

Lemma 2

If \(\alpha :[0,1]^2\rightarrow \mathbb C\) is continuous on the strip \(S_\varepsilon \) for some \(\varepsilon >0\) and \(k\in \mathbb Z\), then

$$\begin{aligned} \{D_{n,k}(\alpha )\}_n\sim _\mathrm{GLT}\alpha (x,x). \end{aligned}$$

Proof

We show that, for all \(n\ge 2\) and \(k\in \mathbb Z\),

$$\begin{aligned} D_{n,k}(\alpha ) = D_n(\alpha (x,x))+R_{n,k}+N_{n,k}, \end{aligned}$$
(8)

where

$$\begin{aligned} \lim _{n\rightarrow \infty }\frac{\mathrm{rank}(R_{n,k})}{n}=\lim _{n\rightarrow \infty }\Vert N_{n,k}\Vert =0. \end{aligned}$$
(9)

This implies that the sequence \(\{Z_{n,k}:=R_{n,k}+N_{n,k}\}_n\) is zero-distributed (see Proposition 2), and the thesis follows from (8) in combination with GLT 3 and GLT 5.

Let \(\omega _{\alpha ,\varepsilon }(\cdot )\) be the modulus of continuity of \(\alpha \) over the strip \(S_\varepsilon \),

$$\begin{aligned} \omega _{\alpha ,\varepsilon }(\delta ):=\max _{\begin{array}{c} (x,y),(x',y')\in S_\varepsilon \\ |x-x'|,|y-y'|\le \delta \end{array}}|\alpha (x,y)-\alpha (x',y')|,\quad \delta >0. \end{aligned}$$

If \(k\ge 0\) and \(n>\frac{k}{\varepsilon }+1\), for \(h=1,\ldots ,n-k\) we have

$$\begin{aligned} \left| (D_{n,k}(\alpha ))_{hh}-(D_n(\alpha (x,x)))_{hh}\right|&= \left| \alpha \Bigg (\frac{h-1+k}{n-1},\frac{h-1}{n-1}\Bigg )- \alpha \Bigg (\frac{h}{n},\frac{h}{n}\Bigg )\right| \nonumber \\&\le \omega _{\alpha ,\varepsilon } \Bigg (\frac{k+1}{n-1}\Bigg ), \end{aligned}$$
(10)

which tends to 0 as \(n\rightarrow \infty \). Write

$$\begin{aligned} D_{n,k}(\alpha )-D_n(\alpha (x,x)) = N_{n,k}+R_{n,k}, \end{aligned}$$

where \(N_{n,k}\) (resp., \(R_{n,k}\)) is the matrix obtained from \(D_{n,k}(\alpha )-D_n(\alpha (x,x))\) by setting to 0 all the diagonal elements corresponding to indices \(h>n-k\) (resp., \(h\le n\,{-}\,k\)). The decomposition (8)–(9) follows from (10) and from the (obvious) inequality \(\mathrm{rank}(R_{n,k})\le k\).

If \(k<0\) and \(n>\frac{|k|}{\varepsilon }+1\), for \(h=|k|+1,\ldots ,n\) we have

$$\begin{aligned} \left| (D_{n,k}(\alpha ))_{hh}-(D_n(\alpha (x,x)))_{hh}\right|&= \left| \alpha \Bigg (\frac{h-1+k}{n-1},\frac{h-1}{n-1}\Bigg )-\alpha \Bigg (\frac{h}{n},\frac{h}{n}\Bigg )\right| \nonumber \\&\le \omega _{\alpha ,\varepsilon } \Bigg (\frac{|k|+1}{n-1}\Bigg ), \end{aligned}$$
(11)

which tends to 0 as \(n\rightarrow \infty \). Write

$$\begin{aligned} D_{n,k}(\alpha )-D_n(\alpha (x,x)) = N_{n,k}+R_{n,k}, \end{aligned}$$

where \(N_{n,k}\) (resp., \(R_{n,k}\)) is the matrix obtained from \(D_{n,k}(\alpha )-D_n(\alpha (x,x))\) by setting to 0 all the diagonal elements corresponding to indices \(h<|k|+1\) (resp., \(h\ge |k|+1\)). The decomposition (8)–(9) follows from (11) and from the (obvious) inequality \(\mathrm{rank}(R_{n,k})\le |k|\). \(\square \)

We are now ready to prove the main result.

Proof of Theorem 1

The proof consists of three steps. In the first step we show that \(\{A_n(a)\}_n\sim _\mathrm{GLT}a(x,x,\theta )\) whenever \(a\in \mathcal W\). In the second step we show that \(\{A_n(a)\}_n\sim _\mathrm{GLT}a(x,x,\theta )\) whenever \(a\in \mathcal X\). Finally, in the third step we show that \(\{A_n(a)\}_n\sim _\mathrm{GLT}a(x,x,\theta )\) whenever \(a\in \mathcal Y\).

Step 1. Suppose \(a\in \mathcal W\). For \(n\ge 2\) and \(m\in \mathbb N\), consider the matrix

$$\begin{aligned} A_{n,m}(a) := \sum _{k=-m}^mT_n(\mathrm{e}^{\mathrm{i}k\theta })D_{n,k}(\hat{a}_k). \end{aligned}$$

Note that \(A_{n,m}(a)\) “resembles” \(A_n(a)\) because, by Lemma 1, the only difference between these two matrices is the range of indices where k varies. By Lemma 1 we also have \(A_{n,m}(a)=A_n(a_m)\) for \(n>m\), where

$$\begin{aligned} a_m(x,y,\theta ):=\sum _{k=-m}^m\hat{a}_k(x,y)\mathrm{e}^{\mathrm{i}k\theta } \end{aligned}$$

is the m-th Fuorier sum of \(a(x,y,\theta )\). We are going to show that:

  1. i.

    \(\{A_{n,m}(a)\}_n\sim _\mathrm{GLT}a_m(x,x,\theta )\) for every \(m\in \mathbb N\);

  2. ii.

    \(a_m(x,x,\theta )\rightarrow a(x,x,\theta )\) in measure over \([0,1]\times [-\pi ,\pi ]\);

  3. iii.

    \(\{A_{n,m}(a)\}_n\mathop {\longrightarrow }\limits ^\mathrm{a.c.s.}\{A_n(a)\}_n\).

Once this is done, the GLT relation \(\{A_n(a)\}_n\sim _\mathrm{GLT}a(x,x,\theta )\) follows from GLT 8. By the hypothesis on a, each \(\hat{a}_k\) is continuous on \(S_{\varepsilon (k)}\) for some \(\varepsilon (k)>0\), and so \(\{D_{n,k}(\hat{a}_k)\}_n\sim _\mathrm{GLT}\hat{a}_k(x,x)\) by Lemma 2. Thus, by GLT 3 and GLT 5,

$$\begin{aligned} \{A_{n,m}(a)\}_n\sim _\mathrm{GLT}\sum _{k=-m}^m\mathrm{e}^{\mathrm{i}k\theta }\hat{a}_k(x,x)=a_m(x,x,\theta ), \end{aligned}$$

and item i is proved. Since a satisfies the Wiener-type condition (1), \(a_m(x,y,\theta )\rightarrow a(x,y,\theta )\) uniformly on \([-\pi ,\pi ]\) for each fixed \((x,y)\in [0,1]^2\). In particular, the sequence of continuous functions \(a_m(x,x,\theta )\) converges pointwise to \(a(x,x,\theta )\) over \([0,1]\times [-\pi ,\pi ]\). Since the pointwise convergence on a set of finite measure implies the convergence in measure [2, 11], item ii is proved. Finally, by Lemma 1 and by the equality

$$\begin{aligned} \Vert T_n(\mathrm{e}^{\mathrm{i}k\theta })\Vert =1,\qquad |k|<n, \end{aligned}$$

for every \(n>m\) we have

$$\begin{aligned} \Vert A_n(a)-A_{n,m}(a)\Vert&=\left\| \sum _{n>|k|>m}T_n(\mathrm{e}^{\mathrm{i}k\theta })D_{n,k}(\hat{a}_k)\right\| \\&\le \sum _{n>|k|>m}\Vert T_n(\mathrm{e}^{\mathrm{i}k\theta })\Vert \,\Vert D_{n,k}(\hat{a}_k)\Vert \\&\le \sum _{n>|k|>m}\sup _{x,y\in [0,1]}|\hat{a}_k(x,y)|\\&=:\varepsilon (m,n). \end{aligned}$$

Recalling that a satisfies the Wiener-type condition (1), we have

$$\begin{aligned} \lim _{m\rightarrow \infty }\limsup _{n\rightarrow \infty }\varepsilon (m,n)=0, \end{aligned}$$

and Proposition 3 implies that \(\{A_{n,m}(a)\}_n\mathop {\longrightarrow }\limits ^\mathrm{a.c.s.}\{A_n(a)\}_n\). We then conclude that item iii holds.

Step 2. Suppose \(a\in \mathcal X\). In view of GLT 5 and the linearity of the map \(A_n(\cdot )\), it suffices to prove that \(\{A_n(a)\}_n\sim _\mathrm{GLT}a(x,x,\theta )\) for any a of the form \(a(x,y,\theta )=\alpha (x,y)\beta (\theta )\), with \(\alpha \in C([0,1]^2)\) and \(\beta \in L^2([-\pi ,\pi ])\). Fix then \(a(x,y,\theta )=\alpha (x,y)\beta (\theta )\) with \(\alpha \in C([0,1]^2)\) and \(\beta \in L^2([-\pi ,\pi ])\). Let

$$\begin{aligned} a_m(x,y,\theta )=\alpha (x,y)p_m(\theta ), \end{aligned}$$

where \(p_m\) is a sequence of trigonometric polynomials such that \(p_m\rightarrow \beta \) in \(L^2([-\pi ,\pi ])\). As in Step 1, we show that:

  1. i.

    \(\{A_n(a_m)\}_n\sim _\mathrm{GLT}a_m(x,x,\theta )\) for every \(m\in \mathbb N\);

  2. ii.

    \(a_m(x,x,\theta )\rightarrow a(x,x,\theta )\) in measure over \([0,1]\times [-\pi ,\pi ]\);

  3. iii.

    \(\{A_n(a_m)\}_n\mathop {\longrightarrow }\limits ^\mathrm{a.c.s.}\{A_n(a)\}_n\).

Once this is done, the GLT relation \(\{A_n(a)\}_n\sim _\mathrm{GLT}a(x,x,\theta )\) follows from GLT 8. Since \(a_m\) belongs to \(\mathcal W\) (note that only a finite number of Fourier coefficients of \(a_m(x,y,\cdot )\) is nonzero), item i follows immediately from the result of Step 1. Since \(p_m\rightarrow \beta \) in \(L^2([-\pi ,\pi ])\), \(a_m(x,x,\theta )=\alpha (x,x)p_m(\theta )\rightarrow \alpha (x,x)\beta (\theta )=a(x,x,\theta )\) in \(L^2([0,1]\times [-\pi ,\pi ])\), and so \(a_m(x,x,\theta )\rightarrow a(x,x,\theta )\) in measure. This proves item ii. To prove item iii, we note that

$$\begin{aligned} A_n(a)-A_n(a_m)=A_n(a-a_m)=\left[ \alpha \Bigg (\frac{i-1}{n-1}, \frac{j-1}{n-1}\Bigg )\right] _{i,j=1}^n\circ \ T_n(\beta -p_m), \end{aligned}$$

where \(\circ \) denotes the componentwise (or Hadamard) product of matrices. The Frobenius norm of \(A_n(a)-A_n(a_m)\) can then be estimated by means of Proposition 1:

$$\begin{aligned} \Vert A_n(a)-A_n(a_m)\Vert _2^2 \le \Vert \alpha \Vert _{\infty }^2\Vert T_n(\beta -p_m)\Vert _2^2\le \Vert \alpha \Vert _{\infty }^2 \Vert \beta -p_m\Vert _{L^2}^2n. \end{aligned}$$
(12)

Item iii now follows from Proposition 3, taking into account that \(\Vert \beta -p_m\Vert _{L^2}\rightarrow 0\).

Step 3. Suppose \(a\in \mathcal Y\). In view of GLT 5 and the linearity of the map \(A_n(\cdot )\), it suffices to prove that \(\{A_n(a)\}_n\sim _\mathrm{GLT}a(x,x,\theta )\) for any a of the form \(a(x,y,\theta )=\alpha (x)\beta (y)\gamma (\theta )\), with \(\alpha ,\beta \in C([0,1])\) and \(\gamma \in L^1([-\pi ,\pi ])\). Fix then \(a(x,y,\theta )=\alpha (x)\beta (y)\gamma (\theta )\) with \(\alpha ,\beta \in C([0,1])\) and \(\gamma \in L^1([-\pi ,\pi ])\). Let

$$\begin{aligned} a_m(x,y,\theta )=\alpha (x)\beta (y)p_m(\theta ), \end{aligned}$$

where \(p_m\) is a sequence of trigonometric polynomials such that \(p_m\rightarrow \gamma \) in \(L^1([-\pi ,\pi ])\). As in Steps 1–2, we show that:

  1. i.

    \(\{A_n(a_m)\}_n\sim _\mathrm{GLT}a_m(x,x,\theta )\) for every \(m\in \mathbb N\);

  2. ii.

    \(a_m(x,x,\theta )\rightarrow a(x,x,\theta )\) in measure over \([0,1]\times [-\pi ,\pi ]\);

  3. iii.

    \(\{A_n(a_m)\}_n\mathop {\longrightarrow }\limits ^\mathrm{a.c.s.}\{A_n(a)\}_n\).

Once this is done, the GLT relation \(\{A_n(a)\}_n\sim _\mathrm{GLT}a(x,x,\theta )\) follows from GLT 8. Since \(a_m\) belongs to \(\mathcal W\) (note that only a finite number of Fourier coefficients of \(a_m(x,y,\cdot )\) is nonzero), item i follows immediately from the result of Step 1. Since \(p_m\rightarrow \gamma \) in \(L^1([-\pi ,\pi ])\), \(a_m(x,x,\theta )=\alpha (x)\beta (x)p_m(\theta )\rightarrow \alpha (x)\beta (x)\gamma (\theta )=a(x,x,\theta )\) in \(L^1([0,1]\times [-\pi ,\pi ])\), and so \(a_m(x,x,\theta )\rightarrow a(x,x,\theta )\) in measure. This proves item ii. To prove item iii, let

$$\begin{aligned} \Delta _n(g)=\mathop {\mathrm{diag}}\limits _{i=1,\ldots ,n}g\Bigl (\frac{i-1}{n-1}\Bigr ),\qquad g:[0,1]\rightarrow \mathbb C,\qquad n\ge 2. \end{aligned}$$

Note that

$$\begin{aligned} A_n(a)-A_n(a_m)=A_n(a-a_m)=\Delta _n(\alpha )T_n(\gamma -p_m)\Delta _n(\beta ). \end{aligned}$$

The trace-norm of \(A_n(a)-A_n(a_m)\) can then be estimated by means of the Hölder-type inequality (2) and Proposition 1:

$$\begin{aligned} \Vert A_n(a)-A_n(a_m)\Vert _1\le & {} \Vert \Delta _n(\alpha )\Vert \,\Vert \Delta _n(\beta )\Vert \,\Vert T_n(\gamma -p_m)\Vert _1\nonumber \\\le & {} \Vert \alpha \Vert _\infty \Vert \beta \Vert _\infty \Vert \gamma -p_m\Vert _{L^1}n. \end{aligned}$$
(13)

Item iii now follows from Proposition 3, taking into account that \(\Vert \gamma -p_m\Vert _{L^1}\rightarrow 0\). \(\square \)

4 Consequences of the Main Result

Consequences of Theorem 1 (and of the theory of GLT sequences) are all the results that can be deduced from the list of properties GLT 1 – GLT 8 in which GLT 3 is extended to include the result of Theorem 1. For the sake of clarity, we use a different notation for the extended version of GLT 3 and we give it the label \(\overline{\mathbf{GLT\,3}}\). Property \(\overline{\mathbf{GLT\,3}}\) is then the following.

\(\overline{\mathbf{GLT\,3 }}.\) We have:

  • \(\{T_n(f)\}_n\sim _\mathrm{GLT}\kappa (x,\theta ):=f(\theta )\) if \(f\in L^1([-\pi ,\pi ])\);

  • \(\{D_n(a)\}_n\sim _\mathrm{GLT}\kappa (x,\theta ):=a(x)\) if \(a\in C_\mathrm{a.e.}([0,1])\);

  • \(\{Z_n\}_n\sim _\mathrm{GLT}\kappa (x,\theta ):=0\) if and only if \(\{Z_n\}_n\sim _\sigma 0\);

  • \(\{A_n(a)\}_n\sim _\mathrm{GLT}\kappa (x,\theta ):=a(x,x,\theta )\) if \(a\in \mathcal W\cup \mathcal X\cup \mathcal Y\).

In this section we discuss some consequences of GLT 1 – GLT 2, \(\overline{\mathbf{GLT\,3}}\), GLT 4 – GLT 8.

Spectral distribution results on the algebra generated by variable-coefficient Toeplitz sequences Let \(\mathscr {C}\) denote the *-algebra generated by the variable-coefficient Toeplitz sequences \(\{A_n(a)\}_n\) with \(a\in \mathcal W\cup \mathcal X\cup \mathcal Y\). It is not difficult to see that

$$\begin{aligned} \mathscr {C}= & {} \left\{ \left\{ \sum _{i=1}^r\prod _{j=1}^{q_i}A_n(a_{ij})\right\} _n:\ r,q_1,\ldots ,q_r\in \mathbb N,\right. \\&\left. a_{ij}\in \mathcal W\cup \mathcal X\cup \mathcal Y \text{ for } \text{ all }\, i=1,\ldots ,r \mathrm{and}j=1,\ldots ,q_i\right\} . \end{aligned}$$

By \(\overline{\mathbf{GLT\,3}}\) and GLT 5, \(\mathscr {C}\) is a subalgebra of the GLT algebra \(\mathscr {A}\), and for the generic element of \(\mathscr {C}\) we have

$$\begin{aligned} \left\{ \sum _{i=1}^r\prod _{j=1}^{q_i}A_n(a_{ij})\right\} _n\sim _\mathrm{GLT}\sum _{i=1}^r\prod _{j=1}^{q_i}a_{ij}(x,x,\theta ). \end{aligned}$$

Hence, by GLT 1,

$$\begin{aligned} \left\{ \sum _{i=1}^r\prod _{j=1}^{q_i}A_n(a_{ij})\right\} _n\sim _\sigma \sum _{i=1}^r\prod _{j=1}^{q_i}a_{ij}(x,x,\theta ) \end{aligned}$$
(14)

and, if the matrices \(\sum _{i=1}^r\prod _{j=1}^{q_i}A_n(a_{ij})\) are Hermitian,

$$\begin{aligned} \left\{ \sum _{i=1}^r\prod _{j=1}^{q_i}A_n(a_{ij})\right\} _n\sim _\lambda \sum _{i=1}^r\prod _{j=1}^{q_i}a_{ij}(x,x,\theta ). \end{aligned}$$
(15)

A result analogous to (14)–(15) was obtained by Silbermann and Zabroda in [18, Theorem 7.2]. In particular, in view of [18, pp. 185–186], a consequence of [18, Theorem 7.2] is that (15) holds whenever the functions \(a_{ij}\) belong to

$$\begin{aligned}&C^\infty _\theta ([0,1]^2\times [-\pi ,\pi ]):=\Bigl \{a\in C([0,1]^2\times [-\pi ,\pi ]): \\&\mathrm{the}\,2\pi \hbox {-}\mathrm{periodic\,extension\,of}\,a(x,y,\cdot )\,\mathrm{from}\, [-\pi ,\pi )\, \mathrm{to} \, \mathbb R\\&\text{ belongs } \text{ to }\, C^\infty (\mathbb R)\, \text{ for } \text{ all }\,(x,y)\in [0,1]^2,\\&\text{ all } \text{ the } \text{ derivatives } \dfrac{\partial ^ra(x,y,\theta )}{\partial \theta ^r} \text{ belong } \text{ to }\, C([0,1]^2\times [-\pi ,\pi ])\Bigr \}. \end{aligned}$$

Since \(C^\infty _\theta ([0,1]^2\times [-\pi ,\pi ])\) is a subset of \(\mathcal W\) (as already observed in [18, pp. 171–172]), the relation (15) seems to add something to [18, Theorem 7.2], though it should be noted that the latter theorem implies more than the eigenvalue distribution (15) for functions \(a_{ij}\in C^\infty _\theta ([0,1]^2\times [-\pi ,\pi ])\). Concerning the singular value distribution (14), one of its consequences is that, if \(\sigma _1(A_n(a))\le \ldots \le \sigma _n(A_n(a))\) are the singular values of \(A_n(a)\) arranged in non-decreasing order, and if a is a function in \(\mathcal W\cup \mathcal X\cup \mathcal Y\) such that \(\mathrm{ess\,inf}_{(x,\theta )\in [0,1]\times [-\pi ,\pi ]}a(x,x,\theta )=0\), then

$$\begin{aligned} \lim _{n\rightarrow \infty }\sigma _i(A_n(a))=0\ \text{ for } \text{ each }\,fixed \,i\ge 1; \end{aligned}$$
(16)

cf. with [10, Theorem 1]. To give a formal proof of (16) based on (14), suppose by contradiction that there exists an index \(i\ge 1\) such that \(\sigma _i(A_n(a))\) does not converge to 0 as \(n\rightarrow \infty \). Then, there exists \(\varepsilon >0\) and a subsequence \(\{\sigma _i(A_{n_j}(a))\}_j\) such that \(\sigma _i(A_{n_j}(a))\ge \varepsilon \) for all j. This implies that \(\sigma _k(A_{n_j}(a))\ge \varepsilon \) for all \(k\ge i\) and all j (recall that we are assuming the singular values are arranged in non-decreasing order). Take a function \(F\in C_c(\mathbb C)\) such that \(F=1\) over \([0,\varepsilon /2]\), \(F=0\) over \([\varepsilon ,\infty )\), and \(0\le F\le 1\) over \([0,\infty )\). For this function we have

$$\begin{aligned} \lim _{j\rightarrow \infty }\frac{1}{n_j}\sum _{k=1}^nF(\sigma _k(A_{n_j}(a)))= \lim _{j\rightarrow \infty }\frac{1}{n_j}\sum _{k=1}^{i-1}F(\sigma _k(A_{n_j}(a)))\le \lim _{j\rightarrow \infty }\frac{i-1}{n_j}=0. \end{aligned}$$

On the other hand, since we are assuming that \(\mathrm{ess\,inf}_{(x,\theta )\in [0,1]\times [-\pi ,\pi ]}a(x,x,\theta )=0\), the measure of the set \(S=\{(x,\theta )\in [0,1]\times [-\pi ,\pi ]:\ 0\le a(x,x,\theta )\le \varepsilon /2\}\) is positive and the singular value distribution (14) yields

$$\begin{aligned}&\lim _{n\rightarrow \infty }\frac{1}{n}\sum _{k=1}^nF(\sigma _k(A_n(a)))=\frac{1}{2\pi } \int _{-\pi }^\pi \int _0^1F(a(x,x,\theta ))dxd\theta \\&\ge \frac{1}{2\pi } \int _SF(a(x,x,\theta ))dxd\theta =\frac{\mu _2(S)}{2\pi }>0, \end{aligned}$$

a contradiction.

Spectral distribution results beyond the algebra generated by variable-coefficient Toeplitz sequences It is clear that the relations (14)–(15) are far from exhausting the spectral distribution results that can be derived from GLT 1 – GLT 2, \(\overline{\mathbf{GLT\,3}}\), GLT 4 – GLT 8. In particular, GLT 1, \(\overline{\mathbf{GLT\,3}}\) and GLT 5 – GLT 7 allow one to compute the singular value and eigenvalue distribution of matrix-sequences that are obtained not only through sums and products of variable-coefficient Toeplitz sequences, but also through more complex operations involving all the GLT sequences listed in \(\overline{\mathbf{GLT\,3}}\). For example, let \(a\in \mathcal W\cup \mathcal X\cup \mathcal Y\). If \(a(x,x,\theta )\ne 0\) a.e., then GLT 1, \(\overline{\mathbf{GLT\,3}}\), GLT 6 yield

$$\begin{aligned}&\{A_n(a)^\dag \}_n\sim _\mathrm{GLT}\frac{1}{a(x,x,\theta )},\\&\{A_n(a)^\dag \}_n\sim _\sigma \frac{1}{a(x,x,\theta )}. \end{aligned}$$

If in addition the matrices \(A_n(a)\) are Hermitian for all n, then GLT 1, \(\overline{\mathbf{GLT\,3}}\), GLT 6 also yield

$$\begin{aligned}&\{A_n(a)^\dag \}_n\sim _\lambda \frac{1}{a(x,x,\theta )}. \end{aligned}$$

If the matrices \(A_n(a)\) are Hermitian for all n, then GLT 1, \(\overline{\mathbf{GLT\,3}}\), GLT 7 give

$$\begin{aligned}&\{\sin A_n(a)\}_n\sim _\mathrm{GLT}\sin (a(x,x,\theta )),\\&\{\sin A_n(a)\}_n\sim _\sigma \sin (a(x,x,\theta )),\\&\{\sin A_n(a)\}_n\sim _\lambda \sin (a(x,x,\theta )). \end{aligned}$$

If \(a(x,x,\theta )\ne 0\) a.e. and the matrices \(A_n(a)\) are Hermitian for all n, then GLT 1, \(\overline{\mathbf{GLT\,3}}\), GLT 5 – GLT 7 yield

$$\begin{aligned}&\{T_n(|\theta |^{-1/2})A_n(a)^\dag \mathrm{e}^{A_n(a)}A_n(a)^\dag T_n(|\theta |^{-1/2})+8D_n(\log x)A_n(a)^\dag D_n(\log x)\}_n\\&\sim _\mathrm{GLT}\frac{\mathrm{e}^{a(x,x,\theta )}}{|\theta |a(x,x,\theta )^2}+\frac{8\log ^2x}{a(x,x,\theta )},\\&\{T_n(|\theta |^{-1/2})A_n(a)^\dag \mathrm{e}^{A_n(a)}A_n(a)^\dag T_n(|\theta |^{-1/2})+8D_n(\log x)A_n(a)^\dag D_n(\log x)\}_n\\&\sim _\sigma \frac{\mathrm{e}^{a(x,x,\theta )}}{|\theta |a(x,x,\theta )^2}+\frac{8\log ^2x}{a(x,x,\theta )},\\&\{T_n(|\theta |^{-1/2})A_n(a)^\dag \mathrm{e}^{A_n(a)}A_n(a)^\dag T_n(|\theta |^{-1/2})+8D_n(\log x)A_n(a)^\dag D_n(\log x)\}_n\\&\sim _\lambda \frac{\mathrm{e}^{a(x,x,\theta )}}{|\theta |a(x,x,\theta )^2}+\frac{8\log ^2x}{a(x,x,\theta )}. \end{aligned}$$

We could continue with this game indefinitely...

5 Final Remarks and Future Perspectives

In [18, Theorem 9.1] the anonymous reviewer of [18] formulated a result which extends Theorem 1. If this result were true, the GLT relation \(\{A_n(a)\}_n\sim _\mathrm{GLT}a(x,x,\theta )\) would be satisfied for any function \(a\in C([0,1]^2\times [-\pi ,\pi ])\) (and not only). As a consequence, we could replace the fourth item in \(\overline{\mathbf{GLT\,3}}\) by the following,

  • \(\{A_n(a)\}_n\sim _\mathrm{GLT}\kappa (x,\theta ):=a(x,x,\theta )\) if \(a\in \mathcal W\cup \mathcal X\cup \mathcal Y\cup C([0,1]^2\times [-\pi ,\pi ])\),

thus obtaining a property \(\overline{\mathbf{GLT\,3}}\) stronger than the one considered in Sect. 4, and we could then generalize the results of Sect. 4 in this perspective. It should be noted, however, that the proof of [18, Theorem 9.1] provided in [18, Sect. 10] is unfortunately wrong. The problem in the proof is contained in [18, p. 195], where it is asserted that a bound on the Frobenius norm such as \(\Vert A_n-A_{n,\varepsilon }\Vert _2\le \varepsilon n\) implies that \(\{\{A_{n,\varepsilon }\}_n\}_{\varepsilon >0}\) is an a.c.s. for \(\{A_n\}_n\) as \(\varepsilon \rightarrow 0\). This is not true, as shown by the following example.

Example

Let \(\{A_n\}_n\) be a matrix-sequence and let \(A_{n,\varepsilon }:=A_n+c_{n,\varepsilon }I_n\), where \(I_n\) is the \(n\times n\) identity matrix and \(c_{n,\varepsilon }\) is a coefficient depending on both n and \(\varepsilon \). It is clear that

$$\begin{aligned} \Vert A_n-A_{n,\varepsilon }\Vert _2 = |c_{n,\varepsilon }|\sqrt{n}. \end{aligned}$$

If we choose \(c_{n,\varepsilon }\) so that

$$\begin{aligned} 0\le c_{n,\varepsilon }\le \varepsilon \sqrt{n}, \end{aligned}$$
(17)

then the Frobenius norm \(\Vert A_n-A_{n,\varepsilon }\Vert _2\) is bounded by \(\varepsilon n\) for all n and \(\varepsilon \). However, there are many choices for the coefficient \(c_{n,\varepsilon }\) such that (17) is satisfied but \(\{\{A_{n,\varepsilon }\}_n\}_{\varepsilon >0}\) is not an a.c.s. for \(\{A_n\}_n\) as \(\varepsilon \rightarrow 0\). For example, take any \(c_{n,\varepsilon }\) satisfying (17) and such that, for every \(\varepsilon >0\),

$$\begin{aligned} \lim _{n\rightarrow \infty }c_{n,\varepsilon }=c_\varepsilon \ge c>0, \end{aligned}$$
(18)

with c independent of \(\varepsilon \). A possible choice is \(c_{n,\varepsilon }=\varepsilon \sqrt{n}\). Then \(\{\{A_{n,\varepsilon }\}_n\}_{\varepsilon >0}\) is not an a.c.s. for \(\{A_n\}_n\) as \(\varepsilon \rightarrow 0\). Let us prove it formally. If \(R_{n,\varepsilon }\) and \(N_{n,\varepsilon }\) are any two matrices such that

$$\begin{aligned} c_{n,\varepsilon }I_n=A_n-A_{n,\varepsilon }=R_{n,\varepsilon }+N_{n,\varepsilon }, \end{aligned}$$

then, by the minimax principle for singular values [1, Problem III.6.1],

$$\begin{aligned} c_{n,\varepsilon }&=\sigma _i(R_{n,\varepsilon }+N_{n,\varepsilon })\nonumber \\&=\max _{\begin{array}{c} V subspace of \mathbb C^n\\ \dim V=i \end{array}}\min _{\begin{array}{c} \mathbf x\in V\\ \Vert \mathbf x\Vert =1 \end{array}}\Vert R_{n,\varepsilon }\mathbf x+N_{n,\varepsilon }\mathbf x\Vert \nonumber \\&\le \max _{\begin{array}{c} V subspace of \mathbb C^n\\ \dim V=i \end{array}}\min _{\begin{array}{c} \mathbf x\in V\\ \Vert \mathbf x\Vert =1 \end{array}}\Vert R_{n,\varepsilon }\mathbf x\Vert +\Vert N_{n,\varepsilon }\Vert \nonumber \\&=\sigma _i(R_{n,\varepsilon })+\Vert N_{n,\varepsilon }\Vert , \end{aligned}$$
(19)

for all \(i=1,\ldots ,n\). If \(rank (R_{n,\varepsilon })<n\), then at least one singular value \(\sigma _i(R_{n,\varepsilon })\) is zero, and by (18)–(19) we have \(\Vert N_{n,\varepsilon }\Vert \ge c_{n,\varepsilon }\rightarrow c_\varepsilon \ge c>0\). In view of (4)–(5), this implies that \(d_\mathrm{a.c.s.}(\{A_n\}_n,\{A_{n,\varepsilon }\}_n)\ge \min (c,1)>0\) for all \(\varepsilon >0\) and, consequently, \(\{\{A_{n,\varepsilon }\}_n\}_{\varepsilon >0}\) is not an a.c.s. for \(\{A_n\}_n\) as \(\varepsilon \rightarrow 0\).

We conclude this work by mentioning two possible ways to extend Theorem 1, i.e., to prove the GLT relation \(\{A_n(a)\}_n\sim _\mathrm{GLT}a(x,x,\theta )\) for a space of functions a larger than \(\mathcal W\cup \mathcal X\cup \mathcal Y\).

  1. 1.

    The first way originates from the observation that, when proving items i–iii in Step 1 of Sect. 3, we actually proved more than needed. We emphasize in particular the following aspect. When proving item iii, we showed that

    $$\begin{aligned} \Vert A_n(a)-A_{n,m}(a)\Vert \le \varepsilon (m,n) \end{aligned}$$
    (20)

    for some \(\varepsilon (m,n)\) satisfying

    $$\begin{aligned} \lim _{m\rightarrow \infty }\limsup _{n\rightarrow \infty }\varepsilon (m,n)=0. \end{aligned}$$
    (21)

    On the other hand, it would have been enough to show that \(\{A_{n,m}(a)\}_n\mathop {\longrightarrow }\limits ^\mathrm{a.c.s.}\{A_n(a)\}_n\), by proving for example that

    $$\begin{aligned} \Vert A_n(a)-A_{n,m}(a)\Vert _p\le \varepsilon (m,n)n^{1/p} \end{aligned}$$
    (22)

    for some \(p\in [1,\infty )\) and some \(\varepsilon (m,n)\) satisfying (21); see Proposition 3. Note that (22) is a condition weaker than (20), because for all \(p\in [1,\infty )\) we have

    $$\begin{aligned} \frac{\Vert A\Vert _p}{n^{1/p}}\le \Vert A\Vert , \end{aligned}$$

    with the equality holding if and only if all the singular values of A are equal. In view of these considerations, there is room to refine the argument used for the proof of Theorem 1 (Step 1), so as to weaken the hypotheses on a and, consequently, to enlarge the space of functions a such that \(\{A_n(a)\}_n\sim _\mathrm{GLT}a(x,x,\theta )\).

  2. 2.

    Suppose that, for all functions a belonging to a certain class \(\mathcal C\subset \mathcal L_1\) we can construct a sequence \(\{a_m\}_m\subset \mathcal L_1\) such that

    1. i.

      the relation \(\{A_n(a_m)\}_n\sim _\mathrm{GLT}a_m(x,x,\theta )\) holds for all m,

    2. ii.

      \(a_m\) converges to a in some way which ensures that \(a_m(x,x,\theta )\rightarrow a(x,x,\theta )\) in measure,

    3. iii.

      \(\{A_n(a_m)\}_n\mathop {\longrightarrow }\limits ^\mathrm{a.c.s.}\{A_n(a)\}_n\).

    Then \(\{A_n(a)\}_n\sim _\mathrm{GLT}a(x,x,\theta )\) for all \(a\in \mathcal C\) (by GLT 8). This technique is the second possible way to extend Theorem 1, and it was already used in the proof of Theorem 1 with \(\mathcal C=\mathcal W\) (Step 1), \(\mathcal C=\mathcal X\) (Step 2) and \(\mathcal C=\mathcal Y\) (Step 3). What is important to point out is that, now that we have proved Theorem 1, we are allowed to take any sequence in \(\mathcal W\cup \mathcal X\cup \mathcal Y\) as the sequence \(\{a_m\}_m\), because item i is automatically satisfied and item ii is satisfied as well, provided we choose an \(a_m\) converging to a in a suitable way. In Steps 2–3 we precisely used the strategy outlined here by taking \(\{a_m\}_m\subset \mathcal W\) such that \(a_m\rightarrow a\) in \(L^p\). Moreover, the strategy worked successfully because the \(L^p\) convergence implied the a.c.s. convergence \(\{A_n(a_m)\}_n\mathop {\longrightarrow }\limits ^\mathrm{a.c.s.}\{A_n(a)\}_n\), thanks to the estimates (12)–(13), that is, to the specific “separable” structure of the functions \(a_m,a\) and to the crucial inequality \(\Vert T_n(f)\Vert _p\le n^{1/p}\Vert f\Vert _{L^p}\). It would be interesting to see whether this same strategy also works for classes \(\mathcal C\) other than \(\mathcal X\) and \(\mathcal Y\). In particular, does it work if \(\mathcal C\) is one of the Hölder classes considered in [3, 4, 17]?

The extension of Theorem 1 using the suggestions in the above items 1–2 may form the content of a future research.