Geometric means of quasi-Toeplitz matrices

Bini, Dario A.; Iannazzo, Bruno; Meng, Jie

doi:10.1007/s10543-023-00962-2

Geometric means of quasi-Toeplitz matrices

Published: 08 March 2023

Volume 63, article number 20, (2023)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

BIT Numerical Mathematics Aims and scope Submit manuscript

Geometric means of quasi-Toeplitz matrices

Download PDF

Dario A. Bini¹,
Bruno Iannazzo² &
Jie Meng³

264 Accesses
2 Citations
Explore all metrics

Abstract

We study means of geometric type of quasi-Toeplitz matrices, that are semi-infinite matrices $A=(a_{i,j})_{i,j=1,2,\ldots }$ of the form $A=T(a)+E$, where E represents a compact operator, and T(a) is a semi-infinite Toeplitz matrix associated with the function a, with Fourier series $\sum _{k=-\infty }^{\infty } a_k e^{{{\mathfrak {i}}}k t}$, in the sense that $(T(a))_{i,j}=a_{j-i}$. If a is real valued and essentially bounded, then these matrices represent bounded self-adjoint operators on $\ell ^2$. We prove that if $a_1,\ldots ,a_p$ are continuous and positive functions, or are in the Wiener algebra with some further conditions, then matrix geometric means, such as the ALM, the NBMP and the weighted mean of quasi-Toeplitz positive definite matrices associated with $a_1,\ldots ,a_p$, are quasi-Toeplitz matrices associated with the function $(a_1\cdots a_p)^{\frac{1}{p}}$, which differ only by the compact correction. We introduce numerical algorithms for their computation and show by numerical tests that these operator means can be effectively approximated numerically.

Algorithms for Approximating Means of Semi-infinite Quasi-Toeplitz Matrices

Matrix-Valued Truncated Toeplitz Operators: Unbounded Symbols, Kernels and Equivalence After Extension

Article Open access 29 January 2022

On the exponential of semi-infinite quasi-Toeplitz matrices

Article 31 October 2018

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

In this paper, we examine the concept of matrix geometric mean applied to a set of semi-infinite Toeplitz matrices, provide a theoretical analysis of the main properties, and perform the design and analysis of numerical algorithms for the computation of matrix geometric means like the ALM, the NBMP, and the weighted mean.

1.1 Matrix geometric means

The concept of matrix geometric mean of a set of positive definite matrices, together with its analysis and computation, has received an increasing interest in the past years due to its rich and elegant theoretical properties, the nontrivial algorithmic issues related to its computation, and also the important role played in several applications.

A fruitful approach to get a matrix geometric mean has been to identify and impose the right properties required by this function. These properties, known as Ando-Li-Mathias axioms, are listed in the seminal work [1]. They are satisfied, in particular, by the ALM mean [1], the NBMP mean, independently introduced in [11] and [35], and by the Riemannian center of mass, the so-called Karcher mean, identified as a matrix geometric mean in [33]. We refer the reader to the book [4] for an introduction to matrix geometric means with historical remarks. These means have been generalized in a natural way to the infinite dimensional case. The Karcher mean has been extended to self-adjoint positive definite operators and to self-adjoint elements of a $C^*$-algebras in [30] and [28], respectively. Another natural generalization relies on the concept of weighted mean: a generalization of the NBMP mean is presented in [29].

Matrix geometric means have played an important role in several applications such as radar detection [27, 44], image processing [38], elasticity [34]. More recent applications include machine learning [20, 42], brain-computer interface [45, 47], and network analysis [15]. This has lead to the design and analysis of numerical methods for matrix geometric means. See in particular [5, 6, 11, 19, 21, 23, 46], and the literature cited therein. In certain applications, the matrices to be averaged have further structures originated by the peculiar features of the physical model that they describe. In the radar detection application, they are Toeplitz. A (possibly infinite) matrix $T=(t_{i,j})$ is said to be Toeplitz if $ t_{i,j}=a_{j-i}$ for some $\{a_k\}_{k\in {\mathbb {Z}}}\subset {\mathbb {C}}$. The sequence $\{a_k\}_{k\in {\mathbb {Z}}}$ may arise as the set of Fourier coefficients of a function a defined on the unit circle ${\mathbb {T}}=\{z\in {\mathbb {C}}:\,|z|=1\}$, with Fourier series $\sum _{\ell =-\infty }^\infty a_\ell z^\ell $. The function a is said to be the symbol associated with $T=(t_{i,j})$. In the case where i, j range in the set ${\mathbb {Z}}^+$ of positive integers, T is semi-infinite and is denoted by T(a).

The problem of averaging finite-dimensional Toeplitz matrices has been treated in some recent papers, see for instance [7, 22, 36, 37], with the aim to provide a definition of matrix geometric mean $G=G(A_1,\ldots ,A_p)$ which preserves the matrix structure of the input matrices $A_k$, for $k=1,\ldots ,p$.

1.2 Theoretical contributions

We consider the problem of analyzing the structure of the matrix G when $A_k=T(a_k)$, for $k=1,\ldots ,p$, is a self-adjoint positive definite Toeplitz matrix and G is either the ALM, NBMP, and the weighted mean. These results can be complemented with similar results for the Karcher mean provided in [8] We start by considering the case where the symbols $a_k(z)$, $k=1,\ldots ,p$, are continuous real valued positive functions.

We show that if G is the ALM, NBMP, or a weighted mean of the self-adjoint positive definite operators $T(a_1),\ldots ,T(a_p)$, then, even though the Toeplitz structure is lost in G, a hidden structure is preserved. In fact, we show that G is a quasi-Toeplitz matrix, that is, it can be written as $G=T(g)+K_G$, where $K_G$ is a self-adjoint compact operator on the set $\ell ^2$ formed by infinite vectors $v=(v_i)_{i\in {\mathbb {Z}}^+}$, $v_i\in {\mathbb {C}}$ such that $\Vert v\Vert _2:=(\sum _{i=1}^\infty |v_i|^2)^\frac{1}{2}<\infty $. An interesting feature is that the function g(z) turns out to be $(a_1(z)a_2(z)\cdots a_p(z))^{\frac{1}{p}}$, as expected. These results are extended to the more general case $A_k=T(a_k)+K_k$, where $K_k$ is a compact operator.

Among the side results that we have introduced to prove the main properties of the geometric mean, it is interesting to point out Theorem 2.6 which states that for any continuous real valued function a(z) and for any continuous function f(t) defined on the spectrum of T(a), the difference $f(T(a))-T(f(a))$ is a compact operator.

Algebras of quasi-Toeplitz matrices are also known in the literature as Toeplitz algebras. See for instance [14] where in Example 2.8 the case of continuous symbols is considered, or in Section 7.3, where the smallest closed subalgebra of $\mathcal B(\ell ^2)$ containing the matrices of the type T(a), where a has an absolute convergent Fourier series, is considered. Here, ${\mathcal {B}}(\ell ^2)$ denotes the set of bounded operators on the set $\ell ^2$.

After analyzing the case of continuous symbols, we identify additional regularity assumptions on the functions $a_1(z),\ldots ,a_p(z)$, such that the summation $\sum _{\ell =-\infty }^{+\infty }|g_\ell |$ is finite, where $\sum _{\ell =-\infty }^{+\infty }g_\ell z^\ell $ is the Fourier series of $g(z)=(a_1(z)\cdots a_p(z))^{\frac{1}{p}}$. This property is meaningful since it implies that $T(a_k)$ as well as G are bounded operators on the set $\ell ^\infty $ of infinite vectors with uniformly bounded components. We will apply these ideas also in the case where the matrices $A_1,\ldots ,A_p$, are of finite size n and show that, for a sufficiently large value of n, the matrix G is numerically well approximated by a Toeplitz matrix plus a matrix correction that is nonzero in the entries close to the top-left corner and to the bottom-right corner of the matrix.

1.3 Algorithmic contributions

Regarding computational issues, we examine classical algorithms for computing the square root and the pth root of a matrix as Cyclic Reduction and Newton’s iteration, and analyze their convergence properties when applied to Toeplitz matrices given in terms of the associated symbols. Then we deal with the problem of computing G for the ALM, NBMP, and the weighted geometric means. We introduce and analyze algorithms for computing the Toeplitz part T(g) and the correction part $K_G$ in a very efficient way, both in the case of infinite and of finite Toeplitz matrices. A set of numerical experiments shows that the geometric means of infinite quasi-Toeplitz matrices can be easily and accurately computed by relying on simple MATLAB implementations, based on the CQT-Toolbox of [10]. An extension of the CQT-Toolbox is provided by introducing the functions qt_alm, qt_nbmp, and qt_karcher for computing the corresponding means.

1.4 Paper organization

The paper is organized as follows: in the next section, after providing some preliminary results on Banach algebras and $C^*$-algebras, we recall the most common matrix geometric means, introduce the class of quasi-Toeplitz matrices associated with a continuous symbol and study functions of matrices in this class. Moreover, we analyze the case where the symbol of quasi-Toeplitz matrices is in the Wiener class, that is, the associated Fourier series is absolutely convergent. In Sect. 3 we prove that the most common definitions of geometric mean of operators preserve the quasi-Toeplitz matrix structure, and that the symbol associated with the geometric mean is the geometric mean of the symbols associated with the input matrices. In the same section, we give regularity conditions on the symbols in order that the convergence of the sequence of symbols generated by the ALM recursion holds in the Wiener norm, this implies the convergence of the operator sequences in the infinity norm. In Sect. 4 we discuss some algorithmic and numerical issues related to the computation of the pth root and of the geometric mean of quasi-Toeplitz matrices, in Sect. 5 we describe the geometric means of finite quasi-Toeplitz matrices, while in Sect. 6 we report the results of some numerical experiments. Finally, Sect. 7 draws some conclusions and raises some open problems.

2 Preliminaries and notations

Let ${{\mathfrak {i}}}$ be the complex unit, and denote by $\mathbb T=\{z\in {\mathbb {C}}\,:\, |z|=1\}$ the unit circle in the complex plane. Given a function $f(z):{\mathbb {T}}\rightarrow {\mathbb {C}}$ the composition $\widetilde{f}(t)=f(e^{{{\mathfrak {i}}}t})$ is a $2\pi $-periodic function from ${\mathbb {R}}$ to ${\mathbb {C}}$ which can be restricted to $[0,2\pi ]$. For the sake of simplicity, we keep the notation f(t) to denote $\widetilde{f}(t)$, where it should be understood that the t variable is real and ranges in the set $[0,2\pi ]$, while the z variable is complex and ranges in the set ${\mathbb {T}}$.

For a given subset ${\mathcal {S}}\subset {\mathbb {R}}$, and $1\le p<\infty $, we denote by $L^p({\mathcal {S}})$ the set of functions $f(t):{\mathcal {S}}\rightarrow {\mathbb {C}}$, such that $\Vert f\Vert _{p,\mathcal S}=(\int _{\mathcal {S}} |f(t)|^pdt)^\frac{1}{p}<\infty $. We denote $L^\infty ({\mathcal {S}})$ the set of measurable functions defined over ${\mathcal {S}}$ with finite essential supremum and we set $\Vert f\Vert _{\infty ,{\mathcal {S}}}=\text{ ess }\sup _{t\in {\mathcal {S}}}|f(t)|$. If the set ${\mathcal {S}}$ is clear from the context we write $\Vert f\Vert _{L^p}$ in place of $\Vert f\Vert _{p,{\mathcal {S}}}$ for $1\le p<\infty $ and $\Vert f\Vert _{\infty }$ instead of $\Vert f\Vert _{\infty ,\mathcal S}$.

We denote by $C({\mathcal {S}})$ the set of continuous functions $f:{\mathcal {S}}\rightarrow {\mathbb {C}}$. The subset of $f\in C([0,2\pi ])$ such that $f(0)=f(2\pi )$ is denoted by $C_{2\pi }$, while $L^p_{2\pi } =L^p([0,2\pi ))$.

Recall that any linear operator $A:\ell ^2\rightarrow \ell ^2$ can be represented by a semi-infinite matrix $(a_{i,j})_{i,j\in \mathbb Z^+}$, which we denote with the same symbol A. The set of bounded operators on $\ell ^2$ is denoted by ${\mathcal {B}}(\ell ^2)$, or more simply by ${\mathcal {B}}$. We recall that the set $\mathcal K\subset {\mathcal {B}}$ of compact operators is the closure in $\mathcal B$ of the set of bounded operators with finite rank. Given the matrix (operator) $A=(a_{i,j})$, we define $A^*=({\overline{a}}_{j,i})$ the conjugate transpose of A, where ${\overline{x}}$ denotes the complex conjugate of $x\in {\mathbb {C}}$. A matrix (operator) A is Hermitian or self-adjoint if $A=A^*$, moreover, is positive if in addition $v^*Av=\sum _i {\overline{v}}_i(Av)_i>0$ for any $v\in \mathbb C^n{\setminus }\{0\}$ ($v\in \ell ^2{\setminus }\{0\}$). We say that $A\in \ell ^2$ is positive definite if there exists a constant $\gamma >0$ such that $v^*Av\ge \gamma \Vert v\Vert _2^2$ for any $v\in \ell ^2$.

2.1 Banach algebras and $C^*$-algebras

A Banach algebra ${\mathfrak {B}}$ is a Banach space with a product such that $\Vert ab\Vert \le \Vert a\Vert \Vert b\Vert $, for $a,b\in {\mathfrak {B}}$, where $\Vert \cdot \Vert $ is the norm of ${\mathfrak {B}}$. Examples of Banach algebras that we will use are: the set of essentially bounded functions $L^\infty (S)$, with $S\subset {\mathbb {R}}$, with the norm $\Vert f\Vert _{\infty ,S}$ and pointwise multiplication (as a special case we have $L_{2\pi }^\infty $); the set C(K) of continuous functions in a compact $K\subset {\mathbb {C}}^n$ with pointwise multiplication; and $\mathcal B(\ell ^2)$ with operator composition. If $a^{(k)}\in {\mathfrak {B}}$ is a sequence such that $\lim _k\Vert a^{(k)}-a\Vert =0$ for some $a\in \mathfrak B$, we write $a^{(k)}\rightarrow _{{\mathfrak {B}}} a$.

For two sequences $\{a_1^{(k)}\}_k, \{a_2^{(k)}\}_k$ in a Banach algebra ${\mathfrak {B}}$, it can be easily shown (see, for instance, [40, Remark 2.10]) that if $a_1^{(k)}\rightarrow _{{\mathfrak {B}}} a_1$ and $a_2^{(k)}\rightarrow _{{\mathfrak {B}}} a_2$, then $a_1^{(k)} a_2^{(k)}\rightarrow _{{\mathfrak {B}}} a_1a_2$. An induction argument applied to the sequences $\{a_i^{(k)}\}_k\subset {\mathfrak {B}}$ for $i=1,\ldots ,p$ will provide the next Lemma.

Lemma 2.1

Let ${\mathfrak {B}}$ be a Banach algebra and $a_1,\ldots ,a_p\in {\mathfrak {B}}$. If the sequences $\{a_i^{(k)}\}_k\subset {\mathfrak {B}}$ are such that $a_i^{(k)}\rightarrow _{{\mathfrak {B}}} a_i$ for $i=1,\ldots ,p$, then

$$\begin{aligned} a_1^{(k)}\cdots a_p^{(k)}\rightarrow _{{\mathfrak {B}}} a_1\cdots a_p. \end{aligned}$$

The spectrum $\textrm{sp}(A)$ of an element A of a Banach algebra is the set of $\lambda \in {\mathbb {C}}$ such that $A-\lambda I$ fails to be invertible. The spectral radius $\rho (A)$ is defined as $\rho (A)=\sup _{\lambda \in \textrm{sp}(A)} |\lambda |$.

A $C^*$-algebra is a Banach algebra ${\mathfrak {B}}$ with an involution $A\rightarrow A^*$ such that $(aS+bT)^*={\overline{a}} S^*+{\overline{b}} T^*$; $(ST)^*=T^*S^*$, $(T^*)^*=T$ and $\Vert T^*T\Vert =\Vert T^2\Vert $, for $T,S\in {\mathfrak {B}}$ and $a,b\in {\mathbb {C}}$. Two important examples of $C^*$-algebras are ${\mathcal {B}}(\ell ^2)$ with the adjoint operation and C(K), for $K\subset {\mathbb {R}}$ compact, with the pointwise conjugation. A $C^*$-subalgebra $\mathfrak F\subset {\mathfrak {B}}$ is a closed subalgebra that contains the identity and such that $A\in {\mathfrak {F}}$ implies $A^*\in \mathfrak F$. An element A of a $C^*$-algebra is self-adjoint if $A^*=A$, normal if $A^*A=AA^*$. We need some additional results taken from Proposition 4.1.1, and Theorems 4.1.3 and 4.1.6 of [24].

Lemma 2.2

Let A be an element of a $C^*$-algebra ${\mathfrak {B}}$.

(i)
If A is normal then $\rho (A)=\Vert A\Vert $. Moreover, if A is self-adjoint, then the spectrum $\textrm{sp}(A)$ of A is a compact subset of ${\mathbb {R}}$.
(ii)
If A is self-adjoint, then there exists a unique continuous mapping $f\rightarrow f(A):C(\textrm{sp}(A))\rightarrow {\mathfrak {B}}$ such that f(A) has its elementary meaning when f is a polynomial. Moreover, f(A) is normal, while it is self-adjoint if and only if f takes real values at the spectrum of A.
(iii)
If A is self-adjoint and $f\in C(\textrm{sp}(A))$, then $\Vert f(A)\Vert =\Vert f\Vert _{\infty ,\textrm{sp}(A)}$ and $\textrm{sp}(f(A))=f(\textrm{sp}(A))$.

Lemma 2.2 implies the existence of a continuous functional calculus, that is to define a function of a self-adjoint element A of a $C^*$-algebra when the function is continuous at the spectrum of A.

2.2 Means of operators

The weighted geometric mean of two $n\times n$ matrices A and B, with weight $t\in [0,1]$, is defined as $ A\#_t B= A^{\frac{1}{2}}(A^{-\frac{1}{2}}BA^{-\frac{1}{2}})^tA^{\frac{1}{2}}, $ while the geometric mean is $G(A,B):=A\# B:=A\#_{1/2}B$. Notice that, using continuous functional calculus, the same formula makes sense also for bounded self-adjoint positive definite operators $A,B\in {\mathcal {B}}(\ell ^2)$.

For more than two matrices, several different definitions have been proposed. For the ease of the reader, here we limit the treatment to three positive matrices of the same size, while we refer to a later section for the case of more than three matrices. The ALM mean, proposed by Ando, Li, and Mathias [1], has been introduced as the common limit of the sequences

$$\begin{aligned} A_{k+1}=B_k\# C_k,\quad B_{k+1}=C_k\# A_k,\quad C_{k+1}=A_k\# B_k, \end{aligned}$$

with $A_0=A, B_0=B, C_0=C$. A nice feature of this mean is that the sequences converge with respect to the Thompson metric [41]

$$\begin{aligned} d(A,B)=\log \max \{\rho (A^{-1}B),\rho (B^{-1}A)\}. \end{aligned}$$

It is well known that convergence in the Thompson metric implies convergence in the operator norm induced by the $\ell ^2$ norm, also in the infinite dimensional case. This fact allows one to define the ALM mean of self-adjoint positive definite operators.

A variant of this construction, introduced independently by Bini, Meini, Poloni [11] and Nakamura [35], named NBMP mean, relies on the sequences

$$\begin{aligned} A_{k+1}=A_k\#_{2/3}(B_k\# C_k),~\, B_{k+1}=B_k\#_{2/3}(C_k\# A_k),~\, C_{k+1}=C_k\#_{2/3}(A_k\# B_k), \end{aligned}$$

with $A_0=A$, $B_0=B$, $C_0=C$. The sequences converge to a common limit, different in general from the ALM mean, and with a faster rate than the ALM sequences. The convergence of these sequences in the Thompson metric allows one to extend them to operators. Weighted versions of the ALM and NBMP mean can be given (see Sect. 3 for more details).

A well-recognized geometric mean of matrices is the unique positive definite solution of the matrix equation

$$\begin{aligned} \log (X^{-\frac{1}{2}}AX^{-\frac{1}{2}})+\log (X^{-\frac{1}{2}}BX^{-\frac{1}{2}})+\log (X^{-\frac{1}{2}}CX^{-\frac{1}{2}})=0, \end{aligned}$$

the so-called Karcher mean of A, B and C. This mean can be defined also for positive definite self-adjoint bounded operators, proving that the equation above has a unique positive definite self-adjoint solution [30]. Recently, the Karcher mean of positive definite quasi-Toeplitz matrices has been proved to be a positive definite quasi-Toeplitz matrix in [8], where effective algorithms are introduced and analyzed.

2.3 Quasi-Toeplitz matrices with continuous symbols

A classical result states that T(a) represents a bounded operator on $\ell ^2$ if and only if $a\in L^{\infty }_{2\pi }$ [12, Theorem 1.1]. Moreover, if a is continuous, then $\Vert T(a)\Vert _2=\Vert a\Vert _\infty $.

Toeplitz matrices form a linear subspace of ${\mathcal {B}}(\ell ^2)$, closed by involution since $T(a)^*=T({\overline{a}})$, but not closed under multiplication, that is, by composition of operators in $B(\ell ^2)$, so they are not a $C^*$-subalgebra of $\mathcal B(\ell ^2)$. Nevertheless, there is a nice formula for the product of two Toeplitz matrices.

Proposition 2.3

[12, Propositions 1.10 and 1.11]. Let $a,b\in L_{2\pi }^{\infty }$,

$$\begin{aligned} T(a)T(b)=T(ab)-H(a^-)H(b^+), \end{aligned}$$

(2.1)

where $H(a^-)_{i,j}=(a_{-i-j+1})_{i,j\in {\mathbb {Z}}^+}$ and $H(b^+)=(b_{i+j-1})_{i,j\in {\mathbb {Z}}^+}$. Moreover, if a, b are continuous functions then $H(a^-),H(b^+)$ are compact operators on $\ell ^2$.

A nice feature of equation (2.1) is that it relates the symbol of the product to the product of the symbols and if $a,b\in C_{2\pi }$ then $T(a)T(b)-T(ab)$ belongs to the set ${\mathcal {K}}$ of compact operators on $\ell ^2$ [12].

The smallest $C^*$-subalgebra of the $C^*$-algebra $\mathcal B(\ell ^2)$ containing all Toeplitz operators with continuous symbols is [14]

$$\begin{aligned} {\mathcal {Q}}{\mathcal {T}}=\{T(a)+K:\, a\in C_{2\pi },K \in {\mathcal {K}}\}, \end{aligned}$$

that we call the set of quasi-Toeplitz matrices and it is also known in the literature as Toeplitz algebra. Notice that for $A\in {\mathcal {Q}}{\mathcal {T}}$ there exists unique $a\in C_{2\pi }$ and $K\in {\mathcal {K}}$ such that $A=T(a)+K$, because the intersection between Toeplitz matrices with continuous symbols and compact operators is the zero operator [12, Proposition 1.2]. To define a ${\mathcal {Q}}{\mathcal {T}}$ matrix, we will use often the notation $A=T(a)+K\in {\mathcal {Q}}{\mathcal {T}}$, without saying explicitly that a is a continuous and $2\pi $-periodic function and K is a compact operator.

The following lemma collects some known results from [12, Sections 1.4 and 1.5], [14, Section 1.1] and from [17], which will be useful in the sequel. In particular, it states properties of the spectrum $\text{ sp }(A)$ of A, of the essential spectrum $\mathrm{sp_{ess}}(A)$ defined by $\mathrm{sp_{ess}}(A)=\{\lambda \in {\mathbb {C}}: A-\lambda I \ \hbox {is not Fredholm on}\ {\mathcal {B}}(\ell ^2)\}$, and on the numerical range of A defined as $\textrm{W}(A)=\{x^*Ax:\,\Vert x\Vert _2=1\}$.

Lemma 2.4

The following properties hold:

(i)
T(a) represents a bounded operator on $\ell ^2$ if and only if $a\in L_{2\pi }^{\infty }$; T(a) is self adjoint if and only if a is real valued.

Moreover, if $a\in C_{2\pi }$ then:

(ii)
$\Vert a\Vert _\infty =\Vert T(a)\Vert _2\le \Vert T(a)+K\Vert _2$, for $K\in {\mathcal {K}}$;
(iii)
$\text{ sp }(T(a))=a([0,2\pi ])\cup \{\lambda \in {\mathbb {C}}:\,\text{ wind }(a-\lambda )\ne 0\}$ is compact, where $\mathrm{wind(a)}$ is the winding number of the curve $a({\mathbb {T}})$;
(iv)
$a([0,2\pi ])=\mathrm{sp_{ess}}(T(a))=\mathrm{sp_{ess}}(T(a)+K)\subset \text{ sp }(T(a)+K)$ for $K\in {\mathcal {K}}$;
(v)
$\textrm{sp} (A)$ is contained in the closure of $\textrm{W}(A)$ for $A\in {\mathcal {B}}(\ell ^2)$.

It was proved in [12, Proposition 1.3] that a Toeplitz matrix T(a) is self-adjoint if and only if a is real valued. We prove that if a quasi-Toeplitz matrix $A=T(a)+E$ is positive (definite), then a is nonnegative (strictly positive).

Lemma 2.5

Let $A=T(a)+K\in {\mathcal {Q}}{\mathcal {T}}$. If $A=A^*$ then $T(a)=T(a)^*$ and $K=K^*$. If A is positive, then a is nonnegative. If in addition A is positive definite then a is strictly positive.

Proof

If $A=A^*$, then $T(a)-T(a)^* +K-K^*=0$, and by the uniqueness of the decomposition of a quasi-Toeplitz matrix [12], we have that $K=K^*$ and $T(a)=T(a)^*$. If A is positive definite, there exists $\gamma >0$ such that $x^*Ax\ge \gamma \Vert x\Vert _2>0$ for any $x\ne 0$. That is, the numerical range $\textrm{W}(A)$ of A is formed by real numbers greater than or equal to $\gamma $ so that its closure contains positive values. Since by Lemma 2.4, part (v), the spectrum $\textrm{sp} (A)$ is contained in the closure of $\textrm{W}(A)$, we have that $\lambda >0$ for any $\lambda \in \textrm{sp}(A)$. From Lemma 2.4 part (iv), it follows that $a(t)\in \textrm{sp}(A)$ so that $a(t)>0$. The case where A is positive is treated similarly since $x^*Ax\ge 0$ for any $x\in \ell ^2$ so that the closure of $\textrm{W}(A)$ is formed by nonnegative numbers. $\square $

The fact that ${\mathcal {Q}}{\mathcal {T}}$ is a $C^*$-algebra allows one to use continuous functional calculus, by Lemma 2.2. If A is a self-adjoint quasi-Toeplitz matrix and f is a function continuous on the spectrum of A, then one can define the normal quasi-Toeplitz matrix f(A). In particular, the pth root of a self-adjoint positive definite quasi-Toeplitz matrix turns out to be a quasi-Toeplitz matrix. This fact implies that the sequences generated by the ALM and the NBMP constructions, if the initial values are ${\mathcal {Q}}{\mathcal {T}}$ matrices, are formed by entries belonging to ${\mathcal {Q}}{\mathcal {T}}$. We will prove that also the limit G of these sequences belongs to ${\mathcal {Q}}{\mathcal {T}}$, and that it can be written as $G=T(g)+K_G$, where g(t) is the geometric mean of the symbols associated with the Toeplitz part of the given matrices.

We now prove the following result, that is apparently new and which allows us to define the function f(T(a)) for a self-adjoint Toeplitz matrix T(a) and a continuous function f defined on the spectrum of T(a).

Theorem 2.6

Let $a\in C_{2\pi }$ be a real valued function and f a continuous function on the spectrum of T(a), then

$$\begin{aligned} f(T(a))=T(f\circ a)+K, \end{aligned}$$

(2.2)

where K is a compact operator. If f takes real values on the spectrum of T(a), then K is self-adjoint and $\textrm{sp}(f(T(a)))=\textrm{sp}(T(f\circ a))$.

Proof

Since a is real valued, by Lemma 2.4 the operator T(a) is self-adjoint and since a is continuous, its spectrum $\Sigma $ is a compact set containing the range of a (see Lemma 2.4). Moreover, one can define f(T(a)), for any function f continuous on $\Sigma $, using the continuous functional calculus and, since the function f is continuous in $\Sigma $, by Lemma 2.2, we have $\Vert f(T(a))\Vert _2=\Vert f\Vert _{\infty ,\Sigma }$.

From Proposition 2.3 it follows that if f is a polynomial, then equation (2.2) holds. By using an approximation argument, we prove that (2.2) still holds for any continuous function f. Indeed, f can be approximated uniformly by a sequence of polynomials $p_n$ such that $\Vert f-p_n\Vert _{\infty ,\Sigma }$ tends to 0 (by the Weierstrass theorem) and there exist compact operators $K_k$ such that

$$\begin{aligned} p_k(T(a))=T(p_k\circ a)+K_k. \end{aligned}$$

Since continuous functional calculus preserves $C^*$-algebras, $f(T(a))\in {\mathcal {Q}}{\mathcal {T}}$, that is $f(T(a))=T(b)+K$, what we need to prove is that $b=f\circ a$.

To get the result, in view of the uniqueness of the decomposition of a quasi-Toeplitz matrix, it is sufficient to prove that $f(T(a))-T(f\circ a)$ is a compact operator. We have

$$\begin{aligned} \begin{aligned}&\Vert f(T(a))-T(f\circ a)-K_k\Vert _2 \\&\le \Vert f(T(a))-p_k(T(a))\Vert _2+\Vert T(f\circ a)-T(p_k\circ a)\Vert _2 \le 2\Vert f-p_k\Vert _{\infty ,\Sigma }, \end{aligned}\nonumber \\ \end{aligned}$$

(2.3)

where the last inequality follows by functional calculus, since $f-p_k$ is continuous in the spectrum $\Sigma $ of T(a) and from the property $\Vert T(a)\Vert _2=\Vert a\Vert _{\infty }$ (compare Lemma 2.4). Whence we get

$$\begin{aligned}{} & {} \Vert T(f\circ a-p_k\circ a)\Vert _2 = \Vert (f-p_k)\circ a\Vert _{\infty ,[0,2\pi ]}\\ {}{} & {} =\Vert f-p_k\Vert _{\infty ,a([0,2\pi ])}\le \Vert f-p_k\Vert _{\infty ,\Sigma }. \end{aligned}$$

Inequality (2.3) implies that $K_k$ converges to $f(T(a))-T(f\circ a)$ in the operator norm, and since ${\mathcal {K}}$ is closed, we may conclude that $f(T(a))-T(f\circ a)$ is a compact operator, that is what we wanted to prove.

Regarding the last statement, by Lemma 2.2, part (ii) and Lemma 2.4, we know that f(T(a)) and $T(f\circ a)$ are self-adjoint and thus K is self-adjoint as well, by Lemma 2.2, part (iii), and Lemma 2.4, we get

$$\begin{aligned} \textrm{sp}(f(T(a)))=f(\textrm{sp}(T(a)))=f(a([0,2\pi ]))=\textrm{sp}(T(f\circ a)). \end{aligned}$$

$\square $

By slightly modifying the above proof, we can easily arrive at the following generalization, which shows that $f(T(a)+H)$ is well-defined if $T(a)+H$ is a self-adjoint quasi-Toeplitz matrix and f is a continuous function defined on the spectrum of $T(a)+H$.

Theorem 2.7

Let $a\in C_{2\pi }$ be a real valued function, H a self-adjoint compact operator and f a continuous function on the spectrum of $T(a)+H$, then $f(T(a)+H)=T(f\circ a)+K$, where K is a compact operator. If f takes real values on the spectrum of T(a), then K is self-adjoint.

As an immediate consequence of the above Theorem 2.7 in the following corollary we provide new insight on the structure and the symbol of the weighted geometric mean of quasi-Toeplitz matrices.

Corollary 2.8

Let $A,B\in {\mathcal {Q}}{\mathcal {T}}$ be positive definite operators associated with the continuous symbols a, b, respectively. Then, $a,b>0$ and for $s\in [0,1]$, we have $G=A\#_s B\in {\mathcal {Q}}{\mathcal {T}}$ and the symbol g associated with G is such that $g=a^{1-s}b^s$.

Proof

From Lemma 2.5 we deduce that if A is positive definite then $a,b>0$. By definition, $A\#_s B=A^\frac{1}{2}(A^{-\frac{1}{2}}BA^{-\frac{1}{2}})^s A^{\frac{1}{2}}$ so that, by Theorem 2.7 one has $A^\frac{1}{2}, (A^{-\frac{1}{2}}BA^{-\frac{1}{2}})^s\in {\mathcal {Q}}{\mathcal {T}}$, thus $A\#_s B\in {\mathcal {Q}}{\mathcal {T}}$ and $g=a^{1-s}b^s$. $\square $

Since ${\mathcal {Q}}{\mathcal {T}}$ is a $C^*$-algebra, if a sequence $\{A_k\}_k\subset {\mathcal {Q}}{\mathcal {T}}$ converges to $A\in \mathcal {B}(\ell ^2)$, then it is easy to see that $A\in {\mathcal {Q}}{\mathcal {T}}$. In the following lemma, we prove that the Toeplitz part and the compact part of $A_k$ converge to the Toeplitz part and the compact part of A, respectively.

Lemma 2.9

Let $\{A_k\}_k$ be a sequence of quasi-Toeplitz matrices such that $A_k=T(a_k)+K_k\in {\mathcal {Q}}{\mathcal {T}}$, where $a_k\in C_{2\pi }$, and $K_k\in {\mathcal {K}}$, for $k\in {\mathbb {Z}}^+$. Let $A=T(a)+K\in {\mathcal {Q}}{\mathcal {T}}$ be such that $\lim _k\Vert A-A_k\Vert _2=0$. Then $\lim _k\Vert a-a_k\Vert _\infty =0$ and $\lim _k\Vert K-K_k\Vert _2=0$.

Proof

From the inequality $\Vert T(a)\Vert _2\le \Vert A\Vert _2$ (see Lemma 2.4, part (ii)) and from $\Vert a\Vert _\infty =\Vert T(a)\Vert _2$, it follows that $\Vert a-a_k\Vert _\infty =\Vert T(a-a_k)\Vert _2\le \Vert A-A_k\Vert _2$ tends to 0 and so does $\Vert K-K_k\Vert _2$. $\square $

2.4 Quasi-Toeplitz matrices with symbols in the Wiener algebra

If $a(z)=\sum _{\ell =-\infty }^{+\infty } a_\ell z^\ell $ is a continuous function then $\lim _{n}|a_n|=0$, therefore, for an error bound $\varepsilon >0$ the cardinality of the set $\mathcal {S}_\varepsilon =\{n\in {\mathbb {Z}}:~ |a_n|<\varepsilon \}$ is finite. This fact allows one to numerically approximate the function a(z) to any precision in a finite number of operations.

Classical results of harmonic analysis relate the regularity of a(z) with the decay of the Fourier coefficients to zero. A better regularity implies a faster convergence to zero of $\{a_n\}_n$ and, in practice, this allows one to approximate a(z) by using fewer coefficients.

We will consider Toeplitz matrices associated with functions having an analytic or an absolutely convergent Fourier series. The former situation is ideal since it implies an exponential decay to zero of $\{a_n\}_n$.

We point out that if the function a is just continuous then $\sum _{i=-\infty }^{+\infty }|a_i|$ is not necessarily finite so that $\Vert T(a)\Vert _\infty = \sum _{i=-\infty }^{+\infty }|a_i|$ (compare [13, Theorem 1.14]) is not bounded in general. This is a reason to determine conditions under which a function f(a) for $a\in C_{2\pi }$, as well as the geometric mean of $a_i\in C_{2\pi }$, $i=1,\ldots ,p$, have absolutely summable coefficients or are analytic.

Another important related issue is to find out under which conditions a sequence of matrices $A_k\in \mathcal{Q}\mathcal{T}$ such that $\Vert A_k\Vert _\infty <\infty $ converges in the infinity norm to a limit $A\in \mathcal{Q}\mathcal{T}$ such that $\Vert A\Vert _\infty <\infty $. In this section we provide tools to give an answer to these questions.

The set of functions $a\in L_{2\pi }^1$ with absolutely convergent Fourier series, namely

$$\begin{aligned} {\mathcal {W}}=\left\{ f\in L_{2\pi }^1:\,f=\sum _{k\in {\mathbb {Z}}} a_k e^{{{\mathfrak {i}}}k t},\,\sum _{k\in {\mathbb {Z}}}|a_k|<\infty \right\} , \end{aligned}$$

with the norm $\Vert f\Vert _{_{_{{\mathcal {W}}}}}:=\sum _{k\in {\mathbb {Z}}} |a_k|$, is a Banach algebra, also known as the Wiener algebra [12]. A function $a\in {\mathcal {W}}$ is necessarily continuous, but it may fail to be differentiable, or even Lipschitz continuous.

We consider the convergence of a sequence in three broad classes of functions included in the Wiener algebra. The following introduction of the classes of functions can be found in [2].

The first class is the set of $\alpha $-Hölder continuous functions. A continuous function $\varphi :\Omega \rightarrow {\mathbb {C}}$, with $\Omega $ subset of ${\mathbb {R}}^n$ or ${\mathbb {C}}^n$, is said to be $\alpha $-Hölder continuous on $\Omega $, with $0<\alpha \le 1$, if

$$\begin{aligned}{}[\varphi ]_\alpha := \sup _{{\mathop {x\ne y}\limits ^{x,y\in \Omega }}}\frac{|\varphi (x)-\varphi (y)|}{|x-y|^\alpha } \end{aligned}$$

is finite. We denote by $C_{2\pi }^{0,\alpha }$ the set of $\alpha $-Hölder continuous functions on ${\mathbb {R}}$ with period $2\pi $, that is a Banach algebra with the norm

$$\begin{aligned} \Vert \varphi \Vert _{C^{0,\alpha }}:=\Vert \varphi \Vert _\infty +[\varphi ]_\alpha . \end{aligned}$$

The second class is the set of functions of bounded variation on $[0,2\pi ]$, that is functions $f:[0,2 \pi ]\rightarrow {\mathbb {C}}$, such that

$$\begin{aligned} V(f)=\sup _{{\mathop {x_0=\,0\le x_1\le \cdots \le x_N=2\pi }\limits ^{N=1,2,\ldots }}}\Bigl \{\sum _{i=0}^{N-1}|f(x_{i+1})-f(x_i)|\Bigr \} \end{aligned}$$

is finite.

The last class is the set of absolutely continuous functions on $C_{2\pi }$, that is functions $f:[0,2\pi ]\rightarrow \mathbb {R}$ such that $f(2\pi )=f(0)$ and for any $\varepsilon >0$, there exists $\delta >0$ such that

$$\begin{aligned} \sum _{i=1}^n|f(y_i)-f(x_i)|<\varepsilon , \end{aligned}$$

whenever $\{[x_i,y_i]: i=1,\ldots , n\}$ is a finite collection of mutually disjoint subintervals of $[0,2\pi ]$ with $\sum _{i=1}^n(y_i-x_i)<\delta $.

Relations between ${\mathcal {W}}$ and these three classes are stated in the following summary, which except inequality (2.5) are collected from [3] and [25].

Proposition 2.10

Let $a\in C_{2\pi }$, we have

(i)
If $a\in C_{2\pi }^{0,\alpha }$, with $\alpha \in (1/2,1]$, then $a\in {\mathcal {W}}$. Moreover, there exists a constant $\gamma _a$ such that
$$\begin{aligned} \Vert a\Vert _{_{_{{\mathcal {W}}}}}\le \gamma _a \Vert a\Vert _{C^{0,\alpha }}. \end{aligned}$$
(2.4)
(ii)
If $a\in C_{2\pi }^{0,\alpha }$, with $\alpha \in (0,1]$ and of bounded variation V(a) on $[0,2\pi ]$, then $a\in {\mathcal {W}}$ and there exists a constant $\gamma _b$ such that
$$\begin{aligned} \Vert a\Vert _{_{_{{\mathcal {W}}}}}\le \gamma _b \bigl (|a_0|+V(a)^{\frac{1}{2}}\sum _{\ell =1}^\infty (\pi 2^{-\ell })^{\alpha /2}\bigr ). \end{aligned}$$
(2.5)
(iii)
If $a\in C_{2\pi }$ is absolutely continuous on $[0,2\pi ]$ and $a'\in L_{2\pi }^2$, then $a\in {\mathcal {W}}$ and there exists a constant $\gamma _c$ such that
$$\begin{aligned} \Vert a\Vert _{_{_{{\mathcal {W}}}}}\le \gamma _c\bigl (\Vert a\Vert _{L^1}+\Vert a'\Vert _{L^2}\bigr ). \end{aligned}$$
(2.6)

Proof

Concerning part (i), the statement about $\alpha $-Hölder continuous functions with $\alpha \in (1/2,1]$ is a classical result by Bernstein [3], while the complete proof when $0<\alpha <1$ can be found in [25, p. 33] and the case $\alpha =1$ can be proved analogously.

Concerning part (iii), the statement about the absolute continuous function with derivative in $L_{2\pi }^2$ follows from [25, Theorem I.6.2].

It is left to prove part (ii). If $a\in C_{2\pi }^{0,\alpha }$ with $\alpha \in (0,1]$, and of bounded variation on $[0,2\pi ]$, then by [25, Theorem I.6.4] (see also [48, p. 241]), it follows that $a\in {\mathcal {W}}$. We just need to prove the inequality (2.5).

Denote by $\omega (\delta )=\omega (\delta ,a)=\sup |a(t_1)-a(t_2)|$ for $t_1,t_2\in [0,2\pi ]$ and $|t_1-t_2|\le \delta $. Applying the technique as in [48, p.241-242], we get

$$\begin{aligned} \sum _{k=1}^{\infty }|a_k|\le \frac{1}{2}\sum _{\ell =1}^{\infty }\bigl (\omega (\pi 2^{-\ell })\bigr )^{\frac{1}{2}}V(a)^{\frac{1}{2}} \le \frac{\sqrt{\gamma _1}}{2}\sum _{\ell =1}^{\infty }(\pi 2^{-\ell })^{\frac{\alpha }{2}}V(a)^{\frac{1}{2}}, \end{aligned}$$

(2.7)

where the last inequality holds since $\omega (\delta )\le r_1\delta ^{\alpha }$ for a constant $r_1$ independent of $\delta $ if $a\in C^{0,\alpha }_{2\pi }$. It can be proved analogously that

$$\begin{aligned} \sum _{k=1}^{\infty }|a_{-k}|\le \frac{\sqrt{\gamma _1}}{2}\sum _{\ell =1}^{\infty }(\pi 2^{-\ell })^{\alpha /2}V(a)^{\frac{1}{2}}. \end{aligned}$$

The proof is completed by choosing $\gamma _b=\max \{1,\sqrt{\gamma _1}\}$. $\square $

Note that, in particular, Proposition 2.10 implies that a Lipschitz continuous function with period $2\pi $ belongs to ${\mathcal {W}}$.

We are interested in fractional powers and means of functions in the Wiener algebra. In the case of analytic functions there is an interesting result due to Lévy [32].

Proposition 2.11

Let $a\in {\mathcal {W}}$ and let f be a complex function, analytic in the range of a, then $f\circ a\in {\mathcal {W}}$.

Clearly, if both functions a and f are analytic then $f\circ a$ is analytic. We prove the following simpler result related to Hölder continuous functions $f\circ a$.

Lemma 2.12

Let $a:\Omega \rightarrow {\mathbb {R}}$ be $\alpha $-Hölder continuous and let $f\in C^1(\mathcal I)$ where ${\mathcal {I}}$ is a closed interval containing the range of a. Then $f\circ a$ is $\alpha $-Hölder continuous and

$$\begin{aligned}{}[f\circ a]_\alpha \le \Vert f'\Vert _{\infty ,{\mathcal {I}}} [a]_\alpha . \end{aligned}$$

Moreover $\Vert f\circ a\Vert _{C^{0,\alpha }}\le \gamma (\Vert f\Vert _{\infty ,{\mathcal {I}}}+\Vert f'\Vert _{\infty ,{\mathcal {I}}})$, where $\gamma =\max \{[a]_\alpha ,1\}$.

Proof

If $x,y\in \Omega $ are such that $a(x)\ne a(y)$, then, by the Mean Value Theorem,

$$\begin{aligned} \frac{|f(a(x))-f(a(y))|}{|x-y|^\alpha }= \frac{|f(a(x))-f(a(y))|}{|a(x)-a(y)|} \frac{|a(x)-a(y)|}{|x-y|^\alpha }\le \Vert f'\Vert _{\infty ,{\mathcal {I}}}[a]_\alpha . \end{aligned}$$

The bound holds also when $a(x)=a(y)$ and we have $[f\circ a]_\alpha \le \Vert f'\Vert _{\infty ,{\mathcal {I}}}[a]_\alpha $.

The latter inequality follows from $\Vert f\circ a\Vert _{C^{0,\alpha }}=\Vert f\circ a\Vert _\infty +[f\circ a]_\alpha $. $\square $

As a consequence we have the following.

Corollary 2.13

If $a(t)\in {\mathcal {W}}$ is such that $a(t)>0$ then $a(t)^{\frac{1}{p}}\in {\mathcal {W}}$. If in addition $a(t)\in C_{2\pi }^{0,\alpha }$ for some $\alpha >0$ then $a^{\frac{1}{p}}\in C_{2\pi }^{0,\alpha }$ and $[a^{\frac{1}{p}}]_\alpha \le \frac{1}{p}\frac{[a]_\alpha }{\min a(t)^{1-\frac{1}{p}}}$.

Proof

Since $f(z)=z^{\frac{1}{p}}$ is analytic in the range of a, by Proposition 2.11, $a^{\frac{1}{p}}\in {\mathcal {W}}$. Let $\mathcal I=[\min a(t),\max a(t)]$ be a closed interval of positive numbers enclosing the range of a, then $f\in C^1({\mathcal {I}})$ and we can apply Lemma 2.12, using $\Vert f'\Vert _{\infty ,\mathcal I}=\frac{1}{p\min a(t)^{1-\frac{1}{p}}}$. $\square $

Similarly, we have the following result.

Lemma 2.14

Let $a: [0, 2\pi ]\rightarrow {\mathbb {R}}$ be a real valued function and let $f\in C^1({\mathcal {I}})$ where ${\mathcal {I}}$ is a closed interval containing the range of a.

(i)
If a is absolutely continuous and $a'\in L_{2\pi }^2$, then $f\circ a$ is absolutely continuous and with derivative in $L_{2\pi }^2$. Moreover, $\Vert (f\circ a)'\Vert _{L^2}\le \Vert f'\Vert _{\infty ,{\mathcal {I}}}\Vert a'\Vert _{L^2}$.
(ii)
If a is of bounded variation on $[0,2\pi ]$, then $f\circ a$ is of bounded variation on $[0,2\pi ]$ and, moreover, $V(f\circ a)\le \Vert f'\Vert _{\infty ,{\mathcal {I}}} V(a)$.

Proof

Concerning part (i), it follows from [2, Theorem 5.10, part (d)] that $f\circ a$ is absolutely continuous. Observe that $(f\circ a)'=(f'\circ a)a'$ and

$$\begin{aligned} \Bigl (\int _{0}^{2\pi }|(f\circ a)'|^2 dt\Bigr )^{\frac{1}{2}}\le \Bigl (\int _{0}^{2\pi }\Vert f'\circ a\Vert _{\infty }^2|a'|^2 dt\Bigr )^{\frac{1}{2}}=\Vert f'\Vert _{\infty , {\mathcal {I}}}\Vert a'\Vert _{L^2}<\infty , \end{aligned}$$

which implies that $(f\circ a)' \in L_{2\pi }^2$ and $\Vert (f\circ a)'\Vert _{L^2}\le \Vert f'\Vert _{\infty ,{\mathcal {I}}}\Vert a'\Vert _{L^2}$.

Concerning part (ii), a direct consequence of [2, Theorem 5.10, part (e)] shows that $f\circ a$ is of bounded variation.

Consider a partition $x_0=0\le x_1\le \ldots \le x_\ell =2\pi $, we have

$$\begin{aligned} \begin{aligned}&\sum _{j=0}^{\ell -1}|f(a(x_{j+1})) -f(a(x_j))| \\&=\sum _{\underset{a(x_j)\ne a(x_{j+1})}{j=0}}^{\ell -1}\frac{|f(a(x_{j+1}))-f(a(x_j))|}{|a(x_{j+1})-a(x_j)|} |a(x_{j+1})-a(x_j)|\\&\le \sup _{\underset{x\ne y}{x,y\in {\mathcal {I}}}}\frac{|f(x)-f(y)|}{|x-y|} \sum _{j=0}^{\ell -1} |a(x_{j+1})-a(x_j)|\le \Vert f'\Vert _{\infty ,{\mathcal {I}}}V(a). \end{aligned} \end{aligned}$$

Taking the supremum over all partitions, we get the desired inequality. $\square $

3 Geometric means of ${\mathcal {Q}}{\mathcal {T}}$ matrices

We start this section by recalling a general construction of the ALM mean and the NBMP mean of p $(p\ge 3)$ positive definite operators.

Denote by $G_t(A,B)$, the weighted geometric mean $A\#_{t} B$, with weight $t\in [0,1]$. Given the $(p-1)$-tuple $(s_1,s_2,\ldots , s_{p-1})$ with $s_i\in [0,1]$ and positive definite operators $A_1,\ldots , A_p$, the sequences generated by

$$\begin{aligned} A_i^{(k+1)}=A_i^{(k)}\#_{s_1}G_{s_2,\ldots ,s_{p-1}}(A_1^{(k)},\ldots ,A_{i-1}^{(k)},A_{i+1}^{(k)},\ldots ,A_p^{(k)}), \ i=1,\ldots , p,\nonumber \\ \end{aligned}$$

(3.1)

with $A_i^{(0)}=A_i$, can be recursively defined, and they converge to a common limit $G_{s_1,\ldots ,s_{p-1}}$ [31].

With the choice $(s_1,\ldots ,s_{p-2},s_{p-1})=(1,\ldots ,1,\frac{1}{2})$ one obtains the ALM mean [1] and with the choice $(s_1,s_2,\ldots , s_{p-1})=((p-1)/p, (p-2)/(p-1), \ldots , \frac{1}{2})$ one obtains the NBMP mean. We call the corresponding iterations the ALM iteration and the NBMP iteration, respectively. The latter construction for positive definite matrices can be found in [11, 35], while for positive definite operators the convergence of the sequences in Thompson metric was proved in [35].

A similar inductive construction has been introduced in [29]. Given a probability vector $w=(w_i)\in {\mathbb {R}}^p$, i.e., such that $w_i>0$ and $\sum _{i=1}^p w_i=1$, define the following sequence

$$\begin{aligned} A_i^{(k+1)}=A_i^{(k)}\#_{1-w_i} G_{\widehat{w}^{(i)}}(A_1^{(k)},\ldots ,A_{i-1}^{(k)},A_{i+1}^{(k)},\ldots ,A_p^{(k)}),\quad i=1,\ldots ,p,\nonumber \\ \end{aligned}$$

(3.2)

where $\widehat{w}^{(i)}=\frac{1}{1-w_i}(w_1,\ldots ,w_{i-1},w_{i+1},\ldots ,w_p)$ is again a probability vector, and $ G_{w_1,w_2}(A_1,A_2)=A_1\#_{w_2} A_2$. If $\lim _k A_i^{(k)}$ exists and has the same value for every i, then we denote the common limit by $ G_{w}(A_1,\ldots ,A_p)$ and refer to it as the weighted mean. It is proved in [29] that this limit exists in the Thompson metric. Observe that by choosing $w=\frac{1}{p}(1,1,\ldots ,1)$, the weighted mean coincides with the NBMP mean.

In this section, we show that the ALM mean, the NBMP mean and the weighted mean of positive definite operators $A_1,\ldots , A_p$ such that $A_i=T(a_i)+E_{i}\in \mathcal{Q}\mathcal{T}$ are also ${\mathcal {Q}}{\mathcal {T}}$ matrices. In fact, we prove that the sequences $\{A_i^{(k)}\}_k$ generated by (3.1) converge to a common limit $G\in \mathcal{Q}\mathcal{T}$ with symbol $g=(a_1\ldots a_p)^{\frac{1}{p}}$ and we have $\lim _k\Vert a_i^{(k)}-g\Vert _{\infty }=0$, where $a_i^{(k)}$ is the symbol of $A_i^{(k)}$. In the case where the symbols $a_i\in {{\mathcal {W}}}$, $i=1,\ldots ,p$, we provide sufficient conditions under which $a_i^{(k)}$ converges to g in the Wiener norm.

3.1 ALM mean

The ALM sequences $\{A_i^{(k)}\}_{k=0}^{\infty }$ generated by (3.1) with $(s_1,\ldots ,s_{p-2},s_{p-1})=(1,\ldots ,1,\frac{1}{2})$, converge to a common limit G in the Thompson metric (see [31, Remarks 4.2 and 6.5 and Theorem 4.3]). This implies that $\lim _k\Vert A_i^{(k)}-G\Vert _2=0$ since the topology of the Thompson metric agrees with the relative operator norm topology [41].

We will prove that if the positive definite matrices $A_1,\ldots , A_p$, $p\ge 3$, belong to $\mathcal{Q}\mathcal{T}$ then also the matrices $A_i^{(k)}$ of the ALM sequence generated by (3.1) as well as their limit G belong to ${\mathcal {Q}}{\mathcal {T}}$. Moreover, the symbol associated with the Toeplitz part of G is the uniform limit of the symbols $a_i^{(k)}$ associated with the Toeplitz parts of $A_i^{(k)}$, which in turn are the functions obtained by applying the ALM construction to the symbols $a_i$ associated with the Toeplitz parts of the matrices $A_i$.

Theorem 3.1

Let $A_i=T(a_i)+E_i\in \mathcal{Q}\mathcal{T}$ be positive definite, for $i=1,\ldots , p$, with $p\ge 3$. The matrices $A_i^{(k)}$ generated by (3.1) for the ALM iteration, and the ALM mean $G=G(A_1,\ldots ,A_p)$ of $A_1,\ldots , A_p$, satisfy the following properties:

1.
for any $k\ge 0$ and for $i=1,\ldots , p$, there exist $a_i^{(k)}\in C_{2\pi }$ and $K_i^{(k)}\in {\mathcal {K}}(\ell ^2)$ such that $A_i^{(k)}=T(a_i^{(k)})+K_i^{(k)}$, that is $A_i^{(k)}\in {\mathcal {Q}}{\mathcal {T}}$;
2.
there exist $g\in C_{2\pi }$ and $K_G\in {\mathcal {K}}(\ell ^2)$ such that $G=T(g)+K_G$, that is, $G\in \mathcal{Q}\mathcal{T}$;
3.
$\lim _k\Vert a_i^{(k)}-g\Vert _\infty =0$, $\lim _k\Vert K_i^{(k)}-K_G\Vert _2=0$, for $i=1,\ldots ,p$;
4.
the equation $a_i^{(k+1)} = a_i^{(k)}\#_{s_1}G_{s_2,\ldots ,s_{p-1}}(a_1^{(k)},\ldots ,a_{i-1}^{(k)},a_{i+1}^{(k)},\ldots ,a_p^{(k)})$, is satisfied for $i=1,\ldots ,p$, and for any k.^{Footnote 1}

Proof

It is known from [31] that the sequences $\{A_i^{(k)}\}_{k=0}^{\infty }$ converge to G in the Thompson metric and that $\lim _k\Vert A_i^{(k)}-G\Vert _2=0$. We prove parts 1–4 by induction on p. If $p=3$, then

$$\begin{aligned} \begin{aligned}&A_1^{(k+1)}=A_1^{(k)}\#_{s_1}G(A_2^{(k)},A_3^{(k)}),\\&A_2^{(k+1)}=A_2^{(k)}\#_{s_1}G(A_3^{(k)},A_1^{(k)}),\\&A_3^{(k+1)}=A_3^{(k)}\#_{s_1}G(A_1^{(k)},A_2^{(k)}),\\ \end{aligned} \end{aligned}$$

(3.3)

Recall that if $A,B\in {\mathcal {Q}}{\mathcal {T}}$ then the geometric mean $G(A,B)\in {\mathcal {Q}}{\mathcal {T}}$, and the symbols a, b, g associated with A, B and G(A, B), respectively, are such that $g=G(a,b)$ in view of Corollary 2.8.

Using an induction argument on k, we show part 1 of the theorem, i.e., $A_i^{(k)}\in {\mathcal {Q}}{\mathcal {T}}$ for $i\in \{1,2,3\}$. We have $A_i^{(0)}=A_i\in {\mathcal {Q}}{\mathcal {T}}$, and assuming $A_i^{(k)}\in {\mathcal {Q}}{\mathcal {T}}$, for $i\in \{1,2,3\}$, from (3.3) and Corollary 2.8, we deduce that $A_i^{(k+1)}\in {\mathcal {Q}}{\mathcal {T}}$, for $i\in \{1,2,3\}$. Consequently, since $\lim _k\Vert A_i^{(k)}-G\Vert _2=0$, for $G=G(A_1,A_2,A_3)$, from Lemma 2.9 we deduce part 2, i.e., $G\in {\mathcal {Q}}{\mathcal {T}}$ and that the symbol associated with the Toeplitz part of $A_i^{(k)}$ converges to g uniformly and that the compact part of $A_i^{(k)}$ converges to the compact part of G in norm, i.e., part 3. Finally, since the symbol associated with $A_i^{(0)}$ is $a_i(z)$, we find that the symbol associated with $G(A_i^{(k)},A_j^{(k)})$ is $G(a_i^{(k)},a_j^{(k)})$ by Corollary 2.8. Therefore, from (3.3) and Corollary 2.8 we find that $a_i^{(k+1)} = a_i^{(k)}\#_{s_1}G(a_{i-1}^{(k)},a_{i+1}^{(k)})$ with $a_{0}^{(k)}:=a_3^{(k)}$ and $a_{4}^{(k)}:=a_{1}^{(k)}$. That is, part 4.

For the inductive step on p, assume $p>3$ and follow the same argument to prove that if parts 1–4 hold for the sequence generated by (3.1) starting from $p-1$ matrices $A_i\in {\mathcal {Q}}{\mathcal {T}}$, $i=1,\ldots ,p-1$, then they also hold for the sequence generated by (3.1) starting from p matrices $A_i\in {\mathcal {Q}}{\mathcal {T}}$, $i=1,\ldots ,p$.

To this end, consider equation (3.1) and use induction on k to prove that $A_i^{(k)}\in {\mathcal {Q}}{\mathcal {T}}$ for $i=1,\ldots ,p$. For $k=0$, clearly $A_i^{(0)}=A_i\in {\mathcal {Q}}{\mathcal {T}}$ by assumption. Concerning the inductive step on k, assume that $A_i^{(k)}\in {\mathcal {Q}}{\mathcal {T}}$ for $i=1,\ldots ,p$, and deduce that $A_i^{(k+1)}\in {\mathcal {Q}}{\mathcal {T}}$ for $i=1,\ldots ,p$. By the inductive hypothesis on k we have $A_i^{(k)}\in {\mathcal {Q}}{\mathcal {T}}$ so that by the inductive hypothesis on p, the matrix $G_{s_2,\ldots ,s_{p-1}}(A_1^{(k)},\ldots ,A_{i-1}^{(k)},A_{i+1}^{(k)},\ldots ,A_p^{(k)})=G(A_1^{(k)}\ldots ,A_{i-1}^{(k)},A_{i+1}^{(k)},\ldots ,A_p^{(k)})$ belongs to ${\mathcal {Q}}{\mathcal {T}}$. Therefore, by Corollary 2.8 and from (3.1), the matrix $A_i^{(k+1)}$ is in ${\mathcal {Q}}{\mathcal {T}}$ as well. That is part 1 of the theorem. Moreover, since $\lim _k\Vert A_i^{(k)}-G\Vert _2=0$ for $G=G(A_1,\ldots ,A_p)$, then from Lemma 2.9 we deduce that $G\in {\mathcal {Q}}{\mathcal {T}}$ and that the symbol associated with the Toeplitz part of $A_i^{(k)}$ uniformly converges to the symbol g associated with the Toeplitz part of G, and that the compact part of $A_i^{(k)}$ converges to the compact part of G in norm, i.e., parts 2 and 3 of the theorem.

Concerning part 4, we proceed by induction on p. We have already proved that for $p=3$ the property is satisfied. In order to prove the inductive step on p we proceed by induction on k. For the initial step, i.e., for $k=0$, by Corollary 2.8 and by the inductive assumption on p, from (3.3) we find that, $a_i^{(1)} = a_i^{(0)}\#_{s_1}G_{s_2,\ldots ,s_{p-1}}(a_1^{(0)},\ldots ,a_{i-1}^{(0)},a_{i+1}^{(0)},\ldots ,a_p^{(0)})$. For the induction step on k, assume that part 4 is satisfied for k and prove it for $k+1$. By the inductive hypotheses valid for $p-1$ matrices, we know that

$$\begin{aligned} G_{s_2,\ldots ,s_{p-1}}(A_1^{(k)},\ldots ,A_{i-1}^{(k)},A_{i+1}^{(k)},\ldots ,A_p^{(k)})\in {\mathcal {Q}}{\mathcal {T}}\end{aligned}$$

and that its symbol is $G_{s_2,\ldots ,s_{p-1}}(a_1^{(k)},\ldots ,a_{i-1}^{(k)},a_{i+1}^{(k)},\ldots ,a_p^{(k)})$. Therefore, by Corollary 2.8 and from (3.1), the symbol associated with the Toeplitz part of $A_i^{(k+1)}$ is $a_i^{(k+1)}= a_i^{(k)}\#_{s_1}G_{s_2,\ldots ,s_{p-1}}(a_1^{(k)},\ldots ,a_{i-1}^{(k)},a_{i+1}^{(k)},\ldots ,a_p^{(k)})$. $\square $

As a consequence of the above theorem we will show that the symbol g associated with the Toeplitz part of G is such that $g(z)=(a_1(z)\ldots a_p(z))^{\frac{1}{p}}$, where the symbols $a_i(z)$ take positive values in view of Lemma 2.5 since $A_i$ are positive definite. In order to prove this representation of g(z), consider the sequences $a_i^{(k)}(z)$ defined by the ALM iteration, that is

$$\begin{aligned} \begin{aligned}&a_i^{(k+1)} = a_i^{(k)}\#_{s_1}G_{s_2,\ldots ,s_{p-1}}(a_1^{(k)},\ldots ,a_{i-1}^{(k)},a_{i+1}^{(k)},\ldots ,a_p^{(k)}),\\&a_i^{(0)}(z)=a_i(z), \end{aligned} \end{aligned}$$

(3.4)

for $i=1,\ldots ,p$, $k\ge 0$. It can be easily verified that

$$\begin{aligned} a_i^{(k+1)}(z)=\Big (\prod _{j=1,\, j\ne i}^p a_j^{(k)}(z)\Big )^\frac{1}{p-1},\quad i=1,\ldots ,p,\quad k=0,1,\ldots \end{aligned}$$

(3.5)

We have the following.

Lemma 3.2

Let $a_1(z),\ldots ,a_p(z)$ be continuous nonnegative functions and let $a_i^{(k)}$, for $i=1,\ldots ,p$ and $k=0,1,\ldots $, be the sequences defined by the ALM iteration (3.4). For $k=0,1,\ldots ,$ we have

$$\begin{aligned} a_i^{(k)}=a_i^{n_{k-1}}\prod _{j=1,\, j\ne i}^p a_j^{n_k},\qquad i=1,\ldots ,p, \end{aligned}$$

where $n_k=\frac{1}{p}\bigl (1+\frac{(-1)^{k+1}}{( p-1)^k}\bigr )$.

Proof

We proceed by induction on k. The case $k=0$, follows from $n_0=0$ and $n_{-1}=1$. For $k>0$, from (3.5), we obtain for the sequences $\{n_k\}_k$ the difference equation $(p-1)n_{k+1}=(p-2)n_k+n_{k-1}$, with $n_0=0$ and $n_{-1}=1$, whose solution is $n_k=\frac{1}{p}(1-(-(1/(p-1))^k))$. $\square $

Observe that $\lim _k n_k =\frac{1}{p}$ so that $\lim _k a_i^{(k)}(z)=\left( \prod _{i=1}^p a_i(z)\right) ^\frac{1}{p}$ pointwise as expected. On the other hand, from Theorem 3.1 it follows that convergence is uniform. We may conclude with the following.

Proposition 3.3

If $A_i=T(a_i)+K_i\in {\mathcal {Q}}{\mathcal {T}}$, for $i=1,\ldots ,p,$ are positive definite operators, then the symbol g(t) associated with $G=G(A_1,\ldots , A_p)$ is such that $g(t)=(a_1(t)\cdots a_p(t))^{\frac{1}{p}}$.

Now, consider the case where $A_1, A_2, \ldots , A_p\in {\mathcal {Q}}{\mathcal {T}}$ are such that their associated symbols $a_i\in {{\mathcal {W}}}$ and $E_i\in \mathcal {K}(\ell ^2)$ for $i=1,\ldots ,p$. It is clear that the sequences $\{A_i^{(k)}\}$ generated by the ALM iteration converge to a matrix $G\in \mathcal{Q}\mathcal{T}$ with symbol $g=(a_1\cdots a_p)^{\frac{1}{p}}$. Since ${{\mathcal {W}}}$ is a Banach algebra, we have $g=(a_1\cdots a_p)^{\frac{1}{p}}\in {{\mathcal {W}}}$. Concerning the convergence of the symbols $\{a_i^{(k)}\}$ of the sequences $\{A_i^{(k)}\}$, we know that $\lim _k\Vert a_i^{(k)}-g\Vert _{\infty }=0$, but uniform convergence does not imply that $\lim _k\Vert a_i^{(k)}-g\Vert _{_{_{{\mathcal {W}}}}}=0$ (see [25, p. 34]).

Now we show that under some regularity conditions, the sequences $\{a_i^{(k)}\}_k$ converge to g in Wiener norm, i.e., $\lim _k\Vert a_i^{(k)}-g\Vert _{_{_{{\mathcal {W}}}}}=0$.

Proposition 3.4

Let $a_1,\ldots ,a_p\in C_{2\pi }$ be the symbols of the positive definite matrices $A_1,\ldots , A_p\in {\mathcal {Q}}{\mathcal {T}}$, and let $a_i^{(k)}$, for $i=1,\ldots ,p$ and $k=0,1,2,\ldots ,$ be the sequence obtained by the ALM iteration (3.4). If one of the conditions

(a)
$a_i\in {C_{2\pi }^{0,\alpha }}$, with $\alpha \in (\frac{1}{2},1]$;
(b)
$a_i\in {C_{2\pi }^{0,\alpha }}$, with $\alpha \in (0,1]$ and of bounded variation;
(c)
$a_i$ absolutely continuous and with derivative in $L_{2\pi }^2$;

for $i=1,\ldots ,p$, is fulfilled, then $a_i^{(k)}\in {\mathcal {W}}$, $(a_1\cdots a_p)^{\frac{1}{p}}\in {\mathcal {W}}$ and $a_i^{(k)}\rightarrow _{_{_{{\mathcal {W}}}}}(a_1\cdots a_p)^{\frac{1}{p}}$.

Proof

Observe that, in all three cases, Proposition 2.10 implies that $a_1,\ldots ,a_p\in {\mathcal {W}}$, so that $a_i^{(k)}, a_i^{\frac{1}{p}}$ and $(a_1\cdots a_p)^{\frac{1}{p}}$ belong to $\mathcal W$ by Proposition 2.11. To show $a_i^{(k)}\rightarrow _{_{_{{\mathcal {W}}}}}(a_1\cdots a_p)^{\frac{1}{p}}$, we can see from Lemmas 2.1 and 3.2 that it suffices to show $a_i^{n_k}\rightarrow _{_{_{{\mathcal {W}}}}}a_i^{\frac{1}{p}}$, where $n_k$ is defined in Lemma 3.2 and $\lim _k n_k=\frac{1}{p}$.

With the notation of Lemma 3.2, we have $a_i^{n_k}-a_i^{\frac{1}{p}}=f_k\circ a_i$, where

$$\begin{aligned} f_k(x):=x^{n_k}-x^{\frac{1}{p}}=x^{n_k}(1-x^{(-1/(p-1))^k}) \end{aligned}$$

is a sequence of analytic functions on $(0,\infty )$. Observe that $a_i$, for $i=1,\ldots ,p$, is a strictly positive function by Lemma 2.5, let ${\mathcal {I}}$ be the set of the range of $a_i$, then ${\mathcal {I}}\subset (0,\infty )$ is a closed interval and the sequences $\{f_k\}_k$ and $\{f'_k\}_k$ converge to 0 uniformly on ${\mathcal {I}}$ by [26, Theorem 1.2]. We show that $\Vert a_i^{n_k}-a_i^{\frac{1}{p}}\Vert _{_{_{{\mathcal {W}}}}}=\Vert f_k\circ a_i\Vert _{_{_{{\mathcal {W}}}}}\rightarrow 0$ under the three cases.

In case (a), we can use Lemma 2.12 and Proposition 2.10 again to get

$$\begin{aligned} \Vert f_k\circ a_i\Vert _{_{_{{\mathcal {W}}}}}\le \gamma _a\Vert f_k\circ a_i\Vert _{C^{0,\alpha }}\le \gamma _a\gamma (\Vert f_k\Vert _{\infty ,{\mathcal {I}}}+\Vert f'_k\Vert _{\infty , {\mathcal {I}}})\rightarrow 0. \end{aligned}$$

In case (b), by Lemmas 12 and 14, $f_k\circ a_i\in C_{2\pi }^{0,\alpha }$ and is of bounded variation with $V(f_k\circ a_i)\le \Vert f'_k\Vert _{\infty ,{\mathcal {I}}}V(a_i)\rightarrow 0$.

Set $(f_k\circ a_i)(t):=h_k(t)=\sum _{j\in \mathbb {Z}}h^{(k)}_je^{{{\mathfrak {i}}}jt}$, then $\Vert h_k\Vert _{\infty }=\Vert f_k\Vert _{\infty ,{\mathcal {I}}}\rightarrow 0$. Observe that $|h_0^{(k)}|\le \frac{1}{2\pi }\int _{0}^{2\pi }|h_k(t)|dt\le \Vert h_k\Vert _{\infty }$ so that $h_0^{(k)}\rightarrow 0$ as $k\rightarrow \infty $.

Together with Proposition 2.10 and (2.5), it yields

$$\begin{aligned} \Vert f_k\circ a_i\Vert _{_{_{{\mathcal {W}}}}}\le \gamma _b\Big (|h_0^{(k)}|+V(f_k\circ a_i)^{\frac{1}{2}}\sum _{v=1}^{\infty }(\pi 2^{-v})^{\frac{\alpha }{2}}\Big )\rightarrow 0. \end{aligned}$$

In case (c), we use part (i) of Lemma 2.14 and (2.6) to get

$$\begin{aligned} \begin{aligned} \Vert f_k\circ a_i\Vert _{_{_{{\mathcal {W}}}}}&\le { \gamma _c(\Vert f_k\circ a_i\Vert _{L^1}+\Vert (f_k\circ a_i)'\Vert _{L^2})}\\&\le \gamma _c({\Vert f_k\Vert _{\infty , {\mathcal {I}}}+\Vert f'_k\Vert _{\infty , {\mathcal {I}}}\Vert a_i'\Vert _{L^2})\rightarrow 0.} \end{aligned} \end{aligned}$$

$\square $

3.2 NBMP mean and weighted mean

The analysis performed in the previous section can be repeated here for the NBMP mean. In fact, since the p sequences $\{A_i^{(k)}\}_{k=1,2,\ldots }$, $i=1,\ldots ,p$, generated by (3.1) for the NBMP iteration converge to a common limit G in the Thompson metric [31], they converge in the operator norm. Following an analysis similar to the one of Sect. 3.1, one can see that the NBMP mean of positive definite ${\mathcal {Q}}{\mathcal {T}}$ matrices is a ${\mathcal {Q}}{\mathcal {T}}$ matrix. Moreover, since the NBMP construction applied to scalars converges in just one step, we have that for the symbols $a_i^{(k)}$ obtained this way it holds that $a_i^{(k)}(z)=g(z)$ for $k\ge 1$, $i=1,\ldots ,p$. We may conclude with the following results which can be proved by adapting the proof of Theorem 3.1.

Theorem 3.5

Let $A_i=T(a_i)+E_i\in \mathcal{Q}\mathcal{T}$, for $i=1,\ldots , p$, $p\ge 3$, be positive definite. Then the matrices $A_i^{(k)}$ generated by (3.1) for the NBMP iteration, and the NBMP mean $G=G(A_1,\ldots ,A_p)$ of $A_1,\ldots , A_p$, satisfy the following properties:

1.
$A_i^{(k)}=T(g)+K_{i}^{(k)}\in {\mathcal {Q}}{\mathcal {T}}$, $i=1,\ldots ,p$, for any $k\ge 1$, where $g=(a_1\cdots a_p)^{\frac{1}{p}}$;
2.
$G=T(g)+K_G\in {\mathcal {Q}}{\mathcal {T}}$;
3.
$\lim _k\Vert K_{i}^{(k)}-{K_G}\Vert _2=0$, for $i=1,\ldots ,p$.

A similar argument can be used for the weighted mean: the p sequences $\{A_i^{(k)}\}_{k=1,2,\ldots }$, $i=1,\ldots ,p$, generated by (3.2) converge to their limit G in the Thompson metric [29], and thus then they converge in the operator norm, and as before, the weighted mean of positive definite ${\mathcal {Q}}{\mathcal {T}}$ matrices is a ${\mathcal {Q}}{\mathcal {T}}$ matrix. The symbols $a_i^{(k)}$ obtained with this procedure are such that, for $k>1$, we have $a_i^{(k+1)}(z)=a_i^{(k)}\#_{1-w_i}G_{\widehat{w}^{(i)}}(a_1^{(k)},\ldots ,a_{i-1}^{(k)},a_{i+1}^{(k)},\ldots , a_p^{(k)})$ for $i=1,\ldots ,p$. We can show that $a_i^{(k)}=a_1^{w_1}a_2^{w_2}\cdots a_p^{w_p}=G_w(a_1,\ldots ,a_p)$ for $k\ge 1$ and get the following results which can be proved again by adjusting the proof of Theorem 3.1.

Theorem 3.6

Let $A_i=T(a_i)+E_i\in \mathcal{Q}\mathcal{T}$, for $i=1,\ldots , p$, $p\ge 3$, be positive definite. Let $w=(w_i)\in {\mathbb {R}}^p$ be a probability vector, and $A_i^{(k)}$ be the matrix sequences generated by (3.2) for the weighted iteration. Finally, let $G=G_w(A_1,\ldots ,A_p)$ be the weighted mean of $A_1,\ldots , A_p$. Then we have

1.
$A_i^{(k)}=T(g)+K_{i}^{(k)}\in {\mathcal {Q}}{\mathcal {T}}$, $i=1,\ldots ,p$, for any $k\ge 1$, where $g=a_1^{w_1}a_2^{w_2}\cdots a_p^{w_p}$;
2.
$G=T(g)+K_G\in \mathcal{Q}\mathcal{T}$;
3.
$\lim _k\Vert K_{i}^{(k)}-K_G\Vert _2=0$, for $i=1,\ldots ,p$.

4 Numerical and computational issues

In this section we discuss some issues concerning the effective numerical computation of the geometric mean of $A_1,\ldots , A_p\in \mathcal{Q}\mathcal{T}$. We rely on the CQT-Toolbox [10] for computations in the ${\mathcal {Q}}{\mathcal {T}}$ algebra but in order to compute geometric means, we need to compute some fundamental functions of ${\mathcal {Q}}{\mathcal {T}}$ matrices, namely, the pth root and in particular the square root, that we will discuss in the following. In this section, without loss of generality, we assume that the matrix $A=T(a)+E_A \in {\mathcal {Q}}{\mathcal {T}}$ is such that $0 \le a(z) \le 1$. This condition is satisfied in particular if A is positive and $\Vert A\Vert _2\le 1$ since $\Vert a(z)\Vert _\infty =\Vert T(a)\Vert _2\le \Vert A\Vert _2$.

4.1 Square root

The square root has been implemented in [10] in two different ways relying on the Denman and Beavers algorithm, and on the Cyclic Reduction (CR) algorithm, respectively. While the two algorithms are equivalent to Newton’s method for a matrix A when the initial value commutes with A, in finite arithmetic their behavior differs (see [16, Section 6.3]). In our numerical tests, the CR algorithm provided better numerical results and thus it appears to be better suited for ${\mathcal {Q}}{\mathcal {T}}$ matrices.

The CR algorithm for the square root is defined as follows

$$\begin{aligned} \begin{array}{ll} Y_{k+1}=-Y_k S_k^{-1} Y_k, &{} Y_0= I-A, \\ S_{k+1}=S_k+2Y_{k+1}, &{} S_0=2(I+A), \end{array} \end{aligned}$$

(4.1)

where we assume that all the matrices $S_k$ are invertible. In the finite dimensional case it follows that $\lim _k\frac{1}{4} S_k=A^\frac{1}{2}$, $\lim _kY_k=0$, where convergence holds in any operator norm.

The sequences obtained by the CR algorithm are related to the sequences obtained by the simplified Newton method

$$\begin{aligned} X_{k+1}=\frac{1}{2}(X_k+AX_k^{-1}),\qquad X_0=\frac{1}{2}(I+A). \end{aligned}$$

(4.2)

Indeed, $S_k=4X_k$ and $Y_k=2(X_k-X_{k-1})$, with $X_{-1}=A$ (see [18]).

We define the sequence $\{x_k(x)\}_k$ of rational functions of the variable x, by means of $x_{k+1}(x)=\frac{1}{2}(x_k(x)+xx_k^{-1}(x))$, for $k=0,1,\ldots $, with $x_0(x)=\frac{1}{2}(1+x)$. Since the sequence $x_k(x)$ is obtained by applying the Newton method to the equation $z^2=x$, customary arguments show that for $x\in [0,1]$ the sequence $\{x_k(x)\}_{k=0,1,\ldots }$ is well-defined and monotonically decreasing to $\sqrt{x}$. In view of Dini’s theorem we may conclude that convergence is uniform on [0, 1].

From the identity $X_k=x_k(A)$ we deduce the following.

Corollary 4.1

Let $A=T(a)+K\in {\mathcal {Q}}{\mathcal {T}}$ be self-adjoint and such that a is real valued and $v^TAv\ge 0$ for any $v\in \ell ^2$, $\Vert v\Vert =1$. Then the sequences $S_k$ and $Y_k$ generated by (4.1) are such that $\lim _k\Vert Y_k\Vert _2=0$, $\lim _k\Vert S_k-4A^\frac{1}{2}\Vert _2=0$.

Proof

By scaling A, we can assume that $v^TAv\le 1$. The spectrum $\textrm{sp}(A)$ of A is a compact set contained in [0, 1], in fact it is contained in the closure of the numerical range $\{v^TAv:\,v^Tv=1\}$ and this closure is contained in [0, 1]. Let $\{X_k\}_k$ be the sequence obtained by (4.2) and $x_k(x)$ the corresponding scalar sequence. The function $\varphi _k(x):=x_k(x)-\sqrt{x}$ is continuous in [0, 1] and by Lemma 2.2 we have

$$\begin{aligned} \Vert X_k-A^{\frac{1}{2}}\Vert _2=\Vert \varphi _k(X_k)\Vert _2=\Vert \varphi _k(x)\Vert _{\infty ,\textrm{sp}(A)}. \end{aligned}$$

Since $\textrm{sp}(A)\subset [0,1]$, the latter tends to zero and we have that $\lim _k\Vert X_k-A^{\frac{1}{2}}\Vert _2=0$ and using $S_k=4X_k$ and $Y_k=2(X_k-X_{k-1})$ we complete the proof. $\square $

From the above results it follows that the symbols associated with the matrices in (4.1) uniformly converge to the symbols of their limit. Moreover the compact corrections associated with these matrix sequences converge in the ${\mathcal {B}}(\ell ^2)$ norm to the compact corrections of their limit.

A better convergence of the symbols can be obtained under the assumption of more regularity of a(z) in view of Propositions 2.10 and 2.11.

4.2 pth root

It is well known that if A is an $n\times n$ Hermitian matrix, where n is finite, with eigenvalues in [0, 1] then the sequence generated by Newton’s iteration $ Y_{k+1}=\frac{1}{p} Y_k((p-1)I+Y_k^{-p}A)$, $Y_0=I$, is such that $\lim _k Y_k=A^{\frac{1}{p}}$ [16]. However, this iteration may encounter stability problems when implemented in floating point arithmetic. A stable version is based on the following iteration

$$\begin{aligned} \begin{array}{ll} Y_{k+1}=Y_k(\frac{(p-1)I+M_k}{p}),&{}Y_0=I,\\ M_{k+1}=(\frac{(p-1)I+M_k}{p})^{-p}M_k,&{}M_0=A. \end{array} \end{aligned}$$

(4.3)

We prove that the iteration (4.3) applied to a ${\mathcal {Q}}{\mathcal {T}}$ matrix converges in norm. To this end we follow the same approach used for the square root. More precisely, we introduce the functional sequences

$$\begin{aligned} \begin{array}{ll} y_{k+1}(x)=y_k(x)(\frac{p-1+m_k(x)}{p}),&{}y_0(x)=1,\\ m_{k+1}(x)=(\frac{p-1+m_k(x)}{p})^{-p}m_k(x),&{}m_0(x)=x. \end{array} \end{aligned}$$

(4.4)

so that we have $Y_k=y_k(A)$, $M_k=m_k(A)$. Now we prove the following result.

Proposition 4.2

For any $x\in [0,1]$, for the sequences (4.4) we have

1.
$0\le m_k(x)\le m_{k+1}(x)\le 1$,
2.
$x^\frac{1}{p}\le y_{k+1}(x)\le y_k(x)$,
3.
$\lim _{k}\Vert m_k-1\Vert _{\infty }=0, \lim _k\Vert y_k-x^{\frac{1}{p}}\Vert _{\infty }=0$,

where inequalities are strict if $x\ne 0,1$.

Proof

Concerning Part 1, we prove by induction that $0\le m_k(x)\le 1$. Clearly, since $m_0(x)=x$ we have $0\le m_0(x)\le 1$. For the inductive step, we observe that the function $f(t)=t(\frac{p-1+t}{p})^{-p}$ such that $m_{k+1}(x) = f(m_k(x))$, maps monotonically the interval [0, 1] into itself, so that $0\le m_k(x)\le 1$ implies $0\le m_{k+1}(x)\le 1$. Similarly, for the inequality $m_k(x)\le m_{k+1}(x)$ we have $m_0(x)\le m_1(x)$ and the monotonicity of f(t) implies by induction that $m_k(x)\le m_{k+1}(x)$. Part 2 follows since $(\frac{p-1+m_k(x)}{p})<1$ under the assumption $0\le m_k(x)\le 1$, and we know that the limit of the sequence is $x^\frac{1}{p}$. Finally, Part 3 is proved by applying Dini’s theorem. $\square $

From the identities $Y_k=y_k(A)$ and $M_k=m_k(A)$ we deduce the following.

Corollary 4.3

Let $A=T(a)+E_A\in {\mathcal {Q}}{\mathcal {T}}$ be self-adjoint and such that a is real valued and $v^TAv\ge 0$ for any $v\in \ell ^2$, $v\ne 0$. Then the sequences $Y_k$ and $M_k$ generated by (4.3) are such that $\lim _k\Vert M_k-I\Vert _2=0$, $\lim _k\Vert Y_k-A^\frac{1}{p}\Vert _2=0$.

As in the square root case, the symbols associated with the matrices in (4.3) uniformly converge to the symbols of their limit and the compact corrections associated with these matrix sequences converge in the ${\mathcal {B}}(\ell ^2)$ norm to the compact corrections of their limit.

Remark 4.4

Note that the proofs of Corollaries 4.1 and 4.3 rely uniquely on the fact that the spectrum of a self-adjoint operator is contained in the closure of the numerical range and thus they can be stated on the milder hypothesis that A is a self-adjoint operator in ${\mathcal {B}}(\ell ^2)$.

4.3 Computing the symbol g(z)

Observe that both the symbol of the ALM mean and the NBMP mean is $g(z)=(a_1(z)\ldots a_p(z))^{\frac{1}{p}}$. The application of the iterations of Sect. 3 in the arithmetic of quasi-Toeplitz matrices, provides as a result both the values of g(z) and of the correction K such that $G=T(g)+K$ is the sought geometric mean. However, if only the symbol part of G is needed, then it might be more convenient to compute the coefficients of g(z) by means of the evaluation/interpolation technique.

In this section, when the symbols are such that $a_{i}\in \mathcal W$, $i=1,\ldots ,p$, based on the evaluation/interpolation at the roots of unity, we provide an algorithm for computing the approximation $\widetilde{g}_{j}$, $j=-n+1,\ldots ,n$, to the Fourier coefficients $g_j$ of the symbol $g(z)=(a_1(z)\cdots a_p(z))^{\frac{1}{p}}$.

Let $n>0$ be an integer, set $m=2n$ and let $w_m=\cos \frac{2\pi }{m}+{{\mathfrak {i}}}\sin \frac{2\pi }{m}$ be the principal mth root of 1. There is always a unique Laurent polynomial $\widetilde{g}(z)=\sum _{-n+1}^n\widetilde{g}_jz^j$ such that $g(w_m^{\ell })=\widetilde{g}(w_m^{\ell })$, that is, $\widetilde{g}(z)$ interpolates g(z) at $w_m^{\ell }$, $\ell =-n+1,\ldots ,n$.

The computation of the approximation $\widetilde{g}_j$, $j=-n+1,\ldots ,n$, to the coefficients $g_j$ of g(z) can proceed by first selecting a positive integer n and evaluating $g(w_m^{\ell })$, $\ell =-n+1,\ldots ,n$; and then interpolating g(z) at $w_m^{\ell }$, $\ell =-n+1,\ldots ,n$, by means of the FFT, obtaining the coefficients $\widetilde{g}_j$ of $\widetilde{g}(z)=\sum _{j=-n+1}^n\widetilde{g}_jz^j$. We stop the process if $\widetilde{g}(z)$ is close enough to g(z), otherwise we continue this process by doubling the value of n.

Concerning the accuracy of the approximation, we recall the following result from [9, Theorem 3.8]

$$\begin{aligned} \widetilde{g}_i = g_i +\sum _{k=1}^\infty (g_{i+kn}+g_{i-kn}). \end{aligned}$$

(4.5)

If $g\in {\mathcal {W}}$ then $\lim _n\sum _{|j|\ge n} |g_j|=0$ so that, for a given $\varepsilon >0$ there exists a sufficiently large n such that $\sum _{|j|\ge n} |g_j|\le \varepsilon $. Thus, from (4.5) we deduce that $|\widetilde{g}_i - g_i|\le \varepsilon $. Under additional assumptions on g(t), a guaranteed way for determining n such that the above bound is satisfied, can be easily determined (see [9, p. 57] for further details). In general, we may adopt the following heuristic criterion, to halt the evaluation/interpolation procedure (see also [39]) where the iteration is terminated if

$$\begin{aligned} \sum _{|j|>\lceil \frac{n}{2}\rceil }|\widetilde{g}_{j}|< \sum _{j=-n+1}^n|\widetilde{g}_{j}|\cdot \varepsilon . \end{aligned}$$

(4.6)

We summarize the procedure for computing the coefficients $\widetilde{g}_j$, $j=-n+1,\ldots ,n$, as Algorithm 1. The overall computational cost of this algorithm is $O(n \log n)$ arithmetic operations.

Observe that if $a_i>0$ for $i=1,\ldots ,p$, has only real coefficients, then $a_i(w_m^\ell )=a_i(w_m^{-\ell })$ for $\ell =-n+1, \ldots , n$ and $i=1,\ldots ,p$, then Step 2 of Algorithm 1 can proceed by computing $a_i(w_m^{\ell })$ for $\ell =0,\ldots ,n$ and setting $r_{\ell }=r_{-\ell }$ for $\ell =-1,\ldots , -n+1$.

5 Geometric means of finite ${\mathcal {Q}}{\mathcal {T}}$ matrices

The representation and the arithmetic for ${\mathcal {Q}}{\mathcal {T}}$ matrices, up to a certain extent, can be adapted for handling finite dimensional matrices. Accordingly, we show that iteration (3.1) for computing the mean G can be applied to finite dimensional matrices that can be written as a sum of a Toeplitz matrix and a low-rank matrix correction.

Given a symbol a(z) and $m\in \mathbb {Z}^+$, we denote by $T_m(a)$, $H_m(a^{-})$ and $H_m(a^+)$ the $m\times m$ leading principal submatrices of T(a), $H(a^-)$ and $H(a^+)$, respectively. The following theorem can be seen a finite dimensional version of Proposition 2.3, which implies that our algorithms for geometric means of ${\mathcal {Q}}{\mathcal {T}}$ matrices can be applied also for finite size matrices.

Proposition 5.1

[43] If $a,b\in L^{\infty }$, then

$$\begin{aligned} T_m(a)T_m(b)=T_m(ab)-H_m(a^-)H_m(b^+)-JH_m(a^+)H_m(b^-)J, \end{aligned}$$

where J is the $m\times m$ flip matrix having ones on the anti-diagonal and zeros elsewhere.

If we focus on finite size Toeplitz matrices, whose symbols are Laurent polynomials of the kind $\sum _{j=-r}^r a_jz^j$, where the degree r is small compared to the matrix size m, say, $r\le m/4$, then Proposition 5.1 shows that the product of matrices of this kind can be represented as the sum of a finite Toeplitz matrix and two low-rank matrix corrections with nonzero entries (the support of the correction) located in the upper leftmost and in the lower rightmost corners, respectively. Observe also that if the matrices are Hermitian, then $a^-=a^+$ and $b^-=b^+$ so that the compact correction obtained in the infinite case provides both the corrections in the upper leftmost and the lower rightmost corner.

A similar property holds if the matrices can be written as the sum of a band Toeplitz matrix associated with a Laurent polynomial of degree at most r, and a correction with support located in the two opposite corners, provided that the values of r and of the size s of the support are suitably smaller than m, say, $r,s\le m/4$. Moreover, comparing Proposition 5.1 with Proposition 2.3 one can see that the Toeplitz part $T_m(ab)$ and the correction $H_m(a^-)H_m(b^+)$ coincide with the $m\times m$ leading principal submatrices of the infinite matrices T(ab) and $H(a^-)H(b^+)$, respectively, so that the infinite ${\mathcal {Q}}{\mathcal {T}}$ arithmetic provides the corresponding result of the finite ${\mathcal {Q}}{\mathcal {T}}$ arithmetic. This property holds even in the case where there is an overlap of the supports in the two corners of the corrections, or if the bandwidth r takes large values. In this situation, the implementation of the arithmetic is still possible [10], but it is not efficient from the computational point of view.

This allows us to implement the computation of the sequences $A_i^{(k)}$ of $m\times m$ matrices generated by (3.1) relying on the computation of their infinite counterparts. This implementation leads to an effective computation if the numerical degree of the symbols as well as the size (the maximum between rows and columns) of the supports of the corrections of the matrices $A_i^{(k)}$ remain bounded from above by a constant smaller than or equal to m/4.

6 Numerical experiments

In order to apply the theoretical results of the previous sections, we show by some tests the effectiveness of computing the ALM and the NBMP means of three matrices $A_1,A_2,A_3\in {\mathcal {Q}}{\mathcal {T}}$, and the convenience of computing the associated symbol g with the evaluation/interpolation method.

The algorithms of Sect. 4 have been implemented in MATLAB by following the lines of the corresponding implementations for finite positive definite matrices of the Matrix Means Toolbox (http://bezout.dm.unipi.it/software/mmtoolbox/). They rely on the package CQT-Toolbox of [10] for the storage and arithmetic of ${\mathcal {Q}}{\mathcal {T}}$ matrices. The software can be provided by the authors upon request.

The tests have been run on a cluster with 128GB of RAM and 24 cores. The internal precision of the CQT toolbox has been set to threshold = 1.e-15, while the value of $\varepsilon =\mathtt{10 * threshold=1.e-14}$ has been used to truncate the values of the symbol and of the correction. That is, only the values $g_i$, $i=0,\ldots ,k$ are computed, where k is such that $|g_j|\le \varepsilon \Vert g\Vert _\infty $ for $j>k$. Similarly, for the correction $E_G$, written in the form $E_G=UV^T$, where U and V are thin matrices, we computed only the values $u_{i,j}$, $i=1,\ldots ,m$ and $v_{i,j}$, $i=1,\ldots ,n$, where m, n are such that $|u_{i,j}|<\varepsilon \max _i|u_{i,j}|$ for any $i>m$ and j and $|v_{i,j}|<\varepsilon \max _i|v_{i,j}|$ for any $i>n$ and j, respectively.

The test examples have been constructed by relying on trigonometric symbols in the class , for different values of the coefficients $f_0,f_1,f_2$. A Toeplitz matrix associated with a symbol of this type turns out to be pentadiagonal.

With the choices of $(f_0,f_1,f_2)$ in the set $\{(2,1,0),~ (3,2,1),~ (9,4,4)\}$, the symbol f(t) takes values in the intervals [0, 4], [0, 9], [0, 25], respectively, and 0 belongs to the image of the symbol in all three cases. By perturbing the value of $f_0$ into $f_0+\vartheta $ where $\vartheta >0$, we are able to tune the ratio $\max _t f(t)/\min _t f(t)$ which for finite matrices is related to the condition number of the associated Toeplitz matrix. More specifically, with the choices $\vartheta =1,0.1,0.01$ we have three sets of matrices with increasing condition numbers.

In Table 1, for each value of $\vartheta $ we report the numerical length of the symbol g(z) together with the CPU time needed by the evaluation/interpolation technique to compute the coefficients of g(z). The CPU time needed for this computation is quite negligible and the numerical length of the symbol, as well as the number of interpolation points, grow as the ratio $\min _t g(t)/\max _t g(t)$ gets closer to 0, as expected.

Table 1 Length of the symbol, number of interpolation points and CPU time in seconds for computing the coefficients of g(z)

Full size table

In Table 2, for each value of $\vartheta $ we report the number of iterations and the CPU time needed to compute the ALM mean $G_\textrm{ALM}$ and the NBMP mean $G_\textrm{NBMP}$, together with the numerical size and the numerical rank of the compact correction.

We may observe that the NBMP iteration arrives at numerical convergence in just 3 steps, while the ALM iteration requires 43 steps independently of the condition number of the matrices. This fact reflects the different aysmptotic convergence order of the two iterations for finite size matrices: while the ALM converges linearly, the NBMP converges cubically [11].

As for the symbol, the size and the rank of the correction grows when the matrices to be averaged get more ill conditioned. For this computation, the CPU time needed is much larger than the time needed to compute just the symbol g(t). A closer analysis, performed by using the MATLAB profiler, shows that the most part of time is spent to perform compression operations in the CQT-Toolbox. By compressing a matrix E with respect to a threshold value $\varepsilon $, we mean to find a matrix ${\widetilde{E}}$ of the lowest rank such that $\Vert E-{\widetilde{E}}\Vert \le \varepsilon $.

Table 2 Number of iterations, CPU time in seconds, size and rank of the correction in the computation of the ALM and the NBMP mean

Full size table

In order to give an idea of the structure of the geometric mean, for $\vartheta =1$, in Fig. 1 we show in logarithmic scale the graph of the symbol g(t), i.e., the plot of the pairs $(j,\log _{10}|g_j|)$ for $j\ge 0$, and of the compact corrections $E=(e_{i,j})$, i.e., the plot of the triples $(i,j,\log _{10}|e_{i,j}|)$ for $G_\textrm{ALM}$. While in Fig. 2 we show the graph of the correction of $G_\textrm{NBMP}$ together with the modulus of componentwise difference $G_\textrm{ALM}-G_\textrm{NBMP}$.

We may see that the geometric mean, even though is an infinite dimensional matrix, can be effectively approximated by a finite number of parameters once it is decomposed into its Toeplitz part and its compact correction. From the graphs in Figs. 1 and 2 we may appreciate some rounding errors in the compact corrections. For the ALM mean, errors are located in the components of modulus less than $10^{-20}$, while for the NBMP mean the affected components have modulus less than $10^{-18}$ along an edge of the domain.

Since the ALM and the NBMP means differ only in the compact correction, an interesting remark is that this difference has values of modulus less than $10^{-5}$, see Fig. 2. This reflects also in the infinite dimensional case, the relatively small difference that has been usually observed between the two mean in the finite dimensional case [5].

Table 3 CPU time in seconds, needed to compute the ALM and the NBMP means in the case of finite matrices whose size is 3 times the size of the corresponding compact correction. For comparison, the CPU time needed in the infinite case is written between bracket

Full size table

Finally, we have compared the CPU time needed by the ALM and NBMP iterations applied to ${\mathcal {Q}}{\mathcal {T}}$ matrices and to their finite truncation to size n. The value of n is chosen equal to 3 times the size of the compact correction. This choice is motivated by the goal to keep well separated the compact correction in the top leftmost corner from that in the bottom rightmost corner of the finite size matrix. The CPU time is reported in Table 3. Observe that, while in the case $\vartheta =1$ of well conditioned matrices, the application of the ALM and NBMP iterations is faster for finite size matrices than for ${\mathcal {Q}}{\mathcal {T}}$ matrices, in the slightly more ill conditioned cases the time needed for finite size matrices gets larger than the time taken for ${\mathcal {Q}}{\mathcal {T}}$ matrices. In particular, for $\vartheta =0.01$ the speed-up reached by the computation based on the quasi-Toeplitz technology is roughly 3 for the ALM mean and 9.7 for the NBMP mean.

7 Conclusions and open problems

Some common definitions of geometric means of positive definite matrices, namely the ALM, NBMP and weighted mean, have been extended to the case of infinite matrices which are the sum of a Toeplitz matrix, associated with a continuous symbol, and a compact correction. We have shown that these means still keep the same structure of the input matrices, i.e., they can be written as the sum of a Toeplitz matrix and a compact correction. Moreover, the symbol associated with the Toeplitz part of the mean is the mean of the symbols associated with the averaged matrices. We have given, moreover, conditions under which the geometric mean belongs to ${\mathcal {B}}(\ell ^\infty )$ and under which the ALM sequence of the symbols converges in Wiener norm.

Numerical computations show that the ALM and the NBMP means can be computed also for infinite matrices in the ${\mathcal {Q}}{\mathcal {T}}$ class and that these means can be represented with good precision in terms of a finite number of parameters. Perhaps surprising, these ideas can be useful also when computing means of finite matrices.

An open issue, which we would like to investigate further, is exploiting the availability of the symbol g(t) of the geometric mean G in order to accelerate the computation of the compact correction. In fact, even though g(t) can be computed separately at a much lower cost, our current implementations of the ALM and NBMP iterations do not take advantage of the availability of g(t) and compute simultaneously approximation to the symbol and to the compact correction of the mean. Moreover, the large CPU time needed for computing the correction is mainly spent for low-rank approximations of large matrices. We believe that this part can be much improved by means of randomized techniques both for the task of computing general matrix functions and for computing the geometric mean.

Notes

Here and in the proof of the theorem, we have $s_1=1$ and thus the notation could be simplified, but we prefer to keep $s_1$ because this makes the proof valid for the NBMP and weighted means.

References

Ando, T., Li, C.-K., Mathias, R.: Geometric means. Linear Algebra Appl. 385, 305–334 (2004)
Article MathSciNet MATH Google Scholar
Appell, J., Bana$\acute{s}$, J., Merentes, N.: Bounded Variation and Around. De Gruyter, Berlin, (2013)
Bernstein, M.S.: Sur la convergence absolue des séries trigonométriques. Comptes rendu, t. 158, 1161–1163 (1914)
MATH Google Scholar
Bhatia, R.: Positive definite matrices. Princeton Series in Applied Mathematics. Princeton University Press, Princeton, NJ (2007)
MATH Google Scholar
Bini, D.A., Iannazzo, B.: A note on computing matrix geometric means. Adv. Comput. Math. 35(2–4), 175–192 (2011)
Article MathSciNet MATH Google Scholar
Bini, D.A., Iannazzo, B.: Computing the Karcher mean of symmetric positive definite matrices. Linear Algebra Appl. 438(4), 1700–1710 (2013)
Article MathSciNet MATH Google Scholar
Bini, D.A., Iannazzo, B., Jeuris, B., Vandebril, R.: Geometric means of structured matrices. BIT 54(1), 55–83 (2014)
Article MathSciNet MATH Google Scholar
Bini, D. A., Iannazzo, B., Meng, J.: Algorithms for approximating means of semi-infinite quasi-Toeplitz matrices. International Conference on Geometric Science of Information. GSI 2021:Geometric Science of Information, pages 405–414 (2021)
Bini, D. A., Latouche, G., Meini, B.: Numerical Methods for Structured Markov Chains. Numerical Mathematics and Scientific Computation. Oxford University Press, New York, Oxford Science Publications (2005)
Bini, D.A., Massei, S., Robol, L.: Quasi-Toeplitz matrix arithmetic: a MATLAB toolbox. Numer. Algorithms 81, 741–769 (2019)
Article MathSciNet MATH Google Scholar
Bini, D.A., Meini, B., Poloni, F.: An effective matrix geometric mean satisfying the Ando-Li-Mathias properties. Math. Comp. 79(269), 437–452 (2010)
Article MathSciNet MATH Google Scholar
Böttcher, A., Grudsky, S.M.: Toeplitz matrices, asymptotic linear algebra, and functional analysis. Birkhäuser Verlag, Basel (2000)
Book MATH Google Scholar
Böttcher, A., Grudsky, S.M.: Spectral properties of banded Toeplitz matrices. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA (2005)
Book MATH Google Scholar
Böttcher, A., Silbermann, B.: Introduction to large truncated Toeplitz matrices. Universitext. Springer-Verlag, New York (1999)
Book MATH Google Scholar
Fasi, M., Iannazzo, B.: Computing the weighted geometric mean of two large-scale matrices and its inverse times a vector. SIAM J. Matrix Anal. Appl. 39(1), 178–203 (2018)
Article MathSciNet MATH Google Scholar
Higham, N.J.: Functions of matrices: theory and computation. Society for Industrial and Applied Mathematics, Philadelphia, PA, USA (2008)
Book MATH Google Scholar
Hildebrandt, S.: The closure of the numerical range of an operator as spectral set. Comm. Pure Appl. Math. 17, 415–421 (1964)
Article MathSciNet MATH Google Scholar
Iannazzo, B.: A note on computing the matrix square root. Calcolo 40(4), 273–283 (2003)
Article MathSciNet MATH Google Scholar
Iannazzo, B.: The geometric mean of two matrices from a computational viewpoint. Numer. Linear Algebra Appl. 23(2), 208–229 (2016)
Article MathSciNet MATH Google Scholar
Iannazzo, B., Jeuris, B., Pompili, F.: The Derivative of the Matrix Geometric Mean with an Application to the Nonnegative Decomposition of Tensor Grids. In Structured Matrices in Numerical Linear Algebra, pages 107–128. Springer (2019)
Iannazzo, B., Porcelli, M.: The Riemannian Barzilai-Borwein method with nonmonotone line search and the matrix geometric mean computation. IMA J. Numer. Anal., 38(1):495–517, 04 (2017)
Jeuris, B., Vandebril, R.: The Kähler mean of block-Toeplitz matrices with Toeplitz structured blocks. SIAM J. Matrix Anal. Appl. 37(3), 1151–1175 (2016)
Article MathSciNet MATH Google Scholar
Jeuris, B., Vandebril, R., Vandereycken, B.: A survey and comparison of contemporary algorithms for computing the matrix geometric mean. Electron. Trans. Numer. Anal. 39, 379–402 (2012)
MathSciNet MATH Google Scholar
Kadison, R. V., Ringrose, J. R.: Fundamentals of the Theory of Operator Algebras. Vol. I, volume 100 of Pure and Applied Mathematics. Academic Press, Inc. [Harcourt Brace Jovanovich, Publishers], New York. Elementary theory (1983)
Katznelson, Y.: An introduction to harmonic analysis, 3rd edn. Cambridge University Press, New York (2004)
Book MATH Google Scholar
Lang, S.: Complex analysis, 4th edn. Springer-Verlag, New York (1999)
Book MATH Google Scholar
Lapuyade-Lahorgue, J., Barbaresco, F.: Radar detection using Siegel distance between autoregressive processes, application to HF and X-band radar. In 2008 IEEE Radar Conference, pages 1–6 (2008)
Lawson, J.: Existence and uniqueness of the Karcher mean on unital $C^\ast $-algebras. J. Math. Anal. Appl., 483(2):123625, 16 (2020)
Lawson, J., Lee, H., Lim, Y.: Weighted geometric means. Forum Math. 24(5), 1067–1090 (2012)
Article MathSciNet MATH Google Scholar
Lawson, J., Lim, Y.: Karcher means and Karcher equations of positive definite operators. Trans. Amer. Math. Soc. Ser. B 1, 1–22 (2014)
Article MathSciNet MATH Google Scholar
Lee, H., Lim, Y., Yamazaki, T.: Multi-variable weighted geometric means of positive definite matrices. Linear Algebra Appl. 435(2), 307–322 (2011)
Article MathSciNet MATH Google Scholar
Lévy, P.: Sur la convergence absolue des séries de Fourier. Compositio Math. 1, 1–14 (1935)
MathSciNet MATH Google Scholar
Moakher, M.: A differential geometric approach to the geometric mean of symmetric positive-definite matrices. SIAM J. Matrix Anal. Appl. 26(3), 735–747 (2005)
Article MathSciNet MATH Google Scholar
Moakher, M.: On the averaging of symmetric positive-definite tensors. J. Elasticity 82(3), 273–296 (2006)
Article MathSciNet MATH Google Scholar
Nakamura, N.: Geometric means of positive operators. Kyungpook Math. J. 49(1), 167–181 (2009)
Article MathSciNet MATH Google Scholar
Nobari, E.: A monotone geometric mean for a class of Toeplitz matrices. Linear Algebra Appl. 511, 1–18 (2016)
Article MathSciNet MATH Google Scholar
Nobari, E., Ahmadi Kakavandi, B.: A geometric mean for Toeplitz and Toeplitz-block block-Toeplitz matrices. Linear Algebra Appl., 548:189–202 (2018)
Rathi, Y., Tannenbaum, A., Michailovich, O.: Segmenting images on the tensor manifold. In 2007 IEEE Conference on Computer Vision and Pattern Recognition, pages 1–8 (2007)
Robol, L.: Rational Krylov and ADI iteration for infinite size quasi-Toeplitz matrix equations. Linear Algebra Appl. 604, 210–235 (2020)
Article MathSciNet MATH Google Scholar
Siddiqi, A. H.: Functional Analysis and Applications. Springer (2018)
Thompson, A.C.: On certain contraction mappings in a partially ordered vector space. Proc. Amer. Math. soc. 14, 438–443 (1963)
MathSciNet MATH Google Scholar
Wang, Y., Qiu, S., Ma, X., He, H.: A prototype-based SPD matrix network for domain adaptation EEG emotion recognition. Pattern Recognit. 110, 107626 (2021)
Widom, H.: Asymptotic behavior of block Toeplitz matrices and determinants. II. Advances in Math. 21, 1–29 (1976)
Article MathSciNet MATH Google Scholar
Yang, L., Arnaudon, M., Barbaresco, F.: Geometry of covariance matrices and computation of median. In Bayesian inference and maximum entropy methods in science and engineering, volume 1305 of AIP Conf. Proc., pages 479–486. Amer. Inst. Phys., Melville, NY (2010)
Yger, F., Berar, M., Lotte, F.: Riemannian Approaches in Brain-Computer Interfaces: A Review. IEEE Trans. Neural Syst. Rehabilitation Eng. 25(10), 1753–1762 (2017)
Article Google Scholar
Yuan, X., Huang, W., Absil, P.-A., Gallivan, K.A.: Computing the matrix geometric mean: Riemannian versus Euclidean conditioning, implementation techniques, and a Riemannian BFGS method. Numer. Linear Algebra Appl. 27(5), e2321 (2020)
Zanini, P., Congedo, M., Jutten, C., Said, S., Berthoumieu, Y.: Transfer Learning: A Riemannian Geometry Framework With Applications to Brain-Computer Interfaces. IEEE Trans. Biomed. Eng. 65(5), 1107–1116 (2018)
Article Google Scholar
Zygmund, A.: Trigonometric series, 2nd edn. Cambridge University Press, Cambridge (1959)
MATH Google Scholar

Download references

Acknowledgements

The authors wish to thank the anonymous referee and the editor for providing useful comments that helped to improve this presentation.

Author information

Authors and Affiliations

Dipartimento di Matematica, Università di Pisa, Pisa, Italy
Dario A. Bini
Dipartimento di Matematica e Informatica, Università degli Studi di Perugia, Perugia, Italy
Bruno Iannazzo
School of Mathematical Sciences, Ocean University of China, Qingdao, China
Jie Meng

Authors

Dario A. Bini
View author publications
You can also search for this author in PubMed Google Scholar
Bruno Iannazzo
View author publications
You can also search for this author in PubMed Google Scholar
Jie Meng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jie Meng.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Communicated by Michiel E. Hochstenbach.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The work of the first two authors was partly supported by INdAM (Istituto Nazionale di Alta Matematica) through a GNCS Project. A part of this research has been done during a visit of the third author to the University of Perugia. The work of the second author was partly supported by the project “Non-Euclidean geometries with applications” funded by the Università degli Studi di Perugia, Fondo Ricerca di Base 2019. The work of the third author was partly supported by the National Natural Science Foundation of China under grant Nos.12201591,12001262.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Bini, D.A., Iannazzo, B. & Meng, J. Geometric means of quasi-Toeplitz matrices. Bit Numer Math 63, 20 (2023). https://doi.org/10.1007/s10543-023-00962-2

Download citation

Received: 08 September 2022
Accepted: 17 February 2023
Published: 08 March 2023
DOI: https://doi.org/10.1007/s10543-023-00962-2

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Geometric means of quasi-Toeplitz matrices

Abstract

Similar content being viewed by others

Algorithms for Approximating Means of Semi-infinite Quasi-Toeplitz Matrices

Matrix-Valued Truncated Toeplitz Operators: Unbounded Symbols, Kernels and Equivalence After Extension

On the exponential of semi-infinite quasi-Toeplitz matrices

1 Introduction

1.1 Matrix geometric means

1.2 Theoretical contributions

1.3 Algorithmic contributions

1.4 Paper organization

2 Preliminaries and notations

2.1 Banach algebras and \(C^*\)-algebras

Lemma 2.1

Lemma 2.2

2.2 Means of operators

2.3 Quasi-Toeplitz matrices with continuous symbols

Proposition 2.3

Lemma 2.4

Lemma 2.5

Proof

Theorem 2.6

Proof

Theorem 2.7

Corollary 2.8

Proof

Lemma 2.9

Proof

2.4 Quasi-Toeplitz matrices with symbols in the Wiener algebra

Proposition 2.10

Proof

Proposition 2.11

Lemma 2.12

Proof

Corollary 2.13

Proof

Lemma 2.14

Proof

3 Geometric means of \({\mathcal {Q}}{\mathcal {T}}\) matrices

3.1 ALM mean

Theorem 3.1

Proof

Lemma 3.2

Proof

Proposition 3.3

Proposition 3.4

Proof

3.2 NBMP mean and weighted mean

Theorem 3.5

Theorem 3.6

4 Numerical and computational issues

4.1 Square root

Corollary 4.1

Proof

4.2 pth root

Proposition 4.2

Proof

Corollary 4.3

Remark 4.4

4.3 Computing the symbol g(z)

5 Geometric means of finite \({\mathcal {Q}}{\mathcal {T}}\) matrices

Proposition 5.1

6 Numerical experiments

7 Conclusions and open problems

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification