1 Introduction

We denote by ∥⋅∥ a norm on \(\mathbb {C}^d\) and its associated operator norm on the ring of d × d matrices \(M_d(\mathbb {C})\). For a bounded subset \(S \subset M_d(\mathbb {C})\), we let ∥S∥ :=supsSs∥. The joint spectral radius [3, 10, 15, 27, 29] is defined by:

$$\displaystyle \begin{aligned} \begin{array}{rcl}{} \rho(S):=\lim_{n \to +\infty} \|S^n\|{}^{\frac{1}{n}}\end{array} \end{aligned} $$
(1)

where Sn := {s 1 ⋅… ⋅ s n, s i ∈ S} is the n-th fold product set. From the submultiplicativity of the operator norm, it is clear that the limit exists, is independent of the choice of norm, and coincides with the infimum of all \(\|S^n\|{ }^{\frac {1}{n}}\), n ≥ 1. A straightforward consequence is that Sρ(S) is upper-semicontinuous for the Hausdorff topology. Moreover ρ(Sk) = ρ(S)k for every \(k\in \mathbb {N}\). It is also clear that ρ(gSg−1) = ρ(S) for every \(g \in \mathrm {GL}_d(\mathbb {C})\). Rota and Strang [27] observed that ρ(S) is equal to the infimum of ∥S∥ as the norm varies among all possible norms on \(\mathbb {C}^d\). Combined with John’s ellipsoid theorem, this easily yields:

Lemma 1

Given a norm ∥⋅∥ on \(\mathbb {C}^d\) , for any bounded subset \(S \subset M_d(\mathbb {C})\) , we have:

$$\displaystyle \begin{aligned} \begin{array}{rcl}\rho(S) \leq \inf_{g \in \mathrm{GL}_d(\mathbb{C})} \|gSg^{-1}\| \leq d \cdot \rho(S).\end{array} \end{aligned} $$

When S is irreducible (i.e., does not preserve a proper subspace of \(\mathbb {C}^d\)), it turns out that there is norm such that ρ(S) = ∥S∥. The existence of such extremal norms will be reviewed in Sect. 2 along with related known facts. It also follows easily from this that ρ(S) = 0 if and only if the subalgebra \(\mathbb {C}[S]\) generated by S is nilpotent.

It turns out that ρ(S) can also be approximated from below by eigenvalues. Let Λ(s) be the largest modulus of an eigenvalue of \(s \in M_d(\mathbb {C})\) and

$$\displaystyle \begin{aligned} \begin{array}{rcl}\Lambda (S):=\max_{s \in S} \Lambda(s).\end{array} \end{aligned} $$

It is clear that Λ(S) ≤ ρ(S) and thus \(\Lambda (S^n)^{\frac {1}{n}} \leq \rho (S)\) for all n. When S is a singleton, the classical Gelfand formula asserts that Λ(s) = ρ({s}). For several matrices, the key fact is as follows:

Theorem 1 (Berger-Wang [3])

$$\displaystyle \begin{aligned} \begin{array}{rcl}\rho(S)=\limsup_{n \to +\infty} \Lambda(S^n)^{\frac{1}{n}}.\end{array} \end{aligned} $$

An immediate consequence is that Sρ(S) is also lower-semicontinuous and hence continuous for the Hausdorff topology. Theorem 1 had been conjectured by Daubechies and Lagarias [10]. Elsner [11] gave a simple proof of it. In this article we will be interested in giving explicit estimates quantifying this convergence. Our first observation is that in fact the following slightly stronger result holds:

Theorem 2

Let \(S\subset M_d(\mathbb {C})\) be a bounded subset with ρ(S) > 0. Then

$$\displaystyle \begin{aligned} \begin{array}{rcl}\limsup_{n \to +\infty} \frac{\Lambda(S^n)}{\rho(S)^n}=1.\end{array} \end{aligned} $$

The question of the speed of convergence in Theorem 1 or 2 is an interesting one and goes back at least to the Lagarias-Wang finiteness conjecture [18], which posited that the limsup should be attained at a certain finite n. This has been disproved by Bousch and Mairesse [7] for 2 × 2 matrices (see also [13, 14, 21]), and Morris (see [22, Thm 2.7]) gave an example with \(S=\{a,b\}\subset \mathrm {SL}_2(\mathbb {R})\). In general, counterexamples are thought to be rare.

Elsner’s proof of Theorem 1 is based on a pigeonhole argument, which we will revisit in this note and can roughly be described as follows under the assumption that S is irreducible: if ρ(S) = 1, then given a unit vector \(x \in \mathbb {C}^d\), we may always find s ∈ S such that sx is also a unit vector (this follows from the existence of a Barabanov norm, see Sect. 2), so iterating this construction, we eventually find a short product w = s n ⋅… ⋅ s k with wy close to y = s k−1 ⋅… ⋅ s 1 x, implying that w has an eigenvalue close to 1. This idea also leads to a proof of Theorem 2 and to the following quantitative and explicit version:

Theorem 3

Let \(S\subset M_d(\mathbb {C})\) be a bounded subset with ρ(S) = 1. Set \(n_0(d)=3^d4^{d^2}\) and let ε > 0. If \(n \ge \varepsilon ^{-d^2}n_0(d)\) , then

$$\displaystyle \begin{aligned} \begin{array}{rcl}\max_{k \leq n} \Lambda(S^k) \ge 1-\varepsilon. \end{array} \end{aligned} $$

This yields a polynomial decay of the form \(|\sup _{k\leq n} \Lambda (S^k)-1|=O_{S,d}(n^{-1/d^2})\) in Theorem 2 when ρ(S) = 1. In [20] Morris proved a much stronger super-polynomial upper bound on the speed of convergence: that is, |supkn Λ(Sk) − 1| = O A,S(nA) for every A ≥ 1, provided S is finite and ρ(S) = 1. However the implied constant is not explicit. He also points out that his argument fails when S is infinite.

In this note we will be interested in the d aspect. The bound on n in Theorem 3 is super-exponential in d. If we aim to approximate the joint spectral radius no longer up to a small error, but only up to a constant multiple, we can expect polynomial bounds in d. In this vein, Bochi [5, Theorem B] established the following general inequality:

Theorem 4 (Bochi [5])

There are constants c(d) > 0, N(d) > 0 such that for every bounded set \(S \subset M_d(\mathbb {C})\) we have:

$$\displaystyle \begin{aligned} \begin{array}{rcl}{}\max_{1 \leq k \leq N(d)} \Lambda(S^k)^{\frac{1}{k}} \ge c(d) \cdot \rho(S).\end{array} \end{aligned} $$
(2)

Note that Theorem 1 (but not Theorem 2) follows immediately from Bochi’s inequality: indeed apply the inequality to Sn and let n tends to infinity. On the other hand, Theorem 3 implies Bochi’s inequality with \(N(d)=3^d8^{d^2}\) and \(c(d)=\frac {1}{2}\) say. We are interested in quantifying the constants c(d) and N(d) in terms of the dimension d. Example 1 (2) below shows that N(d) ≥ d. Bochi’s proof gave N(d) = 2d − 1, but a non-constructive c(d) obtained via a topological argument involving some geometric invariant theory.

In [8, 2.7, 2.9], another non-constructive proof was given with N(d) = d2. This proof actually allows to take for N(d) = (d) the smallest upper bound on the integer k such that for any \(S\subset M_d(\mathbb {C})\) the powers S, …, Sk span linearly the matrix algebra \(\mathbb {C}[S]\) generated by S. It is immediate that (d) ≤ d2, but in a recent breakthrough, Shitov [28] has proved that (d) ≤ 2d(log2 d + 2) greatly improving an earlier bound in O(d3∕2) due to Pappacena [24].

In order to motivate our main result and since it is very short, we give now a direct proof of Theorem 4 using the following slight variant of the argument from [8]:

Claim 1

There is c′(d) > 0 such that for every bounded subset S of \(M_d(\mathbb {C})\) with ρ(S) = 1 there is a non-zero idempotent \(p \in M_d(\mathbb {C})\) (i.e., p2 = p) such that c′(d)p belongs to the complex convex hull of S, …, S(d).

Proof

By the complex convex hull Conv(Q) of \(Q \subset M_d(\mathbb {C})\), we mean the set of linear combinations α 1 q 1 + … + α n q n with q i ∈ Q and |α 1| + … + |α n| = 1. Since the problem is invariant under conjugation, in view of Lemma 1, we may assume that S is confined to a bounded region of \(M_d(\mathbb {C})\), allowing us to pass to a Hausdorff limit of potential counterexamples to the claim. By compactness and upper semi-continuity of the joint spectral radius, we get a bounded subset S with ρ(S) ≥ 1, but such that Conv(S ∪… ∪ S(d)) contains no scalar multiple of an idempotent. In particular, \(\mathbb {C}[S]\) contains no idempotent. By the Artin-Wedderburn theorem, this means that \(\mathbb {C}[S]\) is a nilpotent subalgebra of \(M_d(\mathbb {C})\). In particular, Sd = 0, which is in contradiction with ρ(Sd) = ρ(S)d ≥ 1. □

Proof of Theorem 4

If ρ(S) = 0, there is nothing to prove. Otherwise, rescaling we may assume ρ(S) = 1. If the left-hand side in (2) is at most c (and we may assume c ≤ 1, so ck ≤ c), then the trace of any element in Sk, k ≤ (d), is at most cd in modulus. The trace of the idempotent element p found in Claim 1 is a non-zero integer. So c′(d) ≤ c′(d)|Tr(p)|≤ cd. So setting c(d) = c′(d)∕d and N(d) = (d) yields (2) as desired. □

As with Bochi’s original argument, this one does not give any explicit estimate on the constant c(d). It is however possible to “effectivize” the argument just given: this requires effectivizing the proof of Wedderburn’s theorem and, after a fairly painstaking analysis, the details of which we will spare the reader, yields a rather poor lower bound on c(d) of doubly exponential type in d. Another route is to use an idea appearing in the work of Oregon-Reyes [23, Rk. 4.5], which consists in using the effective arithmetic nullstellensatz by making explicit the implication {Tr(Sk) = 0 for all k = 1, …, (d)} ⇒ {Sd = 0}. This also yields an effective bound on c(d), which is again unfortunately rather poor, at least doubly exponential in d.

The following result, which is the main contribution of this note, gives explicit polynomial bounds on both c(d) and N(d).

Theorem 5

For every bounded set \(S \subset M_d(\mathbb {C})\) , we have:

$$\displaystyle \begin{aligned} \begin{array}{rcl}\max_{1 \leq k \leq 2d^3} \Lambda(S^k)^{\frac{1}{k}} \ge \frac{1}{2^8d^5} \cdot \rho(S).\end{array} \end{aligned} $$

In particular, applying this to S m for a suitable integer m, we also have:

$$\displaystyle \begin{aligned} \begin{array}{rcl}\max_{1 \leq k \leq N_2(d)} \Lambda(S^k)^{\frac{1}{k}} \ge \frac{1}{2} \cdot \rho(S),\end{array} \end{aligned} $$

where N 2(d) = 2d3⌈8 + 5log2 d⌉.

By the same trick of replacing S by Sm for suitable m, the factor \(\frac {1}{2}\) can of course be replaced by any number κ < 1 provided N 2(d) is replaced by \(N_{\kappa ^{-1}}(d):=2d^3\lceil \log _{\kappa ^{-1}} (2^8d^5)\rceil \). The proof exploits a different kind of pigeonhole argument, where one argues, as in the classical Siegel lemma in number theory, that some non-zero linear combination with small integer coefficients of the iterates s n ⋅… ⋅ s 1 x will vanish or be very small. In turn, this forces one of the products to have a spectral radius bounded away from zero.

The following natural questions then arise:

Questions

How sharp is the bound d3+o(1) on N 2(d)? We only know that N 2(d) must be at least d. Is there a polynomial bound on c′(d) in Claim 1 above?

In [5, Theorem A], Bochi proves another inequality, giving this time a lower bound on ρ(S) in terms of the norms of Sn, which, when iterated, gives a speed of convergence for (1); see [16]. Given any norm ∥⋅∥ on \(\mathbb {C}^d\),

$$\displaystyle \begin{aligned} \begin{array}{rcl}{} \|S^d\| \leq C_0(d) \rho(S) \|S\|{}^{d-1}. \end{array} \end{aligned} $$
(3)

While no explicit bound on C 0(d) was given in [5], his proof gives a super-polynomial bound in d3d∕2 (see [16, Section 4]). It turns out that the pigeonhole argument for our Theorem 5 gives a polynomial bound for (3) at the expense of increasing the power of S:

Theorem 6

Let \(S\subset M_d(\mathbb {C})\) be a bounded subset and set n 1 = 2d2 . Then

$$\displaystyle \begin{aligned} \begin{array}{rcl}{}\|S^{n_1}\| \leq 2^7 d^4 \rho(S) \|S\|{}^{n_1-1}.\end{array} \end{aligned} $$
(4)

Iterating (4) yields an explicit estimate quantifying the convergence in (1) improving the bounds obtained in [16, Theorem 1].

Finally we examine what happens when the field \(\mathbb {C}\) is replaced by an arbitrary algebraically closed complete valued field (K, |⋅|). By Ostrowski’s theorem, if K is not \(\mathbb {C}\), it must be non-Archimedean (for instance, \(\mathbb {C}_p\) the completion of the algebraic closure of the field of p-adic numbers \(\mathbb {Q}_p\), or the completion of the field of Laurent series over the algebraic closure of \(\mathbb {F}_p\)). All of the above makes sense of course, and the joint spectral radius is defined in the same way. As it turns out, the analogues of the results above are much simpler for such K, the Lagarias-Wang finiteness conjecture holds in a uniform way, and in fact:

Theorem 7

Let K be an algebraically closed non-archimedean complete valued field. Consider an ultrametric norm ∥⋅∥0 on Kd and a bounded subset S of M d(K). Then

$$\displaystyle \begin{aligned} \begin{array}{rcl}{}\max_{1 \leq k \leq \ell(d)} \Lambda(S^k)^{\frac{1}{k}} = \rho(S)=\inf_{g \in \mathrm{GL}_d(K)} \|gSg^{-1}\|{}_0.\end{array} \end{aligned} $$
(5)

Moreover, ρ(S) > 0 if and only if the subalgebra generated by S is not nilpotent, in which case there is an ultrametric norm ∥⋅∥ on Kd withS∥ = ρ(S).

Recall that (d) denotes the smallest integer k such that for any field F and any S ⊂ M d(F) the power sets S, …, Sk span linearly the algebra F[S]. Obviously (d) ≤ d2 and recall that in fact (d) ≤ 2dlog2 d + 4d − 4 by [28].

If K is not algebraically closed, then (5) still holds with K replaced by its algebraic closure \(\overline {K}\) (indeed, the absolute value extends uniquely to \(\overline {K}\) and the completion of \(\overline {K}\) will remain algebraically closed by Kürschák’s theorem [26, 5.J.]).

In the special case, when K is a local field and S a compact subgroup of GLd(K), the last assertion of the theorem recovers the well-known Bruhat-Tits fixed point theorem: the norm ∥⋅∥ will be preserved by S and thus be a fixed point in the Bruhat-Tits building of ultrametric norms [12].

Theorem 7 was proved in [8] for \(K=\mathbb {C}_p\). We will give a slightly more direct proof of the general case.

Similarly, the analogue of Theorem 6 reads:

Theorem 8

For any ultrametric norm ∥⋅∥0 on Kd and S  M d(K) bounded

$$\displaystyle \begin{aligned} \begin{array}{rcl}\|S^d\|{}_0\leq \rho(S) \|S\|{}_0^{d-1}.\end{array} \end{aligned} $$
(6)

2 Extremal Norms and Barabanov Norms

In this section we recall some well-known facts about the joint spectral radius and extremal norms providing complete and self-contained proofs. Most of the material can be found in the first chapters of the book [15]. We then prove Theorem 2.

We begin by the observation of Rota and Strang mentioned in the introduction. Recall that ∥⋅∥ denotes both a norm on \(\mathbb {C}^d\) and its associated operator norm and that for some subset Q (in either \(\mathbb {C}^d\) or \(M_d(\mathbb {C})\)), we set ∥Q∥ :=supqQq∥.

Lemma 2 (Rota-Strang)

Let \(S\subset M_d(\mathbb {C})\) be a bounded subset.

$$\displaystyle \begin{aligned} \begin{array}{rcl}{}\rho(S)=\inf_{\|\cdot\|} \|S\|,\end{array} \end{aligned} $$
(7)

where the infimum is over all norms on \(\mathbb {C}^d\).

Proof

Let r > 0 with (S) < 1 and consider the norm v r(x) :=∑n≥0Sn xrn. Clearly \(v_r(sx)\leq \frac {1}{r}v_r(x)\) for all s ∈ S. So v r(S) ≤ r−1. Letting r−1 tend to ρ(S) yields the result. □

Lemma 1 follows immediately by combining Lemma 2 with the following well-known fact:

Lemma 3 (John’s Ellipsoid)

If v is a norm on \(\mathbb {C}^d\) and ∥⋅∥2 the standard hermitian norm, then there is \(g \in \mathrm {GL}_d(\mathbb {C})\) such that for all \(x \in \mathbb {C}^d\)

$$\displaystyle \begin{aligned} \begin{array}{rcl}\|gx\|{}_2 \leq v(x) \leq \sqrt{d} \cdot \|gx\|{}_2.\end{array} \end{aligned} $$

In particular if w(x) is any another norm, then for some \(h \in \mathrm {GL}_d(\mathbb {C})\)

$$\displaystyle \begin{aligned} \begin{array}{rcl}w(hx) \leq v(x) \leq d \cdot w(hx).\end{array} \end{aligned} $$

Proof

According to John’s ellipsoid theorem (e.g., [1]), every symmetric convex body K in \(\mathbb {R}^k\) contains a unique ellipsoid E of maximal volume and E moreover satisfies \(K \subset \sqrt {d}E\). If K is the ball of radius 1 of the complex norm v in \(\mathbb {C}^d=\mathbb {R}^{2d}\), then the uniqueness implies that the norm associated to E is hermitian, hence of the form ∥gx2 for some \(g \in \mathrm {GL}_d(\mathbb {C})\). □

Remark 1

This argument shows that the constant d in Lemma 1 can be replaced by \(\sqrt {d}\) if the norm is ∥⋅∥2. In fact a more subtle argument (see, e.g., [4]) shows that it can be replaced with \(\sqrt {\min \{k,d\}}\) in case S has k elements.

One says that S is irreducible if it does not preserve a non-trivial proper subspace of \(\mathbb {C}^d\). It is said to be product bounded if the semigroup it generates T :=⋃n≥1 Sn is bounded. The following is also classical (see [2, 3, 11, 29]):

Lemma 4 (Extremal Norms)

Suppose S is irreducible. Then ρ(S) > 0, Sρ(S) is product bounded and the infimum in (7) is attained.

Norms realizing the infimum in (7) are called extremal norms.

Proof

By Burnside’s theorem, the subalgebra \(\mathbb {C}[S]\) generated by S is all of \(M_d(\mathbb {C})\). Since \(\mathbb {C}[S]\) is linearly spanned by \(S \cup \ldots \cup S^{d^2}\), we may express each element of the canonical basis E ij of \(M_d(\mathbb {C})\) as a linear combination of elements from T. Given that Tr(E ii) = 1, this means that at least one element of T has non-zero trace, which clearly forces ρ(S) > 0. Rescaling, we may assume without loss of generality that ρ(S) = 1. In particular |Tr(t)|≤ d for all t ∈ T and thus |Tr(tE ij)| is bounded independently of t ∈ T, which means that T is bounded. Finally, given any norm ∥⋅∥ on \(\mathbb {C}^d\) and setting v(x) := ∥Tx∥, we get a well-defined norm such that v(sx) ≤ v(x) for all s ∈ S. Hence v is an extremal norm. □

The example of a single non-trivial unipotent matrix shows that the infimum in (7) is not attained in general. If S is not irreducible, it can be put in block triangular form in some basis of \(\mathbb {C}^d\). Therefore the following is an immediate consequence of the previous lemma (recall that an algebra N is nilpotent if there is an integer n such that Nn = 0).

Corollary 1

Let S be a bounded subset of \(M_d(\mathbb {C})\) . Then ρ(S) = 0 if and only if \(\mathbb {C}[S]\) is a nilpotent subalgebra of \(M_d(\mathbb {C})\).

If S is irreducible and ρ(S) = 1, T is bounded, and we may define

$$\displaystyle \begin{aligned} \begin{array}{rcl}{} v(x):=\limsup_{n \to +\infty}\|S^nx\|.\end{array} \end{aligned} $$
(8)

Then v is a norm, because v(x) = 0 for some x ≠ 0 implies v(Sn x) = 0 for all n, which implies by irreducibility that v is identically zero and hence that ρ(S) = 0. In particular:

Lemma 5 (Barabanov Norms)

Let S be an irreducible bounded subset of \(M_d(\mathbb {C})\) , then there is a complex norm v on \(\mathbb {C}^d\) such that for all \(x \in \mathbb {C}^d\),

$$\displaystyle \begin{aligned} \begin{array}{rcl}{} \max_{s \in S} v(sx)=\rho(S) \cdot v(x).\end{array} \end{aligned} $$
(9)

Proof

Indeed we may define v as in (8) for S replaced by Sρ(S). □

A norm satisfying (9) is a special kind of extremal norm called a Barabanov norm (see [2, 17, 25, 29]). Such norms are not unique in general (e.g., in Example 1 4. below any norm ∥⋅∥ on \(\mathbb {C}^d\) with εx2 ≤∥x∥≤∥x2 is a Barabanov norm for S), but they can be in some situations [19].

Another object is naturally associated to S when ρ(S) = 1; it is the attractor semigroup [2, 29]

$$\displaystyle \begin{aligned} \begin{array}{rcl}T_\infty:=\bigcap_{n\ge 1} \overline{S^nT}.\end{array} \end{aligned} $$

In other words, this is the set of limit points of finite products s 1 ⋅… ⋅ s n whose length n tends to infinity. It is clearly compact, and for every operator norm, it contains an element of norm at least 1. Indeed otherwise we would have ∥Sn∥ < 1 for some n and thus ρ(Sn) < 1, which is impossible as ρ(Sn) = ρ(S)n = 1. By construction, the Barabanov norm (8) is also equal to \(v(x)=\max _{t \in T_\infty } \|tx\|\). Furthermore, it is straightforward that T  = T S = ST and \(T_\infty ^2=T_\infty \) and that:

Lemma 6

Suppose S is irreducible with ρ(S) = 1. Then T is also irreducible and ρ(T ) = 1.

Proof

For every non-zero \(x \in \mathbb {C}^d\), the linear span 〈T x contains 〈T Sk x for each k and hence \(\langle T_\infty \rangle \mathbb {C}^d\) by irreducibility of S. So this must be 0 or \(\mathbb {C}^d\). The former is impossible, because T ≠ {0} by the above discussion. So T is irreducible. Finally by construction v(T ) = 1 and \(T_\infty ^k=T_\infty \) for every k. Hence ρ(T ) = 1. □

We are now in a position to prove Theorem 2.

Lemma 7 (Existence of an Idempotent)

Suppose S is a bounded irreducible subset of \(M_d(\mathbb {C})\) with ρ(S) = 1. Then the attractor semigroup T contains a non-zero idempotent.

Proof

Let K be the subset of T made of elements with operator norm 1 for the Barabanov norm (8). We have already seen that K is non-empty. If ab has norm one and a, b ∈ T , then both a and b have norm one. So K ⊂ K2. Starting from some t 0 ∈ K, we may write t 0 = t 1 s 1 with t 1, s 1 ∈ K, and then similarly t 1 = t 2 s 2, etc. For each n we have t 0 = t n s n ⋅… ⋅ s 1. By compactness of K, there is a subsequence n i such that \(s_{n_i}\cdot \ldots \cdot s_1\) converges, say towards k ∈ K. Passing to a further subsequence, we may assume that \(s_{n_{i+1}} \cdot \ldots \cdot s_{n_i+1}\) also converges, say towards u ∈ K. At the limit we have k = uk. But there is a unit vector x such that y := kx has norm 1. Hence y = uy and u has 1 as an eigenvalue. So T contains an element u with eigenvalue 1. Now looking at u in Jordan normal form and considering large powers of u, we see that the Jordan blocks with eigenvalue of modulus 1 must be of size 1, because powers of non-trivial unipotents are unbounded. Therefore {un}n≥1 contains an idempotent in its closure. □

Note that T may contain 0, so the lemma does not follow from a general result guaranteeing the existence of idempotents in compact semigroups such as the Ellis-Numakura lemma.

Proof of Theorem 2

We first assume that S is irreducible. Rescaling, we may assume that ρ(S) = 1. By Lemma 7, T contains an idempotent. In particular, Λ(T ) = 1, which implies what we want. The general case follows from the irreducible one. Indeed if S is not irreducible, it can be put in block triangular form, and if S ii denotes the i-th diagonal block, then it is straightforward to check (either from the definition or more directly from Theorem 1) that ρ(S) =maxi ρ(S ii). □

Example 1

The following are examples of irreducible subsets of \(M_d(\mathbb {C})\) with joint spectral radius equal to 1.

  1. 1.

    S = {E ij}ij the elementary matrices in \(M_d(\mathbb {C})\). Note that S is made of rank 1 elements and T  = S ∪{0}.

  2. 2.

    S = {E i,i+1}1≤i<d ∪{E d1}. Note that T = T  = {0}∪{E ij}ij.

  3. 3.

    \(S=U_d(\mathbb {C}) \cup \{t\}\), where \(U_d(\mathbb {C})\) is the group of unitary matrices and \(t= \operatorname {\mathrm {diag}}(\alpha _1,\ldots ,\alpha _d)\) with |α i| < 1. Then T  = T ∪{0}.

  4. 4.

    \(S=\{ \operatorname {\mathrm {id}}\} \cup \varepsilon U_d(\mathbb {C})\) for ε < 1. Then T  = T ∪{0}.

3 Explicit Bounds for Theorem 2

In this section we prove Theorem 3. We need a basic lemma.

Lemma 8

Let ∥⋅∥ be a norm on \(\mathbb {C}^d\) . Let \(A \in M_d(\mathbb {C})\) and \(x \in \mathbb {C}^d\) withA∥≤ 1 andx∥ = 1. Let ε > 0 and \(\lambda \in \mathbb {C}\) with |λ|≤ 2. Assume thatAx  λx∥≤ (ε|λ|)d . Then the spectral radius Λ(A) of A satisfies Λ(A) ≥|λ|(1 − 4ε).

Proof

Writing Ak x − λk x = Ak−1(Ax − λx) + … + λk−1(Ax − λx) and using that ∥A∥≤ 1, we obtain for k ≤ d

$$\displaystyle \begin{aligned} \begin{array}{rcl}\|A^kx-\lambda^kx\|\leq (\varepsilon|\lambda|)^d (1+|\lambda|+\ldots+|\lambda|{}^{k-1})\leq (2\varepsilon|\lambda|)^d.\end{array} \end{aligned} $$

If χ A(t) = td + a d−1 td−1 + … + a 0 is the characteristic polynomial of A, then \(\|\chi _A(A)x-\chi _A(\lambda )x\|\leq \sum |a_k|\|A^k x - \lambda ^k x\|\), and χ A(A) = 0 by Cayley-Hamilton. But \(|a_{d-k}|\leq {d \choose k}\), and so |χ A(λ)|≤ 2d(2ε|λ|)d. To prove the claim, we may assume that |λ|≥ Λ(A). Writing α i for the roots of χ A, the claim then follows from

$$\displaystyle \begin{aligned} \begin{array}{rcl}||\lambda|-\Lambda(A)|{}^d \leq \prod_1^d ||\lambda| - |\alpha_i|| \leq |\chi_A(\lambda)|.\end{array} \end{aligned} $$

Proof of Theorem 3

We may put S in block triangular form. Since ρ(S) =maxi ρ(S ii), at least one of the irreducible diagonal blocks S ii has ρ(S ii) = 1. Hence, without loss of generality, we may assume that S is irreducible. Let v be a Barabanov norm for S as in Lemma 5. Pick a unit vector \(x_0 \in \mathbb {C}^d\) and find recursively s 1, s 2, … such that x n = s n ⋅… ⋅ s 1 x 0 satisfies v(x n) = 1 for all n ≥ 0. Let δ = (ε∕4)d. Note that the cardinality of a δ-separated set lying in the unit ball for v is at most \((1+\delta /2)^d/(\delta /2)^d=(1+\frac {2}{\delta })^d \leq n_0(d)\varepsilon ^{-d^2}\), because the v-balls of radius \(\frac {\delta }{2}\) centered at these points are disjoint and contained in the v-ball of radius \(1+\frac {\delta }{2}\) around the origin. By pigeonhole, there are 0 ≤ n < n′ both smaller than \(n_0(d)\varepsilon ^{-d^2}\) such that \(v(x_n - x_{n'})<\delta \). In other words, v(Ax n − x n) < δ, where \(A:=s_{n'}\cdot \ldots \cdot s_{n+1}\). By Lemma 8, it follows that Λ(A) ≥ 1 − 4δ1∕d = 1 − ε. □

4 Explicit Bounds for Bochi’s Inequalities

In this section we prove Theorems 5 and 6. We begin by the Siegel-type lemma already mentioned.

Lemma 9 (Siegel-Type Lemma)

Let ∥⋅∥ be a norm on \(\mathbb {C}^d\) . Let ε ∈ (0, 1) and \(T,n \in \mathbb {N}\) with (1 + T)n > (1 + 2nTε−1)d . Pick x 1, …, x n vectors in \(\mathbb {C}^d\) withx i∥≤ 1. Then there are integers c 1, …, c n , not all zero, such that |c i|≤ T for all i and

$$\displaystyle \begin{aligned} \begin{array}{rcl}\|\sum_1^n c_i x_i\| \leq \varepsilon.\end{array} \end{aligned} $$

Proof

Consider the sums \(\sum _1^n d_i x_i\) for integers d i ∈ [0, T]. They have norm at most Tn. If all \(\frac {\varepsilon }{2}\)-balls around them were disjoint, then the ball of radius \(Tn+\frac {\varepsilon }{2}\) around the origin would contain at least (1 + T)n disjoint balls of radius \(\frac {\varepsilon }{2}\). Comparing volumes we would have (1 + T)n ≤ (Tn + ε∕2)d∕(ε∕2)d, contrary to our assumption. Hence, two of these balls, corresponding to (d i)i and \((d_i^{\prime })_i\), say, must intersect. Setting \(c_i=d_i^{\prime }-d_i\) we get what we want. □

Lemma 10

Let ε > 0. Let \(A \in M_d(\mathbb {C})\) such that |Tr(Ak)|≤ εk for k = 1, …, d. Then the spectral radius of A satisfies Λ(A) ≤ 2ε.

Proof

Let \(s_k=\lambda _1^k+\ldots +\lambda _d^k\), where λ 1, …, λ d are the eigenvalues of A. The Newton relations read s k + a d−1 s k−1 + … + a dk+1 s 1 = −ka dk, where td + a d−1 td−1 + … + a 0 is the characteristic polynomial χ A of A. We deduce from them that |a dk|≤ εk for each k = 1, …, d. If λ is an eigenvalue of A, then χ A(λ) = 0 and thus

$$\displaystyle \begin{aligned} \begin{array}{rcl}|\lambda|{}^d \leq \varepsilon|\lambda|{}^{d-1}+\ldots+\varepsilon^k|\lambda|{}^{d-k}+\ldots+\varepsilon^d.\end{array} \end{aligned} $$

Setting x = ε∕|λ|, we obtain 1 ≤ x + … + xd. But this implies x ≥ 1∕2. □

Lemma 11

Let \(n \in \mathbb {N}\) and \(S\subset M_d(\mathbb {C})\) be a bounded set such that \(\varepsilon :=\max _{k \leq nd} \Lambda (S^k)^{\frac {1}{k}} \leq 1\) . Let Q be the complex convex hull of S ∪… ∪ Sn . Then Λ(Q) ≤ 2dε.

Proof

Note that Conv(A)Conv(B) ⊂ Conv(AB) for any two sets \(A,B \subset M_d(\mathbb {C})\). So if a ∈ Q, then ak belongs to the convex hull of ⋃kink Si. In particular

$$\displaystyle \begin{aligned} \begin{array}{rcl}|\mathrm{Tr}(a^k)| \leq d\varepsilon^k \leq (d\varepsilon)^k\end{array} \end{aligned} $$

for each k = 1, …, d. The conclusion now follows from Lemma 10. □

We now prove Theorem 5. Rescaling and triangularizing S if necessary, we may assume without loss of generality that ρ(S) = 1 and that S is irreducible. As in the proof of Theorem 3, take a Barabanov norm ∥⋅∥ for S. Pick a unit vector \(x \in \mathbb {C}^d\) and find s 1, …, s n, … in S such that ∥x n∥ = 1 for all n, where x n := s n ⋅… ⋅ s 1 x. For T and ε > 0 as in Lemma 9, we obtain integers c i not all zero such that |c i|≤ T and \(\|\sum _1^n c_i x_i\| \leq \varepsilon \). Let i 0 be the smallest index i with c i ≠ 0 and set \(y=x_{i_0}\). Hence, we may write:

$$\displaystyle \begin{aligned} \begin{array}{rcl}\|c_{i_0}y + \sum_{i>i_0} c_i s_i \cdot \ldots \cdot s_{i_0+1} y \|\leq \varepsilon.\end{array} \end{aligned} $$

In other words:

$$\displaystyle \begin{aligned} \begin{array}{rcl}{}\|\lambda y -Ay\|\leq \frac{\varepsilon}{N},\end{array} \end{aligned} $$
(10)

where \(A:=\frac {1}{N} \sum _{i>i_0} -c_i s_i \cdot \ldots \cdot s_{i_0+1}\), \(\lambda =\frac {c_{i_0}}{N}\) and \(N:=\sum _{i>i_0} |c_i|\). Note that N ≠ 0, because ε < 1 and ∥x i∥ = 1 for all i. Note further that ∥A∥≤ 1 because ∥s∥≤ 1 for all s ∈ S. And that \(|\lambda |\ge \frac {1}{N} \ge \frac {1}{Tn}\), while \(|\lambda |\leq \|Ay\|+\frac {\varepsilon }{N}\leq 1+\frac {\varepsilon }{N}\leq 2\).

We can then apply Lemma 8 to A and λ and get

$$\displaystyle \begin{aligned} \begin{array}{rcl}\Lambda(A) \ge |\lambda|- 4(\frac{\varepsilon}{N})^{\frac{1}{d}} \ge \frac{1}{2N} \ge \frac{1}{2nT}.\end{array} \end{aligned} $$

provided \(4(\frac {\varepsilon }{N})^{\frac {1}{d}} \leq 1/2N\). The conditions for Lemma 8 require that |λ|≤ 2, while those for Lemma 9 require (1 + T)n > (1 + 2nTε−1)d. These conditions will be fulfilled if we set T = 32d2, n = 2d2, and ε−1 = 8d(nT)d−1. We conclude that

$$\displaystyle \begin{aligned} \begin{array}{rcl}\Lambda(A) \ge \frac{1}{2^7d^4}.\end{array} \end{aligned} $$

However, A belongs to the convex hull of S ∪… ∪ Sn. Therefore Lemma 11 implies that

$$\displaystyle \begin{aligned} \begin{array}{rcl}\max_{k \leq nd} \Lambda(S^k)^{\frac{1}{k}}\ge \frac{1}{2^8d^5}.\end{array} \end{aligned} $$

This yields the first inequality in Theorem 5. The second follows by applying the first to Sm for m = ⌈8 + 5log2 d⌉.

Proof of Theorem 6

This is very similar. Suppose ∥S∥ = 1 and let \(\delta =\|S^{n_1}\|\). Pick a unit vector x and \(s_1,\ldots ,s_{n_1}\) such that \(\|s_{n_1}\cdot \ldots \cdot s_1 x\|=\delta \). Arguing as in the above proof of Theorem 5, we get a y with ∥y∥≥ δ such that (10) holds. Lemma 8 gives \(\Lambda (A) \ge \frac {1}{2n_1T}\) if ε is chosen so that 4(εδ−1n 1 T)1∕d = 1∕2n 1 T. Then setting n 1 = 2d2, n 1 T = −1, we see that the condition for Lemma 9 is fulfilled if M ≥ 26 d4. But ρ(S) ≥ Λ(A) ≥ δ∕2M, proving the claim. □

5 Ultrametric Complete Valued Fields

In this section we consider the analogue of the above for an algebraically closed complete and non-Archimedean valued field K and prove Theorem 7.

Let \(\mathscr {O}:=\{x \in K, |x|\leq 1\}\) be the ring of integers, \(\mathfrak {m}:=\{x \in K, |x|<1\}\) its maximal ideal, and \(\overline {k}=\mathscr {O}/\mathfrak {m}\) the residue field. Recall that the value group of K is dense in \(\mathbb {R}_{>0}\) since K is algebraically closed. By an ultrametric norm on Kd, we mean a map \(\|\cdot \|:K \to \mathbb {R}_{\ge 0}\) such that ∥λx∥ = |λ|∥x∥, \(\|x+y\|\leq \max \{\|x\|,\|y\|\}\), and ∥x∥ = 0 if and only if x = 0, for all x, y ∈ Kd, λ ∈ K.

An orthogonal basis for an ultrametric norm is a basis \((e_i)_1^d\) of Kd such that \(\|x\|=\max \{ c_i |x_i|\}\) for some positive reals c i, if x = x 1 e 1 + … + x d e d. We say that it is orthonormal if c i = 1 for all i. If K is locally compact, or just spherically complete [6, 2.4.4], all ultrametric norms admit an orthogonal basis, but in general we only have:

Lemma 12

Let v and w be two ultrametric norms on Kd and α > 1 a real. Then there is g ∈GLd(K) such that w(x) ≤ v(gx) ≤ αw(x) for all x  Kd.

Proof

This is well-known and follows from the existence [6, 2.6.2 Prop. 3] of almost orthogonal bases for ultrametric norms on Kd and the density in \(\mathbb {R}^+\) of the value group of K. □

We begin by pointing out that the Rota-Strang observation, Lemma 2, and its proof remain valid in the ultrametric setting. Combined with Lemma 12, this yields the right-hand side of (5). It turns out that the infimum in (7) is realized under some mild conditions (milder than in the complex case):

Lemma 13

Suppose that the value group of K is all of \(\mathbb {R}_{>0}\) and S  M d(K) is a bounded set. If ρ(S) = 0, then Sd = 0, while if ρ(S) > 0, then there is an ultrametric norm ∥⋅∥ on Kd withS∥ = ρ(S).

Proof

The first assertion follows from the same argument as in Lemma 4. If ρ(S) > 0, we may rescale and assume that ρ(S) = 1, because we can pick λ ∈ K with |λ| = ρ(S). If S is irreducible, then the proof of Lemma 4 works verbatim and yields the desired norm. In general, we may choose a basis of Kd for which S is in block triangular form with irreducible blocks and define the norm ∥x∥ =maxix ii, where ∥⋅∥i is a norm on the i-th block with ρ(S ii) = ∥S iii, provided S ii ≠ 0 and arbitrary otherwise. We may further conjugate S by a block diagonal matrix g, where the i-th block is the scalar matrix ti for some t ∈ K with 0 ≠ |t| < 1∕∥S∥. Then, because of the ultrametric property, ∥gSg−1∥≤ 1. Thus ∥g ⋅ g−1∥ is the desired norm. □

We now proceed to the proof of Theorem 7. It follows the same idea as in the proof of Claim 1 from the introduction, but we will need to palliate the lack of compactness and the fact that the value group may not be all of \(\mathbb {R}_{>0}\) by the use of an ultrapower construction. The gist of the proof is in the following lemma:

Lemma 14

Suppose that ∥⋅∥ is an ultrametric norm admitting an orthonormal basis. If S  M d(K) is such thatS∥ = ρ(S) = 1, then

$$\displaystyle \begin{aligned} \begin{array}{rcl}\max_{k \leq \ell(d)} \Lambda(S^k)=1.\end{array} \end{aligned} $$

Proof

Let \((e_i)_1^d\) be the orthonormal basis, i.e., \(\|x\|=\max _1^d|x_i|\) if x = x 1 e 1 + … + x d e d. In this basis, \(S \subset M_d(\mathscr {O})\). Consider the convex hull Q of S, …, S(d), that is, the \(\mathscr {O}\)-module they span. If Λ(Sk) < 1 for each k = 1, …, (d), the characteristic polynomial of a matrix in Sk will be td modulo \(\mathfrak {m}\). So the image of Q modulo \(\mathfrak {m}\) in \(M_d(\overline {k})\) will consist of nilpotent matrices, and it will be a subalgebra of \(M_d(\overline {k})\) by definition of (d). By Wedderburn’s theorem, it will therefore be a nilpotent algebra, and we conclude that \(S^d \subset M_d(\mathfrak {m})\). In particular, ∥Sd∥ < 1, contradicting our assumption that ρ(S) = 1. □

Proof of Theorem 7

Suppose first that the value group of K is all of \(\mathbb {R}_{>0}\) and that all ultrametric norms on Kd admit an orthonormal basis. Then the theorem follows from the combination of the two previous lemmas by renormalizing S. So to handle the general case, it is enough to show that K can be embedded in another such field with the above properties. Any ultralimit K =  (K)∕ ≡ of K with respect to some non-principal ultrafilter \(\mathscr {U}\) on \(\mathbb {N}\) will do. Here (K) is the space of bounded sequences in K and (x n)n ≡ (y n)n if and only if \(\lim _{\mathscr {U}}|x_n-y_n|=0\). Indeed, by the countable saturation property of ultraproducts (e.g., [9, 2.25]), an ultralimit K will again be complete and algebraically closed, its value group will be \(\mathbb {R}_{>0}\), and, because of Lemma 12, all norms will admit an orthonormal basis. This shows Theorem 7 in full. □

Proof of Theorem 8

This follows from Bochi’s original argument [5, Theorem A] suitably adapted to the ultrametric setting. First, up to passing to a suitable field extension as in the proof of Theorem 7, we may assume that all norms admit an orthonormal basis. Pick one so that ∥x0 =maxi|x i|. Then observe the following: for every invertible diagonal matrix a, we have:

$$\displaystyle \begin{aligned} \begin{array}{rcl}{}\|aS^da^{-1}\|{}_0 \leq \|S\|{}_0 \cdot \|aSa^{-1}\|{}_0^{d-1}.\end{array} \end{aligned} $$
(11)

Indeed every matrix entry of an element of aSd a−1 is a sum of monomials of the form \(a_{i_1} s^{(1)}_{i_1i_2} \cdot \ldots \cdot s^{(d)}_{i_{d}i_{d+1}} a_{i_{d+1}}^{-1}\) for matrices s(i) ∈ S. We may write it as \(a_{i_1} s^{(1)}_{i_1i_2} a_{i_2}^{-1} a_{i_2} \cdot \ldots \cdot a_{i_{d}}^{-1}a_{i_{d}} s^{(d)}_{i_{d}i_{d+1}} a_{i_{d+1}}^{-1}\), a product of d factors each bounded by ∥aSa−10. However at least one of the d factors is bounded by ∥S0, because for at least one j ∈ [1, d], \(|a_{i_j}^{-1}a_{i_{j+1}}|\leq 1\), proving (11). Now we claim that (11) holds for an arbitrary matrix a ∈GLd(K), no longer assumed diagonal. Indeed this follows from the fact that ∥⋅∥0 is invariant under \(\mathrm {GL}_d(\mathscr {O})\) and that any matrix in GLd(K) can be written as a product k 1 ak 2, with k 1, k 2 in \(\mathrm {GL}_d(\mathscr {O})\) and a diagonal, as can be easily checked using operations on rows and columns as in Gaussian elimination. Finally, the theorem is proved taking the infimum in overall a ∈GLd(K) in view of (7). □

Finally we record one last observation.

Proposition 1

If S  M d(K) is bounded and irreducible, then it admits a Barabanov norm, i.e., an ultrametric norm ∥⋅∥ such that maxsSsx∥ = ρ(S)∥xfor all x  Kd.

Proof

By the proof of Theorem 7, we may embed K into a complete algebraically closed valued field K whose value group is all of \(\mathbb {R}_{>0}\). Pick λ ∈K with |λ| = ρ(S). Then Lemma 13 implies that λ ≠ 0 and that S := Sλ ⊂ M d(K) is product bounded and admits an extremal norm ∥⋅∥. We may define the Barabanov norm of S by the same formula (8) applied to S. Irreducibility forces this semi-norm to be a genuine norm. □