1 Introduction

In this paper we study the Fibonacci Hamiltonian. Along with the almost Mathieu operator, this particular operator is the most heavily studied Schrödinger operator, with dozens of mathematics papers and hundreds of physics papers devoted to it. There are several reasons for this extensive interest in the spectral properties of the Fibonacci Hamiltonian. The first and perhaps most important reason is that this operator is a central model in mathematical physics. Namely it is relevant in the study of electronic properties of quasicrystals. Quasicrystals are materials that were first discovered by Shechtman in 1982, and this discovery led to a paradigm shift in materials science. In diffraction experiments they produce patterns consisting of sharp bright spots, the so-called Bragg peaks, while at the same time these diffraction patterns display symmetries that conclusively prove that the arrangement of atoms in the sample cannot be periodic. This came as a surprise as it had been believed that Bragg peaks can only be observed in the diffraction of materials for which the arrangement of atoms is periodic. It therefore took the scientific community a while until this discovery was properly digested and accepted, and it was finally published in the 1984 paper [93]. Shechtman has received numerous honors and distinctions for this discovery, including the 2011 Nobel Prize in Chemistry.

Since the 1980s mathematicians have studied appropriate models of quasicrystals. Naturally, the choice of these models is guided by the distinctive property of real-life quasicrystals, namely that of having a pure point diffraction which in turn displays symmetries that rule out periodicity. The central examples of the commonly accepted mathematical quasicrystal models are the Fibonacci tilings or sequences in one dimension and the Penrose tilings in two dimensions. Indeed, these examples belong to all classes of models that are typically considered in their respective dimension. In particular, they may be generated both by inflation and by a cut-and-project scheme.

Mathematical quasicrystal models are studied from many perspectives, including dynamical systems, harmonic analysis, spectral theory, discrete geometry, combinatorics, and algebra; compare [5, 7, 82]. The study of electronic or quantum transport in quasicrystals, which is the perspective we take in this paper, naturally leads to the consideration of Schrödinger operators with potentials modeling a quasicrystalline environment. Choosing the environment given by the Fibonacci tiling or sequence, this leads to the discrete one-dimensional Schrödinger operator

$$\begin{aligned}{}[H_{\lambda ,\omega } u](n) = u(n+1) + u(n-1) + \lambda \chi _{[1-\alpha ,1)}(n \alpha + \omega \!\!\!\! \mod 1) u(n),\quad \end{aligned}$$
(1)

acting in \(\ell ^2({\mathbb {Z}})\), where \(\lambda > 0\) is the coupling constant, \(\alpha = \frac{\sqrt{5}-1}{2}\) is the frequency, and \(\omega \in {\mathbb {T}}= {\mathbb {R}}/ {\mathbb {Z}}\) is the phase. In particular, \(\alpha \) is the inverse of the golden ratio

$$\begin{aligned} \varphi = \frac{\sqrt{5} + 1}{2}. \end{aligned}$$
(2)

Alternatively, the potential can be generated by the Fibonacci substitution \(a \mapsto ab\), \(b \mapsto a\); compare [25, 27, 29]. The operator family (1) is what is usually called the Fibonacci Hamiltonian. It was proposed and initially studied by Kohmoto et al. [65] and by Ostlund et al. [84], prior to the publication of [93], as a quasi-periodic model that can be solved exactly by renormalization group techniques. The relevance to quasicrystals was only established and discussed later. The first papers on the model in the mathematics literature belong to Casdagli [21] and Sütő [100].

The second reason for the extensive interest in this operator family is that it has exciting spectral properties. Independently of the relevance of the model to physics, the Fibonacci Hamiltonian also serves as a paradigm for many spectral phenomena that had been considered exotic prior to the 1980s. For example, it persistently displays Cantor spectrum, zero-measure spectrum, purely singular continuous spectral measures, and anomalous transport. Moreover, the fact that these properties can be rigorously established only adds to the importance of the model. Specifically, it is often difficult to answer questions about the spectrum and the spectral type for a given Schrödinger operator with an aperiodic and non-decaying potential (periodic and decaying potentials are well understood; compare, e.g., [97] and [48]). In many cases one rather resorts to statements about members in families of operators that hold generically or with probability one. The operator families corresponding to the Fibonacci and almost Mathieu cases are special in that quite detailed and difficult questions about these operators can be answered for all members of the respective family. Establishing this has been the objective of many publications in the past three decades focusing on these two operator families; see, for example, the surveys [25, 2729, 61, 63].

In this paper we show that the spectrum of the Fibonacci Hamiltonian is a dynamically defined Cantor set and that the density of states measure is exact-dimensional; this implies that all standard fractal dimensions coincide in each case. We show that all the gaps of the spectrum allowed by the gap labeling theorem are open for all values of the coupling constant. Also, we consider the optimal Hölder exponent of the integrated density of states, the dimension of the density of states measure, the dimension of the spectrum, and the upper transport exponent, establish strict inequalities between them, and provide the exact large coupling asymptotics of the dimension of the density of states measure (for the other three quantities, the large coupling asymptotics were known before). We also provide the explicit relations between these spectral characteristics and the dynamical properties of the Fibonacci trace map (such as dimensional characteristics of the non-wandering hyperbolic set and its measure of maximal entropy as well as other equilibrium measures, topological entropy, multipliers of periodic orbits). We establish exact identities relating the spectral and dynamical quantities, and show the connection between the spectral quantities and the thermodynamic pressure function. Our results not just improve but complete our understanding of many spectral characteristics and properties of Fibonacci Hamiltonian. In the rest of the introduction we provide the exact statement of the results and discuss them in the context of previously known facts.

1.1 The spectrum of the Fibonacci Hamiltonian

The spectrum of the Fibonacci Hamiltonian \(H_{\lambda ,\omega }\) is independent of \(\omega \) and may therefore by denoted by \(\Sigma _{\lambda }\). This follows from strong operator convergence and the minimality of an irrational rotation of the circle. It was shown by Sütő in [101] that \(\Sigma _{\lambda }\) is a Cantor set of zero Lebesgue measure for every \(\lambda > 0\). The zero-measure property in turn rules out any absolutely continuous spectrum for \(H_{\lambda ,\omega }\). Complementing this, Damanik and Lenz showed that \(H_{\lambda ,\omega }\) has no eigenvalues [39], and hence for all parameter values, all spectral measures are purely singular continuous. This answers the basic qualitative spectral questions about this operator family.

Our first result shows that actually \(\Sigma _{\lambda }\) is a dynamically defined Footnote 1 Cantor set, that is, it belongs to a special and heavily studied class of Cantor sets that have strong self-similarity properties (see [85] for the formal definition and a detailed discussion of the properties of dynamically defined Cantor sets).

Theorem 1.1

For every \(\lambda > 0,\) \(\Sigma _{\lambda }\) is a dynamically defined Cantor set. In particular,  for every \(E \in \Sigma _{\lambda }\) and every \(\varepsilon > 0,\) we have

$$\begin{aligned} \dim _H \left( (E-\varepsilon , E+\varepsilon ) \cap \Sigma _{\lambda } \right)= & {} \dim _B \left( (E-\varepsilon , E+\varepsilon ) \cap \Sigma _{\lambda } \right) \\= & {} \dim _H \Sigma _{\lambda } = \dim _B \Sigma _{\lambda }. \end{aligned}$$

Here, \(\dim _H(S)\) (resp., \(\dim _B(S)\)) denotes the Hausdorff (resp., box counting) dimension of the Borel set \(S \subset {\mathbb {R}}\). Stating the identities above contains the implicit assertion that the box counting dimension of the set in question exists.

This result was previously known for \(\lambda \ge 16\) [21] and \(\lambda > 0\) sufficiently small [32]. Knowing that the spectrum is a dynamically defined Cantor set not only establishes the equality of all standard fractal dimensions of the set (and shows that this common dimension is bounded away from zero and one), it also serves as the starting point for further studies. For example, higher-dimensional separable models may be considered and their spectra turn out to be given by the sum of the one-dimensional spectra; compare, for example, [33, 37]. This leads to a study of sums of dynamically defined Cantor sets, which is an extensively investigated problem about which much is known (see, e.g., [57, 83] and references therein).

1.2 Transport exponents

Given that the operator \(H_{\lambda ,\omega }\) has purely singular continuous spectrum for all parameter values, the RAGE Theorem (see, e.g., [89, Theorem XI.115]) suggests that when studying the Schrödinger time evolution for this Schrödinger operator, that is, \(e^{-itH_{\lambda ,\omega }} \psi \) for some initial state \(\psi \in \ell ^2({\mathbb {Z}})\), one should consider time-averaged quantities. For simplicity, let us consider initial states of the form \(\delta _n\), \(n \in {\mathbb {Z}}\). Since a translation in space simply results in an adjustment of the phase, we may without loss of generality focus on the particular case \(\psi = \delta _0\). The time-averaged spreading of \(e^{-itH_{\lambda ,\omega }} \delta _0\) is usually captured on a power-law scale as follows; compare, for example, [47, 70]. For \(p > 0\), consider the p-th moment of the position operator,

$$\begin{aligned} \langle |X|_{\delta _0}^p \rangle (t) = \sum _{n \in {\mathbb {Z}}} |n|^p | \langle e^{-itH_{\lambda ,\omega }} \delta _0, \delta _n \rangle |^2 \end{aligned}$$

We average in time as follows. If f(t) is a function of \(t > 0\) and \(T > 0\) is given, we denote the time-averaged function at T by \(\langle f \rangle (T)\):

$$\begin{aligned} \langle f \rangle (T) = \frac{2}{T} \int _0^{\infty } e^{-2t/T} f(t) \, dt. \end{aligned}$$

Then, the corresponding upper and lower transport exponents \(\tilde{\beta }^+_{\delta _0}(p)\) and \(\tilde{\beta }^-_{\delta _0}(p)\) are given, respectively, by

$$\begin{aligned} \tilde{\beta }^+_{\delta _0}(p)= & {} \limsup _{T \rightarrow \infty } \frac{\log \langle \langle |X|_{\delta _0}^p \rangle \rangle (T) }{p \, \log T},\\ \tilde{\beta }^-_{\delta _0}(p)= & {} \liminf _{T \rightarrow \infty } \frac{\log \langle \langle |X|_{\delta _0}^p \rangle \rangle (T) }{p \, \log T}. \end{aligned}$$

The transport exponents \(\tilde{\beta }^\pm _{\delta _0}(p)\) belong to [0, 1] and are non-decreasing in p (see, e.g., [47]), and hence the following limits exist:

$$\begin{aligned} \tilde{\alpha }_l^\pm= & {} \lim _{p \rightarrow 0} \tilde{\beta }^\pm _{\delta _0}(p), \\ \tilde{\alpha }_u^\pm= & {} \lim _{p \rightarrow \infty } \tilde{\beta }^\pm _{\delta _0}(p). \end{aligned}$$

Ballistic transport corresponds to transport exponents being equal to one, diffusive transport corresponds to the value \(\frac{1}{2}\), and vanishing transport exponents correspond to (some weak form of) dynamical localization. In all other cases, transport is called anomalous. The Fibonacci Hamiltonian has long been the primary candidate for a model exhibiting anomalous transport, going back at least to the work of Abe and Hiramoto [1]. Many papers have been devoted to a study of the transport properties of the Fibonacci Hamiltonian; see, for example, [16, 24, 26, 38, 4246, 62, 68]. For example, it is known that all the transport exponents defined above are strictly positive for all \(\lambda > 0\), \(\omega \in {\mathbb {T}}\); see [38]. On the other hand, upper bounds for all the transport exponents were shown in [45] for \(\lambda > 8\) (see also [16] for a somewhat weaker result). The exact large coupling asymptotics of \(\tilde{\alpha }_u^\pm \) were identified in [46], where is was shown that

$$\begin{aligned} \lim _{\lambda \rightarrow \infty } \tilde{\alpha }_u^\pm \cdot \log \lambda = 2 \log \varphi , \end{aligned}$$
(3)

uniformly in \(\omega \in {\mathbb {T}}\). In particular, the Fibonacci Hamiltonian indeed gives rise to anomalous transport for sufficiently large coupling. The behavior in the weak coupling regime was studied in [36], where it was shown that there is a constant \(c > 0\) such that for \(\lambda > 0\) sufficiently small, we have

$$\begin{aligned} 1 - c\lambda ^2 \le \tilde{\alpha }_u^\pm \le 1, \end{aligned}$$

uniformly in \(\omega \in {\mathbb {T}}\).

While it is of clear interest to identify the asymptotic behavior of \(\tilde{\alpha }_u^\pm \) in the large and small coupling regimes, and in particular show that the asymptotic behavior of \(\tilde{\alpha }_u^+\) coincides with that of \(\tilde{\alpha }_u^-\), the following questions remain. What can we say for a given value of \(\lambda \)? Can we for example give an explicit expression for \(\tilde{\alpha }_u^+\) or \(\tilde{\alpha }_u^-\)? Can we even show that \(\tilde{\alpha }_u^+\) and \(\tilde{\alpha }_u^-\) coincide for the given value of \(\lambda \) (and \(\omega \))?

We will address these questions in this paper. An explicit description of \(\tilde{\alpha }_u^\pm \) will be given in Theorem 1.6 stated later in this introduction (it will require the trace map formalism, which will be recalled in Sect. 1.4). A particular consequence of the description given there is that the desired identity holds:

Theorem 1.2

For every \(\lambda > 0,\) \(\tilde{\alpha }^+_u(\lambda )\) and \(\tilde{\alpha }^-_u(\lambda )\) are equal and independent of \(\omega \in {\mathbb {T}}\).

The interpretation of this statement is that, for all values of the coupling constant and the phase, the fastest part of the wavepacket travels uniformly on a power-law scale. That is, there aren’t two different sequences of time scales along which the “front of the wavepacket” moves at two different power-law rates. To the best of our knowledge this is the first time this phenomenon has been rigorously established for a model for which \(\tilde{\alpha }^+_u(\lambda )\) and \(\tilde{\alpha }^-_u(\lambda )\) take fractional values. The reason for this is that it is usually very difficult to identify transport exponents exactly (if they take fractional values) and hence most results only establish estimates for them.

The independence of \(\omega \) is also of interest as it confirms what had been expected based on the following intuition. Due to the general ballistic upper bound, the evolution of \(\delta _0\) explores only finite regions of space, up to super-polynomially small tails that do not contribute to the moments of the evolution that are measured via transport exponents, for any bounded time interval. Since the local structure (often called the subwords or the factors) of the potential is the same for all \(\omega \)’s, the quantum state cannot determine the phase \(\omega \) and hence the transport exponents should indeed be independent of it. Nevertheless, this is the first time such a result has been established rigorously in a case where transport exponents take fractional values.

1.3 Density of states measure and gap labeling

Let us recall the definition of the density of states measure and some derived quantities. By the spectral theorem, there are Borel probability measures \(\mu _{\lambda ,\omega }\) on \({\mathbb {R}}\) such that

$$\begin{aligned} \langle \delta _0, g(H_{\lambda ,\omega }) \delta _0 \rangle = \int g(E) \, d\mu _{\lambda ,\omega }(E) \end{aligned}$$

for all bounded measurable functions g. The density of states measure \(\nu _{\lambda }\) is given by the \(\omega \)-average of these measures with respect to Lebesgue measure, that is,

$$\begin{aligned} \int _{\mathbb {T}}\langle \delta _0, g(H_{\lambda ,\omega }) \delta _0 \rangle \, d\omega = \int g(E) \, d\nu _{\lambda }(E) \end{aligned}$$

for all bounded measurable functions g. By general principles, the density of states measure is non-atomic and its topological support is \(\Sigma _{\lambda }\). The fact that \(\Sigma _{\lambda }\) has zero Lebesgue measure therefore implies that \(\nu _{\lambda }\) is singular continuous for every \(\lambda > 0\). The density of states measure can also be obtained by counting the number of eigenvalues per unit volume, in a given energy region, of restrictions of the operator to finite intervals (which explains the terminology). Indeed, for any real \(a < b\),

$$\begin{aligned} \nu _{\lambda }(a,b) = \lim _{L \rightarrow \infty } \frac{1}{L} \# \big \{ \text {eigenvalues of } H_{\lambda ,\omega }|_{[1,L]} \text { that lie in } (a,b) \big \}, \end{aligned}$$

uniformly in \(\omega \); compare [58]. Here, for definiteness, \(H_{\lambda ,\omega }|_{[1,L]}\) is defined with Dirichlet boundary conditions.

We will be interested in the optimal Hölder exponent \(\gamma _{\lambda }\) of \(\nu _{\lambda }\). That is, \(\gamma _{\lambda }\) is the unique number in [0, 1] such that the following two properties hold.

  1. 1.

    For any \(\gamma < \gamma _{\lambda }\) and any sufficiently small interval \(I \subset \mathbb {R}\), we have \(\nu (I) < |I|^{\gamma }\);

  2. 2.

    For any \(\tilde{\gamma } > \gamma _{\lambda }\) and any \(\varepsilon > 0\), there exists an interval \(I \subset \mathbb {R}\) such that \(|I| < \varepsilon \) and \(\nu (I) > |I|^{\tilde{\gamma }}\).

The optimal Hölder exponent of the density of states measure is studied for other popular discrete Schrödinger operators in numerous papers; see, for example, [3, 15, 5456] and references therein. For the Fibonacci case in the regime of small or large coupling, it was studied in [35]. In particular, it was shown that \(\gamma _{\lambda }\rightarrow 1/2\) as \(\lambda \rightarrow 0\) and \(\gamma _{\lambda }\rightarrow 0\) as \(\lambda \rightarrow \infty \) (the explicit rate at which it does so is recalled in Theorem 1.10 below). In all these works only estimates and asymptotics for the optimal Hölder exponent were established. In this paper, we will express the optimal Hölder exponent in the Fibonacci case explicitly, for any value of the coupling constant, in terms of dynamical quantities related to the Fibonacci trace map (the explicit formula is provided in Theorem 1.6 below). For small values of the coupling constant \(\lambda \), this will allow us to give an exact formula for \(\gamma _{\lambda }\) as a function of \(\lambda \); see Corollary 6.2.

The distribution function of the density of states measure is called the integrated density of states and denoted by \(N_{\lambda }\). Thus, for \(E \in {\mathbb {R}}\), we have

$$\begin{aligned} N_{\lambda }(E)= & {} \int \chi _{(-\infty ,E]} \, d\nu _{\lambda }\\= & {} \lim _{L \rightarrow \infty } \frac{1}{L} \# \big \{ \text {eigenvalues of } H_{\lambda ,\omega }|_{[1,L]} \text { that are } \le E \big \}, \end{aligned}$$

uniformly in \(\omega \).

Since \(\Sigma _{\lambda }\) is the topological support of \(\nu _{\lambda }\), it follows that \(N_{\lambda }\) is constant on each gap of \(\Sigma _{\lambda }\), where any connected component of \({\mathbb {R}}{\setminus } \Sigma _{\lambda }\) is called a gap of \(\Sigma _{\lambda }\). This value may be used as the label of the gap in question. The gap labeling theorem (see, e.g., [11, 64]) provides a set that is defined purely in terms of the underlying dynamical system generating the ergodic family of potentials in question (in our case this is either the irrational rotation of the circle by the (inverse of the) golden ratio, or the shift transformation on the subshift generated by the Fibonacci substitution), to which all gap labels must belong. This general gap labeling theorem specializes in the Fibonacci case to the following statement (see, e.g., [12, Eq. (6.7)]):

$$\begin{aligned} \{ N_{\lambda }(E) : E \in {\mathbb {R}}{\setminus } \Sigma _{\lambda } \} \subseteq \{ \{ m \varphi \} : m \in {\mathbb {Z}}\} \cup \{ 1 \} \end{aligned}$$
(4)

for every \(\lambda > 0\). Here \(\{ m \varphi \}\) denotes the fractional part of \(m \varphi \), that is, \(\{ m \varphi \} = m \varphi - \lfloor m \varphi \rfloor \). Notice that the set of possible gap labels is indeed \(\lambda \)-independent and only depends on the value of \(\varphi \) from the underlying circle rotation. Since \(\varphi \) is irrational, the set of gap labels is dense.

In general, a dense set of possible gap labels is indicative of a Cantor spectrum and hence a common (and attractive) stronger version of proving Cantor spectrum is to show that the operator “has all its gaps open.” For example, the Ten Martini Problem for the almost Mathieu operator is to show Cantor spectrum, while the Dry Ten Martini Problem is to show that all labels correspond to gaps in the spectrum. The former problem has been completely solved [2], while the latter has not yet been completely settled (it remains open for the case of critical coupling and non-Liouville frequency; see [2, 4, 22] and references therein). Indeed, it is in general a hard problem to show that all labels given by the gap labeling theorem correspond to gaps and there are only few results of this kind.

Here we show the stronger (or “dry”) form of Cantor spectrum for the Fibonacci Hamiltonian and establish complete gap labeling:

Theorem 1.3

For every \(\lambda > 0,\) all gaps allowed by the gap labeling theorem are open. That is, 

$$\begin{aligned} \{ N_{\lambda }(E) : E \in {\mathbb {R}}{\setminus } \Sigma _{\lambda } \} = \{ \{ m \varphi \} : m \in {\mathbb {Z}}\} \cup \{ 1 \}. \end{aligned}$$
(5)

Raymond proved (5) for \(\lambda > 4\) [88] and Damanik and Gorodetski proved (5) for \(\lambda > 0\) sufficiently small [33]. In [33] it was also shown that all gaps open linearly as the coupling constant is turned on. It was conjectured in [33] that (5) holds for every \(\lambda > 0\), and Theorem 1.3 proves this conjecture.

Our next result concerns the exact-dimensionality of the density of states measure for every value of the coupling constant.

Theorem 1.4

For every \(\lambda > 0,\) the density of states measure \(\nu _{\lambda }\) is exact-dimensional. Namely,  for every \(\lambda > 0,\) the limit (called the scaling exponent of \(\nu _{\lambda }\) at E)

$$\begin{aligned} \lim _{\varepsilon \downarrow 0} \frac{\log \nu _{\lambda }(E - \varepsilon , E + \varepsilon )}{\log \varepsilon } \end{aligned}$$

\(\nu _{\lambda }\)-almost everywhere exists and is constant (and equal to \(\dim _H\nu _{\lambda }\)). The dimension \(\dim _H\nu _{\lambda }\) is a real-analytic function of \(\lambda \in (0, \infty )\).

Notice that the analyticity of \(\dim _H \Sigma _{\lambda }\) was previously established in [19]. In [34] Damanik and Gorodetski had shown the exact-dimensionality of \(\nu _{\lambda }\) for \(\lambda > 0\) sufficiently small. A particular consequence of the exact-dimensionality of \(\nu _{\lambda }\) is that virtually all the known characteristics of dimension type of the measure coincide. In particular, the following four dimensions associated with the measure \(\nu _{\lambda }\), those most relevant to quantum dynamics, coincide (namely with the almost everywhere value of the limit above):

$$\begin{aligned} \dim _H \nu _{\lambda }= & {} \inf \{ \dim _H(S) : \nu _{\lambda }(S) = 1 \}, \\ \dim _H^- \nu _{\lambda }= & {} \inf \{ \dim _H(S) : \nu _{\lambda }(S) > 0 \}, \\ \dim _P \nu _{\lambda }= & {} \inf \{ \dim _P(S) : \nu _{\lambda }(S) = 1 \}, \\ \dim _P^- \nu _{\lambda }= & {} \inf \{ \dim _P(S) : \nu _{\lambda }(S) > 0 \}. \end{aligned}$$

Here, \(\dim _P(S)\) denotes the packing dimension of the Borel set \(S \subset {\mathbb {R}}\). These four dimensions are called the upper and lower Hausdorff dimension and the upper and lower packing dimension of \(\nu _{\lambda }\), respectively; compare, for example, [52].

1.4 Trace map dynamics and transversality

There is a fundamental connection between the spectral properties of the Fibonacci Hamiltonian and the dynamics of the trace map

$$\begin{aligned} T : \mathbb {R}^3 \rightarrow \mathbb {R}^3,\quad T(x,y,z)=(2xy-z,x,y). \end{aligned}$$
(6)

The function \(G(x,y,z) = x^2+y^2+z^2-2xyz-1\) is invariantFootnote 2 under the action of T, and hence T preserves the family of cubic surfacesFootnote 3

$$\begin{aligned} S_{\lambda } = \left\{ (x,y,z)\in \mathbb {R}^3 : x^2+y^2+z^2-2xyz=1+ \frac{\lambda ^2}{4} \right\} . \end{aligned}$$
(7)

It is therefore natural to consider the restriction \(T_{\lambda }\) of the trace map T to the invariant surface \(S_{\lambda }\). That is, \(T_{\lambda }:S_{\lambda } \rightarrow S_{\lambda }\), \(T_{\lambda }=T|_{S_{\lambda }}\). We denote by \(\Lambda _{\lambda }\) the set of points in \(S_{\lambda }\) whose full orbits under \(T_{\lambda }\) are bounded (it is known that \(\Lambda _{\lambda }\) is equal to the non-wandering set of \(T_{\lambda }\); e.g. see Lemma 4.3 from [80]).

Denote by \(\ell _{\lambda }\) the line

$$\begin{aligned} \ell _{\lambda } = \left\{ \left( \frac{E-\lambda }{2}, \frac{E}{2}, 1 \right) : E \in {\mathbb {R}}\right\} . \end{aligned}$$
(8)

It is easy to check that \(\ell _{\lambda } \subset S_{\lambda }\). The key to the fundamental connection between the spectral properties of the Fibonacci Hamiltonian and the dynamics of the trace map is the following result of Sütő [100]. An energy \(E \in {\mathbb {R}}\) belongs to the spectrum \(\Sigma _{\lambda }\) of the Fibonacci Hamiltonian if and only if the positive semiorbit of the point \((\frac{E-\lambda }{2}, \frac{E}{2}, 1)\) under iterates of the trace map T is bounded. This connection shows that spectral properties of the Fibonacci Hamiltonian can be studied via an analysis of the dynamics of the trace map.

Another very important ingredient is the following. For every \(\lambda > 0\), \(\Lambda _{\lambda }\) is a locally maximal compact transitive hyperbolic set of \(T_{\lambda } : S_{\lambda } \rightarrow S_{\lambda }\); see [19, 21, 32]. This fact allows one to use powerful tools from hyperbolic dynamics in exploring the connection between the operator and the trace map. Actually, this realization is the driving force behind all of the recent advances (roughly those dating back to 2008, starting with [30]). To fully exploit this, one needs that the stable manifolds of points in \(\Lambda _{\lambda }\) intersect the line of initial conditions, \(\ell _{\lambda }\), transversally. This crucial fact was known for \(\lambda \) sufficiently large [21] or sufficiently small [32], but open in the intermediate regime. As a consequence, many of the recent results could only be shown in the regimes of small and large coupling.

Theorem 1.5

For every \(\lambda > 0,\) \(\ell _{\lambda }\) intersects \(W^s(\Lambda _{\lambda })\) transversally.

Here, \(\ell _{\lambda }\) denotes the line of initial conditions given in (8) and \(W^s(\Lambda _{\lambda })\) denotes the collection of stable manifold of points in the locally maximal compact transitive hyperbolic set \(\Lambda _{\lambda }\) of \(T_{\lambda } : S_{\lambda } \rightarrow S_{\lambda }\).

Theorems 1.1, 1.3, and 1.4 are consequences of Theorem 1.5. In fact, each of the statements in Theorems 1.1, 1.3, and 1.4 was previously known for \(\lambda > 0\) sufficiently small [3234]; but more precisely, these statements were shown in each case to hold for all values of the coupling constant between zero and the specific value where a breakdown of transversality first occurs (or \(\infty \) if no such value exists). Since transversality is easily seen to hold for \(\lambda > 0\) sufficiently small [32], one could derive the desired statements unconditionally in the small coupling regime. For this reason, proving the absence of a breakdown of transversality had been one of the major goals in the study of the Fibonacci Hamiltonian, and Theorem 1.5 finally accomplishes this goal.

It is interesting to note that the proof of Theorem 1.5 is not a straightforward construction of an invariant cone field but rather uses the fact that the trace map is polynomial as well as spectral arguments (the fact that \(\Sigma _{\lambda }\) does not have isolated points).

1.5 Connections between spectral characteristics and dynamical quantities

Recall that we are primarily interested in the following four quantities associated with the Fibonacci Hamiltonian: the upper transport exponents \(\tilde{\alpha }^\pm _u(\lambda )\), the dimension of the spectrum \(\dim _H \Sigma _{\lambda }\), the dimension of the density of states measure \(\dim _H \nu _{\lambda }\), and the optimal Hölder exponent of the integrated density of states \(\gamma _{\lambda }\). Our next main result establishes explicit identities connecting the four spectral/quantum dynamical quantities of interest with dynamical quantities associated with the trace map. In this theorem, \(\mu _{\lambda ,{\mathrm {max}}}\) denotes the measure of maximal entropy of \(T_{\lambda }|_{\Lambda _{\lambda }}\) and \(\mu _{\lambda }\) denotes the equilibrium measure of \(T_{\lambda }|_{\Lambda _{\lambda }}\) that corresponds to the potential \(- \dim _H \Sigma _{\lambda } \cdot \log \Vert DT_{\lambda }|_{E^u}\Vert \). By \({\mathrm {Lyap}}^u(p)\) we will denote the unstable (positive) Lyapunov exponent of the periodic point p, and by \({\mathrm {Lyap}}^u \mu _{\lambda }\) (or \({\mathrm {Lyap}}^u \mu _{\lambda ,{\mathrm {max}}}\)) we will denote the unstable Lyapunov exponent of the ergodic invariant measure \(\mu _{\lambda }\) (respectively, \(\mu _{\lambda ,{\mathrm {max}}}\)). Recall from (2) that \(\varphi \) denotes the golden ratio.

Theorem 1.6

For every \(\lambda > 0,\) we have

$$\begin{aligned} \tilde{\alpha }^\pm _u(\lambda )= & {} \frac{\log \varphi }{\inf _{p \in Per(T_{\lambda })} {\mathrm {Lyap}}^u(p)}, \end{aligned}$$
(9)
$$\begin{aligned} \dim _H \Sigma _{\lambda }= & {} \frac{h_{\mu _{\lambda }}}{{\mathrm {Lyap}}^u \mu _{\lambda }}, \end{aligned}$$
(10)
$$\begin{aligned} \dim _H \nu _{\lambda }= & {} \dim _H \mu _{\lambda ,{\mathrm {max}}} = \frac{h_{\mathrm {top}}(T_{\lambda })}{{\mathrm {Lyap}}^u \mu _{\lambda ,{\mathrm {max}}}} = \frac{\log \varphi }{{\mathrm {Lyap}}^u \mu _{\lambda ,{\mathrm {max}}}}, \end{aligned}$$
(11)
$$\begin{aligned} \gamma _{\lambda }= & {} \frac{\log \varphi }{\sup _{p \in Per(T_{\lambda })} {\mathrm {Lyap}}^u(p)}. \end{aligned}$$
(12)

As mentioned earlier, Theorem 1.2 is a direct consequence of (9). Another consequence of (9) is that we can derive explicit lower bounds for \(\tilde{\alpha }^\pm _u(\lambda )\) by simply estimating \(\inf _{p \in Per(T_{\lambda })} {\mathrm {Lyap}}^u(p)\) from above using specific choices of periodic points. By the same token, these specific choices of periodic points will also lead to upper bounds for \(\gamma _{\lambda }\) due to (12). For example, this leads to the following pair of explicit lower and upper bounds (the period p of the periodic point leading to this bound is given in parentheses).

Corollary 1.7

For every \(\lambda > 0,\) we have

$$\begin{aligned} \gamma _{\lambda }\le & {} \frac{4 \log \varphi }{\log ( 4 \lambda ^2 + \sqrt{16 \lambda ^4 + 56 \lambda ^2 + 45} + 7 ) - \log 2} \le \tilde{\alpha }^\pm _u(\lambda )\quad (p = 6) \end{aligned}$$
(13)
$$\begin{aligned} \gamma _{\lambda }\le & {} \frac{6 \log \varphi }{\log ( \lambda ^4 + \sqrt{( \lambda ^4 + 8 \lambda ^2 + 18 )^2 - 4} + 8 \lambda ^2 + 18 ) - \log 2} \le \tilde{\alpha }^\pm _u(\lambda )\quad (p = 4)\nonumber \\ \end{aligned}$$
(14)

The graphs of these two functions are shown in Fig. 1. We see that for \(\tilde{\alpha }^\pm _u(\lambda )\), (13) is better for small \(\lambda \), while (14) is better for large \(\lambda \), whereas the opposite is true for \(\gamma _{\lambda }\).

Fig. 1
figure 1

The two bounds for \(\tilde{\alpha }^\pm _u(\lambda )\) and \(\gamma _{\lambda }\) from Corollary 1.7

In fact, a better upper bound for \(\gamma _{\lambda }\) can be derived via a different family of periodic points (of period two). The associated Lyapunov exponents can also be given explicitly; for the corresponding expression, see Corollary 6.2. The upper bounds resulting from the Lyapunov exponents of the families of period two (left implicit here) and period six (given above) are given in Fig. 2.

Fig. 2
figure 2

Upper bounds for \(\gamma _{\lambda }\) via periodic points of period 2 and 6

1.6 Thermodynamical formalism and relations between spectral characteristics

By general principles, we have

$$\begin{aligned} \gamma _{\lambda } \le \dim _H \nu _{\lambda } \le \dim _H \Sigma _{\lambda }. \end{aligned}$$

This is obvious since \(\Sigma _{\lambda }\) supports the measure \(\nu _{\lambda }\), and the almost everywhere scaling exponent of \(\nu _{\lambda }\) is at least as big as one that works at every point. On the other hand, there is no inequality that relates \(\tilde{\alpha }^\pm _u(\lambda )\) to one of the other three quantities, which holds for general operators.Footnote 4 The following theorem shows that for the Fibonacci Hamiltonian and every value of the coupling constant, the four quantities satisfy strict inequalities.

Theorem 1.8

For every \(\lambda > 0,\) we have

$$\begin{aligned} \gamma _{\lambda } < \dim _H \nu _{\lambda } < \dim _H \Sigma _{\lambda } < \tilde{\alpha }^\pm _u(\lambda ). \end{aligned}$$
(15)

The particular inequality \(\dim _H \nu _{\lambda } < \dim _H \Sigma _{\lambda }\) in (15) establishes a conjecture of Barry Simon, which was made based on an analogy with work of Makarov and Volberg [75, 76, 102]; see [34] for a more detailed discussion. This inequality was shown in [34] for \(\lambda > 0\) sufficiently small, and hence the conjecture had been partially established there. Our result here settles it in the generality in which it was stated.Footnote 5

Theorem 1.8 places \(\tilde{\alpha }^\pm _u(\lambda )\) in relation to the other three quantities. As mentioned above, it is in general not clear how it relates to them and in this case it turns out to be strictly larger. This also realizes the hope expressed in [34] that phased-averaged spectral measures bound phase-averaged transport from below, which is a result known for the critical almost Mathieu operator due to work of Bellissard, Guarneri and Schulz-Baldes [13], but which was not known for the Fibonacci Hamiltonian. Indeed, the density of states measure is the phase-average of the \(\delta _0\)-spectral measures and the transport exponents are phase-independent by Theorem 1.2, and hence are equal to their phase average. Through the particular inequality \(\dim _H \nu _{\lambda } < \tilde{\alpha }^\pm _u(\lambda )\) in (15), we therefore establish here the analogue of the Bellissard–Guarneri–Schulz–Baldes result for the almost Mathieu operator in the case of the Fibonacci Hamiltonian.

Moreover, the inequality

$$\begin{aligned} \dim _H \Sigma _{\lambda } < \tilde{\alpha }^\pm _u(\lambda ) \end{aligned}$$
(16)

in (15) is related to a question of Yoram Last. He asked in [70] whether in general \(\dim _H \Sigma _{\lambda }\) bounds \(\tilde{\alpha }^\pm _u(\lambda )\) from above and conjectured that the answer is no. The inequality (16) confirms this. This realization is not new. It was shown in [46] (resp., [36]) that (16) holds for \(\lambda > 0\) sufficiently large (resp., for \(\lambda > 0\) sufficiently small). What we add here is that it holds for all \(\lambda > 0\).

The identities in Theorem 1.6 are instrumental in our proof of Theorem 1.8. Indeed, once the identities (9)–(12) are established, Theorem 1.8 can be proved using the thermodynamic formalism, which we will describe next. Define \(\phi : \Lambda _{\lambda } \rightarrow {\mathbb {R}}\) by \(\phi (x) = -\log \Vert DT_{\lambda } (x)|_{E^u}\Vert \) and consider the pressure function (sometimes called the Bowen function) \(P : t \mapsto P(t\phi )\), where \(P(\psi )\) is the topological pressure.Footnote 6 This function has been heavily studied; the next statement summarizes some known results; compare [17, 67, 86, 91, 103, 104].

Proposition 1.9

Suppose that \(\sigma _A : \Sigma _A \rightarrow \Sigma _A\) is a topological Markov chain defined by a transitive \(0-1\) matrix A,  and \(\phi : \Sigma _A \rightarrow {\mathbb {R}}\) is a Hölder continuous function. Then,  the following statements hold.

  1. 1.

    Variational principle :  \(P(t\phi ) = \sup _{\mu \in \mathfrak {M}} \{ h_\mu + t \int \phi \, d\mu \}\).

  2. 2.

    For every \(t \in {\mathbb {R}},\) there exists a unique \(\sigma _A\)-invariant Borel probability measure \(\mu _t\) (the equilibrium state) such that \(P(t\phi ) = h_{\mu _t} + t \int \phi \, d\mu _t\).

  3. 3.

    \(P(t\phi )\) is a real analytic function of t.

  4. 4.

    If \(\phi \) is cohomological to a constant,  then \(P(t\phi )\) is a linear function;  if \(\phi \) is not cohomological to a constant,  then \(P(t\phi )\) is strictly convex and decreasing.

  5. 5.

    For every \(t_0 \in {\mathbb {R}},\) the line \(h_{\mu _{t_0}} + t \int \phi \, d\mu _{t_0}\) is tangent to the graph of the function \(P(t\phi )\) at the point \((t_0, P(t_0\phi ))\).

  6. 6.

    Denote by \(\mathfrak {M}\) the space of \(\sigma _A\)-invariant Borel probability measures. The following limits exist : 

    $$\begin{aligned} \lim _{t \rightarrow \infty } \int \phi \, d\mu _t = \sup _{\mu \in \mathfrak {M}} \int \phi \, d\mu ,\quad \lim _{t \rightarrow -\infty } \int \phi \, d\mu _t = \inf _{\mu \in \mathfrak {M}} \int \phi \, d\mu . \end{aligned}$$

    The graph of the function \(t \mapsto P(t\phi )\) lies strictly above each of the lines \(t\cdot \sup _{\mu \in \mathfrak {M}} \int \phi \, d\mu \) and \(t\cdot \inf _{\mu \in \mathfrak {M}} \int \phi \, d\mu \).

Now let us return to our case where \(\sigma _A : \Sigma _A \rightarrow \Sigma _A\) is conjugate to \(T_{\lambda }|_{\Lambda _{\lambda }}\) and the potential is given by \(\phi (x) = -\log \Vert DT_{\lambda } (x)|_{E^u}\Vert \) (suppressing the conjugacy). In Sect. 7 we prove that this potential is not cohomological to a constant. For any \(t \in {\mathbb {R}}\), consider the tangent line to the graph of P(t) at the point \((t, P(t\phi ))\). Since P(t) is decreasing, there exists exactly one point of intersection of the tangent line with the t-axis, at the point \(t_0 = -\frac{h_{\mu _t}}{\int \phi \, d\mu } = \frac{h_{\mu _t}}{Lyap^u\,\mu _t} = \mathrm {dim}_H\mu _t\). The last equality here is due to [77]. In particular, \(\mathrm {dim}_H \mu _{\mathrm {max}} = \mathrm {dim}_H \nu _{\lambda }\) is given by the point of intersection of the tangent line to the graph of P(t) at the point \((0, h_{top}(T_{\lambda }))\) with the t-axis. Also, due to Theorem 1.6 the line \(h_{\mathrm {top}} (T_{\lambda }) + t \cdot \inf _{\mu \in \mathfrak {M}} \int \phi \, d\mu \) intersects the t-axis at the point \(\gamma _{\lambda }\), and the line \(h_{\mathrm {top}} (T_{\lambda }) + t \cdot \sup _{\mu \in \mathfrak {M}} \int \phi \, d\mu \) intersects the t-axis at the point \(\tilde{\alpha }^\pm _u(\lambda )\). Finally, due to [79], the graph of P(t) intersects the t-axis at the point \(\mathrm {dim}_H\Sigma _{\lambda }\). These observations are illustrated in Fig. 3 and explain where the strict inequalities in Theorem 1.8 come from once it is shown that \(\phi \) is not cohomological to a constant (we do that in Sect. 7).

Fig. 3
figure 3

Pressure function and spectral characteristics of the Fibonacci Hamiltonian

1.7 Large coupling asymptotics

For each of the four quantities in question, the large coupling asymptotics are given in the following theorem.

Theorem 1.10

We have

$$\begin{aligned} \lim _{\lambda \rightarrow \infty } \tilde{\alpha }^\pm _u(\lambda ) \cdot \log \lambda= & {} 2 \, \log \varphi , \end{aligned}$$
(17)
$$\begin{aligned} \lim _{\lambda \rightarrow \infty } \dim _H \Sigma _{\lambda } \cdot \log \lambda= & {} \log (1 + \sqrt{2}) \approx 1.83156 \, \log \varphi , \end{aligned}$$
(18)
$$\begin{aligned} \lim _{\lambda \rightarrow \infty } \dim _H \nu _{\lambda } \cdot \log \lambda= & {} \frac{5 + \sqrt{5}}{4} \log \varphi \approx 1.80902 \, \log \varphi , \end{aligned}$$
(19)
$$\begin{aligned} \lim _{\lambda \rightarrow \infty } \gamma _{\lambda } \cdot \log \lambda= & {} 1.5 \, \log \varphi . \end{aligned}$$
(20)

Only (19) is new here, the other results are stated for completeness and comparison purposes. Indeed, (17) was shown in [45, 46], (18) was shown in [30], and (20) was shown in [35]. Thus, our proof of (19) in this paper completes our understanding of the large coupling asymptotics of the four quantities of interest.

Our results open the door for numerous extensions and generalizations. We briefly discuss some of them in Sect. 8.

2 Preliminaries

For the convenience of the reader we recall in this section briefly how the trace map formalism arises naturally in the study of the Fibonacci Hamiltonian via the transfer matrix formalism and the self-similar structure of the potential, and how this gives rise to dynamical descriptions of spectral quantities. A reader familiar with the recent papers on the Fibonacci Hamiltonian may skip this section.

Consider the operator family \(H_{\lambda ,\omega }\) defined in (1) and recall that \(\alpha = \frac{\sqrt{5}-1}{2}\). The spectral analysis of one-dimensional Schrödinger operators is often carried out through the analysis of the solutions of the associated difference equation

$$\begin{aligned} u(n+1) + u(n-1) + \lambda \chi _{[1-\alpha ,1)}(n \alpha + \omega \!\!\!\! \mod 1) u(n) = E u(n) \end{aligned}$$
(21)

for \(E \in {\mathbb {C}}\).

While (21) looks like the eigenvalue equation for the operator H, we emphasize that the solutions of (21) we consider do not have to belong to \(\ell ^2({\mathbb {Z}})\). Thus, for each \(E \in {\mathbb {C}}\), the solutions of (21) form a two-dimensional vector space. Indeed, as soon as we fix two consecutive values of u, the whole solution is completely determined by (21). For example, suppose we fix u(0) and u(1), then any u(n) is obtained by solving the difference equation “from the origin to n.” This can be formalized using transfer matrices as follows. If we set

$$\begin{aligned} T(m;E) = \begin{pmatrix} E - \lambda \chi _{[1-\alpha ,1)}(m \alpha + \omega \!\!\!\! \mod 1) &{}\quad -1 \\ 1 &{}\quad 0 \end{pmatrix} \end{aligned}$$

and

$$\begin{aligned} A(n;E) = {\left\{ \begin{array}{ll} T(n;E) \times \cdots \times T(1;E) &{} n \ge 1 \\ I &{} n = 0 \\ T(n+1;E)^{-1} \times \cdots \times T(0;E)^{-1} &{} n \le -1, \end{array}\right. } \end{aligned}$$

then u solves (21) for every \(n \in {\mathbb {Z}}\) if and only if

$$\begin{aligned} \begin{pmatrix} u(n+1) \\ u(n) \end{pmatrix} = A(n;E) \begin{pmatrix} u(1) \\ u(0) \end{pmatrix} \end{aligned}$$

for every \(n \in {\mathbb {Z}}\).

Consider for a moment the special case \(\omega = 0\). With the Fibonacci numbers \(\{ F_k \}\) given by \(F_0 = F_1 = 1\), \(F_{k + 1} = F_k + F_{k-1}\), \(k \ge 1\), the matrices \(M_k = A(F_k,E)\) obey a remarkable recursion [100]:

$$\begin{aligned} M_{k+1} = M_{k-1} M_k. \end{aligned}$$
(22)

This recursion holds initially for \(k \ge 2\), but notice that one can invert it in order to define \(M_k\) for \(k < 1\) via the recursion, so that (22) will then hold for arbitrary \(k \in {\mathbb {Z}}\). For our purposes it suffices to compute the following matrices:

$$\begin{aligned} M_{-1} = \begin{pmatrix} 1 &{}\quad -\lambda \\ 0 &{}\quad 1 \end{pmatrix}, \quad M_{0} = \begin{pmatrix} E &{}\quad -1 \\ 1 &{}\quad 0 \end{pmatrix},\quad M_{1} = \begin{pmatrix} E - \lambda &{}\quad -1 \\ 1 &{}\quad 0 \end{pmatrix}. \end{aligned}$$
(23)

The matrix \(M_k\) acts as the transfer matrix from 0 to \(F_k\), and it is defined via the values the potential takes on \(\{1, \ldots , F_k\}\). Imagining repeating this block of length \(F_k\) periodically in both directions, we would obtain an \(F_k\)-periodic potential. Floquet theory shows that the spectrum \(\sigma _k\) of this periodic Schrödinger operator is given by

$$\begin{aligned} \sigma _k = \{ E : \mathrm {Tr} (M_k) \in [-2,2] \}. \end{aligned}$$

The map \(E \mapsto \mathrm {Tr} (M_k)\) is called the discriminant of the given periodic operator, and this identity shows that the discriminant determines the spectrum completely.

In other words, if we define \(x_k = \frac{1}{2} \mathrm {Tr} M_k\), then \(\sigma _k\) is the preimage of the interval \([-1,1]\) under the polynomial \(x_k\). These periodic Schrödinger operators serve as periodic approximations of the Fibonacci Hamiltonian, and it is a natural question how the spectrum \(\Sigma _{\lambda }\) of the latter operator may be related to the periodic spectra \(\sigma _k\). It turns out that [100]

$$\begin{aligned} \Sigma _{\lambda } = \bigcap _{k \ge 1} \sigma _k \cup \sigma _{k+1}. \end{aligned}$$
(24)

The identity (24) is obtained as follows. Via the Cayley–Hamilton Theorem, it follows from (22) that the \(x_k\)’s obey the following recursion:

$$\begin{aligned} x_{k+1} = 2 x_k x_{k-1} - x_{k-2}. \end{aligned}$$
(25)

We see from (23) that

$$\begin{aligned} x_{-1} = 1,\quad x_0 = \frac{E}{2},\quad x_1 = \frac{E - \lambda }{2}. \end{aligned}$$
(26)

This shows how the trace map arises naturally from the self-similar structure of the potential.Footnote 7 Namely, with the map T from (6), we have

$$\begin{aligned} T(x_k, x_{k-1}, x_{k-2}) = (x_{k+1}, x_k, x_{k-1}), \end{aligned}$$

due to (25). In particular, appealing to (26), we see that the iteration of T on the initial point \((\frac{E - \lambda }{2}, \frac{E}{2}, 1)\) generates the sequence \(\{x_k\}\). Recall from (8) that we denote the line of initial conditions by \(\ell _{\lambda }\). That definition is obviously motivated by the observations above.

All these quantities depend on the coupling constant \(\lambda \) and the energy E. The question whether a given energy E belongs to the spectrum \(\Sigma _{\lambda } = \sigma (H_{\lambda ,\omega })\) was shown in [100] to be equivalent to the boundedness of the sequence \(\{ x_k \}_{k \ge -1}\), which by the correspondence above is in turn equivalent to the boundedness of the forward orbit of \((\frac{E - \lambda }{2}, \frac{E}{2}, 1)\) under iterations of T. Thus, a spectral question can be connected to a dynamical question. In fact, all of the spectral quantities associated with the Fibonacci Hamiltonian that were discussed in the previous section can be connected to some dynamical quantity associated with the dynamical system generated by the trace map T. This correspondence is fundamental to the detailed quantitative analysis of the Fibonacci Hamiltonian we perform in this paper.

This connection can be taken one step further. As was pointed out in the introduction, T leaves each of the surfaces \(S_{\lambda }\) in (7) invariant, and may therefore be restricted to it; we denote this restriction by \(T_{\lambda }\). It is easy to check that \(\ell _{\lambda } \subset S_{\lambda }\). We denote by \(\Lambda _{\lambda }\) the set of points in \(S_{\lambda }\) whose full orbits under \(T_{\lambda }\) are bounded.

Recall that an invariant closed set \(\Lambda \) of a diffeomorphism \(f : M \rightarrow M\) is hyperbolic if there exists a splitting of the tangent space \(T_xM=E^u_x\oplus E^u_x\) at every point \(x\in \Lambda \) such that this splitting is invariant under Df, the differential Df exponentially contracts vectors from the stable subspaces \(\{E^s_x\}\), and the differential of the inverse, \(Df^{-1}\), exponentially contracts vectors from the unstable subspaces \(\{E^u_x\}\). A hyperbolic set \(\Lambda \) of a diffeomorphism \(f : M \rightarrow M\) is locally maximal if there exists a neighborhood U of \(\Lambda \) such that

$$\begin{aligned} \Lambda =\bigcap _{n\in \mathbb {Z}}f^n(U). \end{aligned}$$

It is known that for \(\lambda > 0\), \(\Lambda _{\lambda }\) is a locally maximal hyperbolic set of \(T_{\lambda } : S_{\lambda } \rightarrow S_{\lambda }\); see [19, 21, 32].

As discussed above, an energy \(E \in \mathbb {R}\) belongs to the spectrum \(\Sigma _{\lambda }\) of the Fibonacci Hamiltonian if and only if the positive semiorbit of the point \((\frac{E-\lambda }{2}, \frac{E}{2}, 1)\) under iterates of the trace map T is bounded. Thus, while the dynamical characterization above does not force \((\frac{E - \lambda }{2}, \frac{E}{2}, 1)\) to belong to \(\Lambda _{\lambda }\), as only the forward orbit needs to be bounded, it actually does force it to be forward-asymptotic to an orbit on \(\Lambda _{\lambda }\). More precisely, \(E \in \Sigma _{\lambda }\) if and only if \((\frac{E - \lambda }{2}, \frac{E}{2}, 1)\) belongs to the stable manifold of a point in \(\Lambda _{\lambda }\). Via the identification of E and its associated point on \(\ell _{\lambda }\), the stable manifolds therefore “transport” information from \(\Lambda _{\lambda }\) to the energy axis. This identifies the spectrum dynamically, and this also identifies the density of states measure as a suitable push-forward of a natural dynamical measure associated with \(T_{\lambda }|_{\Lambda _{\lambda }}\) [34].

3 Transversality

It is known that the stable manifolds of points in \(\Lambda _{\lambda }\) intersect the line \(\ell _{\lambda }\) transversally if \(\lambda > 0\) is sufficiently small [32] or if \(\lambda \ge 16\) [21]. It is also known, based on [10], that if tangential intersections occur in the intermediate regime, they cannot occur at more than finitely many points. This, however, is not sufficient to state uniformly for all values of the coupling constant some of the results that are known to hold in the small and the large coupling regimes. The purpose of this section is to prove that transversality holds for all values of the coupling constant, and some of the immediate consequences; namely, we prove Theorem 1.5, and its consequences—Theorems 1.1, 1.3, and 1.4.

Proof of Theorem 1.5

In what follows, given a curve \(\eta : K\rightarrow K^n\), with \(K = {\mathbb {R}}\) or \(K = {\mathbb {C}}\) and \(n\in {\mathbb N}\), \(\eta ^*\) denotes the image of the curve.

As we have already mentioned, transversality is known for all \(\lambda > 0\) sufficiently small. Let us now assume that \(\lambda _0 > 0\) is such that for all \(\lambda \in (0, \lambda _0)\), transversality holds, while at \(\lambda _0\), \(\ell _{\lambda _0}\cap W^s(\Lambda _{\lambda _0})\) contains tangential intersections. From [10] it is known that such tangencies must be isolated; by compactness of \(\ell _{\lambda }\cap W^s(\Lambda _{\lambda })\), there is at most a finite number of such tangencies.

Let p be a point of such a tangency and let U be an open neighborhood of p in \(S_{\lambda _0}\) such that all the points of \(\ell _{\lambda _0}\cap W^s(\Lambda _{\lambda _0})\cap U\) except p are transversal. Notice that for each \(\lambda \), the sets \(W^s(\Lambda _{\lambda })\) and \(\ell _{\lambda }\) lie on the surface \(S_{\lambda }\).

Now we want to complexify this picture, including the coupling constant \(\lambda \). Let us consider the complexified surfaces \(\hat{S}_{\lambda }\), \(\lambda \in \mathcal {U}\), where \(\mathcal {U}\) is a small neighborhood of \(\lambda _0\) in \(\mathbb {C}\) (notice that here we do not restrict \(\lambda \) only to real values, as we did before). That is,

$$\begin{aligned} \hat{S}_{\lambda }\overset{\mathrm {def}}{=}\left\{ (x,y,z)\in {\mathbb {C}}^3: x^2 + y^2 + z^2 - 2xyz - 1 = \frac{\lambda ^2}{4}\right\} . \end{aligned}$$
(27)

Given a real \(\lambda \) and the real surface \(S_{\lambda }\) as before, let us write \(\hat{S}_{\lambda }\) for the complexification of \(S_{\lambda }\), namely (27).

The point p of tangency between \(\ell _{\lambda _0}\) and \(W^s(\Lambda _{\lambda _0})\) can now be considered as a point in \(\mathbb {C}^3\). By the complex-analytic version of the implicit function theorem, there exists a family of biholomorphisms \(\pi (\cdot , \lambda ): \hat{S}_{\lambda }\rightarrow \hat{S}_{\lambda _0}\) in a neighborhood of p in \({\mathbb {C}}^3\) such that \(\pi (\cdot ,\lambda _0)\) is the identity, \(\pi \) depends holomorphically on \(\lambda \), and for all real \(\lambda \), \(\pi (\cdot , \lambda )\) maps the real part of \(\hat{S}_{\lambda }\), namely \(S_{\lambda }\), to \(S_{\lambda _0}\).

Now let O be an open neighborhood of p in \({\mathbb {R}}^3\) (recall that \(p\in S_{\lambda _0}\) is an assumed point of tangency between \(\ell _{\lambda _0}\) and \(W^s(\Lambda _{\lambda _0})\)), and let \(U_{\lambda } = S_{\lambda } \cap O\). With \(\lambda \) real, \(W^s(\Lambda _{\lambda })\cap U_{\lambda }\) is smoothly projected into \(U_{\lambda _0}\) by \(\pi (\cdot , \lambda )\). Let us denote the resulting laminations by \(\mathcal {F}_{\lambda }\), and the lamination \(W^s(\Lambda _{\lambda _0})\cap U_{\lambda _0}\) by \(\mathcal {F}_{\lambda _0}\). By abuse of notation, let us denote the projection of \(\ell _{\lambda }\) via \(\pi (\cdot , \lambda )\) by the same symbol, \(\ell _{\lambda }\).

Notice that the laminations \(\mathcal {F}_{\lambda }\) consist of real-analytic leaves (see [9, Section 5]), and can be included into a \(C^{1 + \epsilon }\) invariant foliation. Let \(\kappa \) be a parameter in the space of leaves of this foliation, such that the leaves of \(\mathcal {F}_{\lambda }\) depend continuously on \(\kappa \) in the \(C^2\) topology. Moreover, each leaf of \(\mathcal {F}_{\lambda }\) has a canonical continuation in \(\lambda \) that depends holomorphically on \(\lambda \) (for further details, see [18, Section 2]).

Let us denote by \(\phi _{\lambda }^{(\kappa )}\) the leaves of \(\mathcal {F}_{\lambda }\). By \(\phi _{\lambda _0}^{(\kappa _0)}\in \mathcal {F}_{\lambda _0}\) we denote the leaf that admits the tangency with \(\ell _{\lambda _0}\) at p. We will verify that the laminations \(\mathcal {F}_{\lambda }\) satisfy the following properties.

(i):

The leaves \(\phi _{\lambda }^{(\kappa )}\) as well as \(\ell _{\lambda }\) admit holomorphic continuations, \(\hat{\phi }_{\lambda }^{(\kappa )}\) and \(\hat{\ell }_{\lambda }\), respectively, in such a way that for all \(\lambda \), all intersections between \(\hat{\ell }_{\lambda }\) and \(\hat{\phi }_{\lambda }^{(\kappa )}\) are real.

(ii):

For every \(\lambda \) in a neighborhood of \(\lambda _0\), the lamination \(\mathcal {F}_{\lambda }\) is locally homeomorphic to a product of an interval by a Cantor set.

(iii):

Let \(\gamma \) be a transversal to the lamination \(\mathcal {F}_{\lambda }\). For all \(\kappa _1, \kappa _2\), there exists \(\Delta (\kappa _1,\kappa _2) > 0\) such that for all \(\lambda \) sufficiently close to \(\lambda _0\) and the leaves \(\phi ^{(\kappa _1)}_{\lambda }, \phi ^{(\kappa _2)}_{\lambda }\) in \(\mathcal {F}_{\lambda }\), the distance along \(\gamma \) between \(\gamma \cap \phi ^{(\kappa _1)}_{\lambda }\) and \(\gamma \cap \phi ^{(\kappa _2)}_{\lambda }\) is not smaller than \(\Delta \).

Verification of (i)

The curves \(\ell _{\lambda }\) are complexified in a natural way (i.e. first complexify the original line of initial conditions, \(\ell _{\lambda }\), before applying the projection \(\pi (\cdot , \lambda )\), and then project). As for the leaves of the foliation: it is known that stable manifolds admit a suitable complexification as complex submanifolds of the complexified invariant surface \(\hat{S}_{\lambda }\)(see [9]).

To verify that all intersections should be real, we appeal to the argument given by Sütő in [100]: an energy E belongs to the spectrum if and only if the forward orbit of the corresponding point on the line of initial conditions is bounded under the trace map. In fact, we only need the implication one way: boundedness of the forward orbit implies inclusion in the spectrum. Sütő considered only real values for the energy (the parameter of the line of initial conditions), since the spectrum is real. But the same argument applies verbatim to complex-valued energies.

On the other hand, a point has a bounded forward orbit if and only if it belongs to a stable manifold of \(\Lambda _{\lambda }\) (this is known and has been used since Casdagli’s work [21]; an explicit proof was given in [40, Corollary 2.5], see also [80]). Since the spectrum is real and the (complexified) line of initial conditions maps \({\mathbb {R}}\) into the real subspace of the invariant surface, all intersection points must be real. Now use the fact that \(\pi \) preserves the real subspace.\(\square \)

Verification of (ii)

This holds since the nonwandering set \(\Lambda _{\lambda }\) for the trace map restricted to \(S_{\lambda }\), \(\lambda \) real and positive, is a hyperbolic horseshoe (see [19, 21, 32]). \(\square \)

Verification of (iii)

This follows from a compactness argument (the lamination depends continuously on \(\lambda \); restrict \(\lambda \) to some compact interval around \(\lambda _0\), and note that, by definition, no two distinct leaves of the lamination \(\mathcal {F}_{\lambda _0}\) intersect). \(\square \)

We will need the following simple lemma, stated here without proof, that could be derived from, for example, [59, Theorem 1.14].

Lemma 3.1

Suppose that \(\phi , \ell : {\mathbb {R}}\rightarrow {\mathbb {R}}^2\) are real-analytic,  admitting complex-analytic continuations \(\hat{\phi }, \hat{\ell }: {\mathbb {C}}\rightarrow {\mathbb {C}}^2,\) such that \(\hat{\phi }\) and \(\hat{\ell }\) are injective immersions. Suppose further that at some \(q\in \hat{\phi }^*\cap \hat{\ell }^*,\) the curve \(\hat{\phi }^*\) is tangent to \(\hat{\ell }^*\) and this tangency is isolated. Then there exists an open neighborhood U of q in \({\mathbb {C}}^2\) and a biholomorphism \(\zeta : U\rightarrow \mathbb {D}^2=\mathbb {D}\times \mathbb {D}\) with \(\mathbb {D}\) being the unit disc centered at the origin in \({\mathbb {C}},\) with the following properties : 

  1. 1.

    \(\zeta (q) = (0,0)\).

  2. 2.

    \(\zeta ({\mathfrak {R}e}(U)) \subseteq {\mathfrak {R}e}(\mathbb {D}^2)\).

  3. 3.

    \(\zeta \) maps the connected component of \(\hat{\ell }^*\cap U\) that contains q onto \(\mathbb {D}{\times \{0\}}\).

  4. 4.

    There exists a holomorphic function \(f:\mathbb {D}\rightarrow {\mathbb {C}},\) such that the image of the connect component of \(\hat{\phi }^*\cap U\) that contains q is the graph of f.

We will consider separately the case when the tangency at p is quadratic, and when it is of higher order.

Lemma 3.2

If the tangency at p between \(\hat{\phi }^{(\kappa _0)}_{\lambda _0}\) and \(\hat{\ell }_{\lambda _0}\) is of order greater than two, then there exists \(\lambda \in (0, \lambda _0)\) such that \(\ell _{\lambda }\) contains a point of tangency with some leaf of the lamination \(\mathcal {F}_{\lambda }\).

Proof

Let us assume that the tangency at p between \(\hat{\phi }^{(\kappa _0)}_{\lambda _0}\) and \(\hat{\ell }_{\lambda _0}\) is of order \(k > 2\). Let \(\zeta _0\) be a rectifying biholomorphism as in Lemma 3.1. Then in a neighborhood of p, \(\zeta _0\) maps \(\hat{\ell }_{\lambda _0}^*\) onto \(\mathbb {D}\) and \(\zeta _0(\hat{\phi }_{\lambda _0}^{(\kappa _0)})\) is the graph of a holomorphic function over \(\mathbb {D}\). Let us denote this function by \(\hat{f}^{(\kappa _0)}_{\lambda _0}\) and its restriction onto \({\mathbb {R}}\) by \(f^{(\kappa _0)}_{\lambda _0}\). Then \(f_{\lambda _0}^{(\kappa _0)}\) has a root of multiplicity k at the origin.

Now by holomorphic dependence on \(\lambda \) of each leaf of the family \(\mathcal {F}_{\lambda }\), as well as of \(\hat{\ell }_{\lambda }\), we can construct a family of rectifying biholomorphisms \(\left\{ \zeta _{\lambda }\right\} \) depending holomorphically on \(\lambda \) for every \(\lambda \) sufficiently close to \(\lambda _0\) with the same properties as \(\zeta _0\) (less the tangency).

Thus \(\zeta _{\lambda }(\mathcal {F}_{\lambda })\) gives a family of laminations in a neighborhood of the origin in \({\mathbb {R}}^2\), such that for each leaf \(\phi _{\lambda }^{(\kappa )}\in \zeta _{\lambda }(\mathcal {F}_{\lambda })\), its complexification \(\hat{\phi }_{\lambda }^{(\kappa )}\) is given as the graph of an analytic function \(\hat{f}_{\lambda }^{(\kappa )}\) over \(\mathbb {D}\). Notice that the resulting functions \(f_{\lambda }^{(\kappa )}\) depend analytically on \(\lambda \) and continuously on \(\kappa \) in the \(C^2\) topology. In particular, \(\hat{f}^{(\kappa _0)}_{\lambda }\rightarrow \hat{f}^{(\kappa _0)}_{\lambda _0}\) uniformly as \(\lambda \rightarrow \lambda _0\).

Hurwitz’s theorem implies that for every \(\lambda \) sufficiently close to \(\lambda _0\), \(\hat{f}_{\lambda }^{(\kappa _0)}\) has precisely \(k > 2\) zeros (counting multiplicity) in a neighborhood of the origin, and these zeros approach the origin as \(\lambda \rightarrow \lambda _0\). Since, by our standing assumptions, for \(\lambda < \lambda _0\) these zeros form transverse intersections, they all must be simple. Thus when \(\lambda < \lambda _0\), there are precisely \(k > 2\) distinct zeros of \(\hat{f}_{\lambda }^{(\kappa _0)}\) in a neighborhood of the origin which approach the origin as \(\lambda \nearrow \lambda _0\). Furthermore, due to property (i) above, these zeros are all real.

The biholomorphism \(\zeta _{\lambda }\) maps \({\mathbb {R}}^2\) to \({\mathbb {R}}^2\), so we obtain a family of analytic functions \(f_{\lambda }^{(\kappa )}: J\rightarrow {\mathbb {R}}\) over an open interval \(J\subset {\mathbb {R}}\) with \(0\in J\). Since \(k\ge 3\), for all \(\lambda < \lambda _0\) and sufficiently close to \(\lambda _0\), there exist two nondegenerate compact intervals, \(J_{\lambda }^{(1)}\) and \(J_{\lambda }^{(2)}\) with disjoint interiors, such that the endpoints of each are given by zeros of \(f_{\lambda }^{(\kappa _0)}\), on the interior of \(J_{\lambda }^{(1)}\), \(f_{\lambda }^{(\kappa _0)} < 0\) and on the interior of \(J_{\lambda }^{(2)}\), \(f_{\lambda }^{(\kappa _0)} > 0\), and \(|{J_{\lambda }^{(i)}}|\rightarrow 0\) as \(\lambda \nearrow \lambda _0\), \(i = 1, 2\).

By continuity, the derivative of \(f_{\lambda }^{(\kappa _0)}\) is bounded uniformly in \(\lambda \) on the interval J. In particular it follows that if \(M_{\lambda }^{(i)}\) denotes the maximum of \(|{f_{\lambda }^{(\kappa _0)}}|\) over \(J_{\lambda }^{(i)}\), then \(M_{\lambda }^{(i)}\rightarrow 0\) as \(\lambda \nearrow \lambda _0\).

Now fix some \(\lambda ' < \lambda _0\) sufficiently close to \(\lambda _0\) as above. There exists \(\alpha \) among the parameters \(\kappa \) such that if \(f_{\lambda }^{(\alpha )}\) is the continuation of \(f_{\lambda _0}^{(\alpha )}\), then we have the following. By assumption (iii), there exists \(\Delta _0 > 0\) such that for all \(\lambda \in [\lambda ', \lambda _0]\), we have \(\vert f_{\lambda }^{(\kappa _0)}-f_{\lambda }^{(\alpha )}\vert > \Delta _0\) on the interval \(J_{\lambda }^{(i)}\). Furthermore, there exists \(i\in \left\{ 1, 2\right\} \) such that for all \(\lambda \in [\lambda ', \lambda _0)\), either \(f_{\lambda }^{(\kappa _0)}\) is negative on the interior of \(J_{\lambda }^{(i)}\) and \(f_{\lambda }^{(\alpha )} > f_{\lambda }^{(\kappa _0)}\) on \(J_{\lambda }^{(i)}\), or \(f_{\lambda }^{(\kappa _0)}\) is positive on the interior of \(J_{\lambda }^{(i)}\) and \(f_{\lambda }^{(\alpha )} < f_{\lambda }^{(\kappa _0)}\) on \(J_{\lambda }^{(i)}\). Let us consider the latter case, the former being completely similar.

It follows that there exists \(\lambda _1< \lambda _2 \in (\lambda ', \lambda _0)\) such that for all \(\lambda \in [\lambda ', \lambda _1)\), the maximum of \(f_{\lambda }^{(\alpha )}\) over \(J_{\lambda }^{(i)}\) is positive and for all \(\lambda \in (\lambda _2, \lambda _0)\), the maximum of \(f_{\lambda }^{(\alpha )}\) over \(J_{\lambda }^{(i)}\) is negative. As a result, there exists \(\lambda \in (\lambda ', \lambda _0)\) such that \(f_{\lambda }^{(\alpha )}\) is nonpositive on \(J_{\lambda }^{(i)}\) and has a zero \(q\in J_{\lambda }^{(i)}\), and hence q is a point of tangency (see Fig. 4). \(\square \)

Fig. 4
figure 4

a Some \(\lambda \in [\lambda ', \lambda _1)\), b some \(\lambda \in (\lambda _2, \lambda _0)\)

Fig. 5
figure 5

a \(\lambda = \lambda _0\), b \(\lambda < \lambda _0\), c \(\lambda = \lambda _0\), d \(\lambda < \lambda _0\), e \(\lambda = \lambda _0\), f \(\lambda < \lambda _0\), g \(\lambda = \lambda _0\), h \(\lambda < \lambda _0\)

We can now apply Lemma 3.2 to conclude that the tangency at p must either be quadratic, or for some \(\lambda < \lambda _0\), \(\ell _{\lambda }\) intersects \(\mathcal {F}_{\lambda }\) tangentially at some point. On the other hand, by assumption, tangencies cannot occur for \(\lambda < \lambda _0\). Thus the tangency at p must be quadratic.

Assume that we have a quadratic tangency at p between \(\ell _{\lambda _0}\) and some leaf \(\phi _{\lambda _0}\) of the Cantor lamination \(W^s(\Lambda _{\lambda _0})\). Assume for a moment that the leaf \(\phi _{\lambda _0}\) is not a boundary of the lamination. Since for \(\lambda < \lambda _0\) the tangency unfolds, it could either unfold as shown in Fig. 5a, b, or as in c, d; in either case, arbitrarily close to \(\phi _{\lambda _0}\) there exists a leaf of the foliation such that for some \(\lambda < \lambda _0\), this leaf intersects the line tangentially.

Assume now that \(\phi _{\lambda _0}\) is a boundary of the lamination. Since there are no isolated points in the spectrum, in this case we either have an unfolding shown in Fig. 5e, f or g, h.

If the tangency unfolds as shown in Fig. 5e, f, then as before, arbitrarily close to \(\phi _{\lambda _0}\) there exists a leaf such that for some \(\lambda < \lambda _0\), the intersection of the line with this leaf is tangential.

Now suppose that the tangency unfolds as shown in Fig. 5g, h. In this case the interval along \(\ell _{\lambda }\) bounded by the intersection points \(\phi _{\lambda }\cap \ell _{\lambda }\), as shown in the picture, corresponds to a gap in the spectrum. We know that for all sufficiently small couplings \(\lambda \), all the gaps allowed by the gap labeling theorem are open (see the discussion preceding the statement of Theorem 1.3). On the other hand, by assumption, for all \(\lambda < \lambda _0\), the line \(\ell _{\lambda }\) intersects the stable lamination transversally; this allows for continuation of the open gaps from the small coupling regime to all \(\lambda < \lambda _0\) with all gaps remaining open; see [33, Theorem 4.3]. In particular, this also guarantees that for any gap in the spectrum at \(\lambda < \lambda _0\), its two boundary points correspond to the intersection of \(\ell _{\lambda }\) with two stable manifolds of two distinct periodic points (for further details on gap opening, see [33, Section 3]). Thus an intersection of \(\ell _{\lambda }\) for \(\lambda < \lambda _0\) with a stable manifold cannot form a gap, precluding the unfolding of a tangency as shown in Fig. 5g, h.

This shows that the tangency at p cannot be quadratic. Together with Lemma 3.2, this proves Theorem 1.5. \(\square \)

Proof of Theorem 1.1

Given Theorem 1.5, the result follows from [32, Corollary 2] and its proof. \(\square \)

Proof of Theorem 1.3

The result is a consequence of Theorem 1.5 and [33, Theorem 4.3]. \(\square \)

Proof of Theorem 1.4

The assertion of the theorem can be obtained from Theorem 1.5 and [34, Theorem 1.1]; compare the discussion in Remark (e) on [34, p. 978] of the role of \(\lambda _0\) in the formulation of [34, Theorem 1.1]. The analyticity of \(\dim _H \nu _{\lambda }\) follows from [87] combined with Theorem 1.5. \(\square \)

4 Transport exponents

In this section we prove the identity (9). We begin by establishing some results about the dynamics of the trace map.

Proposition 4.1

For every \(\lambda > 0,\) all unstable manifolds of \(T_{\lambda } : S_{\lambda } \rightarrow S_{\lambda }\) are transversal to the circle \(C_{\lambda } := \{ z = 0 \} \cap S_{\lambda }\).

Proof

We know that for every \(\lambda > 0\) and every \(k \in {\mathbb {Z}}_+\), the curve \(T^k_{\lambda }(\ell _{\lambda })\) has the following properties:

  1. 1.

    \(T^k_{\lambda }(\ell _{\lambda })\) is transversal to the plane \(\{ z = c \}\) for any \(c \in (-1, 1)\);

  2. 2.

    If we consider \(\ell _{\lambda }\) as a complex line in \(\mathbb {C}^3\), then \(T^k_{\lambda }(\ell _{\lambda })\cap \{z=0\}\) consists of \(F_{k-1}\) points, and all of them are in the real subspace.

Indeed, both statements follow from standard results in Floquet theory. Namely, the z-component of \(T^k_{\lambda }(\ell _{\lambda }(E))\) is, as a function of \(E \in {\mathbb {C}}\), equal to one-half times the discriminant of a discrete Schrödinger operator with a periodic potential of period \(F_{k-1}\) (see, e.g., [100]).Footnote 8 Thus, the values of E for which \(T^k_{\lambda }(\ell _{\lambda }(E)) \in \{z=c\}\) are precisely the E’s for which one-half the discriminant takes on the value c. If \(c \in (-1, 1)\), then due to, for example, [97, Theorem 5.4.2], there are precisely \(F_{k-1}\) many of them, say \(E_1, \ldots , E_{F_{k-1}}\), and for every \(j \in \{ 1, \ldots , F_{k-1} \}\), \(E_j\) is real, and the derivative of the discriminant at \(E_j\) is non-zero.

It is known that all unstable manifolds of \(T_{\lambda } : S_{\lambda } \rightarrow S_{\lambda }\) are transversal to \(C_{\lambda }\) if \(\lambda \) is sufficiently small. Indeed, this is true for \(\lambda = 0\) and extends to small values of \(\lambda \) by continuity. Suppose that Proposition 4.1 does not hold and denote by \(\lambda ^* > 0\) the smallest value of the coupling constant such that one of the unstable manifolds of \(T_{\lambda ^*}\) has a tangency with \(C_{\lambda }\). Notice that this tangency cannot be quadratic. Namely, due to Theorem 1.5 the line \(\ell _{\lambda }\) is transversal to the stable manifolds of \(T_{\lambda ^*}\), and therefore for any sufficiently large \(k \in {\mathbb {Z}}_+\), the curve \(T^k_{\lambda ^*}(\ell _{\lambda ^*})\) contains an arc that is \(C^2\)-close to an arc of the unstable manifold near the point of tangency. But in this case this arc would have a point of quadratic tangency with a plane \(\{z = \varepsilon \}\) for some small \(\varepsilon \), and this contradicts the properties of the curve \(T^k_{\lambda ^*}(\ell _{\lambda ^*})\) above.

Therefore the tangency between \(T^k_{\lambda ^*}(\ell _{\lambda ^*})\) and \(C_{\lambda ^*}\) must be of order \(m > 2\). There exists a (complex) neighborhood \(U \subset S_{\lambda ^*}\) of the point of tangency and a biholomorphic change of coordinates \(F : U \rightarrow \mathbb {D} \times \mathbb {D}\), where \(\mathbb {D}\) is a unit disc in \(\mathbb {C}\), such that \(F(U \cap \{z=0\}) = \mathbb {D} \times \{0\}\), the point of tangency is mapped into 0, and the arc of the unstable manifold in U is mapped into the graph of a holomorphic function \(g : \mathbb {D} \rightarrow \mathbb {C}\) such that \(g(0) = 0\) is a zero of order \(m > 2\). A holomorphic version of the Inclination Lemma (which follows, for example, from the graph transform construction from [59, Lemma 7.5]) implies that for each sufficiently large \(k \in {\mathbb {Z}}_+\), there is a connected component of the intersection \(T^k_{\lambda ^*}(\ell _{\lambda ^*})\cap U\) such that its image under F is a graph of a holomorphic function \(f_k : \mathbb {D} \rightarrow \mathbb {C}\) and \(f_k \rightrightarrows g\). Due to the Hurwitz Theorem, for all large \(k \in {\mathbb {Z}}_+\), the function \(f_k\) must have \(m > 2\) zeros in \(\mathbb {D}\), and due to the properties of \(T^k_{\lambda ^*}(\ell _{\lambda ^*})\), all these zeros must be simple and real. But this once again leads to the existence of a tangency between the curve \(T^k_{\lambda ^*}(\ell _{\lambda ^*})\) and the plane \(\{z = \varepsilon \}\) for some small \(\varepsilon \) (due to the same arguments that were used in the proof of Lemma 3.2 above), which is a contradiction. \(\square \)

For \(E \in {\mathbb {C}}\) and \(k \in {\mathbb {Z}}\), define \(x_k(E)\) by

$$\begin{aligned} T^k\left( \frac{E-\lambda }{2}, \frac{E}{2}, 1 \right) = T^k (\ell _{\lambda }(E)) = \left( x_{k+1}(E), x_k(E), x_{k-1}(E) \right) . \end{aligned}$$

Then, for \(k \ge 0\), \(x_k\) is a polynomial of degree \(F_k\), where \(F_0 = F_1 = 1\), \(F_{k+1} = F_k + F_{k-1}\), \(k \ge 1\). For \(\delta > 0\), set

$$\begin{aligned} \sigma _k^\delta = \{ E \in {\mathbb {C}}: |x_k(E)| \le 1 + \delta \}. \end{aligned}$$

Lemma 4.2

For every \(\lambda > 0,\) there exists \(\delta (\lambda ) > 0\) such that for every \(\delta \in [0,\delta (\lambda ))\) and every \(k \ge 0,\) \(\sigma _k^\delta \) has precisely \(F_k\) connected components. Denote these connected components by \(B_k^{(j)}(\delta ),\) \(j = 1, \ldots , F_k\). Each \(B_k^{(j)}(\delta )\) is symmetric about the real line,  intersects \({\mathbb {R}}\) in a compact non-degenerate interval,  and contains precisely one \(E_k^{(j)} \in {\mathbb {R}}\) such that \(x_k(E_k^{(j)}) = 0\).

Remark 4.3

  1. (a)

    We will choose a consistent labeling, namely the one which ensures that \(B_k^{(j)}(\delta ) \cap {\mathbb {R}}\) lies to the left of \(B_k^{(j')}(\delta ) \cap {\mathbb {R}}\) if \(j < j'\). In particular, we have \(E_k^{(1)} < E_k^{(2)} < \cdots < E_k^{(F_k)}\).

  2. (b)

    Clearly, the zero \(E_k^{(j)}\) does not depend on \(\delta \in [0,\delta (\lambda ))\).

Proof of Lemma 4.2

Since the coefficients of the polynomial \(x_k\) are real, we have \(x_k(\bar{E}) = \overline{{x_k(E)}}\), and hence in particular \(|{x_k(\bar{E})}| = |x_k(E)|\). This shows that \(\sigma _k^\delta \), and hence each of its connected components, is symmetric about the real line.

Recall that the free spectrum \(\Sigma _0\) is equal to the interval \([-2,2]\), which corresponds to the line segment

$$\begin{aligned} \ell _0^b = \left\{ \left( \frac{E}{2}, \frac{E}{2}, 1 \right) : E \in [-2,2] \right\} \subset \ell _0. \end{aligned}$$

To study the evolution of \(\ell _0^b\) under the trace map, let us recall the following. The surface

$$\begin{aligned} {\mathbf {S}} = S_0 \cap \{ (x,y,z)\in \mathbb {R}^3 : |x|\le 1, |y|\le 1, |z|\le 1\} \end{aligned}$$

is homeomorphic to the two-dimensional real sphere, invariant under T, smooth everywhere except at the four points \(P_1=(1,1,1)\), \(P_2=(-1,-1,1)\), \(P_3=(1,-1,-1)\), and \(P_4=(-1,1,-1)\), where \(\mathbf {S}\) has conic singularities, and the trace map T restricted to \(\mathbf {S}\) is a factor of the hyperbolic automorphism of \({\mathbb {T}}^2 = {\mathbb {R}}^2 / {\mathbb {Z}}^2\) given by

$$\begin{aligned} \mathcal {A}(\theta _1, \theta _2) = (\theta _1 + \theta _2, \theta _1)\ ({\mathrm{mod}}\ 1). \end{aligned}$$
(28)

The semi-conjugacy is given by the map

$$\begin{aligned} F: (\theta _1, \theta _2) \mapsto (\cos 2\pi (\theta _1 + \theta _2), \cos 2\pi \theta _1, \cos 2\pi \theta _2). \end{aligned}$$
(29)

The map \(\mathcal {A}\) is hyperbolic, and is given by the matrix \(A = \begin{pmatrix} 1 &{}\quad 1 \\ 1 &{}\quad 0 \end{pmatrix}\).

From the explicit form (29) of the semi-conjugacy F, we see that

$$\begin{aligned} \tilde{\ell }_0^b = \left\{ (\theta _1, \theta _2) : \theta _2 = 0, \; \theta _1 \in \left[ 0,\tfrac{1}{2} \right] \right\} \subset {\mathbb {T}}^2 \end{aligned}$$

is mapped by F onto \(\ell ^b_0\). Since \(T^k (\ell _0^b) = F(A^k (\tilde{\ell }_0^b))\) and \(A^k (\tilde{\ell }_0^b)\) is the line segment from

$$\begin{aligned} A^k \begin{pmatrix} 0 \\ 0 \end{pmatrix} = \begin{pmatrix} 0 \\ 0 \end{pmatrix} \end{aligned}$$

to

$$\begin{aligned} A^k \begin{pmatrix} \frac{1}{2} \\ 0 \end{pmatrix} = \frac{1}{2} \begin{pmatrix} F_k \\ F_{k-1} \end{pmatrix} \end{aligned}$$

(modulo \({\mathbb {Z}}^2\)), we see that \(T^k (\ell _0^b)\) wraps \(F_k/2\) times around \(\mathbf {S}\). Now turn on \(\lambda \). Since the surfaces \(S_{\lambda }\) and the lines of initial conditions \(\ell _{\lambda }\) change continuously, \(T^k (\ell _{\lambda }^b)\) still wraps \(F_k/2\) times around the central part of \(S_{\lambda }\). Here, \(\ell _{\lambda }^b\) is the line segment on \(\ell _{\lambda }\) that corresponds to the convex hull of \(\Sigma _{\lambda }\) via the map \(E \mapsto ( \frac{E-\lambda }{2}, \frac{E}{2}, 1)\). Moreover, the extremal values reached during each turn-around (of the second coordinate, say) are now at least \(1 + \frac{\lambda ^2}{4}\) in absolute value. This implies that (again considering the second coordinate, say, which determines \(x_k(E)\)) the value of \(x_k(E)\) runs at least from \(-1 - \frac{\lambda ^2}{4}\) to \(1 + \frac{\lambda ^2}{4}\) and vice versa. In particular, for every \(\delta \in (0,\frac{\lambda ^2}{4})\), the preimage of \([-1-\delta ,1+\delta ]\) under \(x_k\) consists of precisely \(F_k\) compact mutually disjoint intervals. This shows that \(\sigma _k^\delta \cap {\mathbb {R}}\) has exactly \(F_k\) connected components, each of which contains precisely one zero of \(x_k\). Let us denote these \(F_k\) real zeros of \(x_k\) by \(E_k^{(1)} < E_k^{(2)} < \cdots < E_k^{(F_k)}\).

Let us argue that each \(E_k^{(j)}\) is also the only zero of \(x_k\) in the complex connected component \(B_k^{(j)}(\delta )\) of \(\sigma _k^\delta \), which contains the real connected component that contains \(E_k^{(j)}\). Suppose this fails. Since \(\sigma _k^{\delta }\) is symmetric with respect to the reflection about the real axis, we can infer that if \(B_k^{(j)}(\delta )\) contains another zero of \(x_k\), and hence another connected component of \(\sigma _k^{\delta } \cap {\mathbb {R}}\), we find that the boundary of this connected component, on which \(x_k\) has constant modulus \(1 + \delta \), contains a closed curve that bounds a bounded region containing points at which \(x_k\) has modulus strictly larger than \(1 + \delta \) (e.g., points on the real line strictly between the two connected components of \(\sigma _k^{\delta } \cap {\mathbb {R}}\) in question). Thus, we obtain a contradiction due to the maximum modulus principle. It follows that \(\sigma _k^\delta \), too, has precisely \(F_k\) connected components, each of which contains precisely one root of \(x_k\), which is real. \(\square \)

Proposition 4.4

For every \(\lambda > 0\) and every \(\varepsilon > 0,\) there exists \(k_0 \in {\mathbb {Z}}_+\) such that for every \(k > k_0\) and every \(p \in \Lambda _{\lambda },\) there exists \(E_k \in {\mathbb {R}}\) such that \(x_k(E_k) = 0\) and

$$\begin{aligned} \frac{1}{k}\log \Vert DT^k(p)|_{E^u_p}\Vert -\varepsilon \le \frac{1}{k}\log |x'_k(E_k)|\le \frac{1}{k}\log \Vert DT^k(p)|_{E^u_p}\Vert +\varepsilon . \end{aligned}$$

Proof

Denote as before \(C_{\lambda } = \{ z = 0 \} \cap S_{\lambda }\). Fix a small \(\delta > 0\). Then there exists \(k' \in {\mathbb {Z}}_+\) such that \(T^{k'}(W_\delta ^u(p)) \cap C_{\lambda } \ne \emptyset \) and \(T^{-k'}(W_\delta ^s(p)) \cap \ell _{\lambda } \ne \emptyset \) for any \(p \in \Lambda _{\lambda }\). Choose any \(p \in \Lambda _{\lambda }\) and pick any point \(\tilde{p} \in T^{-k'}(W_\delta ^s(p))\cap \ell _{\lambda }\). Let \(\tau \subset \ell _{\lambda }\) be an interval that contains \(\tilde{p}\) and such that \(T^{k'}(\tau )\) is a connected component of \(T^{k'}(\ell _{\lambda })\cap U_\delta (p)\). We will denote \(\tau _{k'} = T^{k'}(\tau )\) and \(\tau _{k'+n} = U_\delta (T^n(p)) \cap T(\tau _{k'+n-1})\) for \(n \ge 1\).

Let k be sufficiently large, set \(n = k-2k'\). Then \(T^{k'}(\tau _{k'+n})\) must have some intersections with \(C_{\lambda }\). Take any point \(p^{**} \in C_{\lambda } \cap T^{k'}(\tau _{k'+n})\). Then, \(p^*: = T^{-k}(p^{**}) \in \ell _{\lambda }\), so \(p^* = \ell _{\lambda }(E_k)\) for some \(E_k \in {\mathbb {R}}\). Let us estimate \(\log |x'_k(E_k)|\). Since \(W^u(\Lambda _{\lambda })\) is transversal to \(C_{\lambda }\) by Proposition 4.1, and \(T^{k'}(\tau _{k'+n})\) is \(C^{1}\)-close to \(T^{k'}(W_\delta ^u(T^n(p)))\),

$$\begin{aligned} \left| \log |x'_k(E_k)| - \log \Vert DT^k(p^*)|_{\ell _{\lambda }}\Vert \right| < C_1, \end{aligned}$$

where \(C_1\) is some constant independent of k. On the other hand,

$$\begin{aligned} \left| \log \Vert DT^k(p^*)|_{\ell _{\lambda }}\Vert - \log \Vert DT^n(T^{k'}(p^*))|_{\tau _{k'}}\Vert \right| < C_2, \end{aligned}$$

where \(C_2\) is also independent of k. Using [66, Proposition 6.4.16] and the fact that \(\tau _{k'+j}\) is \(C^{1}\)-close to \(W^u_\delta (T^j(\tau _{k'}))\) (see [81]) we conclude that

$$\begin{aligned} \left| \log \Vert DT^n(T^{k'}(p^*))|_{\tau _{k'}}\Vert - \log \Vert DT^n(p)|_{E_p^u}\Vert \right| < C_3, \end{aligned}$$

also with a k-independent constant \(C_3\). But this implies that for large enough \(k=n+2k'\), we have

$$\begin{aligned} \left| \frac{1}{k} \log |x'_k(E_k)| - \frac{1}{k} \log \Vert DT^k(p)|_{E_p^u}\Vert \right| < \varepsilon . \end{aligned}$$

\(\square \)

Proposition 4.5

For every \(\lambda > 0\) and every \(\varepsilon > 0,\) there exists \(k_0 \in \mathbb {N}\) such that for every \(k > k_0\) and every \(E_k \in {\mathbb {R}}\) with \(x_k(E_k) = 0,\) one can find \(p \in \Lambda _{\lambda }\) such that

$$\begin{aligned} \frac{1}{k} \log \Vert DT^k(p)|_{E^u_p}\Vert - \varepsilon \le \frac{1}{k} \log |x'_k(E_k)| \le \frac{1}{k} \log \Vert DT^k(p)|_{E^u_p}\Vert + \varepsilon . \end{aligned}$$

Proof

Let us choose a ball \(B_{\lambda } \subset {\mathbb {R}}^3\) of sufficiently large radius so that \(\Lambda _{\lambda } \subset B_{\lambda }, W^s(\Lambda _{\lambda }) \cap \ell _{\lambda } \subset B_{\lambda }\), and \(C_{\lambda } := \{ z = 0 \} \cap S_{\lambda } \subset B_{\lambda }\). There exist a neighborhood \(U(\Lambda _{\lambda })\) and \(k_1 \in \mathbb {N}\) such that

  1. 1.

    if \(x \in B_{\lambda }\) and \(\mathcal {O}^+(x)\cap U(\Lambda _{\lambda }) = \emptyset \), then \(T^n(x) \not \in B_{\lambda }\) for all \(n > k_1\);

  2. 2.

    if \(x \in U(\Lambda _{\lambda })\) and \(T(x) \not \in U(\Lambda _{\lambda })\), then \(\mathcal {O}^+(T(x)) \cap U(\Lambda _{\lambda }) = \emptyset \);

  3. 3.

    \(U(\Lambda _{\lambda })\) is inside of \(\delta \)-neighborhood of \(\Lambda _{\lambda }\), where \(\delta \) small enough so that Anosov Closing Lemma type arguments (more specifically, Proposition 6.4.16 from [66]) can be applied.

Such a neighborhood \(U(\Lambda _{\lambda })\) can be constructed by taking a union of open rectangles around elements of a Markov partition for \(\Lambda _{\lambda }\) so that the usual properties of Markov partitions that allow one to use coding can be applied. Slightly abusing terminology we will refer to those rectangles as the elements of a Markov partition. This will ensure that property (2) holds. Property (1) holds for sufficiently large \(k_1\) since \(\Lambda _{\lambda }\) is the set of bounded orbits of the map \(T_{\lambda }\). Indeed, \(\overline{B_{\lambda }}\backslash U(\Lambda _{\lambda })\) is compact, and if (1) does not hold, one can find a sequence of points in \(\overline{B_{\lambda }}\backslash U(\Lambda _{\lambda })\) whose long finite orbits (both positive and negative) are also in that set. Any limit point would have to have a bounded orbit, but this is a contradiction since \(\Lambda _{\lambda }\) is the set of bounded orbits of the map \(T_{\lambda }\).

Let \(k'\in {\mathbb {Z}}_+\) be such that \(\bigcap _{-k' \le n \le k'} T^n(B_{\lambda } \cap S_{\lambda }) \subset U(\Lambda _{\lambda })\). In this case if \(x_k(E_k) = 0\) for \(k \gg \max (k_1,k')\), then \(T^{k'}(\ell _{\lambda }(E_k)) \in U_{\lambda }\), and also \(T^n(\ell _{\lambda }(E_k)) \in U_{\lambda }\) for \(n= k'+1, \ldots , k-k_1\). By the choice of \(B_{\lambda }\), we have \(C_{\lambda }\subset B_{\lambda }\), and since \(x_k(E_k)=0\), we also have \(T^k(\ell _{\lambda }(E_k))\in C_{\lambda }\subset B_{\lambda }\). Set \(P = T^{k'}(\ell _{\lambda }(E_k))\), \(T^i(P) \in U(\Lambda _{\lambda })\) for all \(0 \le i \le k-k_1-k'\). Let \(\bar{p} \in \Lambda _{\lambda }\) be any point that has the same symbolic dynamics over the finite time interval of length \(k-k_1-k'\). In other words, \(\bar{p}\) is such that \(T^i(\bar{p})\) and \(T^i(P)\) belong to the same element of the Markov partition of \(\Lambda _{\lambda }\) for \(i=0, 1, \ldots , k-k'-k_1\). In this case \({\mathrm {dist}} (T^i(\bar{p}), T^i(P)) \le \delta \) for \(i=0, 1, \ldots , k-k'-k_1\). This implies (see Proposition 6.4.16 from [66]) that in fact \({\mathrm {dist}} (T^j(\bar{p}), T^j(P)) \le C \rho ^{\min (j, m-j)}\delta \) for some \(\rho < 1\), where \(m = k-k'-k_1\) and \(0 \le j \le m\). Distortion estimates imply now that

$$\begin{aligned} \left| \log \Vert DT^m(\bar{p})|_{E^u_{\bar{p}}}\Vert - \log \Vert DT^m(P)|_{T^{k'+k_1}(\ell _{\lambda })}\Vert \right| \le C, \end{aligned}$$

where the constant C is independent of m. Take \(p = T^{-k'}(\bar{p})\). Then, for some \(C'\) independent of m, we have

$$\begin{aligned}&\left| \log \Vert DT^k({p})|_{E^u_{p}}\Vert -\log \Vert DT^m(\bar{p})|_{E^u_{\bar{p}}}\Vert \right| \\&\quad +\left| \log \Vert DT^k(\ell _{\lambda }(E_k))|_{\ell _{\lambda }}\Vert -\log \Vert DT^m(P)|_{T^{k'+k_1}(\ell _{\lambda })}\Vert \right| \le C' \end{aligned}$$

and hence

$$\begin{aligned} \left| \frac{1}{k} \log \Vert DT^k({p})|_{E^u_{p}}\Vert - \frac{1}{k} \log \Vert DT^k(\ell _{\lambda }(E_k))|_{\ell _{\lambda }}\Vert \right| \le \frac{(C+C')}{k} \le \varepsilon \end{aligned}$$

if k is sufficiently large. Together with the fact that

$$\begin{aligned} \left| \log |x'_k(E_k)| - \log \Vert DT^k(\ell _{\lambda }(E_k))|_{\ell _{\lambda }}\Vert \right| < C_1 \end{aligned}$$

with \(C_1\) independent of k, this proves Proposition 4.5.\(\square \)

Lemma 4.6

We have

$$\begin{aligned} \lim _{k \rightarrow \infty } \frac{1}{k} \inf _{p \in \Lambda _{\lambda }} \log \Vert DT^k(p)|_{E^u_p}\Vert = \inf _{p \in \Lambda _{\lambda }} {\mathrm {Lyap}}^u(p) = \inf _{p \in Per(\Lambda _{\lambda })} {\mathrm {Lyap}}^u(p). \end{aligned}$$

That is,  the limit on the left-hand side exists and equals the other two expressions.

Proof

Notice that we certainly have

$$\begin{aligned} \liminf _{k \rightarrow \infty } \frac{1}{k} \inf _{p \in \Lambda _{\lambda }} \log \Vert DT^k(p)|_{E^u_p}\Vert \le \inf _{p \in \Lambda _{\lambda }} {\mathrm {Lyap}}^u(p) \le \inf _{p \in Per(\Lambda _{\lambda })} {\mathrm {Lyap}}^u(p). \end{aligned}$$

Let us show that

$$\begin{aligned} A := \liminf _{k \rightarrow \infty } \frac{1}{k} \inf _{p \in \Lambda _{\lambda }} \log \Vert DT^k(p)|_{E^u_p}\Vert \ge \inf _{p \in Per(\Lambda _{\lambda })} {\mathrm {Lyap}}^u(p). \end{aligned}$$
(30)

Fix an arbitrarily small \(\varepsilon >0\). There exist \(k_j \rightarrow \infty \) and \(p_j \in \Lambda _{\lambda }\) such that

$$\begin{aligned} \frac{1}{k_j} \log \Vert DT^{k_j}(p_j)|_{E^u_{p_j}}\Vert \le A + \varepsilon . \end{aligned}$$

The specification property (see, for example, [66, Theorem 18.3.9]) implies that for any \(\delta > 0\), we can find a sequence of periodic orbits \(\{q_j\}\) such that

  1. 1.

    \(T^{k_j+M}(q_j)=q_j\), where \(M\in \mathbb {N}\) is independent of \(j\in \mathbb {N}\);

  2. 2.

    \({\mathrm {dist}} (T^i(q_j), T^i(p_j)) \le \delta \) for \(i=0, \ldots , k_j-1\).

Now the quantitative version of the Anosov Closing Lemma (see, e.g., [66, Proposition 6.4.16]) implies that in fact for some \(\rho < 1\),

$$\begin{aligned} {\mathrm {dist}} (T^i(q_j), T^i(p_j)) \le C \rho ^{\min (i, k_j-i)} \delta . \end{aligned}$$

The stable and unstable distributions of a two dimensional horseshoe are \(C^1\), see [66, Corollary 19.1.11]. Now smoothness of the unstable bundle \(\{E^u_x\}_{x\in \Lambda _{\lambda }}\) allows us to use standard distortion estimates and hence to deduce that

$$\begin{aligned} \log \Vert DT^{k_j}(q_j)|_{E^u_{q_j}}\Vert \le \log \Vert DT^{k_j}(p_j)|_{E^u_{p_j}}\Vert + C', \end{aligned}$$

where the constant \(C'\) is independent of j. Hence for large enough \(k_j\), we have

$$\begin{aligned} {\mathrm {Lyap}}^u(q_j)= & {} \frac{1}{k_j+M} \log \Vert DT^{k_j+M}(q_j)|_{E^u_{q_j}}\Vert \\\le & {} \frac{1}{k_j} \log \Vert DT^k_j(p_j)|_{E^u_{p_j}}\Vert +\varepsilon + \frac{C'}{k_j} \le A + 3 \varepsilon . \end{aligned}$$

This implies that

$$\begin{aligned} \inf _{p \in Per(\Lambda _{\lambda })} {\mathrm {Lyap}}^u(p) \le A + 3 \varepsilon , \end{aligned}$$

and since \(\varepsilon > 0\) can be chosen arbitrary small, we have

$$\begin{aligned} \inf _{p \in Per(\Lambda _{\lambda })} {\mathrm {Lyap}}^u(p) \le A. \end{aligned}$$

This completes the proof of the inequality (30).

Now we need to show that

$$\begin{aligned} B := \limsup _{k \rightarrow \infty } \frac{1}{k} \inf _{p \in \Lambda _{\lambda }} \log \Vert DT^k(p)|_{E^u_p}\Vert \le \inf _{p \in Per(\Lambda _{\lambda })} {\mathrm {Lyap}}^u(p). \end{aligned}$$
(31)

Once again, fix an arbitrarily small \(\varepsilon > 0\). Take a periodic point \(p_0 \in Per(\Lambda _{\lambda })\), \(T^m(p_0) = p_0\), such that

$$\begin{aligned} {\mathrm {Lyap}}^u(p_0) \le \inf _{p \in Per(\Lambda _{\lambda })} {\mathrm {Lyap}}^u(p) + \varepsilon . \end{aligned}$$

For all sufficiently large k, we have

$$\begin{aligned} \frac{1}{k} \log \Vert DT^k(p_0)|_{E^u_{p_0}}\Vert \le {\mathrm {Lyap}}^u(p_0) + \varepsilon \le \inf _{p \in Per(\Lambda _{\lambda })} {\mathrm {Lyap}}^u(p) + 2 \varepsilon , \end{aligned}$$

hence

$$\begin{aligned} \frac{1}{k} \inf _{p \in \Lambda _{\lambda }} \log \Vert DT^k(p)|_{E^u_p}\Vert\le & {} \frac{1}{k} \log \Vert DT^k(p_0)|_{E^u_{p_0}}\Vert \\\le & {} {\mathrm {Lyap}}^u(p_0)+ \varepsilon \le \inf _{p \in Per(\Lambda _{\lambda })} {\mathrm {Lyap}}^u(p) + 2 \varepsilon . \end{aligned}$$

Therefore

$$\begin{aligned} B = \limsup _{k \rightarrow \infty } \frac{1}{k} \inf _{p \in \Lambda _{\lambda }} \log \Vert DT^k(p)|_{E^u_p}\Vert \le \inf _{p \in Per(\Lambda _{\lambda })} {\mathrm {Lyap}}^u(p) + 2 \varepsilon , \end{aligned}$$

and since \(\varepsilon > 0\) is arbitrary, we have \(B \le \inf _{p \in Per(\Lambda _{\lambda })} {\mathrm {Lyap}}^u(p)\). Together with (30) this completes the proof of Lemma 4.6. \(\square \)

As a direct corollary of Propositions 4.4 and 4.5 and Lemma 4.6 we get the following statement:

Proposition 4.7

We have

$$\begin{aligned} \lim _{k \rightarrow \infty } \frac{1}{k} \log \min _{j=1, \ldots , F_k} \left| x'_k(E_k^{(j)}) \right| = \inf _{p \in Per(\Lambda _{\lambda })} {\mathrm {Lyap}}^u(p). \end{aligned}$$

That is,  we have that the limit on the left-hand side exists and that it is equal to the right-hand side.

Recall that we considered above the sets \(\sigma _k^\delta \) and their connected components \(B_k^{(j)}(\delta )\). Define further

$$\begin{aligned} r_k^{(j)}(\delta )= & {} \sup \{ r > 0 : B(E_k^{(j)}, r) \subseteq B_k^{(j)}(\delta ) \}, \quad r_k(\delta ) = \max _{j = 1,\ldots ,F_k} r_k^{(j)}(\delta ), \\ R_k^{(j)}(\delta )= & {} \inf \{ R > 0 : B(E_k^{(j)}, R) \supseteq B_k^{(j)}(\delta ) \},\quad R_k(\delta ) = \max _{j = 1,\ldots ,F_k} R_k^{(j)}(\delta ). \end{aligned}$$

The identity (9) will follow from Proposition 4.7 and the following proposition.

Proposition 4.8

  1. (a)

    For every \(\lambda > 0\) and \(\delta \in (0,\delta (\lambda )),\) we have

    $$\begin{aligned} \tilde{\alpha }_u^- \ge \frac{\log \varphi }{\limsup _{k \rightarrow \infty } \frac{1}{k} \log \frac{1}{r_k(\delta )}} \end{aligned}$$
    (32)

    and

    $$\begin{aligned} \tilde{\alpha }_u^+ \le \frac{\log \varphi }{\liminf _{k \rightarrow \infty } \frac{1}{k} \log \frac{1}{R_k(\delta )}}. \end{aligned}$$
    (33)
  2. (b)

    For every \(\lambda > 0\) and \(\delta \in (0,\delta (\lambda )/2),\) we have

    $$\begin{aligned} \frac{1}{R_k(\delta )} \ge \frac{\delta ^2}{(2 + \delta )(2 + 2\delta )^2} \left( \min _j |x_k'(E_k^{(j)})| \right) \end{aligned}$$
    (34)

    and

    $$\begin{aligned} \frac{1}{r_k(\delta )} \le \frac{(4 + 3\delta )^2}{(2 + \delta )(2 + 2 \delta )^2} \left( \min _j |x_k'(E_k^{(j)})| \right) \end{aligned}$$
    (35)

    for every \(k \ge 0\).

  3. (c)

    For \(\lambda > 0\) and \(\delta \in (0,\delta (\lambda )/2),\) we have

    $$\begin{aligned} \tilde{\alpha }_u^- \ge \frac{\log \varphi }{\limsup _{k \rightarrow \infty } \frac{1}{k} \log \left( \min _{j = 1,\ldots ,F_k} \left| x_k'(E_k^{(j)}) \right| \right) }. \end{aligned}$$

    and

    $$\begin{aligned} \tilde{\alpha }_u^+ \le \frac{\log \varphi }{\liminf _{k \rightarrow \infty } \frac{1}{k} \log \left( \min _{j = 1,\ldots ,F_k} \left| x_k'(E_k^{(j)}) \right| \right) }. \end{aligned}$$

Proof

(a) The strategy of proving (32) and (33) is inspired by [36, 45, 46]. The Parseval identity implies (see, e.g., [68, Lemma 3.2])

$$\begin{aligned} 2\pi \int _0^{\infty } e^{-2t/T} | \langle \delta _n, e^{-itH} \delta _0 \rangle |^2 \, dt = \int _{-\infty }^\infty \left| \langle \delta _n, (H - E - \tfrac{i}{T})^{-1} \delta _0 \rangle \right| ^2 \, \textit{dE},\nonumber \\ \end{aligned}$$
(36)

and hence for the time averaged outside probabilities, defined by

$$\begin{aligned} \langle P(N,\cdot ) \rangle (T) = \frac{2}{T} \int _0^{\infty } e^{-2t/T} \sum _{|n| \ge N} | \langle \delta _n, e^{-itH} \delta _0 \rangle |^2 \, dt, \end{aligned}$$
(37)

we have

$$\begin{aligned} \langle P(N,\cdot ) \rangle (T) = \frac{1}{\pi T} \sum _{|n| \ge N} \int _{-\infty }^\infty \left| \langle \delta _n, (H - E - \tfrac{i}{T})^{-1} \delta _0 \rangle \right| ^2 \, \textit{dE}. \end{aligned}$$
(38)

The right-hand side of (38) may be studied by means of transfer matrices at complex energies, which are defined as follows. For \(z \in {\mathbb {C}}\), \(n \in {\mathbb {Z}}\), we set

$$\begin{aligned} M(n;\omega ,z) = {\left\{ \begin{array}{ll} T(n;\omega ,z) \cdots T(1;\omega ,z) &{} n \ge 1, \\ T(n;\omega ,z)^{-1} \cdots T(-1;\omega ,z)^{-1} &{} n \le -1, \end{array}\right. } \end{aligned}$$

where

$$\begin{aligned} T(\ell ;\omega ,z) = \begin{pmatrix} z - \lambda \chi _{[1-\alpha ,1)}(\ell \alpha + \omega \!\!\!\! \mod 1) &{}\quad -1 \\ 1 &{}\quad 0 \end{pmatrix}. \end{aligned}$$

The following statement follows from [46, Proposition 2]: For every \(\lambda , \delta > 0\), there are constants \(C,\xi \) such that for every k, every \(z \in \sigma _k^\delta \), and every \(\omega \in {\mathbb {T}}\), we have

$$\begin{aligned} \Vert M(n;\omega ,z) \Vert \le C n^\xi . \end{aligned}$$
(39)

for \(1 \le |n| \le F_k\). Combining ideas from the proof of [46, Proposition 2] and the proof of [33, Theorem 5.1], one can show the following for the exponent \(\xi \) in (39). If we denote the largest root of the polynomial \(x^3 - (2+\lambda ) x - 1\) by \(a_{\lambda }\) (note that for small \(\lambda > 0\), we have \(a_{\lambda } \approx \varphi + c\lambda \) with a suitable constant c), then for any

$$\begin{aligned} \xi > 2 \frac{\log [(5 + 2\lambda )^{1/2} (3 + \lambda ) a_{\lambda }]}{\log \varphi }, \end{aligned}$$
(40)

there is a constant C such that (39) holds for \(z \in \sigma _k^\delta \) and \(\omega \in {\mathbb {T}}\).

Let us now consider \(\lambda > 0\), \(\delta \in (0,\delta (\lambda ))\), and \(\varepsilon > 0\). Consider the value of \(j \in \{1, \ldots , F_k\}\) with \(r_k^{(j)}(\delta ) = r_k(\delta )\). By definition, \(E_k^{(j)}\) is the only zero of \(x_k\) in \(B_k^{(j)}(\delta )\).

For \(\rho > 0\) arbitrary, consider

$$\begin{aligned} s = \frac{\limsup _{k \rightarrow \infty } \frac{1}{k} \log \frac{1}{r_k(\delta )}}{\log \varphi } + \rho . \end{aligned}$$
(41)

Clearly, s is strictly positive. By definition of s, for suitably chosen \(C_\delta > 0\), we have

$$\begin{aligned} C_\delta F_k^{s} \ge \frac{2}{r_k(\delta )} \end{aligned}$$
(42)

for every \(k \ge 0\).

Take \(N = F_k\) and consider \(T \ge C_\delta N^{s}\) (which in turn implies \(T \ge \frac{2}{r_k(\delta )}\) by (42)). Due to the Parseval formula (36), we can bound the time-averaged outside probabilities from below as follows,

$$\begin{aligned}&\langle P(N,\cdot ) \rangle (T) \nonumber \\&\quad \gtrsim \frac{1}{T} \int _{\mathbb {R}}\left( \max \left\{ \Vert M(N;\omega ,E+i/T)\Vert , \Vert M(-N;\omega ,E+i/T)\Vert \right\} \right) ^{-2} \, { dE}.\nonumber \\ \end{aligned}$$
(43)

See, for example, the proof of [43, Theorem 1] for an explicit derivation of (43) from (36).

To bound the integral from below, we integrate only over those \(E \in (E_k^{(j)}-r_k(\delta ), E_k^{(j)}+r_k(\delta ))\) for which \(E+i/T \in B(E_k^{(j)}, r_k(\delta )) \subset B_k^{(j)}(\delta )\). Since \(\frac{1}{T} \le \frac{r_k(\delta )}{2}\), the length of such an interval \(I_k\) is larger than \(cr_k(\delta )\) for some suitable \(c > 0\). For \(E \in I_k\), we have

$$\begin{aligned} \Vert M(N;\omega , E+i\varepsilon )\Vert \lesssim N^{\xi } \lesssim T^{\frac{\xi }{s}}. \end{aligned}$$

Therefore, (43) together with (39) gives

$$\begin{aligned} \langle P(N,\cdot ) \rangle (T) \gtrsim \frac{r_k}{T} \, T^{-\frac{2\xi }{s}} \gtrsim T^{-2-\frac{2\xi }{2}}, \end{aligned}$$
(44)

where \(N = F_k\), \(T \ge C_\delta N^s\), for any \(k \ge k_0\).

Now let us take any sufficiently large T and choose k maximal with \(C_\delta F_k^s \le T\). Then,

$$\begin{aligned} C_\delta F_k^s \le T < C_\delta F_{k+1}^s \le 2^s C_\delta F_k^s. \end{aligned}$$

It follows from (44) that

$$\begin{aligned} \left\langle P \left( \tfrac{1}{2 C_\delta ^{1/s}} T^{\frac{1}{s}},\cdot \right) \right\rangle (T) \ge \langle P(F_k,\cdot ) \rangle (T) \gtrsim T^{-2-\frac{2\xi }{s}} \end{aligned}$$

for all sufficiently large T. It follows from the definition of \(\tilde{\beta }^-(p)\) and \(\tilde{\alpha }_u^-\) that

$$\begin{aligned} \tilde{\beta }_{\delta _0}^-(p) \ge \frac{1}{s} - \frac{2}{p}\left( 1 + \frac{\xi }{s} \right) \end{aligned}$$

and

$$\begin{aligned} \tilde{\alpha }_u^- \ge \frac{1}{s} = \left( \frac{\limsup _{k \rightarrow \infty } \frac{1}{k} \log \frac{1}{r_k(\delta )}}{\log \varphi } + \rho \right) ^{-1}, \end{aligned}$$

by (41). Since \(\rho > 0\) can be taken arbitrarily small, this proves (32).

Let us recall [45, Lemma 4]: Given any \(\delta > 0\) and \(E \in {\mathbb {C}}\), a necessary and sufficient condition for \(\{x_k(E)\}_{k \ge -1}\) to be unbounded is that

$$\begin{aligned} |x_{K-1}(E)| \le 1 + \delta , \quad |x_{K}(E)| > 1 + \delta , \quad |x_{K+1}(E)| > 1 + \delta \end{aligned}$$
(45)

for some \(K \ge 0\). This K is unique. Moreover, in this case we have

$$\begin{aligned} |x_{K+k}(E)| \ge (1 + \delta )^{F_k}\quad \text {for } k \ge 0. \end{aligned}$$
(46)

By definition of \(R_k(\delta )\), we have

$$\begin{aligned} \sigma _k^\delta \subseteq \{ z \in {\mathbb {C}}: |\mathrm {Im} \ z| \le R_k(\delta ) \}. \end{aligned}$$

We set

$$\begin{aligned} s' = \frac{\liminf _{k \rightarrow \infty } \frac{1}{k} \log \frac{1}{R_k(\delta )}}{\log \varphi } - \rho ' \end{aligned}$$
(47)

for \(\rho ' > 0\) small enough so that \(s' > 0\) (Proposition 4.7 shows that it is possible to find such a \(\rho '\) since the right-hand side in that proposition is positive as \(\Lambda _{\lambda }\) is a hyperbolic set), and then choose some suitable \(C_\delta ' > 0\), so that we have

$$\begin{aligned} R_k(\delta ) < C_\delta ' F_k^{- s'}, \end{aligned}$$

for every \(k \ge 0\). In particular,

$$\begin{aligned} \sigma _k^\delta \cup \sigma _{k+1}^\delta \subseteq \{ z \in {\mathbb {C}}: |\mathrm {Im} \ z| < C_\delta ' F_k^{- s'} \}. \end{aligned}$$
(48)

For each \(\varepsilon = \mathrm {Im} \ z > 0\), one obtains lower bounds on \(|x_k(E+i\varepsilon )|\) which are uniform for \(E \in [-K,K] \subseteq {\mathbb {R}}\). Namely, given \(\varepsilon > 0\), choose k minimal with the property \(C_\delta ' F_k^{- s'} < \varepsilon \). By (48), we infer that \(|x_k(E+i\varepsilon )| > 1 + \delta \) and \(|x_{k+1}(E+i\varepsilon )| > 1 + \delta \). Since \(|x_{-1}(E+i\varepsilon )| = 1 \le 1 + \delta \), we must have the situation of [45, Lemma 4] (as recalled above) for some \(K \le k\). In particular, for \(k' > k\), (46) shows that

$$\begin{aligned} |x_{k'}(E+i\varepsilon )| \ge (1 + \delta )^{F_{k'-k}}. \end{aligned}$$

This motivates the following definitions. Fix some small \(\delta > 0\). For \(T > 1\), denote by k(T) the unique integer with

$$\begin{aligned} \frac{F_{k(T) - 1}^{s'}}{C_\delta '} \le T < \frac{F_{k(T)}^{s'}}{C_\delta '} \end{aligned}$$

and let

$$\begin{aligned} N(T) = F_{k(T) + \lfloor \sqrt{k(T)} \rfloor }. \end{aligned}$$

Thus, for every \(\tilde{\nu } > 0\), there is a constant \(C_{\tilde{\nu }} > 0\) such that

$$\begin{aligned} N(T) \le C_{\tilde{\nu }} T^\frac{1}{s'} T^{\tilde{\nu }}. \end{aligned}$$
(49)

It follows from [45, Theorem 7] and the argument above thatFootnote 9

$$\begin{aligned} \langle P(N(T),\cdot ) \rangle (T)\lesssim & {} \exp (-c N(T))\\&+\, T^3 \int _{-K}^K \left( \max _{3 \le n \le N(T)} \left\| M \left( n; \omega , E+\tfrac{i}{T} \right) \right\| ^2 \right) ^{-1} { dE} \\\lesssim & {} \exp (-c N(T)) + T^3 (1 + \delta )^{-2 F_{\lfloor \sqrt{k(T)} \rfloor }}. \end{aligned}$$

(We can estimate the norm on the left half-line in a completely analogous way.) From this bound, we see that \(\langle P(N(T),\cdot ) \rangle (T)\) goes to zero faster than any inverse power of T. Therefore we can apply [45, Theorem 1] and obtain from (49) that

$$\begin{aligned} \tilde{\alpha }_u^+ \le \frac{1}{s'} + \tilde{\nu } = \left( \frac{\liminf _{k \rightarrow \infty } \frac{1}{k} \log \frac{1}{R_k(\delta )}}{\log \varphi } - \rho ' \right) ^{-1} + \tilde{\nu }. \end{aligned}$$

Since we can take \(\rho ' > 0\) and \(\tilde{\nu } > 0\) arbitrarily small, (33) follows.

(b) Let \(\lambda > 0\) and choose \(\delta \in (0,\delta (\lambda )/2)\). Fix k and j, and consider the connected component \(B_k^{(j)}(2\delta )\) of \(\sigma _k^{2\delta }\). Since \(B_k^{(j)}(2\delta )\) contains exactly one zero of \(x_k\), it follows from the maximum modulus principle and Rouché’s Theorem that

$$\begin{aligned} x_k : {\mathrm {int}}(B_k^{(j)}(2\delta )) \rightarrow B(0,1 + 2\delta ) \end{aligned}$$

is univalent, and hence

$$\begin{aligned} x_k^{-1} : B(0,1 + 2\delta ) \rightarrow {\mathrm {int}}(B_k^{(j)}(2\delta )) \end{aligned}$$

is well-defined and univalent as well. Consequently, the following mapping is a Schlicht function:

$$\begin{aligned} F : B(0,1) \rightarrow {\mathbb {C}}, \quad F(z) = \frac{x_k^{-1} ((1 + 2\delta )z) - E_k^{(j)}}{(1 + 2\delta ) [(x_k^{-1})'(0)]}. \end{aligned}$$

That is, F is a univalent function on B(0, 1) with \(F(0) = 0\) and \(F'(0) = 1\).

The Koebe Distortion Theorem (see [23, Theorem 7.9]) implies that

$$\begin{aligned} \frac{|z|}{(1 + |z|)^2} \le |F(z)| \le \frac{|z|}{(1 - |z|)^2} \quad \text {for}\quad |z| \le 1. \end{aligned}$$
(50)

Evaluate the bound (50) on the circle \(|z| = \frac{1 + \delta }{1 + 2\delta }\). For such z, we obtain

$$\begin{aligned} \frac{(1 + \delta )(1 + 2\delta )}{(2 + 3 \delta )^2} \le |F(z)| \le \frac{(1 + \delta )(1 + 2\delta )}{\delta ^2}. \end{aligned}$$

By definition of F this means that

$$\begin{aligned} | x_k^{-1} ((1 + 2\delta )z) - E_k^{(j)} | \le \frac{(1 + \delta )(1 + 2\delta )}{\delta ^2} (1 + 2\delta ) |(x_k^{-1})'(0)| \end{aligned}$$

and

$$\begin{aligned} | x_k^{-1} ((1 + 2\delta )z) - E_k^{(j)} | \ge \frac{(1 + \delta )(1 + 2\delta )}{(2 + 3\delta )^2} (1 + 2\delta ) |(x_k^{-1})'(0)| \end{aligned}$$

for all z with \(|z| = \frac{1 + \delta }{1 + 2\delta }\). In other words, if \(|z| = 1 + \delta \), then

$$\begin{aligned} | x_k^{-1} (z) - E_k^{(j)} | \le \frac{(1 + \delta )(1 + 2\delta )^2}{\delta ^2} |(x_k^{-1})'(0)| \end{aligned}$$
(51)

and

$$\begin{aligned} | x_k^{-1} (z) - E_k^{(j)} | \ge \frac{(1 + \delta )(1 + 2\delta )^2}{(2 + 3\delta )^2} |(x_k^{-1})'(0)|. \end{aligned}$$
(52)

Note that as z runs through the circle of radius \(1 + \delta \) around zero, the point \(x_k^{-1} (z)\) runs through the entire boundary of \(B_k^{(j)}(\delta )\). Thus, since \(|(x_k^{-1})'(0)| = |x_k'(E_k^{(j)})|^{-1}\), (51) and (52) yield

$$\begin{aligned}&B \left( E_k^{(j)}, \frac{(1 + \delta )(1 + 2 \delta )^2}{(2 + 3\delta )^2} |x_k'(E_k^{(j)})|^{-1} \right) \\&\quad \subseteq B_k^{(j)}(\delta ) \subseteq B \left( E_k^{(j)}, \left( \frac{(1 + \delta )(1 + 2\delta )}{\delta } \right) ^2 |x_k'(E_k^{(j)})|^{-1} \right) . \end{aligned}$$

In particular, it follows that

$$\begin{aligned}&\frac{(1 + \delta )(1 + 2 \delta )^2}{(2 + 3\delta )^2} |x_k'(E_k^{(j)})|^{-1}\\&\quad \le r_k^{(j)}(\delta ) \le R_k^{(j)}(\delta ) \le \left( \frac{(1 + \delta )(1 + 2\delta )}{\delta } \right) ^2 |x_k'(E_k^{(j)})|^{-1}. \end{aligned}$$

Thus,

$$\begin{aligned} \left( \frac{\delta }{(1 + \delta )(1 + 2\delta )} \right) ^2 |x_k'(E_k^{(j)})|\le & {} \frac{1}{R_k^{(j)}(\delta )} \le \frac{1}{r_k^{(j)}(\delta )}\\\le & {} \frac{(2 + 3\delta )^2}{(1 + \delta )(1 + 2 \delta )^2} |x_k'(E_k^{(j)})|, \end{aligned}$$

which in turn implies

$$\begin{aligned} \left( \frac{\delta }{(1 + \delta )(1 + 2\delta )} \right) ^2 \left( \min _j |x_k'(E_k^{(j)})| \right)\le & {} \frac{1}{R_k(\delta )} \le \frac{1}{r_k(\delta )}\\\le & {} \frac{(2 + 3\delta )^2}{(1 + \delta )(1 + 2 \delta )^2} \left( \min _j |x_k'(E_k^{(j)})| \right) . \end{aligned}$$

This shows (34)–(35).

(c) The estimates in this part follow immediately from the estimates in parts (a) and (b). This concludes the proof. \(\square \)

Proof of (9) in Theorem 1.6

The identity is a direct consequence of Propositions 4.7 and 4.8. \(\square \)

5 The density of states measure

In this section we discuss the density of states measure \(\nu _{\lambda }\). Specifically, we establish the identity (11) and the large coupling asymptotics (19).

The identity (11) was established in [34] for \(\lambda > 0\) sufficiently small. An inspection of the proof given there shows that all that is needed to extend the identity to all \(\lambda > 0\) is the transversality statement provided by Theorem 1.5. Thus, given that Theorem 1.5 has now been established, the identity (11) for all \(\lambda > 0\) follows as an immediate consequence.

Proving (19) will require significantly more work. We begin with the following alternative identity for \(\dim _H \nu _{\lambda }\), which we can prove for \(\lambda \) sufficiently large. Recall that each connected component of \(\sigma _k\) contains precisely one zero of \(x_k\), denoted by \(E_k^{(i)}\), \(1 \le i \le F_k\).

Proposition 5.1

For every \(\lambda >0\) we have

$$\begin{aligned} \dim _H \nu _{\lambda } = \frac{\log \varphi }{\lim _{k \rightarrow \infty } \frac{1}{kF_k} \log \left( \prod _{i = 1}^{F_k} \left| x_k'(E_k^{(i)}) \right| \right) }. \end{aligned}$$
(53)

Proof

Due to (11), we need to show that

$$\begin{aligned} {\mathrm {Lyap}}^u \left( \mu _{\lambda ,{\mathrm {max}}} \right) = \lim _{k \rightarrow \infty } \frac{1}{kF_k} \log \left( \prod _{i=1}^{F_k} \left| x_k'(E_k^{(i)}) \right| \right) , \end{aligned}$$

which is equivalent to

$$\begin{aligned} {\mathrm {Lyap}}^u \left( \mu _{\lambda ,{\mathrm {max}}} \right) = \lim _{k \rightarrow \infty } \frac{1}{k F_{k-1}} \log \left( \prod _{i=1}^{F_{k-1}} \left| x_{k-1}'(E_{k-1}^{(i)}) \right| \right) . \end{aligned}$$

Recall that \(T^{k}_{\lambda }(\ell _{\lambda }(E)) = (x_{k+1}(E), x_k(E), x_{k-1}(E))\), and hence the z-component of \(T^{k}_{\lambda }(\ell _{\lambda }(E))\) is \(x_{k-1}(E)\). Let \(l_i \in \ell _{\lambda }\) be the points such that \(\ell _{\lambda }(E_{k-1}^{(i)}) = l_i\). Due to the transversality of \(T^{k}_{\lambda }(\ell _{\lambda })\) to the plane \(\{z=0\}\), which holds uniformly in k and follows from Proposition 4.1 combined with the Inclination Lemma, we have \(C^{-1} \Vert DT^{k}_{\lambda }(\overline{v_i})\Vert \le | x_{k-1}'(E_{k-1}^{(i)}) | \le C \Vert DT^{k}_{\lambda }(\overline{v_i})\Vert \) for some uniform \(C > 1\), where \(\overline{v_i}\) is a unit vector tangent to \(\ell _{\lambda }\) at the point \(l_i\). Therefore the statement can be reduced to the claim that

$$\begin{aligned} {\mathrm {Lyap}}^u \left( \mu _{\lambda ,{\mathrm {max}}} \right) = \lim _{k \rightarrow \infty } \frac{1}{kF_{k-1}} \log \left( \prod _{\{ l_i \in \ell _{\lambda } : T^{k}_{\lambda }(l_i) \in \{z=0\} \}} \left\| DT^{k}_{\lambda }(\overline{v_i}) \right\| \right) .\quad \quad \end{aligned}$$
(54)

We will need the following statement from hyperbolic dynamics.

Lemma 5.2

Let \(f : M^2 \rightarrow M^2\) be a \(C^2\)-diffeomorphism such that \(f(\Lambda ) = \Lambda \) is a topologically mixing locally maximal totally disconnected hyperbolic set,  and an open set \(U=U(\Lambda )\) be such that \(\bigcap _{n \in \mathbb {Z}}f^n(U(\Lambda ))=\Lambda \). Let \(\gamma _1, \gamma _2\subset U\) be \(C^1\)-smooth curves such that \(\gamma _1\) is transversal to \(W^s(\Lambda ),\) and \(\gamma _2\) is transversal to \(W^u(\Lambda )\). For each \(k\in \mathbb {N}\) denote by \(\{l_i\}_{i=1, \ldots , N_k}\subset \gamma _1\) the set \(f^{-k}(f^k(\gamma _1)\cap \gamma _2)\). Then, 

$$\begin{aligned} {\mathrm {Lyap}}^u(\mu _{\mathrm {max}}) = \lim _{k \rightarrow \infty } \frac{1}{kN_k} \log \left( \prod _{\{l_i \in \gamma _1 : f^k(l_i) \in \gamma _2 \}} \left| Df^k(\overline{v_i}) \right| \right) , \end{aligned}$$
(55)

where \(\mu _{\mathrm {max}}\) is the measure of maximal entropy for \(f|_{\Lambda } : \Lambda \rightarrow \Lambda ,\) and \(\overline{v}_i\) is a unit vector tangent to \(\gamma _1\) at the point \(l_i\).

Proof

First of all, let us notice that if \(\gamma _1\) is represented as a disjoint union of curves \(\gamma _1'\) and \(\gamma _1''\), and (55) holds for both \(\gamma _1'\) and \(\gamma _1''\), then it also holds for the initial curve \(\gamma _1\). Indeed, this just follows from the fact that if \(\{a_n\}\), \(\{b_n\}\), \(\{x_n\}\), and \(\{y_n\}\) are sequences of positive numbers such that \(\frac{a_n}{b_n}\rightarrow c\) and \(\frac{x_n}{y_n}\rightarrow c\) then \(\frac{a_n+x_n}{b_n+y_n}\rightarrow c\). The same statement (due to the same argument) holds for the curve \(\gamma _2\).

Next, let us notice that if (55) holds for some \(\gamma _1\), then it also holds for \(f(\gamma _1)\cap U\) (and vice versa). Indeed, \(N_k(\gamma _1)=N_{k-1}(f(\gamma _1)\cap U)\), and the expression \(\log (\prod _{\{ l_i \in \gamma _1 : f^k(l_i) \in \gamma _2 \}} | Df^k(\overline{v_i})|)\) differs from the expression \(\log (\prod _{\{ l_i \in f(\gamma _1) : f^{k-1}(l_i) \in \gamma _2 \}} |Df^k(\overline{v_i})|)\) by no more than \({\mathrm {const}} \cdot N_k(\gamma _1)\). Combining these two observations, we see that it is enough to prove (55) for the case when \(\gamma _1\) is a curve that is \(C^1\)-close to a piece of unstable manifold of \(\Lambda \) inside a rectangle of a Markov partition, and \(\gamma _2\) is a curve that is \(C^1\)-close to a piece of stable manifold of \(\Lambda \) inside a rectangle of a Markov partition.

Moreover, we can further reduce the statement to the case when \(\gamma _1\) is a piece of an unstable manifold in some element of Markov partition, and \(\gamma _2\) is a piece of a stable manifold in some element of Markov partition. Indeed, let us consider \(C^1\)-invariant stable and unstable foliations in \(U(\Lambda )\) that include stable and unstable laminations \(W^s(\Lambda )\) and \(W^u(\Lambda )\) and the curves \(\gamma _1\) and \(\gamma _2\), respectively. For the existence of these foliations, see [103]. Since the diffeomorphism f is \(C^2\)-smooth, its differential is \(C^1\), and hence the restriction of the differential of f to the unit tangent bundle over a leaf of the unstable foliation is also \(C^1\)-smooth. The exponential instability of orbits near a hyperbolic set (see Proposition 6.4.16 from [66]) now implies that if (55) holds for pieces of stable and unstable manifolds as \(\gamma _1\) and \(\gamma _2\), then it also holds for the initial curves \(\gamma _1\) and \(\gamma _2\) that were sufficiently \(C^1\)-close to the pieces of stable and unstable manifolds.

From now on we can assume that \(\gamma _1\) is a piece of an unstable manifold in some element of Markov partition, and \(\gamma _2\) is a piece of a stable manifold in some element of Markov partition. The restriction \(f|_{\Lambda }\) is conjugate to a topological Markov shift \(\sigma _A : \Sigma _A \rightarrow \Sigma _A\) with some transitive \(0-1\) matrix A of size \(N \times N\).

Lemma 5.3

Let \(\sigma _A : \Sigma _A \rightarrow \Sigma _A\) be a transitive topological Markov chain over the alphabet \(\{1, \ldots , N\},\) and denote by \(\nu _P\) the measure of maximal entropy (Parry measure). Fix any \(\omega ', \omega '' \in \{1, 2, \ldots , N\},\) and admissible sequences

\(\underline{\ldots \text {sequence}_1}\,\, \omega '\)—infinite to the left,  and

\(\omega ''\, \underline{\text {sequence}_2\ldots }\)—infinite to the right.

We assume that at least one of the two one-sided sequences is not eventually periodic.

For each \(k \in \mathbb {N},\) consider the collection \(X_k\) of all the sequences from \(\Sigma _A\) of the form

$$\begin{aligned} \underline{\ldots \text {sequence}_1}\,\, \underbrace{\mathop {\omega '}\limits ^{*} \ldots \ldots \omega ''}_{k+1}\, \underline{\text {sequence}_2\ldots } \end{aligned}$$

(where \(*\) indicates the origin) and set \(S_k = \bigcup _{j=0}^{k-1} \sigma _A^j(X_k)\). Then

$$\begin{aligned} \nu _k := \frac{1}{\# S_k} \sum _{x \in S_k} \delta _x \rightarrow \nu _P\quad \text {as} \ k\rightarrow \infty . \end{aligned}$$

Notice that Lemma 5.3 immediately implies (55) in the case when \(\gamma _1\) and \(\gamma _2\) are pieces of stable and unstable manifolds. Indeed, we can assume without loss of generality that \(\gamma _1, \gamma _2\) do not contain any periodic points (otherwise we deform them slightly). Let \(H : \Sigma _A \rightarrow \Lambda \) be the conjugacy between \(\sigma _A\) and \(f|_{\Lambda }\). Then \(H_* (\nu _P) = \mu _{\mathrm {max}}\), and if \(\phi : \Lambda \rightarrow {\mathbb {R}}\) is a continuous function, then

$$\begin{aligned} \int \phi \, d\left( H_*(\nu _k)\right) = \frac{1}{\# S_k} \sum _{x \in S_k} \phi (H(x)) \rightarrow \int \phi \, d\mu _{max}, \end{aligned}$$

and hence for \(\phi (x) = \log |Df_x(\bar{v}_x)|\), where \(\bar{v}_x\) is a unit vector tangent to a leaf of the unstable foliation at the point x, we have

$$\begin{aligned} \int \phi \, d\left( H_*(\nu _k)\right)= & {} \frac{1}{kN_k} \log \left( \prod _{\{l_i \in \gamma _1 : f^k(l_i) \in \gamma _2\}} \left| Df^k(\overline{v_i}) \right| \right) \\\rightarrow & {} \int \phi \, d\mu _{max} \\= & {} {\mathrm {Lyap}}^u(\mu _{max}) \end{aligned}$$

as \(k\rightarrow \infty \), and therefore (55) holds.

Proof of Lemma 5.3

First of all, let us recall the construction of the Parry measure \(\nu _P\). Due to the Perron–Frobenius Theorem, the matrix \(A = (A_{ij})\) has only one eigenvector \(\bar{v} = (v_1, \ldots , v_N)\) with positive entries, up to multiplication by a positive number. The eigenvalue \(\lambda > 1\) that corresponds to \(\bar{v}\) is larger than the absolute value of any other eigenvalue of A. Denote by \(\bar{u}=(u_1, \ldots , u_N)\) the eigenvector of the transposed matrix \(A^T\) that corresponds to the eigenvalue \(\lambda \). Without loss of generality we can normalize \(\bar{v}\) and \(\bar{u}\) in such a way that \(v_1 u_1 + v_2 u_2 + \cdots +v_N u_N = 1\). The Parry measure is the Markov measure with the stationary probability vector \(\bar{p} = (p_1, \ldots , p_N)\), \(p_i = v_iu_i\), and the transition matrix \((p_{ij})\), \(p_{ij} = \frac{A_{ij} v_j}{\lambda v_j}\). An equivalent way to introduce the Parry measure is to define it on a cylinder \(C = \{\omega \in \Sigma _A : \omega _0 = i_0, \ldots , \omega _n = i_n\}\) by

$$\begin{aligned} \nu _P(C)=\left\{ \begin{array}{ll} 0 &{} \quad \hbox {if } i_0\cdots i_n \hbox { is not an admissible sequence;} \\ \frac{u_{i_0}v_{i_n}}{\lambda ^n} &{} \quad \hbox {if } i_0\cdots i_n \hbox { is admissible.} \end{array} \right. \end{aligned}$$

We need to show that for any continuous function \(\phi : \Sigma _A \rightarrow \mathbb {R}\), we have

$$\begin{aligned} \int \phi \, d\nu _k = \frac{1}{\# S_k} \sum _{x \in S_k} \phi (x) \rightarrow \int \phi \, d \nu _P\quad \text {as} \quad k\rightarrow \infty . \end{aligned}$$
(56)

It is enough to establish this convergence for functions of the form \(\phi _C = \chi _C\), where \(C = \{ \omega \in \Sigma _A : \omega _r = i_r, \omega _{r+1} = i_{r+1}, \ldots , \omega _s = {i_s}\}\) for some \(r < s\) and \(i_j \in \{1, 2, \ldots , N\}\) since the linear combinations of these functions are dense in \(C(\Sigma _A)\).

Lemma 5.4

Consider a topological Markov chain \(\sigma _A : \Sigma _A \rightarrow \Sigma _A\) and fix some finite admissible sequence \([i_0, i_1, \ldots , i_t],\) \(i_j \in \{1, \ldots , N\},\) and \(\omega ', \omega '' \in \{1, \ldots , N\}\). For a given \(k \in \mathbb {N},\) consider the collection of all admissible sequences \(\omega _0, \ldots , \omega _k\) of length \(k+1\) such that \(\omega _0 = \omega '\) and \(\omega _k = \omega '',\) and denote by \(I_{[i_0, i_1, \ldots , i_t]}(\omega ', \omega '')\) the number of times the string \([i_0, i_1, \ldots , i_t]\) can be encountered in these sequences (counting different encounters in the same sequence as separate). If we denote \(A^k = (A^{(k)}_{ij}),\) then

$$\begin{aligned} \frac{I_{[i_0, i_1, \ldots , i_t]}(\omega ', \omega '')}{kA_{\omega '\omega ''}^{(k)}} \rightarrow \frac{u_{i_0}v_{i_t}}{\lambda ^t}\quad \text {as} \ k \rightarrow \infty . \end{aligned}$$

Proof

Let us take \(k \gg t\) and represent

$$\begin{aligned} I_{[i_0, i_1, \ldots , i_t]}(\omega ', \omega '') = I_{[i_0, i_1, \ldots , i_t]}^{\mathrm {bound}}(\omega ', \omega '')+I_{[i_0, i_1, \ldots , i_t]}^{{\mathrm {int}}}(\omega ', \omega ''), \end{aligned}$$

where \(I_{[i_0, i_1, \ldots , i_t]}^{\mathrm {bound}}(\omega ', \omega '')\) is the number of encounters of \([i_0, i_1, \ldots , i_t]\) starting in the beginning or in the tail part of length \([\ln k]\) of the sequences \(\omega _0, \ldots , \omega _k\), and \(I_{[i_0, i_1, \ldots , i_t]}^{{\mathrm {int}}}(\omega ', \omega '')\) is the number of encounters of \([i_0, i_1, \ldots , i_t]\) starting in the middle part (of length \(k-2[\ln k]\)) of these sequences. A rough estimate on \(I_{[i_0, i_1, \ldots , i_t]}^{\mathrm {bound}}(\omega ', \omega '')\) gives

$$\begin{aligned} I_{[i_0, i_1, \ldots , i_t]}^{\mathrm {bound}}(\omega ', \omega '') \le C k^{\ln N} \ln k, \end{aligned}$$

and hence in

$$\begin{aligned} \frac{I_{[i_0, i_1, \ldots , i_t]}(\omega ', \omega '')}{kA_{\omega '\omega ''}^{(k)}} = \frac{I_{[i_0, i_1, \ldots , i_t]}^{\mathrm {bound}}(\omega ', \omega '')}{kA_{\omega '\omega ''}^{(k)}} + \frac{I_{[i_0, i_1, \ldots , i_t]}^{{\mathrm {int}}}(\omega ', \omega '')}{kA_{\omega '\omega ''}^{(k)}}, \end{aligned}$$

we have \(\frac{I_{[i_0, i_1, \ldots , i_t]}^{\mathrm {bound}}(\omega ', \omega '')}{kA_{\omega '\omega ''}^{(k)}}\rightarrow 0\) as \(k\rightarrow \infty \).

For a given l between \([\ln k]\) and \(k-[\ln k]\), denote by \(I^l\) the number of admissible sequences \(\omega _0, \ldots , \omega _k\) such that \([\omega _l\omega _{l+1}\cdots \omega _{l+t}]=[i_0i_1\cdots i_t]\). We have

$$\begin{aligned} \frac{I^l}{A^{(k)}_{\omega '\omega ''}}= & {} \sum _{\mathop {[\omega _l\omega _{l+1}\cdots \omega _{l+t}]=[i_0i_1\cdots i_t]}\limits ^{\omega _0 = \omega ', \omega _k = \omega '',}} \frac{A_{\omega _0\omega _1} A_{\omega _1\omega _2} \cdots A_{\omega _{k-1}\omega _k}}{A^{(k)}_{\omega '\omega ''}} \\= & {} (A_{i_0i_1}A_{i_1i_2}\cdots A_{i_{t-1}i_t})\frac{A^{(l)}_{\omega 'i_0}A^{(k-l-t)}_{i_t\omega ''}}{A^{(k)}_{\omega '\omega ''}}. \end{aligned}$$

Notice that \(A_{i_0 i_1} A_{i_1 i_2} \cdots A_{i_{t-1} i_t} = 1\) since \(i_0 i_1 \cdots i_t\) is an admissible sequence. We also know that \(\lim _{k \rightarrow \infty } A^{(k)}_{i j} \lambda ^{-k} = u_j v_i\) (see, e.g., [103, Theorem 0.17]). Since there are only finitely many pairs (i j), the limit here is uniform in ij, and therefore we have

$$\begin{aligned} \frac{I^l}{A^{(k)}_{\omega ' \omega ''}} = \frac{\left( A^{(l)}_{\omega 'i_0} \lambda ^{-l} \right) \left( A^{(k-l-t)}_{i_t \omega ''} \lambda ^{-(k-l-t)} \right) }{A^{(k)}_{\omega ' \omega ''} \lambda ^{-k} \lambda ^t} \approx \frac{u_{i_0} v_{\omega '} \cdot u_{\omega ''} v_{i_t}}{u_{\omega ''} v_{\omega '} \lambda ^t} = \frac{u_{i_0} v_{i_t}}{\lambda ^t}, \end{aligned}$$

uniformly for large k, and hence

$$\begin{aligned} \frac{I_{[i_0, i_1, \ldots , i_t]}^{{\mathrm {int}}}(\omega ', \omega '')}{k A_{\omega '\omega ''}^{(k)}} = \frac{1}{k} \sum _{l = [\ln k]}^{k - \ln [k]} \frac{I^l}{A_{\omega ' \omega ''}^{(k)}} \rightarrow \frac{u_{i_0} v_{i_t}}{\lambda ^t}\quad \text {as}\quad k \rightarrow \infty . \end{aligned}$$

This proves Lemma 5.4. \(\square \)

Notice that Lemma 5.4 implies (56) for the function \(\phi _C\). Indeed, if \(\ln k\gg \max (|s|, |r|)\), then

$$\begin{aligned} I_{[i_r, i_1, \ldots , i_s]}^{{\mathrm {int}}}(\omega ', \omega '') \le \sum _{x \in S_k} \phi _C(x) \le I_{[i_r, i_1, \ldots , i_s]}^{{\mathrm {int}}}(\omega ', \omega '') + C k^{\ln N} \ln k, \end{aligned}$$

and (56) follows since \(\# S_k = k A_{\omega '\omega ''}^{(k)}\) by the assumption that at least one of the one-sided sequences (\(\underline{\ldots \text {sequence}_1}\,\, \omega '\), \(\omega ''\, \underline{\text {sequence}_2\ldots }\)) is not eventually periodic. This proves Lemma 5.3. \(\square \)

This concludes the proof of Lemma 5.2. \(\square \)

Now (54) follows directly from Lemma 5.2, and this proves Proposition 5.1. \(\square \)

For \(\lambda \) sufficiently large, the modulus of \(x_k'(E_k^{(i)})\) may be estimated with the help of [30, Lemmas 5 & 6]. Namely, if m denotes the number of spectra \(\sigma _j\), \(1 \le j \le k-1\), \(E_k^{(i)}\) belongs to, then

$$\begin{aligned} S_l(\lambda )^m \le |x_k'(E_k^{(i)})| \le S_u(\lambda )^m, \end{aligned}$$
(57)

where

$$\begin{aligned} S_l(\lambda ) = \frac{1}{2} \left( (\lambda - 4) + \sqrt{(\lambda - 4)^2 - 12} \right) \quad \text {and} \quad S_u(\lambda ) = 2\lambda + 22.\quad \end{aligned}$$
(58)

Here, the first inequality in (57) requires \(\lambda \ge 8\) and the second requires \(\lambda > 4\).

Through the end of this section let us assume that \(\lambda > 4\). In this case the Fricke–Vogt invariant implies that

$$\begin{aligned} \sigma _k \cap \sigma _{k+1} \cap \sigma _{k+2} = \emptyset . \end{aligned}$$
(59)

The identity (59) is the basis for work done by Raymond [88]. Following [68], we call a band \(I_k \subset \sigma _k\) a “type A band” if \(I_k \subset \sigma _{k-1}\) (and hence \(I_k \cap (\sigma _{k+1} \cup \sigma _{k-2}) = \emptyset \)). We call a band \(I_k \subset \sigma _k\) a “type B band” if \(I_k \subset \sigma _{k-2}\) (and therefore \(I_k \cap \sigma _{k-1} = \emptyset \)). Then we have the following result (Lemma 5.3 of [68], essentially Lemma 6.1 of [88]).

Lemma 5.5

For every \(\lambda > 4\) and every \(k \ge 1,\)

  1. (a)

    Every type A band \(I_k \subset \sigma _k\) contains exactly one type B band \(I_{k+2} \subset \sigma _{k+2},\) and no other bands from \(\sigma _{k+1},\) \(\sigma _{k+2}\).

  2. (b)

    Every type B band \(I_k \subset \sigma _k\) contains exactly one type A band \(I_{k+1} \subset \sigma _{k+1}\) and two type B bands from \(\sigma _{k+2},\) positioned around \(I_{k+1}\).

We denote by \(a_k\) the number of bands of type A in \(\sigma _k\) and by \(b_k\) the number of bands of type B in \(\sigma _k\). By Raymond’s work, it follows immediately that \(a_k + b_k = F_k\) for every k. In fact, we have the following result, which follows from Lemma 5.5 by an easy induction.

Lemma 5.6

The constants \(\{a_k\}\) and \(\{b_k\}\) obey the relations

$$\begin{aligned} a_k = b_{k-1}, \quad b_k = a_{k-2} + 2b_{k-2} \end{aligned}$$
(60)

with initial values \(a_0 = 1,\) \(a_1 = 0,\) \(b_0 = 0,\) and \(b_1 = 1\). Consequently,  for \(k \ge 2,\)

$$\begin{aligned} a_k = b_{k-1} = F_{k-2}. \end{aligned}$$
(61)

Let us also denote by \(a_{k,m}\) the number of bands b of type A in \(\sigma _k\) with \(\# \{ 0\le j < k : b \cap \sigma _j \not = \emptyset \} = m\) and by \(b_{k,m}\) the number of bands b of type B in \(\sigma _k\) with \(\# \{ 0 \le j < k : b \cap \sigma _j \not = \emptyset \} = m\). Then, [30, Lemma 4] reads as follows:

Lemma 5.7

We have

$$\begin{aligned} a_{k,m} = b_{k-1,m-1}, \quad b_{k,m} = a_{k-2,m-1} + 2b_{k-2,m-1} \end{aligned}$$
(62)

with initial values \(a_{0,m} = 0\) for \(m > 0,\) \(a_{0,0} = 1,\) \(a_{1,m} = 0\) for \(m \ge 0,\) \(b_{0,m} = 0\) for \(m \ge 0,\) \(b_{1,m} = 0\) for \(m > 0,\) and \(b_{1,0} = 1\). Consequently, 

$$\begin{aligned} a_{k,m} = b_{k-1,m-1} = {\left\{ \begin{array}{ll} 2^{2k - 3m - 1} \frac{m}{k-m} { k - m \atopwithdelims ()2m - k } &{}\quad \text {when } \lceil \tfrac{k}{2} \rceil \le m \le \lfloor \tfrac{2k}{3} \rfloor ; \\ 0 &{}\quad \text {otherwise.}\quad \end{array}\right. }\qquad \end{aligned}$$
(63)

In fact, for our purposes here the recursion (62) will be sufficient, and we won’t make use of the explicit solution (63). Verifying the recursion (62) using the definition and Lemma 5.5 is straightforward.

Set

$$\begin{aligned} A_k = \sum _m m a_{k,m}, \quad B_k = \sum _m m b_{k,m}, \quad \text {and} \quad C_k = A_k + B_k. \end{aligned}$$

Lemma 5.8

We have

$$\begin{aligned} A_k= & {} B_{k-1} + F_{k-2}, \end{aligned}$$
(64)
$$\begin{aligned} B_k= & {} A_{k-2} + 2 B_{k-2} + F_{k-1}, \end{aligned}$$
(65)
$$\begin{aligned} C_k= & {} C_{k-1} + C_{k-2} + 2F_{k-2}. \end{aligned}$$
(66)

Proof

We have

$$\begin{aligned} A_k= & {} \sum _m m a_{k,m} \\= & {} \sum _m m b_{k-1,m-1} \\= & {} \sum _m (m - 1 + 1) b_{k-1,m-1} \\= & {} B_{k-1} + b_{k-1} \\= & {} B_{k-1} + F_{k-2}. \end{aligned}$$

Here we used (62) in the second step, (61) in the fifth step, and the definitions in the other steps. This establishes (64).

Similarly, we have

$$\begin{aligned} B_k= & {} \sum _m m b_{k,m} \\= & {} \sum _m m (a_{k-2,m-1} + 2b_{k-2,m-1}) \\= & {} \sum _m (m - 1 + 1) a_{k-2,m-1} + 2 \sum _m (m - 1 + 1) b_{k-2,m-1} \\= & {} A_{k-2} + a_{k-2} + 2 B_{k-2} + 2 b_{k-2} \\= & {} A_{k-2} + F_{k-4} + 2 B_{k-2} + 2 F_{k-3} \\= & {} A_{k-2} + 2 B_{k-2} + F_{k-1}. \end{aligned}$$

Here we used (62) in the second step, (61) in the fifth step, the Fibonacci number recursion twice in the sixth step, and the definitions in the other steps. This establishes (65).

Finally, we have

$$\begin{aligned} C_k= & {} A_k + B_k \\= & {} B_{k-1} + F_{k-2} + A_{k-2} + 2 B_{k-2} + F_{k-1} \\= & {} B_{k-1} + C_{k-2} + B_{k-2} + F_{k} \\= & {} C_{k-1} - A_{k-1} + C_{k-2} + B_{k-2} + F_{k} \\= & {} C_{k-1} - F_{k-3} + C_{k-2} + F_{k} \\= & {} C_{k-1} + C_{k-2} + 2F_{k-2}. \end{aligned}$$

Here we used (64) and (65) in the second step, the definition and the Fibonacci number recursion in the third step, (64) in the fourth step, and the Fibonacci number recursion twice in the sixth step. This establishes (66). \(\square \)

Proposition 5.9

We have

$$\begin{aligned} \lim _{k \rightarrow \infty } \frac{C_k}{k F_k} = \frac{4}{5 + \sqrt{5}}. \end{aligned}$$
(67)

In particular, 

$$\begin{aligned} \frac{\log \varphi }{\lim _{k \rightarrow \infty } \frac{C_k}{k F_k}} = \frac{5 + \sqrt{5}}{4} \log \varphi \approx 1.80902 \log \varphi . \end{aligned}$$

Proof

Set

$$\begin{aligned} \beta = \frac{4}{5 + \sqrt{5}} = \frac{2}{\varphi + 2} \end{aligned}$$

and \(R_k = C_k - \beta k F_k\). Then,

$$\begin{aligned} R_k \!-\! R_{k-1} \!-\! R_{k-2}= & {} C_k \!-\! C_{k-1} \!-\! C_{k-2} \!-\! \beta k F_k \!+\! \beta (k\!-\!1) F_{k-1} \!+\! \beta (k\!-\!2) F_{k-2} \\= & {} 2F_{k-2} - \beta F_{k-1} - 2 \beta F_{k-2} \\= & {} 2F_{k-2} \left( 1 - \frac{\beta }{2} \frac{F_{k-1}}{F_{k-2}} - \beta \right) \\= & {} 2F_{k-2} \left( 1 - \frac{\beta }{2} \left( \varphi + \left( \frac{F_{k-1}}{F_{k-2}} - \varphi \right) \right) - \beta \right) \\= & {} 2F_{k-2} \frac{\beta }{2} \left( \varphi - \frac{F_{k-1}}{F_{k-2}} \right) \\= & {} \beta \left( F_{k-2} \varphi - F_{k-1} \right) \end{aligned}$$

For the fifth step, note that

$$\begin{aligned} 1 - \frac{1}{\varphi + 2} \varphi - \frac{2}{\varphi + 2} = 0. \end{aligned}$$

By a standard estimate from the theory of continued fractions, this shows that

$$\begin{aligned} |R_k - R_{k-1} - R_{k-2}| < \frac{\beta }{F_{k-1}}. \end{aligned}$$
(68)

Set \(C = \max \{ |R_1|, |R_2| \}\) and apply (68) repeatedly to obtain

$$\begin{aligned} |R_1|\le & {} C \\ |R_2|\le & {} C \\ |R_3|< & {} 2C + \frac{\beta }{F_{2}} \\ |R_4|< & {} 3C + \frac{\beta }{F_{2}} + \frac{\beta }{F_{3}} \\ |R_5|< & {} 5C + 2\frac{\beta }{F_{2}} + \frac{\beta }{F_{3}} + \frac{\beta }{F_{4}} \\ |R_6|< & {} 8C + 3\frac{\beta }{F_{2}} + 2 \frac{\beta }{F_{3}} + \frac{\beta }{F_{4}} + \frac{\beta }{F_{5}} \\&\; \; \vdots \\ |R_k|< & {} F_{k-1} C + F_{k-2} \frac{\beta }{F_{2}} + F_{k-3} \frac{\beta }{F_{3}} + F_{k-4} \frac{\beta }{F_{4}} + \cdots + F_0 \frac{\beta }{F_{k}} \\= & {} F_k \left( \frac{F_{k-1}}{F_k} C + \frac{F_{k-2}}{F_k} \frac{\beta }{F_{2}} + \frac{F_{k-3}}{F_k} \frac{\beta }{F_{3}} + \frac{F_{k-4}}{F_k} \frac{\beta }{F_{4}} + \cdots + \frac{F_0}{F_k} \frac{\beta }{F_{k}} \right) . \end{aligned}$$

This implies \(|R_k| = O(F_k)\), and in particular

$$\begin{aligned} \lim _{k \rightarrow \infty } \frac{R_k}{k F_k} = 0. \end{aligned}$$

In view of \(R_k = C_k - \beta k F_k\), this establishes (67) and concludes the proof of the proposition. \(\square \)

We are now in a position to prove (19). This result will be an easy consequence of Proposition 5.1, the estimates (57), and Proposition 5.9.

Proof of (19)

By (53), we have

$$\begin{aligned} \dim _H \nu _{\lambda } = \frac{\log \varphi }{\lim _{k \rightarrow \infty } \frac{1}{kF_k} \log \left( \prod _{i = 1}^{F_k} \left| x_k'(E_k^{(i)}) \right| \right) } \end{aligned}$$

for \(\lambda \ge 16\). By (57), we have

$$\begin{aligned} S_l(\lambda )^{m(E_k^{(i)})} \le |x_k'(E_k^{(i)})| \le S_u(\lambda )^{m(E_k^{(i)})}, \end{aligned}$$

where \(m(E_k^{(i)})\) denotes the number of spectra \(\sigma _j\), \(1 \le j \le k-1\), \(E_k^{(i)}\) belongs to, and \(S_l(\lambda )\), \(S_u(\lambda )\) are given in (58). Thus,

$$\begin{aligned} \log \left( \prod _{i = 1}^{F_k} \left| x_k'(E_k^{(i)}) \right| \right) = \sum _{i = 1}^{F_k} \log \left| x_k'(E_k^{(i)}) \right| = \sum _{m = \lceil \frac{k}{2} \rceil }^{\lfloor \frac{2k}{3} \rfloor } \sum _{m(E_k^{(i)}) = m} \log \left| x_k'(E_k^{(i)}) \right| , \end{aligned}$$

and hence

$$\begin{aligned} \lim _{k \rightarrow \infty } \frac{1}{kF_k} \log \left( \prod _{i = 1}^{F_k} \left| x_k'(E_k^{(i)}) \right| \right) \in \left[ \frac{4}{5 + \sqrt{5}} \log S_l(\lambda ), \frac{4}{5 + \sqrt{5}} \log S_u(\lambda ) \right] \end{aligned}$$

by (67) in Proposition 5.9. We obtain

$$\begin{aligned} \lim _{\lambda \rightarrow \infty } \dim _H \nu _{\lambda } \cdot \log \lambda= & {} \lim _{\lambda \rightarrow \infty } \frac{\log \varphi \cdot \log \lambda }{\lim _{k \rightarrow \infty } \frac{1}{kF_k} \log \left( \prod _{i = 1}^{F_k} \left| x_k'(E_k^{(i)}) \right| \right) }\\= & {} \frac{5 + \sqrt{5}}{4} \log \varphi , \end{aligned}$$

which concludes the proof. \(\square \)

6 The optimal Hölder exponent

In this section we provide an explicit expression for the optimal Hölder exponent of the integrated density of states for the Fibonacci Hamiltonian.

Theorem 6.1

Let \(T : M^2 \rightarrow M^2\) be a \(C^{1+\alpha }\)-diffeomorphism with a (topologically) zero-dimensional basic set \(\Lambda ,\) and \(\mu _{max}\) be the measure of maximal entropy for \(T_{\Lambda }\). Let \(L \subset M\) be a smooth curve transversal to \(W^s(\Lambda )\) with parametrization \(L : \mathbb {R} \rightarrow M^2\) such that \(L \cap W^s(\Lambda )\) is compact. Let R be an element of a Markov partition for \(\Lambda ,\) and let \(\pi : \Lambda \cap R \rightarrow L\) be a continuous projection along the stable manifolds. Set \(\nu = L^{-1} \circ \pi (\mu _{max}|_R),\) and denote by \(\gamma \) the optimal Hölder exponent of \(\nu \). Then, 

$$\begin{aligned} \gamma = \frac{h_{{\mathrm {top}}} (T|_{\Lambda })}{\sup _{p \in Per(T|_{\Lambda })} {\mathrm {Lyap}}^u(p)}. \end{aligned}$$

In other words,  we have the following : 

  1. 1.

    For any \(\gamma _0 < \gamma \) and any sufficiently small interval \(I \subset \mathbb {R},\) we have \(\nu (I) < |I|^{\gamma _0};\)

  2. 2.

    For any \(\gamma _1 > \gamma \) and any \(\varepsilon > 0,\) there exists an interval \(I \subset \mathbb {R}\) such that \(|I| < \varepsilon \) and \(\nu (I) > |I|^{\gamma _1}\).

Proof

Fix any \(\gamma _0 \in (0,\gamma )\) and suppose that \(I = [E_0,E_1] \subset {\mathbb {R}}\) is sufficiently small (we will determine the appropriate smallness condition later). Without loss of generality we can assume that \(L(E_0), L(E_1) \in W^s(\Lambda )\) (otherwise we can decrease the size of I without changing its measure).

Consider the rectangle \(R_I = \pi ^{-1}(L(I)) \subset R\). Then, \(\mu _{\mathrm {max}}(R_I) = \nu (I)\). Let \(N \in {\mathbb {Z}}_+\) be the smallest value such that \(T^N(R_I) \cap \Lambda \) is not a subset of just one element of the Markov partition (and hence has size of order one). We claim that

$$\begin{aligned} C^{-1} \le \left| \frac{\mu _{\mathrm {max}}(R_I)}{e^{-N h_{\mathrm {top}}(T|_\Lambda )}} \right| \le C \end{aligned}$$

with C uniform for all sufficiently small I. Indeed, consider the topological Markov chain \(\sigma _A : \Sigma _A \rightarrow \Sigma _A\) conjugate to \(T|_\Lambda : \Lambda \rightarrow \Lambda \). Then,

$$\begin{aligned} \mu _{\mathrm {max}}(R_I) = \lim _{M \rightarrow \infty } \frac{\# (\mathrm {Fix} (T^M) \cap R_I)}{\# \mathrm {Fix} (T^M)}. \end{aligned}$$

But for large M and N, we have

$$\begin{aligned} \# \mathrm {Fix} (T^M) = \mathrm {Tr} (A^M)= e^{M h_{\mathrm {top}}(T|_\Lambda )}(1+o(1)), \end{aligned}$$

since the largest eigenvalue of A is equal to \(e^{h_{\mathrm {top}}(T|_{\Lambda _{\lambda }})}\). At the same time the number of periodic orbits of period M with prescribed initial segment of length \(N < M\) is given by

$$\begin{aligned} \quad \# (\mathrm {Fix} (T^M) \cap R_I) = e^{(M-N) h_{\mathrm {top}}(T|_\Lambda )} \cdot O(1), \end{aligned}$$

where O(1) is bounded from above and away from zero uniformly in all \(1 \ll N \ll M\), and hence \(\mu _{\mathrm {max}}(R_I) = e^{-N h_{\mathrm {top}}(T|_\Lambda )} \cdot O(1)\).

On the other hand, \(|I| = E_1 - E_0\) is of order of the width of \(R_I\). Pick any point \(p \in \Lambda \cap R_I\) and consider \(W^u_\mathrm {loc}(p) \cap R_I\). Since a holonomy map along stable manifolds is \(C^1\), we have

$$\begin{aligned} |I| = | W^u_\mathrm {loc}(p) \cap R_I |\cdot O(1). \end{aligned}$$

The usual distortion argument shows also that

$$\begin{aligned} | W^u_\mathrm {loc}(p) \cap R_I |= \frac{1}{\Vert DT^N(p)|_{E^u}\Vert } \cdot O(1), \end{aligned}$$

and hence for any \(\varepsilon > 0\), \(\varepsilon < \gamma - \gamma _0\),

$$\begin{aligned} \frac{\log \nu (I)}{\log |I|}= & {} \frac{-N h_{\mathrm {top}}(T|_\Lambda )+O(1)}{-\log \Vert DT^N(p)\Vert +O(1)} \\> & {} \frac{h_{\mathrm {top}}(T|_\Lambda )}{{\mathrm {Lyap}}^u(p)} -\varepsilon \\\ge & {} \frac{h_{\mathrm {top}}(T|_\Lambda )}{\sup _p {\mathrm {Lyap}}^u(p)} - \varepsilon \\= & {} \gamma - \varepsilon >\gamma _0 \end{aligned}$$

if N is large enough (which can be guaranteed by choosing sufficiently small |I|). Therefore \(\nu (I) < |I|^{\gamma _0}\).

Let us now take an arbitrary \(\gamma _1 > \gamma \). There exists a periodic point \(q \in R\) such that

$$\begin{aligned} \frac{h_{\mathrm {top}}(T|_\Lambda )}{{\mathrm {Lyap}}^u(q)} < \gamma _1. \end{aligned}$$

For a given \(\varepsilon \in (0, \gamma _1 - \gamma )\), consider a narrow rectangle \(R_I \subset R\), \(I \subset {\mathbb {R}}\), \(R_I = \pi ^{-1}(L(I))\), such that \(|I| < \varepsilon \) and \(q \in R_I\). If \(N \in {\mathbb {Z}}_+\) is the smallest number such that \(T^N(R_I)\) does not belong to one element of the Markov partition, then

$$\begin{aligned} |I| = \frac{O(1)}{\Vert DT^N(q)|_{E^u_q}\Vert } \end{aligned}$$

and

$$\begin{aligned} \nu (I) = \mu _{\mathrm {max}}(R_I) = e^{-N h_{\mathrm {top}}(T|_\Lambda )}\cdot O(1). \end{aligned}$$

Hence,

$$\begin{aligned} \frac{\log \nu (I)}{\log |I|} = \frac{-N h_{\mathrm {top}}(T|_\Lambda )+O(1)}{-N(\frac{1}{N} \log \Vert DT^N(q)|_{E^u_q}\Vert +O(1)} \le \frac{h_{\mathrm {top}} (T|_\Lambda )}{{\mathrm {Lyap}}^u(q)} + \varepsilon < \gamma _1, \end{aligned}$$

and therefore \(\nu (I) > |I|^{\gamma _1}\). \(\square \)

Proof of (12)

The theorem follows as a special case from Theorem 6.1 since the density of states measure arises from the measure of maximal entropy for the trace map in the way required for Theorem 6.1 to be applicable. This was shown in [34] for small values of the coupling constant \(\lambda \), and due to Theorem 1.5 the same holds for all \(\lambda > 0\). \(\square \)

For small values of the coupling constant, \(\sup _{p\in Per(f|_{\Lambda })}{\mathrm {Lyap}}^u(p)\) is attained in the periodic points (of period 2 and 6) born from the singularities of the Cayley cubic, and therefore it can be calculated explicitly (see, e.g., the proof of [35, Lemma 3.3]). Hence we get the following

Corollary 6.2

For \(\lambda > 0\) sufficiently small,  we have

$$\begin{aligned} \gamma _{\lambda }=\frac{2\log \left( \frac{\sqrt{5}+1}{2}\right) }{\log \left( {{\frac{\sqrt{2} \sqrt{256 I^2+16 \left( 2 A + 3 \sqrt{2} B + \sqrt{2} A B + 35 \right) I + 22 A + 75 \sqrt{2} B + 21 \sqrt{2} A B + 250} + 16 I + A + 2 \sqrt{2} B + \sqrt{2} A B + 23}{2 A + 2 \sqrt{2} B - 2}}}\right) }, \end{aligned}$$

where \(I = \frac{\lambda ^2}{4},\) \(A = A(I)=\sqrt{16 I + 25},\) and \(B = B(I) = \sqrt{8 I - \sqrt{16 I + 25} + 5}\).

These periodic points of period 2 lead to the curve in Fig. 2 that is labeled as “Period two”.

7 Strict inequalities between spectral characteristics

In this section we prove Theorem 1.8, that is, we establish the strict inequalities in (15). They will be a consequence of the following general result.

Proposition 7.1

Suppose that \(\sigma _A : \Sigma _A \rightarrow \Sigma _A\) is a topological Markov chain defined by a transitive \(0-1\) matrix A (i.e.,  some power of A has only positive entries), and \(\phi : \Sigma _A \rightarrow {\mathbb {R}}\) is a Hölder continuous function. If \(\phi \) is not cohomological to zero (in other words,  there are periodic orbits with different values of averages of \(\phi \) over those orbits), then

$$\begin{aligned}&\inf _{p \in Per(f)} \left( \frac{1}{\pi (p)} \sum _{i=0}^{\pi (p)-1} \phi (f^i(p))\right) = \inf _{\mu \in \mathfrak {M}} \int \phi \, d\mu < \int \phi \, d\mu _{max}\nonumber \\&\quad < \sup _{\mu \in \mathfrak {M}} \int \phi \, d\mu = \sup _{p \in Per(f)} \left( \frac{1}{\pi (p)} \sum _{i=0}^{\pi (p)-1} \phi (f^i(p)) \right) , \end{aligned}$$
(69)

where \(\mu _{\mathrm {max}}\) is the measure of maximal entropy, \(\mathfrak {M}\) is the space of all probability Borel \(\sigma _A\)-invariant measures,  and \(\pi (p)\) is the period of a periodic point p.

Proof

First of all, due to Sigmund’s Theorem [94], the ergodic measures supported on periodic orbits are (weak-*) dense in \(\mathfrak {M}\), which implies the equalities in (69). In order to show the strict inequalities in (69), we apply Proposition 1.9 in the case where \(\sigma _A : \Sigma _A \rightarrow \Sigma _A\) is conjugate to \(T_{\lambda }|_{\Lambda _{\lambda }}\) and the potential is given by \(\phi = -\log \Vert DT_{\lambda }|_{E^u}\Vert \).

Since by (5) the line \(h_{{\mathrm {top}}} (\sigma _A) + t \int \phi \, d\mu _{\mathrm {max}}\) is tangent to the graph of \(P(t\phi )\) at \((0, h_{{\mathrm {top}}} (\sigma _A))\), the strict convexity, which follows from (4), together with (6) implies Proposition 7.1. \(\square \)

In order to prove Theorem 1.8, we will show that Proposition 7.1 applies to the case at hand. This amounts to proving that there are periodic orbits in \(\Lambda _{\lambda }\) with different values of the averaged unstable multipliers [66, Proposition 20.3.10]. An averaged unstable multiplier of a periodic point p of period n is defined to be the nth root of the largest (in absolute value) eigenvalue of the differential \(DT^n|_p\). Henceforth, we shall write simply multiplier for averaged unstable multiplier.

Proposition 7.2

For every \(\lambda > 0,\) there exist two periodic points (not necessarily of the same period) in \(\Lambda _{\lambda },\) such that their corresponding multipliers are distinct.

This result is known for all \(\lambda > 0\) sufficiently close to zero. Indeed, in that case one computes the multipliers for the period-six periodic point \(p = (0, 0, a)\), with suitable \(a\in {\mathbb {R}}\) such that \(p\in S_0\), and for the fixed point \(q = (1,1,1)\) explicitly. These multipliers are distinct, and a perturbation argument shows that for all \(\lambda > 0\) sufficiently small, there exist a period-two periodic point and a period-six periodic point with different multipliers; see [34] for details.

Proof of Proposition 7.2

In [90] Baake and Roberts calculated a few periodic orbits, among which are two families of periodic points of period two and four, respectively. These are given, respectively, by

$$\begin{aligned} P_a\overset{\mathrm {def}}{=}\left( a, \frac{a}{2a - 1}, a\right) \end{aligned}$$

and

$$\begin{aligned} Q_b \overset{\mathrm {def}}{=}\left( -\frac{1}{2}, b, -\frac{1}{2}\right) . \end{aligned}$$

Here, \(a, b \in {\mathbb {R}}\). On each level surface \(S_{\lambda }\), we can find points from these families. Namely, a and b simply need to be chosen in such a way that

$$\begin{aligned} I(P_a) = I(Q_b) = \frac{\lambda ^2}{4}, \end{aligned}$$
(70)

where I is the Fricke–Vogt invariant. Note that \(I(P_1) = I(Q_1) = 0\) and that \(\lim _{a \rightarrow \infty } I(P_a) = \infty \) and \(\lim _{b \rightarrow \infty } I(P_b) = \infty \). Thus, by continuity, for each \(\lambda \ge 0\), it is possible to find \(a,b \in [1,\infty )\) so that (70) holds.

We claim that for every \(\lambda \ge 0\), the multipliers of \(P_a\) and \(Q_b\) on \(S_{\lambda }\) are different. The proposition obviously follows from this claim.

Assume that this claim fails. Then there exist \(a,b \in [1,\infty )\) so that \(I(P_a) = I(Q_b) \ge 0\) and the multipliers of \(P_a\) and \(Q_b\) coincide. The identity \(I(P_a) = I(Q_b)\) implies that

$$\begin{aligned} 2a^2 + \frac{a^2}{(2a-1)^2} - \frac{2a^3}{2a-1} = \frac{1}{2} + b^2 - \frac{b}{2}. \end{aligned}$$
(71)

On the other hand, it was shown by Baake and Roberts that the unstable eigenvalue of \(DT^2|_{P_a}\) is a root of the equation

$$\begin{aligned} \mu ^2 - \frac{8a^2 - 2a + 1}{2a-1} \mu + 1 = 0, \end{aligned}$$
(72)

while the unstable eigenvalue of \(DT^4|_{Q_b}\) is a root of the equation

$$\begin{aligned} \mu ^2 - (8(1 - 2b)b + 1) \mu + 1 = 0; \end{aligned}$$
(73)

see [90, p. 850]. Due to Vieta’s formulas, the roots of the equation \(\mu ^2 + (v^2 - 2) \mu + 1 = 0\) are squares of the roots of the equation \(\mu ^2 + v \mu + 1 = 0\). Thus, if the two multipliers in question coincide, it follows from (72) and (73) that

$$\begin{aligned} \left( \frac{8a^2 - 2a + 1}{2a-1} \right) ^2 - 2 = -8(1-2b)b-1, \end{aligned}$$

or, equivalently,

$$\begin{aligned} \left( \frac{8a^2 - 2a + 1}{2a-1} \right) ^2 - 2 = 16 \left( b^2 - \frac{b}{2} \right) - 1. \end{aligned}$$
(74)

It follows from (71) and (74) that

$$\begin{aligned} \frac{(8a^2 - 2a + 1)^2}{(2a-1)^2} + 7 = 16 \frac{4a^4 - 6a^3 + 3a^2}{(2a-1)^2}, \end{aligned}$$

which in turn implies that

$$\begin{aligned} 8a^3 - 4a + 1 = 0. \end{aligned}$$

Write \(P(a) = 8a^3 - 4a + 1\). The critical numbers of this polynomial of degree 3 are \(\pm \frac{1}{\sqrt{6}}\). Thus, P is strictly increasing on \([1,\infty )\). Since \(P(1) = 5\), P does not vanish for any \(a \in [1,\infty )\); contradiction. This shows that the two multipliers cannot be equal, and the claim follows. \(\square \)

Remark 7.3

The two families of periodic points of period two and four, respectively, used in the proof of Proposition 7.2 are the ones that lead to the curves in Figs. 1 and 2 which are labeled with period two and four, respectively.

Proof of Theorem 1.8

The strict inequalities \(\gamma _{\lambda } < \dim _H \nu _{\lambda } < \tilde{\alpha }^\pm _u(\lambda )\) follow directly from Theorem 1.6, Propositions 1.9, and 7.2. The inequalities \(\dim _H \nu _{\lambda } < \dim _H \Sigma _{\lambda } < \tilde{\alpha }^\pm _u(\lambda )\) follow from the strict convexity of the pressure function \(P(t\phi )\) with \(\phi = -\log \Vert DT_{\lambda }|_{E^u}\Vert \), the fact that \(\dim _H \Sigma _{\lambda }\) is the only zero of \(P(t\phi )\) (see [79]), and the expression for \(\tilde{\alpha }^\pm _u(\lambda )\) from Theorem 1.6. \(\quad \square \)

8 Extensions and generalizations

While we have focused up to this point on the classical Fibonacci Hamiltonian, much of what we do extends either partly or fully to other types of operators. Also, we strongly believe that the results presented here provide an insight toward and an opportunity to approach some other more complicated models as well. In this section we briefly address some of these extensions and generalizations.

  • The Off-Diagonal Model. In the present paper we consider the Fibonacci Hamiltonian in the form (1), which is usually called the diagonal model and which is the one most popular in the mathematics literature. In the physics literature the so-called off-diagonal model is usually considered. The spectral properties of the off-diagonal operator as well as the relation to the dynamics of the Fibonacci trace map are not any different from the diagonal one; see the appendix in [33] for a detailed discussion of the off-diagonal model. All the results presented in this paper for the diagonal model also hold for the off-diagonal one.

  • Potentials Generated by Primitive Invertible Substitutions. Discrete Schrödinger operators with potentials generated by primitive invertible substitutions have spectral properties that are very much similar to the spectral properties of the Fibonacci Hamiltonian. For some of the spectral properties this was justified in [53, 80]. We expect that all the qualitative statements (i.e., all the statements mentioned in the introduction except Corollary 1.7 and Theorem 1.10) of this paper can be generalized to this case also. As for the large coupling asymptotics, the calculations can be more complicated, but can likely be carried out for particular potentials (given the results obtained in [7274, 78]).

  • Sturmian Potentials. Sturmian potentials are natural generalizations of the Fibonacci potential. Namely, one simply replaces the specific value of \(\alpha \) in (1) by a general irrational \(\alpha \in (0,1)\). It is known that the spectrum of a discrete Schrödinger operator with a Sturmian potential is a Cantor set of zero measure [14], but in most cases this Cantor set will not be dynamically defined [72, 73]. Nevertheless, there is a dynamical presentation of the spectrum in this case as well [14, 74, 88], and it would be interesting to see whether the dynamical approach can add something to the recent results in [7274, 78] that were obtained via the periodic approximation technique.

  • Jacobi Matrices. In general, discrete Schrödinger operators form a particular case of the operators given by Jacobi matrices. In the case where the coefficients of a Jacobi matrix are modulated by the Fibonacci sequence, their spectral properties were studied in [105]. Interestingly enough, the spectrum in this case does not have to be dynamically defined. Nevertheless, the relation to the dynamics of the Fibonacci trace map allows one to give a detailed description of the spectrum in this case, at least in some regimes, and our results can be used to provide a complete description throughout the entire parameter space. For another model with similar dynamical description see [106].

  • CMV Matrices. CMV matrices are the unitary analog of Jacobi matrices. That is, they are canonical models of unitary operators (just as Jacobi matrices are canonical models of self-adjoint operators) and they arise naturally in the study of orthogonal polynomials on the unit circle (while Jacobi matrices arise in the study of orthogonal polynomials on the real line); compare [95, 96]. In addition, CMV matrices have been effectively used to study quantum walks and the Ising model in one dimension; see [20, 41]. Choosing the coefficients defining a CMV matrix according to the Fibonacci sequence one obtains an interesting model that can be studied using the trace map formalism as well. Several results for this model were obtained in [40, 41], both from the perspective of orthogonal polynomials and the perspective of quantum walks and the Ising model. The results and tools developed in the present paper will allow one to take the analysis of the CMV case further.

  • Continuum Models. Continuum Schrödinger operators with Fibonacci-type potentials were considered in [6, 31, 69, 71]. In this case there are many models (depending on the choice of single-site potentials). The trace map description of the spectrum is also available in this case (see [31]), and hence it is reasonable to expect that our results can be used.

  • Higher-Dimensional Separable Models. Understanding the spectral properties of the operators associated with the standard two- and three-dimensional quasicrystal models is a major problem in the field which is currently out of reach. One of the ways to get some insight into the problem is to consider operators with separable potentials; for example, the Square (and Cubic) Fibonacci Hamiltonian and the labyrinth model [4951, 98, 99]. In these models the spectrum of the higher dimensional operator is the sum (or product) of the spectra of the one dimensional ones. Since studying the sum of dynamically defined Cantor sets is a classical problem which has been extensively studied (see, for example, [57, 83] and references therein), we expect that the current results will be instrumental in understanding the spectral properties of separable models. There is recent work on these models that relies on the one-dimensional results in the small and large coupling regimes [33, 37], and the results of this paper will pave the way for a study of separable models that does not rely on the small and large coupling theory.