1 Introduction

This paper is motivated by an attempt to understand the action on the 2-torus \({\mathbb {T}}^2 := {\mathbb {R}}^2/{\mathbb {Z}}^2\) generated by the diffeomorphisms

$$\begin{aligned} g_0(x,y)=(2x+y,x+y),\quad g_1(x,y)=(x+\rho ,y),\quad g_2(x,y)=(x,y+\rho ),\quad \rho \in {\mathbb {R}}. \end{aligned}$$

The map \(g_0\) is a hyperbolic linear automorphism, and \(g_1,g_2\) are translations. They satisfy the group relations

$$\begin{aligned} g_0g_1=g_1^2g_2g_0,\quad g_0g_2=g_1g_2g_0,\quad g_1g_2=g_2g_1, \end{aligned}$$

and no other relations if \(\rho \) is irrational. Broadly stated, our aim is classify all diffeomorphisms \(g_0,g_1,g_2\) satisfying these relations and no other.

To place the problem in a more general context, in this paper, we establish rigidity properties of certain solvable group actions on the torus \({\mathbb {T}}^N ={\mathbb {R}}^N/{\mathbb {Z}}^N\), for \(N > 1\). The solvable groups \(\Gamma \) considered here are the finitely presented, torsion-free, abelian-by-cyclic (ABC) groups, which admit a short exact sequence

$$\begin{aligned} 0 \hookrightarrow {\mathbb {Z}}^d \rightarrow \Gamma \rightarrow {\mathbb {Z}}\rightarrow 0. \end{aligned}$$

All such groups are of the form \(\Gamma = \Gamma _{B}\), where \(B = (b_{ij}) \) is an integer valued, \(d\times d\) matrix with \(\det ( B)\ne 0\), and

$$\begin{aligned}&\Gamma _{ B}= {\mathbb {Z}}\ltimes {\mathbb {Z}}^d = \left\langle g_0,g_1,\ldots ,g_d\ |\ g_0g_i=\left( \prod _{j=1}^d g_j^{b_{ji}}\right) g_0,\nonumber \right. \\&\left. \quad [g_i,g_j]=1,\quad i,j=1,2,\ldots ,d\right\rangle . \end{aligned}$$
(1.1)

ABC groups have been studied intensively in geometric group theory, as they present the first case in the open problem to classify finitely generated, non-nilpotent solvable groups up to quasi-isometry. The classification problem for ABC groups has been solved in [FM1] (the non-polycyclic case, \(|\det B| > 1\)) and [EFW1, EFW2] (the polycyclic case, \(|\det B| = 1\)), where the authors also revealed close connections between the geometry of these groups and dynamics [FM2, EF]. Here we consider actions of polycyclic ABC groups.

A \(C^r\) action \(\alpha \) of a finitely generated group \(\Gamma \) with generators \(g_1,\ldots ,g_k\) on a closed manifold M is a homomorphism \(\alpha :\Gamma \rightarrow \mathrm {Diff}^r(M)\), where \(\mathrm {Diff}^r(M)\) denotes the group of orientation-preserving, \(C^k\) diffeomorphisms of M. The action is determined completely by \(\alpha (g_1),\ldots ,\alpha (g_k)\). The polycylic ABC groups admit natural affine actions on tori, as follows.

Up to rearranging the standard basis for \({\mathbb {R}}^d\), every matrix \( B\in \mathrm { SL}(d; {\mathbb {Z}})\) can be written in the form

$$\begin{aligned} B = \begin{pmatrix}{\bar{B}} &{}\quad 0\\ 0 &{}\quad I_{d-N} \end{pmatrix}, \end{aligned}$$

for some \(N\le d\), where \({\bar{B}}=({{\bar{b}}}_{ij})\in \mathrm { SL}(N; {\mathbb {Z}})\), and \(I_{d-N}\) is the \((d-N)\times (d-N)\) identity matrix, chosen to be maximal. In this paper, we restrict our attention to the cases where \(d = KN+1\), for some \(K\ge 1\). Then

$$\begin{aligned} \Gamma _{ B}&= \Gamma _{{\bar{B}},K}:=\displaystyle \left\langle g_0,g_{i,k},\ i=1,\ldots ,N,\ k=1,\ldots ,K\ |\ \ [g_{i,k},g_{j,\ell }]=1,\right. \nonumber \\&\quad \left. g_0g_{i,k}=\left( \prod _{j=1}^N g_{j,k}^{{{\bar{b}}}_{ji}}\right) g_0,\quad i,j=1,\ldots ,N,\ k,\ell =1,\ldots ,K\right\rangle . \end{aligned}$$
(1.2)

Note that \(\Gamma _{{\bar{B}}} = \Gamma _{{\bar{B}}, 1}\).

In the affine actions of \(\Gamma _{{\bar{B}},K}\) we consider, the element \(g_0\) acts on \({\mathbb {T}}^N\) by the automorphism \(x\mapsto {{\bar{A}}}x\,\, (\mathrm {mod }\, \,{\mathbb {Z}}^N)\) induced by \({{\bar{A}}}\in \mathrm {SL}(N,{\mathbb {Z}})\), and the elements \(g_{i,k}, i=1,\ldots ,N,\ k=1,\ldots ,K\) act as translations \(x\mapsto x + \rho _{i,j}\,\, (\mathrm {mod } \,{\mathbb {Z}}^N)\), where \(\rho _{i,j}\in {\mathbb {R}}^N\). Thus if we denote this action by \(\alpha =\alpha _{{{\bar{B}}}, K} :\Gamma _{{{\bar{B}}},K} \rightarrow \mathrm {Diff}^r({\mathbb {T}}^N) \), we have

$$\begin{aligned} \alpha (g_0)(x) = {{\bar{A}}} x,\quad \hbox {and } \alpha (g_{i,k})(x) = x + \rho _{i,j}. \end{aligned}$$

The group relations in \(\Gamma _{{{\bar{B}}}, K}\) restrict the possible values of \(\rho _{i,j}\); we describe precisely these restrictions in the next subsection. We will see that for a typical \({{\bar{A}}}\), the affine actions define a finite dimensional space of distinct (i.e. nonconjugate) actions on the torus.

Given such a group \(\Gamma _{{{\bar{B}}}, K}\) with the associated affine action \({\bar{\alpha }} \) we investigate whether there exist other actions \(\alpha :\Gamma _{{{\bar{B}}},K} \rightarrow \mathrm {Diff}^r({\mathbb {T}}^N)\) that are homotopic to \({\bar{\alpha }} \) but not conjugate to \({\bar{\alpha }} \) in the group \(\mathrm {Diff}^r({\mathbb {T}}^N)\). If there are no such actions, or if such actions are proscribed in some manner, then the group is colloquially said to be rigid (a much more precise definition is given below).

The main rigidity results of this paper can be grouped into two classes: local and global. Loosely speaking, local rigidity results concern those \(C^r\) actions that are \(C^r\) perturbations of the affine action, and global results concern actions where \(C^r\) closeness to the affine action is not assumed (although other restrictions might be present).

We obtain local rigidity results for the actions of \(\Gamma _{{{\bar{B}}}} = \Gamma _{{{\bar{B}}}, 1}\), which imply similar results for \(\Gamma _{{{\bar{B}}}} = \Gamma _{{{\bar{B}}}, K},\ K\ge 1\). To each action \(\alpha \) on \({\mathbb {T}}^N\) sufficiently \(C^r\) close to an affine action for some large r, we define an \(N\times N\)rotation matrix\(\varvec{\rho }(\alpha )\). Under suitable hypotheses on \(\varvec{\rho }(\alpha )\), if the columns of this matrix satisfy a simultaneous Diophantine condition, then \(\alpha \) is smoothly conjugate to the affine action with rotation matrix \(\varvec{\rho }(\alpha )\). The fact that the action \(\alpha \) is a smooth perturbation of an affine action is crucial.

More generally, for each affine action \({\bar{\alpha }}\) of \(\Gamma _{{{\bar{B}}}, K}\) there are K rotation matrices \(\varvec{\rho }_1({\bar{\alpha }}), \ldots ,\varvec{\rho }_K({\bar{\alpha }})\). In the section on global rigidity, we consider actions \(\alpha \) of \(\Gamma _{{{\bar{B}}},K}\) for which \({{\bar{B}}}\) acts as an Anosov diffeomorphism, but the rotation matrices \(\varvec{\rho }_i(\alpha )\) are not a priori well-defined. Under relatively weak additional assumptions on the action, we obtain that the collection of \(\varvec{\rho }_i(\alpha )\) can be defined and and forms a complete invariant of the action, up to topological conjugacy. We then establish conditions under which this topological conjugacy is smooth. In particular, if K is sufficiently large (depending on the spectrum of \({{\bar{A}}}\) and the Anosov element \(\alpha (g_0)\)), then for almost every set of rotation matrices \(\varvec{\rho }_1(\alpha ), \ldots ,\varvec{\rho }_K(\alpha )\), the conjugacy is smooth.

Before stating these results, we describe precisely the space of affine actions of \(\Gamma _{{{\bar{B}}}, K}\) we consider.

1.1 The affine actions of \(\Gamma _{{{\bar{B}}}, K}\)

The following proposition can be verified directly using the group relation (1.2). The ABC group relation imposes a restriction to the rotation vectors as follows.

Proposition 1.1

Let \({{\bar{A}}},{{\bar{B}}}\in \mathrm {SL}(N,{\mathbb {Z}})\), and suppose that \(\varvec{\rho }_1, \varvec{\rho }_2,\ldots , \varvec{\rho }_K \) are real-valued, \(N\times N\) matrices such that each \(\varvec{\rho }= \varvec{\rho }_i\) satisfies:

$$\begin{aligned} {{\bar{A}}}\varvec{\rho }= \varvec{\rho }{{\bar{B}}} \mathrm {\ mod\ }\ {\mathbb {Z}}^{N\times N}. \end{aligned}$$
(1.3)

Denote by \(\rho _{i,j}\) the j-th column of \(\varvec{\rho }_i\). Then the affine maps

$$\begin{aligned} {\bar{\alpha }}(g_0)(x) := {{\bar{A}}}x \,\,(\mathrm {mod } \,{\mathbb {Z}}^N),\quad \mathrm {and}\quad {\bar{\alpha }}(g_{i,j})(x) := x +\rho _{i,j} \,\,(\mathrm {mod } \,{\mathbb {Z}}^N) \end{aligned}$$

define an action \({\bar{\alpha }} = {\bar{\alpha }}_K({{\bar{A}}}, \varvec{\rho }) :\Gamma _{{{\bar{B}}}, K}\rightarrow \mathrm {SL}(N,{\mathbb {Z}})\ltimes {\mathbb {R}}^N \) on \({\mathbb {T}}^N\).

Conversely, if \(\alpha :\Gamma _{{{\bar{B}}},K}\rightarrow \mathrm {SL}(N,{\mathbb {Z}})\ltimes {\mathbb {R}}^N \) is an action on \({\mathbb {T}}^N\) with

$$\begin{aligned} \alpha (g_0)(x) = {{\bar{A}}}x \,\,(\mathrm {mod } \,{\mathbb {Z}}^N),\quad \mathrm {and}\quad \alpha (g_i)(x) = x + \beta _{i,j} \,\,(\mathrm {mod } \,{\mathbb {Z}}^N), \end{aligned}$$

for some vectors \(\beta _{i,j}\in {\mathbb {R}}^N\), then for each \(i = 1,\ldots , K\), the matrix \(\varvec{\rho }_i\) whose columns are formed by the \(\beta _{i,j}\) satisfies (1.3).

We will give more details about the set of rotation vectors satisfying (1.3) in Proposition B.1, Lemma B.2 and Theorem B.4 in Appendix B.

We further focus on the case \(K=1\). For \({{\bar{A}}},{{\bar{B}}}\in \mathrm {SL}(N,{\mathbb {Z}})\) and \(\varvec{\rho }\in M_N({\mathbb {T}})\), where \(M_N({\mathbb {T}})\) denotes \(N\times N\) matrices with entries in \({\mathbb {T}}\), we denote by \({\bar{\alpha }}({{\bar{A}}}, \varvec{\rho })\) the action \({\bar{\alpha }}\) on \({\mathbb {T}}^N\) defined in Proposition 1.1. Let

$$\begin{aligned} \mathrm {Aff}(\Gamma _{{{\bar{B}}}},{{\bar{A}}}):=\{{\bar{\alpha }}({{\bar{A}}},\varvec{\rho }): \varvec{\rho }\mathrm {\ satisfies\ }(1.3)\}. \end{aligned}$$

The next proposition describes faithful affine actions.

Proposition 1.2

The action \(\alpha ({{\bar{A}}}, \varvec{\rho })\in \mathrm {Aff}(\Gamma _{{{\bar{B}}}},{{\bar{A}}})\) is faithful if and only if \({{\bar{A}}}\) is not of finite order, and the column vectors \(\rho _1,\ldots ,\rho _N\) of \(\varvec{\rho }\) are linearly independent over \({\mathbb {Z}}\); that is, if there exists \((p_1,\ldots ,p_N)\in {\mathbb {Z}}^N\) with \(\sum _{i=1}^N p_i\rho _i=0 \mod {\mathbb {Z}}^N\), then \(p_1=\cdots =p_N=0\).

For \({{\bar{A}}} \in \mathrm {SL}(N,{\mathbb {Z}})\) not of finite order, we thus define the set of faithful affine actions by

$$\begin{aligned} \mathrm {Aff}_\star (\Gamma _{{{\bar{B}}}},{{\bar{A}}}):=\{{\bar{\alpha }}({{\bar{A}}},\varvec{\rho })\in \mathrm {Aff}(\Gamma _{{{\bar{B}}}},{{\bar{A}}}) : {\bar{\alpha }}({{\bar{A}}},\varvec{\rho }) \mathrm {\ is\ faithful}\}. \end{aligned}$$

Proposition 1.2 reduces the problem of finding faithful actions to solving the equation (1.3) for solution \(\varvec{\rho }\) with linearly independent columns over \({\mathbb {Z}}\). We will provide more details in Appendix B. It may happen that the set \(\mathrm {Aff}_\star (\Gamma _{{{\bar{B}}}},{{\bar{A}}})\) is empty, for instance, when \({{\bar{A}}},{{\bar{B}}}\in \mathrm {SL}(2,{\mathbb {Z}})\) with \(tr{{\bar{A}}}\ne tr {{\bar{B}}}\) (c.f. Proposition B.1(1)). It is useful to keep in mind the following special case of abundant faithful actions. Suppose \({{\bar{A}}}={{\bar{B}}}\in \mathrm {SL}(N,{\mathbb {Z}})\) has simple spectrum, then all \(\varvec{\rho }\) of the form \(\varvec{\rho }=\sum _{i=1}^{N} a_i {{\bar{A}}}^{i-1}\) for \(a=(a_1,\ldots ,a_N)\in {\mathbb {R}}^N\) satisfy (1.3) (c.f. Lemma B.2). The action is not faithful when a is a rational vector and is faithful when a is a Diophantine vector (c.f. Lemma B.3).

Proposition 1.3

Given \({{\bar{A}}}\in \mathrm {SL}(N,{\mathbb {Z}})\), two actions \({\bar{\alpha }}_1, {\bar{\alpha }}_2 \in \mathrm {Aff}_\star (\Gamma _{{{\bar{B}}}},{{\bar{A}}})\) are conjugate by a homeomorphism homotopic to identity if and only if \({\bar{\alpha }}_1={\bar{\alpha }}_2.\)

Thus actions in \(\mathrm {Aff}_\star (\Gamma _{{{\bar{B}}}},{{\bar{A}}})\) may not be locally rigid even among affine actions. Returning to our original example, let

$$\begin{aligned} {{\bar{A}}}= \left( \begin{array}{cc} 2&{}\quad 1\\ 1&{}\quad 1 \end{array} \right) ,\quad {{\bar{B}}}={{\bar{A}}} \end{aligned}$$
(1.4)

and

$$\begin{aligned} \Gamma _{{{\bar{A}}}} = \langle g_0, g_1, g_2\ |\ g_0g_1g_0^{-1} = g_1^2g_2,\quad g_0g_2g_0^{-1} = g_1g_2,\quad [g_1,g_2] =1\rangle . \end{aligned}$$

Fix \(a_0, a_1 \in {\mathbb {T}}^1\), and let \(\varvec{\rho }(a_0, a_1):= \begin{pmatrix} a_0 + 2 a_1 &{}\quad a_1 \\ a_1 &{}\quad a_0 + a_1\end{pmatrix}\). Then

$$\begin{aligned} \{{\bar{\alpha }}({{\bar{A}}}, \varvec{\rho }(a_0,a_1)): a_0,a_1\in {\mathbb {T}}^1\} \end{aligned}$$

defines a 2-parameter family of non-conjugate actions on \({\mathbb {T}}^2\); in the next subsection we explain that these are all such affine actions.

Thus the matrix \(\varvec{\rho }\) is a complete invariant of the faithful affine representations \({\bar{\alpha }}({{\bar{A}}},\varvec{\rho })\) of \(\Gamma _{{{\bar{B}}}}\). The columns of \(\varvec{\rho }\) are rotation vectors of the corresponding translations. We will show that these rotation vectors, and hence the invariant \(\varvec{\rho }\), extend continuously to a neighborhood of the affine representations in such a way that \(\varvec{\rho }\) gives a complete invariant under smooth conjugacy, under the hypotheses that the columns of \(\varvec{\rho }\) satisfy a simultaneous Diophantine condition.

Further properties of the affine representations are discussed in Appendix C, which also contains the proofs of the results in this section.

1.2 Local rigidity of \(\Gamma _{{{\bar{B}}},K}\) actions

An action \(\alpha :\Gamma \rightarrow \mathrm {Diff}^r(M)\) is \(C^{r,k,\ell }\)locally rigid if any sufficiently \(C^k\) small \(C^r\) perturbation \({\tilde{\alpha }}\) is \(C^\ell \) conjugate to \(\alpha \), i.e., there exists a diffeomorphism h of M, \(C^\ell \) close to the identity, that conjugates \({\tilde{\alpha }}\) to \(\alpha \): \(h\circ \alpha (g) = {\tilde{\alpha }}(g)\circ h\) for all \(g\in \Gamma \). The paper of Fisher [Fi] contains background and an excellent overview of the local rigidity problem for general group actions.

Local rigidity results for solvable group actions are relatively rare. In [DK], Damjanovic and Katok proved \(C^{\infty ,k,\infty },\) for some large k, local rigidity for \({\mathbb {Z}}^n\ (n\ge 2)\) (abelian) higher rank partially hyperbolic actions by toral automorphisms, by introducing a new KAM iterative scheme. In [HSW] and [W1], the authors proved local rigidity for higher rank ergodic nilpotent actions by toral automorphisms on \({\mathbb {T}}^N\), for any even \(N \ge 6\). Burslem and Wilkinson in [BW] studied the solvable Baumslag-Solitar groups

$$\begin{aligned} BS(1, n) := \langle a, b\ |\ aba^{-1} = b^n; n \ge 2\rangle ) \end{aligned}$$

acting on \({\mathbb {T}}^1\) and obtained a classification of such actions and a global rigidity result in the analytic setting. Asaoka in [A1, A2] studied the local rigidity of the action on \({\mathbb {T}}^N\) or \({\mathbb {S}}^N\) of non-polycyclic abelian-by-cyclic groups, where the cyclic factor is uniformly expanding. There is a recent paper [W2] by Zhenqi Wang, who proves local rigidity for certain solvable Lie group actions (without any constraints), which can be considered as continuous time counterpart of our actions in the sense that they are solvable and there are only two chambers.

Unless assumptions are made on the action (or the manifold), solvable group actions are typically not locally rigid but can enjoy a form of partial local rigidity: that is, local rigidity subject to constraints that certain invariants be preserved. The simplest example occurs in dimension 1, where the rotation number of a single \(C^2\) circle diffeomorphism supplies a complete topological invariant, provided that it is irrational, and a complete smooth invariant, provided it satisfies a Diophantine condition. This result extends to actions of higher rank abelian groups on \({\mathbb {T}}^1\), under a simultaneous Diophantine assumption on the rotation numbers of the generators of the action [M]. In fact, these results are not just local in nature but apply to all diffeomorphisms of the circle [FK].

For higher dimensional tori, even local rigidity results of this type are scarce, one problem being the lack of invariants analogous to the rotation number. One result in this direction is by Damjanovic and Fayad [DF], who proved local rigidity of ergodic affine \({\mathbb {Z}}^k\) actions on the torus that have a rank-one factor in their linear part, under certain Diophantine conditions.

Definition 1.1

A collection of vectors \(v_1,\ldots , v_m\in {\mathbb {R}}^N\) is simultaneously Diophantine if there exist \(\tau > 0\) and \(C > 0\) such that

$$\begin{aligned} \max _{1\le i\le m} |\langle v_i, n\rangle | \ge \frac{C}{ \Vert n\Vert ^{\tau }},\quad \ \forall \ n \in {\mathbb {Z}}^N{\setminus }\{0\}. \end{aligned}$$
(1.5)

We denote by \(\mathrm {SDC}(C,\tau )\) the set of \((v_1,\ldots ,v_m)\) satisfying (1.5).

For example, the matrix \(\rho \,\mathrm {Id}_N\) is simultaneously Diophantine if \(\rho \) is a Diophantine number. It is known that for any for fixed \(\tau >N-1\), the simultaneous Diophantine vectors

$$\begin{aligned} \bigcup _{C>0} \mathrm {SDC}(C,\tau ) \end{aligned}$$

form a full Lebesgue measure subset of \({\mathbb {T}}^{N\times m}\) ( [P]).

Definition 1.2

Given a homeomorphism \(f: {\mathbb {T}}^N\rightarrow {\mathbb {T}}^N\) homotopic to the identity and preserving a probability measure \(\mu \), the vector

$$\begin{aligned} \rho _\mu (f) :=\int _{{\mathbb {T}}^N} ({\tilde{f}}(x)- x)\, d\mu ,\ \mathrm {mod}\ {\mathbb {Z}}^N, \end{aligned}$$
(1.6)

where \({\tilde{f}}:{\mathbb {R}}^N\rightarrow {\mathbb {R}}^N\) is any lift of f, is independent of the choice of lift \({\tilde{f}}\). We call \(\rho _\mu (f)\) the rotation vector of f with respect to \(\mu \).

Our main local rigidity result is:

Theorem 1.4

For any \({{\bar{A}}},{{\bar{B}}} \in \mathrm {SL}(N,{\mathbb {Z}})\) and any \(C,\tau >0\), there exist \(\varepsilon >0\) and \(\ell \in {\mathbb {N}}\) such that for any \(\varvec{\rho }\in \mathrm {SDC}(C,\tau )\) satisfying (1.3) the following holds. Let \(\alpha : \Gamma _{{{\bar{B}}}}\rightarrow \mathrm {Diff}^\infty ({\mathbb {T}}^N)\) be any representation satisfying

  1. (1)

    \(\alpha (g_0)\) is homotopic to \({\bar{\alpha }}({{\bar{A}}},\varvec{\rho })(g_0)={{\bar{A}}}\);

  2. (2)

    \(\max _{1\le i\le N}\Vert \alpha (g_i)-{\bar{\alpha }}({{\bar{A}}},\varvec{\rho })(g_i)\Vert _{C^\ell }<\varepsilon \);

  3. (3)

    there exist \(\alpha (g_i)\)-invariant probability measures \(\mu _i\), \(i=1,\ldots ,N\), such that the matrix formed by the rotation vectors \((\rho _{\mu _1}(\alpha (g_1)), \ldots ,\rho _{\mu _N}(\alpha (g_N)))\) is equal to \(\varvec{\rho }\).

Then there exists a \(C^\infty \) diffeomorphism h that is \(C^1\) close to identity such that \(h\circ \alpha = {\bar{\alpha }}\circ h\). Moreover, the measure \(\mu = h^{-1}_*\mathrm {Leb}\), where \(\mathrm {Leb}\) is Haar measure on \({\mathbb {T}}^N\), is the unique \(\alpha \)-invariant measure and thus satisfies \(\rho _\mu (T_i)=\rho _{\mu _i}(T_i) = \rho _i,\ i=1,\ldots ,N.\)

We remark that there is no assumption on \({{\bar{A}}}\) and \({{\bar{B}}}\) other than being in \(\mathrm {SL}(N,{\mathbb {Z}})\), and no smallness or hyperbolicity assumption on \(\alpha (g_0)\). The reason is that in the proof we first apply the KAM scheme to the \({\mathbb {Z}}^N\) part of the action to find the conjugacy h then prove that h also conjugates the \(\alpha (g_0)\) to \({{\bar{A}}}\) using the group relation and the ergodicity of the \({\mathbb {Z}}^N\) action (c.f. Sect. 2). On the other hand, the assumption \(\varvec{\rho }\in \mathrm {SDC}(C,\tau )\) satisfying (1.3) puts some constraints on the choice of \({{\bar{A}}}\) and \({{\bar{B}}}\), since it may happen that the affine action is not even faithful for some choices of \({{\bar{A}}}\) and \({{\bar{B}}}\).

We will prove in Appendix B that the simultaneously Diophantine condition is actually satisfied by a large class of matrices \(\varvec{\rho }\) and \({{\bar{A}}},{{\bar{B}}}\) satisfying (1.3). One special case is when \({{\bar{A}}}={{\bar{B}}}\in \mathrm {SL}(N,{\mathbb {Z}})\) has simple spectrum and \(\varvec{\rho }=\sum _{i=1}^N a_i {{\bar{A}}}^{i-1}\), \(a_i\in {\mathbb {R}}\), \(i=1,\ldots ,N\). The columns of the matrix \(\varvec{\rho }\) are simultaneously Diophantine if the nonvanishing \(a_i\)’s form a Diophantine vector (c.f. Lemma B.3).

Remark 1.1

We remark that the faithfulness (guaranteed by the Diophantine condition) of the action is necessary for smooth conjugacy. For instance, consider \(\rho = 1/2\) in (1.4) and \(\alpha (g_i)={\bar{\alpha }}(g_i),\ i=1,2\), and for any \(\varepsilon >0\),

$$\begin{aligned}\alpha (g_0)\left[ \begin{array}{c} x\\ y \end{array} \right] =\left[ \begin{array}{cc} 2&{}\quad 1\\ 1&{}\quad 1 \end{array} \right] \left[ \begin{array}{c} x\\ y \end{array} \right] +\varepsilon \left[ \begin{array}{c} \sin (4\pi x)\\ \sin (4\pi x) \end{array} \right] . \end{aligned}$$

One can verify that this gives rise to a \(\Gamma _{{{\bar{A}}}}\) action. We will see in Theorem 1.5 that for sufficiently small \(\varepsilon ,\) there exists a bi-Hölder conjugacy h satisfying \(h\circ \alpha ={\bar{\alpha }}\circ h\). However, the conjugacy h is not \(C^1\). Indeed, 0 is a fixed point for both \(\alpha (g_0)\) and \({{\bar{A}}}\). The derivative \(D_0\alpha (g_0)={{\bar{A}}}+4\pi \varepsilon \left[ \begin{array}{cc} 1&{}\quad 0\\ 1&{}\quad 0\end{array} \right] \) has determinant 1 but different trace than \({{\bar{A}}}\) for \(\epsilon \ne 0\), so it is not conjugate to \({{\bar{A}}}\).

Since there are still faithful affine actions which do not satisfy the SDC conditions, for which our KAM scheme does not work, it would be interesting to explore the local rigidity of those actions. In the 2D case, [HX] gives the the topological classification for all such actions and smooth classification for faithful actions by volume preserving diffeomorphisms.

1.3 Global rigidity

The proof of the above local rigidity theorem is an application of the KAM techniques for \({\mathbb {Z}}^N\) actions initiated by Moser [M] in the context of \({\mathbb {Z}}^N\) actions by circle diffeomorphisms. The KAM technique is essentially perturbative. It is natural to ask if our solvable group action is rigid in the nonperturbative sense, i.e. whether it is globally rigid. A class of actions of a group, not necessarily close to a algebraic actions, is called globally rigid if any action from this class is conjugate to an algebraic one. There is a nonperturbative global rigidity theory for circle maps known as Herman–Yoccoz theory. For abelian group actions by circle diffeomorphisms, the global version of Moser’s theorem was proved by Fayad and Khanin [FK]. These global rigidity results rely on the Denjoy theorem stating that a \(C^2\) circle diffeomorphism with irrational rotation number is topologically conjugate to the irrational rotation by the rotation number.

In the higher dimensional case, there is no corresponding Herman-Yoccoz theory for diffeomorphisms of \({\mathbb {T}}^N\) isotopic to rotations. The reason is that rotation vectors are not well-defined in general. Even when rotation vectors are uniquely defined, they are not the complete invariants for conjugacy analogous to rotation numbers for circle maps. In particular, the obvious analogue of the topological conjugacy given by the Denjoy theorem does not exist for diffeomorphisms of \({\mathbb {T}}^N,\ N>1.\)

On the other hand, by a theorem of Franks (Theorem 3.1 below), Anosov diffeomorphisms of \({\mathbb {T}}^N\) are topologically conjugate to toral automorphisms. A diffeomorphism \(f:M\rightarrow M\) is called Anosov if there exist constants C and \(0<\lambda <1\) and for each \(x\in M\) a splitting of the tangent space \(T_xM=E^s(x)\oplus E^u(x)\) such that for every \(x\in M\), we have

  • \(D_xf E^s(x)=E^s(f(x))\) and \(D_xf E^u(x)=E^u(f(x))\),

  • \(\Vert D_xf^n v\Vert \le C \lambda ^n\Vert v\Vert \) for \(v\in E^s(x){\setminus } \{0\}\) and \(n\ge 0,\) and \(\Vert D_xf^n v\Vert \le C \lambda ^{-n}\Vert v\Vert \) for \(v\in E^u(x){\setminus } \{0\}\) and \(n\le 0.\)

As the starting point of a global rigidity result of our \(\Gamma _{{{\bar{A}}}}\) action, we assume \(\alpha (g_0)\) acts by an Anosov diffeomorphism homotopic to \({{\bar{A}}}\). With the topological conjugacy at hand, the next question is to show the topological conjugacy given by Franks’s theorem also linearizes the abelian subgroup action. The new problem that arises is that for toral diffeomorphisms homotopic to identity the rotation vector is in general not well-defined, and it only makes sense to talk about the rotation set. When there is more than one vector in the rotation set, the diffeomorphism cannot be conjugate to a translation.

1.3.1 Topological conjugacy

The case \(N=2\) admits a fairly complete understanding of the topological picture of ABC actions. In particular, the next result classifies the ABC group actions on \({\mathbb {T}}^2\) up to topological conjugacy when \(g_0\) acts by an Anosov diffeomorphism and the \(g_i\), for \(i\ge 1\) are not too far from translations, in a sense that we make precise.

Theorem 1.5

Let \({{\bar{A}}},{{\bar{B}}}\in \mathrm {SL}(2,{\mathbb {Z}})\) be linear Anosov and \(\alpha :\Gamma _{{{\bar{B}}}}\rightarrow \mathrm {Diff}^r({\mathbb {T}}^2),\)\(r>1\), be a representation satisfying

  1. (1)

    \(\alpha (g_0)\) is Anosov and homotopic to \({{\bar{A}}}\);

  2. (2)

    the sub-action generated by \(\alpha (g_1),\ldots ,\alpha (g_N) \) has sub-linear oscillation (see Definition 1.3 below) in the case of tr\({{\bar{A}}}\)=tr\({{\bar{B}}}\) and c-slow oscillation in the case of tr\({{\bar{A}}}\ne \)tr\({{\bar{B}}}\) where c is in Remark 1.3.

Then there exist \(\varvec{\rho }\) satisfying (1.3) and a unique bi-Hölder homeomorphism \(h:{\mathbb {T}}^2\rightarrow {\mathbb {T}}^2\) homotopic to the identity satisfying

$$\begin{aligned} h\circ \alpha ={\bar{\alpha }}({{\bar{A}}},\varvec{\rho })\circ h.\end{aligned}$$

Remark 1.2

In this theorem faithfulness of the action is not necessary, since we do not need the rotation vectors of \({\bar{\alpha }}(g_i)\) to be irrational.

The assumption on the sub-linear oscillation is removed in [HX] by introducing an Anosov foliation Tits’ alternative. Here we give the statement and refer the readers to [HX] for the proof.

Theorem 1.6

[HX]. Suppose that \(\alpha : \Gamma _{{{\bar{B}}}} \rightarrow \mathrm {Diff}({\mathbb {T}}^2)\) is such that:

  1. (1)

    \({{\bar{B}}} \in \mathrm {SL}(2,{\mathbb {Z}})\) is an Anosov linear map (i.e. \({{\bar{B}}}\) has eigenvalues of norm different than one.)

  2. (2)

    The diffeomorphism \(\alpha (g_0)\) is Anosov and homotopic to \({{\bar{A}}}\).

Then \(\alpha \) is topologically conjugate to an affine action of \(\Gamma _{{{\bar{B}}}}\) as in Proposition 1.1 up to finite index. More concretely, there exist a finite index subgroup \(\Gamma ' <\Gamma _{{{\bar{B}}}}\) and \(h \in \mathrm {Homeo}({\mathbb {T}}^2)\) such that \(h\alpha (\Gamma ') h^{-1}\) is an affine action.

With the notion of c-slow oscillation, we also obtain the following result for general N.

Theorem 1.7

Suppose \(N>2\). Given hyperbolic matrices \({{\bar{A}}},{{\bar{B}}}\in \mathrm {SL}(N,{\mathbb {Z}})\), there exists \(0\le c<1\) such that the following holds. Let \(\alpha :\Gamma _{{{\bar{B}}}}\rightarrow \mathrm {Diff}^r({\mathbb {T}}^N)\), \(r>1\), be a representation satisfying

  1. (1)

    \(\alpha (g_0)\) is Anosov and homotopic to \({{\bar{A}}}\),

  2. (2)

    the sub-action generated by \(\alpha (g_1),\ldots ,\alpha (g_N)\) has c-slow oscillation (see Definition 1.3 below).

Then there exist \(\varvec{\rho }\) satisfying (1.3) and a unique bi-Hölder homeomorphism \(h:{\mathbb {T}}^N\rightarrow {\mathbb {T}}^N\) homotopic to the identity with

$$\begin{aligned} h\circ \alpha ={\bar{\alpha }}({{\bar{A}}},\varvec{\rho })\circ h. \end{aligned}$$

Remark 1.3

The constant c in Theorem 1.7 can be made explicit as follows. Suppose \({{\bar{A}}}\) has eigenvalues \(\lambda _1^u,\ldots ,\lambda ^u_k\) and \(\lambda _1^s,\ldots ,\lambda ^s_\ell \), \(k\ge 1,\ \ell \ge 1\), \(k+\ell =N\), (complex eigenvalues and repeated eigenvalues are allowed), ordered as follows

$$\begin{aligned} |\lambda ^s_\ell |\le \cdots \le |\lambda _1^s|<1<|\lambda _1^u|\le \cdots \le |\lambda _k^u|. \end{aligned}$$
(1.7)

We introduce similar quantities for \({{\bar{B}}}\)

$$\begin{aligned} |\mu ^s_{\ell '}|\le \cdots \le |\mu _1^s|<1<|\mu _1^u|\le \cdots \le |\mu _{k'}^u|,\quad \ell '+k'=N. \end{aligned}$$
(1.8)

Then c can be chosen to be any number satisfying

$$\begin{aligned} 0\le c<\min \left\{ \frac{\ln |\lambda _1^u|}{\ln |\mu _{k'}^u|}, \frac{\ln |\lambda _1^s|}{\ln |\mu _{\ell '}^s|}\right\} .\end{aligned}$$
(1.9)

We next introduce the concept of sublinear deviation and c-slow deviation. Let \(T\in \mathrm {Diff}_0({\mathbb {T}}^N)\) and let \({\tilde{T}}:{\mathbb {R}}^N\rightarrow {\mathbb {R}}^N\) be a lift of T. Denote by \(\pi _i\) the projection to the i-th component of a vector in \({\mathbb {R}}^N\). Define the oscillation\(\mathrm {Osc}({\tilde{T}})\) of \({\tilde{T}}\) by:

$$\begin{aligned} \mathrm {Osc}({\tilde{T}}):=\max _{x,i}\{\pi _i({\tilde{T}}(x)-x)\}-\min _{x,i}\{\pi _i({\tilde{T}}(x)-x)\}. \end{aligned}$$

It is easy to see that Osc is independent of the choice of the lift. We define Osc\((T)=\)Osc\(({\tilde{T}})\).

Definition 1.3

   (1) For given \(c\in [0,1)\), we say that the abelian group action \(\beta :{\mathbb {Z}}^N\rightarrow \mathrm {Diff}_0^r({\mathbb {T}}^N)\) is of c-slow oscillation if

$$\begin{aligned} \limsup _{\Vert p\Vert \rightarrow \infty }\frac{\mathrm {Osc}(\beta (p))}{\Vert p\Vert ^c}<\infty . \end{aligned}$$
  1. (2)

    We say the action \(\beta \) is of bounded oscillation if it is of 0-slow oscillation.

  2. (3)

    We say the action \(\beta \) has sublinear oscillation if

    $$\begin{aligned} \limsup _{\Vert p\Vert \rightarrow \infty }\frac{\mathrm {Osc}(\beta (p))}{\Vert p\Vert }=0. \end{aligned}$$

Let us motivate the definition of c-slow oscillation a bit. In the circle map case, the existence and uniqueness of rotation number relies crucially on the fact that the graph in \({\mathbb {R}}^2\) of every lifted orbit stays within distance 1 of a straight line, and the rotation number is simply the slope of the line. This fact is also important in the study of Euler class and bounded cohomology for groups acting on circle [Gh]. We say a diffeomorphism \(f:{\mathbb {T}}^N\rightarrow {\mathbb {T}}^N\) is of bounded deviation if there exists \(\rho \in {\mathbb {T}}^N\) and a constant \(C>0\), such that

$$\begin{aligned} \Vert {\tilde{f}}^n(x)-x-n\rho \Vert _{C^0}\le C,\quad \forall \ n\in {\mathbb {Z}}. \end{aligned}$$

Being of bounded deviation implies that each orbit of \({\tilde{f}}\) stays within bounded distance of the line \({\mathbb {R}}\rho \). The concept of bounded deviation was first introduced by Morse, who called it of class A, in the case of geodesic flows on surfaces of genus greater than 1 [Mo]. It was later shown by Hedlund that globally minimizing geodesics for an arbitrary smooth metric on \({\mathbb {T}}^2\) are also of bounded deviation [He]. A generalization to Gromov hyperbolic spaces can be found in [BBI]. In the one-dimensional case, all circle maps are of bounded deviation, from which follows immediately the existence of the rotation number.

Being of bounded deviation does not however guarantee the existence of a conjugacy to a rigid translation. In the one dimensional case, a circle map with irrational rotation number is only known to be semi-conjugate to a rotation. Denjoy’s counter-example shows that the semi-conjugacy cannot be improved to a conjugacy without further assumptions. In the two dimensional case, it is known [Ja] that for a conservative pseudo-rotation of bounded oscillation, the rotation vector being totally irrational is equivalent to the existence of a semi-conjugacy to the rigid translation. Examples of diffeomorphisms on \({\mathbb {T}}^2\) of bounded deviation can be found in [MS], which are higher dimensional generalizations of Denjoy’s examples on \({\mathbb {T}}^1\).

It is easy to see that bounded deviation implies c-slow oscillation with \(c=0\). Sublinear oscillation occurs in first passage percolation (see Section 4.2 of [ADH]) where paths minimizing a cost defined for random walks on \({\mathbb {Z}}^2\) have c-slow oscillation with a power law \(c\le 3/4\) and conjecturally \(c=2/3\).

1.3.2 Smooth conjugacy

The conjugacy h in Theorems 1.5 and 1.7 is only known to be Hölder. It is natural to ask if we can improve the regularity. In hyperbolic dynamics, there is a periodic data rigidity theory for Anosov diffeomorphisms, which implies in the two-dimensional case that if the regularity of h is known to be \(C^1\), then h is in fact as smooth as the Anosov diffeomorphism \(\alpha (g_0)\) (see Theorem 5.7 below).

So the problem is now to find sufficient conditions for our action to ensure that the conjugacy h is \(C^1\). The invariant foliation structure given by the Anosov diffeomorphism enables us to generalize the Herman-Yoccoz theory for circle maps to the higher dimensional setting.

To obtain higher regularity of the conjugacy, we consider a slightly different class of ABC groups \(\Gamma _{{{\bar{B}}},K}\) for some \(K\ge 1.\)

We introduce the following condition:

$$\begin{aligned} (\star ) \varvec{\rho }\, rationally \,generates\, {\mathbb {T}}^N, \end{aligned}$$

meaning: the set \(\left\{ \sum _{i=1}^N p_i\varvec{\rho }_i\mathrm {\ mod\ }{\mathbb {Z}}^N\ |\ (p_1,\ldots ,p_N)\in {\mathbb {Z}}^N\right\} \) is dense in \({\mathbb {T}}^N\), where \(\varvec{\rho }_i\) denotes the ith column of \(\varvec{\rho }\).

Theorem 1.8

Let \({{\bar{A}}},{{\bar{B}}}\in \mathrm {SL}(2,{\mathbb {Z}})\) with tr\({{\bar{A}}}\)=tr\({{\bar{B}}}\). Given an Anosov diffeomorphism \(A:{\mathbb {T}}^2\rightarrow {\mathbb {T}}^2\) homotopic to \({{\bar{A}}}\in \mathrm {SL}(2,{\mathbb {Z}})\), there is a \(C^1\) open set \({\mathcal {O}}\) of Anosov diffeomorphisms containing A, and a number \(K_0\) such that for any integer \(K\ge K_0\), there exists a full measure set \({\mathcal {R}}_{2,K}\subset ({\mathbb {T}}^2)^K\) such that the following holds.

Let \(\alpha :\Gamma _{{{\bar{B}}},K}\rightarrow \mathrm {Diff}^r({\mathbb {T}}^2)\) be a representation satisfying:

  1. (1)

    \(\alpha (g_0)\in {\mathcal {O}}\),

  2. (2)

    the sub-action generated by \(\alpha (g_{1,1}),\alpha (g_{2,1})\) has sub-linear oscillation, and assume in addition that \(\varvec{\rho }\) given by Theorem 1.5 satisfies \((\star )\).

  3. (3)

    for some \(i:\{1,\ldots ,K\}\rightarrow \{1,2\}\), the rotation vectors \((\rho _{i(1),1},\ldots , \rho _{i(K),K})\) lie in \({\mathcal {R}}_{2,K}\), where \(\rho _{j,k}\) is the rotation vector of \(\alpha (g_{j,k})\) with respect to an invariant probability measure \(\mu _{j,k}\), \(j=1,2\) and \(k=1,\ldots ,K\).

Then there exists a unique \(C^{r-\varepsilon }\) conjugacy h conjugating the action \(\alpha \) to an affine action for \(\varepsilon \) arbitrarily small.

Theorem 1.9

Given a hyperbolic \({{\bar{A}}}\in \mathrm {SL}(N,{\mathbb {Z}})\), \(N>2\), with simple real spectrum, there exist a \(C^1\) neighborhood \({\mathcal {O}}\) of \({{\bar{A}}}\), a number \(0\le c<1\) and a number \(K_0\), such that for any integer \(K>K_0\), there exists a full measure set \({\mathcal {R}}_{N,K}\subset ({\mathbb {T}}^N)^K\) such that the following holds.

Let \(\alpha :\Gamma _{{{\bar{B}}},K}\rightarrow \mathrm {Diff}^r({\mathbb {T}}^N)\) be a representation satisfying

  1. (1)

    \(\alpha (g_0)\in {\mathcal {O}}\);

  2. (2)

    the sub-action generated by \(\alpha (g_{1,1}),\ldots , \alpha (g_{N,1})\) has c-slow oscillation and assume in addition that \(\varvec{\rho }\) given by Theorem 1.7 satisfies \((\star )\);

  3. (3)

    for some \(i:\{1,\ldots ,K\}\rightarrow \{1,\ldots ,N\}\), the rotation vectors \((\rho _{i(1),1},\ldots , \rho _{i(K),K})\) lie in \({\mathcal {R}}_{N,K}\), where \(\rho _{j,k}\) is the rotation vector of \(\alpha (g_{j,k})\) with respect to an invariant probability measure \(\mu _{j,k}\), \(j=1,\ldots ,N\) and \(k=1,\ldots ,K\).

Then there is a unique \(C^{1,\nu }\) conjugacy h conjugating \(\alpha \) to an affine action for some \(\nu >0\).

In dimension 3, the regularity of the conjugacy can be improved applying the work of Gogolev in [G2] (Theorem 5.8 below).

Corollary 1.10

Under the same assumptions as Theorem 1.9, suppose in addition that \(N=3\) and \(r>3\). Then the conjugacy \(h\in C^{r-3-\varepsilon },\) for arbitrarily small \(\varepsilon .\) Moreover, there exists a \(\kappa \in {\mathbb {Z}}\) such that if \(r\notin (\kappa ,\kappa +3)\), then \(h\in C^{r-\varepsilon }.\)

Further relaxation of the assumptions of Theorem 1.9 and Corollary 1.10 is possible, snd we discuss this in Sect. 6.3. In particular, in many cases the condition on the \(C^1\) closeness of A to \({{\bar{A}}}\) can be relaxed.

For the \(N>2\) case, the elliptic dynamics techniques in two dimensions carry over completely. However, there are two new obstructions that come from the hyperbolic dynamics. On the one hand, a conjugacy between two Anosov diffeomorphisms sends (un)stable leaves to (un)stable leaves. On the other hand the affine foliations parallel to the eigenspaces of \({{\bar{A}}}\) might not be sent to A-invariant foliations with smooth leaves. Adding to the difficulty is the fact that the regularity of the weakest stable and unstable distributions are low (only Hölder in general). These issues present an obstacle to developing a theory of periodic data rigidity as strong as the two-dimensional setting. The most general result [G1, GKS] in this direction for \(N>2\) states that if Aand\({{\bar{A}}}\)are\(C^1\)close and have the same periodic data, then the conjugacy h is \(C^{1+}\) (i.e. Dh and \(Dh^{-1}\) are Hölder).

The paper is organized as follows. We prove the local rigidity Theorem 1.4 in Sect. 2. All the remaining sections are devoted to the proof of the global rigidity results. In Sect. 3, we prove that there is a common conjugacy (Theorems 1.5 and 1.7). In Sect. 4, we prepare techniques from elliptic dynamics and hyperbolic dynamics. In Sect. 5, we state and prove the main propositions needed for the proof of Theorems 1.8 and 1.9. In Sect. 6, we prove the main Theorems 1.8 and 1.9. In Appendix A, we give the proof of the number theoretic result Theorem 5.3. In Sect. 6.3, we prove the results about affine actions stated in Sect. 1.1.

2 Local Rigidity: Proofs

In this section, we prove Theorem 1.4. Here is a sketch. Given representation \(\alpha :\Gamma _{{{\bar{B}}}}\rightarrow \mathrm {Diff}^\infty ({\mathbb {T}}^N)\) with \((\rho _{\mu _1}(\alpha (g_1)),\ldots ,\rho _{\mu _N}(\alpha (g_N)))=\varvec{\rho }\in \mathrm {SDC}(C,\tau )\), where \(\mu _i\) is a invariant probability measure of \(\alpha (g_i)\), we can proceed as in [M] using the KAM method to show that the abelian subgroup action can be smoothly conjugated to rigid translations. Using the group relation, we can further show that this conjugacy also conjugates the diffeomorphism \(\alpha (g_0)\) to a linear one.

The following proposition is proved by the standard KAM iteration procedure.

Proposition 2.1

(KAM for abelian group actions). Given \(C,\tau >0\), there exist \(\ell \ge 1\) and \(\varepsilon _0>0\) such that the following holds.

Let \(T_1,\ldots , T_m\in \mathrm {Diff}_0^\infty ({\mathbb {T}}^N)\) be commuting diffeomorphisms with \(m>1\). Suppose there exist \(T_k\)-invariant measures \(\mu _k\) such that the rotation vectors \(\rho _{\mu _k}(T_k),\ k=1,\ldots ,m,\) satisfy the simultaneous Diophantine condition with constants \(C,\tau \), and

$$\begin{aligned} \max _{1\le k\le m}\Vert T_k-\mathrm {id}-\rho _{\mu _k}(T_k)\Vert _{C^\ell } < \varepsilon _0. \end{aligned}$$

Then there exists a \(C^\infty \) diffeomorphism h that is \(C^1\) close to the identity such that

$$\begin{aligned} h\circ T_k(x)= h(x)+\rho _{\mu _k}(T_k),\quad x\in {\mathbb {T}}^N,\quad k=1,\ldots ,m. \end{aligned}$$

Moreover the invariant measure \(\mu = h^{-1}_*\mathrm {Leb}\), where \(\mathrm {Leb}\) is Haar measure on \({\mathbb {T}}^N\), satisfies \(\rho _\mu (T_k)=\rho _{\mu _k}(T_k)\), \(k=1,\ldots ,m\).

Proof

The proof of this lemma is essentially the same as Moser [M]. A proof was sketched by F. Rodriguez-Hertz in the case of \(m=2, N = 2\) (see Theorem 6.5 of [R1]). It is not difficult to adapt the proof to the case \(N > 2, m\ge 2\). The only complexity in the case \(N>1\) is caused by the fact that the rotation vector is in general not uniquely defined. Here we give a sketch of the KAM iteration procedure to explain how to incorporate the rotation vector and invariant measure.

Given \(T_1,\ldots T_m\), we want to find a conjugacy h as stated. The strategy of the KAM iteration scheme is to find a sequence \(\{h^{(n)}\},n\ge 0\) of diffeomorphisms such that \(\lim _n {\mathscr {H}}_n \rightarrow h\) in the \(C^1\) topology, where \({\mathscr {H}}_n =h^{(n)}\circ \cdots \circ h^{(1)}\). This limit h is a priori only \(C^1\), but a standard argument then shows that h is smooth. Let \(h^{(0)} = id\), and for \(k=1,\ldots , m\), let \(T_k^{(n)}(x):={\mathscr {H}}_n T_k {\mathscr {H}}_n^{-1}(x)\), for \(n\ge 0 \). Let \(R_k^{(n)}:{\mathbb {T}}^N\rightarrow {\mathbb {R}}^N\) be defined by the equation \(T_k^{(n)}(x) = x+\rho _{\mu _k}(T_k)+R_k^{(n)}(x)\).

We show that for each k, the sequence \(R_{k}^{(n)}\) converges to zero in the \(C^1\) topology as \(n\rightarrow \infty \). When \(T_k^{(n-1)}\) is known from the previous step, each \(h^{(n)}\) is found by solving the linearization of the equation \(h^{(n)}T_k^{(n-1)}= h^{(n)}+\rho _{\mu _k}(T_k)\). Since the equation is not solved exactly in each step, the conjugated map \(T_k^{(n)}=h^{(n)}T_k^{(n-1)}(h^{(n)})^{-1}\) is not yet the translation \(x\mapsto x+\rho _{\mu _k}(T_k)\) but is closer to it than \(T_k^{(n-1)}\) is. The standard KAM method consists mainly of two ideas: the solvability of each linearized equation under the Diophantine condition up to some loss of derivatives, and the convergence of the procedure due to the quadratic smallness of \(R_k^{(n+1)}\) compared with \(R^{(n)}_k\). The key observation of [M] that will also be important here is that the commutativity enables us to solve for one \(h^{(n)}\) simultaneously for all \(k=1,\ldots ,m\) assuming the SDC.

Step 1: the cohomological equation and commutativity.

Write \(T_k(x)=x+\rho _k+R_k(x)\) for \(k=1,\ldots ,m\), where \(\rho _k=\rho _{\mu _k}(T_k)\) is the rotation vector of \(T_k\) with respect to the given measure \(\mu _k\). For the sake of iteration later, we will also label \(T_k=T_k^{(0)},R_k=R_k^{(0)}\) and \(\mu _k=\mu _k^{(0)}\). The vector \(\rho _k\) will be kept constant independent of the super-script.

The conjugacy equation \(h T_k=h+\rho _k\) gives

$$\begin{aligned} x+\rho _k+R^{(0)}_k(x)+ H(x+\rho _k+R_k^{(0)}(x))=x+H(x)+\rho _{k},\ \mathrm {where\ } h(x)=x+H(x), \end{aligned}$$

whose linearization is

$$\begin{aligned} H(x+\rho _k)-H(x)=-R^{(0)}_k(x). \end{aligned}$$
(2.1)

Taking Fourier expansions \(H(x)=\sum _{n\in {\mathbb {Z}}^N} {\hat{H}}_n e^{2\pi i \langle n,x\rangle }\) and \(R^{(0)}_k(x)=\sum _{n\in {\mathbb {Z}}^N} {\hat{R}}^{(0)}_{k,n} e^{2\pi i \langle n,x\rangle }\), we get for \(n\ne 0\)

$$\begin{aligned} {\hat{H}}_n(e^{2\pi i\langle \rho _k, n\rangle }-1)=-{\hat{R}}^{(0)}_{k,n}. \end{aligned}$$
(2.2)

The commutativity condition \(T_kT_j=T_jT_k\) gives

$$\begin{aligned}&x+\rho _j+R^{(0)}_j(x)+\rho _k+R^{(0)}_k(x+\rho _j+R^{(0)}_j(x))\\&\quad =x+\rho _k+R^{(0)}_k(x)+\rho _j+R^{(0)}_j(x+\rho _k+R^{(0)}_k(x)), \end{aligned}$$

whose linearization is

$$\begin{aligned} R^{(0)}_k(x+\rho _j)-R^{(0)}_k=R^{(0)}_j(x+\rho _k)-R^{(0)}_j. \end{aligned}$$
(2.3)

In terms of Fourier coefficients,

$$\begin{aligned} {\hat{R}}^{(0)}_{k,n}(e^{i2\pi \langle \rho _j,n\rangle }-1)={\hat{R}}^{(0)}_{j,n}(e^{i2\pi \langle \rho _k,n\rangle }-1),\quad n\in {\mathbb {Z}}^N{\setminus }\{0\}. \end{aligned}$$
(2.4)

The key point is that the commutativity equation (2.4) implies that the solution of the cohomological equation (2.2) for some k also solves the same equation for all the other \(j\ne k\).

Step 2: the Fourier cut-off and solving the cohomological equation.

We next show how to solve the cohomological equation (2.2). By the simultaneous Diophantine condition, there exists \(C>0\) such that for each \(n\in {\mathbb {Z}}^{N}{\setminus }\{0\}\), there exists \(k=k(n)\in \{1,\ldots ,m\}\) such that \(|\langle \rho _{k}, n\rangle |\ge \frac{C}{\Vert n\Vert ^\tau }\) where \(\rho _k=\rho _{\mu _k}(T_k)\) is the rotation vector.

We take a Fourier cutoff so that we can control higher order derivatives via lower order derivatives. For \(J^{(0)}\in {\mathbb {N}}\), we solve for \({\hat{H}}_n\) with \(|n|<J^{(0)}\) so that we have \(\Vert {{\bar{H}}}^{(1)}\Vert _{C^{\ell +\tau }}\le C(J^{(0)})^{\tau }\Vert {{\bar{H}}}^{(1)}\Vert _{C^{\ell }}\), where \({{\bar{H}}}^{(1)}(x) := \sum _{|n|\le J^{(0)}} {\hat{H}}_n e^{2\pi i \langle n,x\rangle }\). Solving (2.2) for \(k=k(n)\), we get

$$\begin{aligned} {\hat{H}}_n=-(e^{2\pi i\langle \rho _{k(n)}, n\rangle }-1)^{-1}{\hat{R}}_{k(n),n},\ \mathrm {for\ all\ }|n|\le J^{(0)}. \end{aligned}$$

Denoting \(h^{(1)}(x)=x+{{\bar{H}}}^{(1)}(x)\), we get the estimate \(\Vert {{\bar{H}}}^{(1)}\Vert _{C^{\ell }}\le C\Vert R^{(0)}\Vert _{C^{\ell +\tau }}\) by the SDC.

From equation (2.2) and (2.4), we get that \({{\bar{H}}}^{(1)}\) solves the following equation, for all k:

$$\begin{aligned} {{\bar{H}}}^{(1)}(x+\rho _k)-{{\bar{H}}}^{(1)}(x)=-\Pi _{J^{(0)}} R_k^{(0)}+{\hat{R}}_{k,0}^{(0)}, \end{aligned}$$
(2.5)

where \(\Pi _{J^{(0)}}\) denotes the projection to Fourier modes with \(|n|<{J^{(0)}}\), and the constant \({\hat{R}}_{k,0}^{(0)}\) is the 0th Fourier coefficient of \(R_k^{(0)}\).

Step 3: the iteration.

Further introduce, for \(k=1,\ldots ,m,\)

$$\begin{aligned} T^{(1)}_k=h^{(1)}T^{(0)}_k(h^{(1)})^{-1}=x+\rho _k+R_k^{(1)},\quad \hbox {and }\mu _k^{(1)}=h^{(1)}_*\mu _k^{(0)}, \end{aligned}$$

where \(R_k^{(1)}\) is defined as follows. Expanding the expression \(h^{(1)}T^{(0)}_k=T^{(1)}_kh^{(1)}\), we get for all k:

$$\begin{aligned} x+ \rho _k+R_k^{(0)}+ {{\bar{H}}}^{(1)}(x+\rho _k+R_k^{(0)})=x+{{\bar{H}}}^{(1)}(x)+\rho _k+R_k^{(1)}\circ h^{(1)}. \end{aligned}$$

Comparing with (2.5), we obtain for all k:

$$\begin{aligned} R_k^{(1)}= & {} ({{\bar{H}}}^{(1)}(x+\rho _k+R_k^{(0)})-{{\bar{H}}}^{(1)}(x+\rho _k))\circ (h^{(1)})^{-1}\nonumber \\&+(R_k^{(0)}-\Pi _{J^{(0)}} R_k^{(0)})\circ (h^{(1)})^{-1}+{\hat{R}}_{k,0}^{(0)}. \end{aligned}$$
(2.6)

Since the conjugation by \(h^{(1)}\) does not change the rotation vector, we have

$$\begin{aligned} \rho _k= & {} \rho _{\mu _k}(T_k)=\rho _{\mu _k^{(0)}}(T_k^{(0)})\\= & {} \int _{{\mathbb {T}}^N} {\tilde{T}}^{(0)}x-x\,d\mu ^{(0)}_k=\int _{{\mathbb {T}}^N} {\tilde{T}}^{(1)}x-x\,d\mu ^{(1)}_k=\rho _{\mu _k^{(1)}}(T_k^{(1)}); \end{aligned}$$

from the equation \(\rho _k=\int {\tilde{T}}_k^{(1)}(x)-x\,d \mu ^{(1)}_k\), we get \(\int R_k^{(1)}\,d\mu _k^{(1)}=0\), so that the jth component \(j=1,\ldots ,N\) of \(R_{k}^{(1)}\) vanishes at some point \(x_j\). We see from (2.6) that \({\hat{R}}_{k,0}^{(0)}\) is bounded by the \(C^0\) norm of the first two terms on the RHS. The remainder \(R_k^{(1)}\) consists of the quadratically small error discarded when deriving (2.1), as well as the higher Fourier modes with \(|n|\ge J^{(0)}\) in \(R_k^{(0)}\). We thus obtain from (2.6) that

$$\begin{aligned} \Vert R_k^{(1)}\Vert _{C^1}\le C\Vert {{\bar{H}}}^{(1)}\Vert _{C^{2}}\Vert R_k^{(0)}\Vert _{C^1}+C\Vert (R_k^{(0)}-\Pi _{J^{(0)}} R_k^{(0)})\Vert _{C^{1}}. \end{aligned}$$
(2.7)

We set \(\varepsilon ^{(1)}=\max _k \Vert R_k^{(1)}\Vert _{C^1}\).

The standard KAM method in [M] then applies by repeating the above procedure for infinitely many steps, during which we shall let \(J^{(n)}\rightarrow \infty \), \(\varepsilon ^{(n)}\rightarrow 0\). The loss of derivative in (2.7) is handled in the standard way using the quadratic smallness on the RHS of (2.7). Higher order derivative estimates are obtained by interpolation between the \(C^1\) estimate in (2.7) and \(C^{\ell +\tau }\) estimate due to the Fourier cut-off for some large \(\ell \). In the limit, we get the conjugacy h in the statement. Since a collection of translations satisfying the simultaneous Diophantine condition is uniquely ergodic on the torus, we get the common invariant measure \(\mu \) must equal \(h^{-1}_*\mathrm {Leb}\).

We next prove the final statement on the rotation vectors. On the one hand, we have by the conjugacy that \(\rho _{\mu _k}(T_k)=\rho _{h_*\mu _k}({{\bar{T}}}_k)\) and \(\rho _{\mathrm {Leb}}({{\bar{T}}}_k)=\rho _{\mu }(T_k).\) On the other hand, since we have \({{\bar{T}}}_k(x)=x+\rho _{\mu _k}(T_k)\), we get \(\rho _{\mathrm {Leb}}({{\bar{T}}}_k)=\rho _{\mu _k}(T_k)\) by definition of rotation vectors. This completes the proof. \(\quad \square \)

Now we are ready to prove local rigidity.

Proof of Theorem 1.4 (local rigidity)

Let \(T_i=\alpha (g_i)\) and let

$$\begin{aligned} {{\bar{T}}}_i={\bar{\alpha }}({{\bar{A}}},\varvec{\rho })(g_i): x\mapsto x+\rho _{\mu _i}(T_i),\,\; i=1,\ldots ,N. \end{aligned}$$

Using the commutativity of the \(T_j\) and the simultaneous Diophantine condition, we apply Proposition 2.1 to construct h that simultaneously conjugates \(T_i\) to \({{\bar{T}}}_i\):

$$\begin{aligned} h\circ T_i = {{\bar{T}}}_i \circ h, \,\; i=1,\ldots ,N. \end{aligned}$$

We then compose with \(h^{-1}\) on the right and h on the left on both sides of the group relation \(AT_i = (\prod _{j=1}^N T_j^{b_{ji}})A\) to get

$$\begin{aligned} h Ah^{-1} {{\bar{T}}}_i=(\prod _{j=1}^N {{\bar{T}}}_j^{b_{ji}})hAh^{-1}; \end{aligned}$$

in other words,

$$\begin{aligned} hAh^{-1}(x+\rho _j)=hAh^{-1}(x)+\sum _{j=1}^N b_{ji}\rho _j,\mathrm {\ mod\ }{\mathbb {Z}}^N,\quad i=1,\ldots ,N. \end{aligned}$$
(2.8)

We introduce the function \(F(x) = hAh^{-1}(x)-{{\bar{A}}}x\) defined from \({\mathbb {T}}^n\) to \({\mathbb {T}}^n\). We can choose a homotopy connecting h to the identity under which F is homotopic to \(A-{{\bar{A}}}\). Since A is homotopic to \({{\bar{A}}}\), the image of \(A-{{\bar{A}}}\) is homotopic to a point. Therefore we can treat F as a continuous function from \({\mathbb {T}}^n\) to \({\mathbb {R}}^n\). Combined with (1.3), equation (2.8) then gives \(F(x + \rho _i) = F(x),\ \mathrm {mod}\ {\mathbb {Z}}^n,\; i=1,\ldots ,N.\) Continuity of F implies that \(F(x + \rho _i) - F(x)\) is a constant integer vector. We may choose \(n\in {\mathbb {Z}}\) such that \(n\rho _i\) mod \({\mathbb {Z}}^n\) is arbitrarily close to zero, and so by the continuity of F, this constant integer vector has to be zero. We thus obtain that \( F(x + \rho _i) = F(x)\). The Diophantine property of the vectors \(\rho _1,\ldots ,\rho _N\) implies that the action generated by the \({{\bar{T}}}_i\) on \({\mathbb {T}}^N\) is ergodic with respect to \(\hbox {Leb}\). Since the function F(x) is invariant, there is a vector \(F_0\in {\mathbb {R}}^N\) such that \(F(x)=hAh^{-1}(x)-{{\bar{A}}}x=F_0\) almost everywhere (but in fact everywhere, since F is continuous).

To kill this constant vector \(F_0\), we introduce the translation \(t(x) = x + (\mathrm {id}- {{\bar{A}}})^{-1}F_0\). It is easy to check that t conjugates \({{\bar{A}}} x + F_0\) and \({{\bar{A}}}x\), i.e. \({{\bar{A}}}t(x) + F _0= t({{\bar{A}}} x)\). Composing the above h with t, we get the conjugacy in the statement of the theorem.\(\quad \square \)

3 The Existence of the Common Conjugacy

In this section, we prove Theorem 1.7. We will use the following result of Franks [Fr].

Theorem 3.1

If \(A : {\mathbb {T}}^N\rightarrow {\mathbb {T}}^N\) is an Anosov diffeomorphism, then A is topologically conjugate to a hyperbolic toral automorphism induced by \(A_*:H_1({\mathbb {T}}^N,{\mathbb {Z}})\rightarrow H_1({\mathbb {T}}^N,{\mathbb {Z}})\).

This result has been generalized to the infranilmanifold case by Manning. It is also known ([KH] Theorem 19.1.2) that the conjugacy h is bi-Hölder; i.e. both h and \(h^{-1}\) are Hölder continuous.

Proof of Theorem 1.7

Suppose we are given an action \(\alpha :\Gamma _{{{\bar{B}}}}\rightarrow \mathrm {Diff}^r({\mathbb {T}}^N)\) such that \(\alpha (g_0)=A\) is Anosov and homotopic to \({{\bar{A}}}\), and \(\alpha (g_i)= T_i\), \(i=1,\ldots ,N\) has c-slow oscillation, where c satisfies (1.9). For \(p=(p_1,\ldots ,p_n)\in {\mathbb {Z}}^N\), we use the notation \(T^p\) to denote \(\prod _{i=1}^N T_i^{p_i}\). By Theorem 3.1, there is a homeomorphism h such that \(hAh^{-1} = {{\bar{A}}}\). Let \(R_i:=h T_ih^{-1}\), for \(i = 1,\ldots , N\). We will show that \(R_i(x) = x + \rho _i\) where \(\rho _i\) is the rotation vector of \(T_i\). We lift h to \({\tilde{h}}:{\mathbb {R}}^N\rightarrow {\mathbb {R}}^N\) and decompose

$$\begin{aligned} {\tilde{h}}(x) = x + g(x),\quad {\tilde{h}}^{-1}(x) = x + g_-(x),\quad {\tilde{T}}^p(x) = x+\Delta T_p(x), \end{aligned}$$

for \(p\in {\mathbb {Z}}^N\) where g(x), \(g_-(x)\) and \(\Delta T_p(x)\) are \({\mathbb {Z}}^N\)-periodic.

For \(p\in {\mathbb {Z}}^N\) and \(t\in {\mathbb {T}}^N\), we have

$$\begin{aligned} \begin{aligned} {\tilde{R}}^p(x)&= {\tilde{h}} {\tilde{T}}^p{\tilde{h}}^{-1}(x) \\&= {\tilde{T}}^p{\tilde{h}}^{-1}(x) + g({\tilde{T}}^p{\tilde{h}}^{-1}(x))\\&={\tilde{h}}^{-1}(x) + \Delta T_p({\tilde{h}}^{-1}(x)) + g({\tilde{T}}^p{\tilde{h}}^{-1}(x))\\&= x +\Delta T_p({\tilde{h}}^{-1}(x)) + g_-(x) + g({\tilde{T}}^p{\tilde{h}}^{-1}(x)). \end{aligned} \end{aligned}$$

Since both \(g_-\) and g are uniformly bounded, it follows that if \(\{T^p\}\) has c-slow oscillation, then so does \(\{R^p\}\). From the group relation, we obtain for \(p\in {\mathbb {Z}}^N\) and \(n\in {\mathbb {Z}}\),

$$\begin{aligned} {{\bar{A}}}^n{\tilde{R}}^p(x)= & {} {\tilde{R}}^{({{\bar{B}}}^t)^np} {{\bar{A}}}^n(x)+Q_{p,n},\quad {{\bar{A}}}^n({\tilde{R}}^p(x)- x) \nonumber \\= & {} ({\tilde{R}}^{({{\bar{B}}}^t)^np} - \mathrm {id}){{\bar{A}}}^n(x)+Q_{p,n}, \end{aligned}$$
(3.1)

where \(Q_{p,n}\) is an integer vector in \({\mathbb {Z}}^N\) depending on pn and the choice of the lifts.

For each \({\tilde{R}}^p,\ p \in {\mathbb {Z}}^N\), we take the Fourier expansion \({\tilde{R}}^p(x)- x = \sum _{k\in {\mathbb {Z}}^N} {\hat{R}}_k(p)e^{2\pi i\langle k,x\rangle }\), where the coefficient for \(k\ne 0\) is

$$\begin{aligned} {\hat{R}}_k(p) =\int _{{\mathbb {T}}^N} ({\tilde{R}}^p(x)-x)e^{-2\pi i\langle k,x\rangle }\, dx. \end{aligned}$$

The condition that \(\{R^p\}\) has c-slow oscillation implies that there exist CP such that when \(\Vert p\Vert \ge P\), we have \(\Vert {\hat{R}}_k(p)\Vert \le C\Vert p\Vert ^c\), uniformly for all \(k\ne 0\). From equation (3.1) we obtain that for all \(k\in {\mathbb {Z}}^N{\setminus }\{0\}\),

$$\begin{aligned} {\hat{R}}_k(p) = {{\bar{A}}}^{-n}{\hat{R}}_{({{\bar{A}}}^t)^{-n}k}(({{\bar{B}}}^t)^np). \end{aligned}$$
(3.2)

We next consider the splitting of \({\mathbb {R}}^N\) into \({\bar{{\mathcal {W}}}}^u(0)\oplus {\bar{{\mathcal {W}}}}^s(0)\), the direct sum decomposition into unstable and stable eigenspaces of \({{\bar{A}}}\). Each \({\hat{R}}_k(p)\) is a vector, so we write \({\hat{R}}_k(p)=({\hat{R}}_k(p))^u+({\hat{R}}_k(p))^s\) where \(({\hat{R}}_k(p))^{u,s}\in {\bar{{\mathcal {W}}}}^{u,s}(0)\). Applying \({{\bar{A}}}^{n}\) we get \({{\bar{A}}}^{n}{\hat{R}}_k(p)={{\bar{A}}}^{n}({\hat{R}}_k(p))^u+{{\bar{A}}}^{n}({\hat{R}}_k(p))^s\) with the estimate \(\Vert {{\bar{A}}}^{n}({\hat{R}}_k(p))^u\Vert \ge |\lambda _1^u|^n\Vert ({\hat{R}}_k(p))^u\Vert . \) Doing this decomposition to the equation (3.2), we obtain the following estimate for \(\Vert ({{\bar{B}}}^t)^np\Vert \ge P\):

$$\begin{aligned} \Vert ({\hat{R}}_k(p))^u\Vert \le \frac{1}{|\lambda _1^u|^{n}} \Vert {\hat{R}}_{({{\bar{A}}}^t)^{-n}k}(({{\bar{B}}}^t)^{n}p)\Vert \le C\frac{\Vert ({{\bar{B}}}^t)^np\Vert ^c}{|\lambda _1^u|^{n}}\le C\Vert p\Vert ^c\left( \frac{|\mu ^u_{k'}|^c}{|\lambda _1^u|}\right) ^n\rightarrow 0 \end{aligned}$$

as \(n\rightarrow \infty \), if \(c< \frac{\ln |\lambda _1^u|}{\ln |\mu _{k'}^u|}\). Similarly, letting \(n\rightarrow -\infty \) and projecting to the \({\bar{{\mathcal {W}}}}^s(0)\) in the above argument, we get that the projection of \({\hat{R}}_k(p)\) to \({\bar{{\mathcal {W}}}}^s(0)\) is also 0. Therefore \({\hat{R}}_k(p) = 0\) for all \(k\ne 0\). This implies that each \(R^p(x)- x,\ p \in {\mathbb {Z}}^N\), is a constant. Since a conjugacy does not change the rotation vector, we have \(R_i(x) = x + \rho _i\), where \(\rho _i\) the rotation vector of \(T_i\), \(i=1,\ldots ,N\). Next we have \(R^p(x) = x + \varvec{\rho }p\), \(p\in {\mathbb {Z}}^N\). This completes the proof. \(\quad \square \)

4 Preliminaries: Elliptic and Hyperbolic Dynamics

In this section, we explain and develop techniques from elliptic and hyperbolic dynamics that we will use to prove our main results. We first introduce the framework of Herman–Yoccoz–Katznelson–Ornstein for obtaining regularity of the conjugacy of circle maps and generalize it to abelian group actions on \({\mathbb {T}}^N\). Next, we state facts about Anosov diffeomorphisms, including the invariant foliation structure and its regularity properties.

4.1 Elliptic dynamics: the framework of Herman–Yoccoz–Katznelson–Ornstein

In this section, we generalize to abelian group actions the framework of Herman–Yoccoz theory for circle maps after Katznelson-Ornstein.

Definition 4.1

Let \({\mathcal {F}}\) be a continuous foliation of \({\mathbb {T}}^N\) by one-dimensional uniformly \(C^1\) leaves \({\mathcal {F}}(x),\ x\in {\mathbb {T}}^N\), and let \(k\ge 1\).

  1. (1)

    We denote by \({\mathcal {H}}^k = {\mathcal {H}}^k({\mathbb {T}}^N)\) the group of \(C^k\) diffeomorphisms on \({\mathbb {T}}^N\), \(k\in {\mathbb {N}}\).

  2. (2)

    We denote by \({\mathcal {H}}^k_{\mathcal {F}}\) the subgroup of diffeomorphisms in \({\mathcal {H}}^k\) preserving the foliation \({\mathcal {F}}\); i.e., \(f {\mathcal {F}}(x)={\mathcal {F}}(f(x)),\ \forall \ f\in {\mathcal {H}}^k_{\mathcal {F}}\) and \(\forall \ x\in {\mathbb {T}}^N\).

  3. (3)

    The \(C^k\) norm \(\Vert \cdot \Vert _{C^k({\mathcal {F}})}\) on \(C^k({\mathbb {T}}^N, {\mathbb {R}}^{N})\) along the foliation \({\mathcal {F}}\) is defined as follows. For \(\varphi \in C^k({\mathbb {T}}^N, {\mathbb {R}}^{N})\), let

    $$\begin{aligned} \Vert \varphi \Vert _{C^k({\mathcal {F}})}:=\sum _{i=0}^k\sup _x\Vert (D_x^i \left( \varphi \vert _{{\mathcal {F}}}\right) \Vert , \end{aligned}$$

    where the norm inside the summand on the right hand side is the operator norm induced by the Euclidean metric restricted to the leaves of \({\mathcal {F}}\).

4.1.1 Generalization of the framework of Herman after Katznelson–Ornstein

The following statement about circle maps was known to Herman [H]:

Suppose that\(f\in {\mathcal {H}}^k({\mathbb {T}}^1)\)takes the form\(f=h^{-1}(h(x)+\rho )\), where\(h:{\mathbb {T}}^1\rightarrow {\mathbb {T}}^1\)is a homeomorphism and\(\rho \notin {\mathbb {Q}}\). Then\(h \in {\mathcal {H}}^k\)if and only if the iterates\(\{ f^j\}_{j\in {\mathbb {Z}}}\)are uniformly bounded in\({\mathcal {H}}^k\).

Following Katznelson–Ornstein [KO], we generalize this statement to abelian subgroup actions.

Definition 4.2

A collection of m vectors \(\rho _1,\ldots ,\rho _m\in {\mathbb {T}}^N\) is said to rationally generate\({\mathbb {T}}^N\) if \(\{\sum _{i=1}^m p_i\rho _i,\ p_i\in {\mathbb {Z}},\ i=1,\ldots ,m\}\) is dense on \({\mathbb {T}}^N\).

Given commuting diffeomorphisms \(T_1,\ldots , T_m\) on \({\mathbb {T}}^N\) and \(p=(p_1,\ldots ,p_m)\in {\mathbb {Z}}^m\), we will use the abbreviation \(T^p=\prod _{i=1}^m T_i^{p_i}\).

Proposition 4.1

Suppose that for some \(k>0\), the maps \(T_i\in {\mathcal {H}}^k({\mathbb {T}}^N),\ i=1,\ldots ,m\), commute. Suppose also that there exists a homeomorphism \(h: \ {\mathbb {T}}^N\rightarrow {\mathbb {T}}^N\) such that \(T_i = h^{-1}{{\bar{T}}}_ih\), where \({{\bar{T}}}_i(x) = x + \rho _i, \mathrm {mod\ } {\mathbb {Z}}^N,\ i = 1, 2,\ldots ,m\), and \(\rho _1,\ldots ,\rho _m\) rationally generate \({\mathbb {T}}^N\). Fix a lift \({\tilde{h}}:{\mathbb {T}}^N\rightarrow {\mathbb {R}}^N\) of h.

Then the following equality holds for all x:

$$\begin{aligned} {\tilde{h}}(x)= \mathrm {const}. + \lim _{n\rightarrow \infty } \frac{1}{(2n + 1)^{N}} \sum _{\Vert p\Vert _{\ell ^\infty }\le n}\left( \tilde{T}^p(x)-\varvec{\rho }p\right) . \end{aligned}$$
(4.1)

where \(p = (p_1,\ldots , p_m)\in {\mathbb {Z}}^m\), \(\varvec{\rho }= (\rho _1,\ldots ,\rho _m)\in {\mathbb {T}}^{N\times m}\) and \({\tilde{T}}^p\) is the lift of \(T^p\) satisfying \({\tilde{T}}^p(0)={\tilde{h}}^{-1}({\tilde{h}}(0)+\varvec{\rho }p)\).

Proof of Proposition 4.1

From \(T^p = h^{-1}{{\bar{T}}}^{p}h\), we get \( T^p = h^{-1}( h(x) + \varvec{\rho }p\)). We next fix the lift \({\tilde{T}}^p\) of \(T^p\) that satisfies \({\tilde{T}}^p(0)={\tilde{h}}^{-1}({\tilde{h}}(0)+\varvec{\rho }p)\) to obtain

$$\begin{aligned} {\tilde{T}}^p(x)-\varvec{\rho }p-{\tilde{h}}(x) = ({\tilde{h}}^{-1}-\mathrm {id})\circ ({\tilde{h}}(x) + \varvec{\rho }p). \end{aligned}$$

Averaging over all \(p\in {\mathbb {Z}}^m\) with \(\Vert p\Vert _{\ell ^\infty }\le n\), and letting \(n\rightarrow \infty \), we get

$$\begin{aligned} {\tilde{h}}(x)= -\int _{{\mathbb {T}}^N}({\tilde{h}}^{-1}(x)- x) dx + \lim _{n\rightarrow \infty }\frac{1}{ (2n + 1)^{N}} \sum _{\Vert p\Vert _{\ell ^\infty }\le n}\left( \tilde{T}^p(x)-\varvec{\rho }p\right) , \end{aligned}$$

where to get the integral, we use the fact that the affine action of \({\mathbb {Z}}^m\) via the rigid translations \({{\bar{T}}}_i,\ i=1,\ldots ,m\) is ergodic with respect to Lebesgue, combined with a version of the Birkhoff ergodic theorem for abelian group actions (c.f. Theorem 1.1. of [L]). \(\quad \square \)

Corollary 4.2

Let the abelian group \({\mathcal {A}}=\{T^p:p\in {\mathbb {Z}}^m\}\ (<{\mathcal {H}}^k({\mathbb {T}}^N))\) and the conjugacy h be as in Proposition 4.1.

  1. (1)

    Let \({\bar{{\mathcal {F}}}} = \{{\bar{{\mathcal {F}}}}(x),\ x\in {\mathbb {T}}^N\}\) be an affine foliation of \({\mathbb {T}}^N\) by parallel lines. Let \({\mathcal {F}}\) be the (topological) foliation of \({\mathbb {T}}^N\) whose leaves are \({\mathcal {F}}(x) = h^{-1}({\bar{{\mathcal {F}}}}(h(x)))\), \(x\in {\mathbb {T}}^N\).

  2. (2)

    Assume the leaves \({\mathcal {F}}(x)\) of the foliation \({\mathcal {F}}\) are uniformly \(C^1\). Note that this implies that \({\mathcal {A}}< {\mathcal {H}}^k_{\mathcal {F}}({\mathbb {T}}^N)\).

If the set \(\{T^p(x)-x:p\in {\mathbb {Z}}^m\} \subset C^k({\mathbb {T}}^N,{\mathbb {R}}^N)\) is precompact in the \(\Vert \cdot \Vert _{C^k({\mathcal {F}})}\) norm, then h is uniformly \(C^k\) along the leaves of \({\mathcal {F}}\). Moreover, in the case of \(k=1\), we also have that \(h^{-1}\) is uniformly \(C^1\) along the leaves of \({\bar{{\mathcal {F}}}}\).

The proof of Corollary 4.2 is given in Sect. 4.1.2.

Given a continuous increasing function \(\psi (x):{\mathbb {R}}_{\ge 0}\rightarrow {\mathbb {R}}_{\ge 0}\) with \(\psi (0)= 0\), we say that a function \(f: (X,d)\rightarrow (X',d')\) between two metric spaces has modulus of continuity\(\psi \) at a point \(x_0\in X\), if there exists a constant \(C>0\) such that

$$\begin{aligned} d'(f(x_0),f(y))\le C\psi (d(x_0,y)), \end{aligned}$$

for any \(y \in X\) sufficiently close to \(x_0\).

Proposition 4.3

Let the abelian group \({\mathcal {A}}\), the conjugacy h, and the foliations \({\mathcal {F}}\), \({\bar{{\mathcal {F}}}}\) be as in Corollary 4.2. Assume that \(\{T^p(x)-x:p\in {\mathbb {Z}}^m\} \subset C^1({\mathbb {T}}^N,{\mathbb {R}}^N)\) is uniformly bounded in the \(\Vert \cdot \Vert _{C^1({\mathcal {F}})}\) norm and that the mapping

$$\begin{aligned} \varvec{\rho }p\ (\mathrm {mod\ } {\mathbb {Z}}^N) \mapsto \Vert T^p(x)-x\Vert _{C^1({\mathcal {F}})} \end{aligned}$$

has modulus of continuity \(\psi \) at \(\varvec{\rho }p=0\). Then both \(\Vert D\left( h\vert _{{\mathcal {F}}}\right) \Vert \) and \(\Vert D\left( h^{-1}\vert _{{\bar{{\mathcal {F}}}}}\right) \Vert \) have modulus of continuity \(\psi \) with respect to the Euclidean metric.

The proof of Proposition 4.3 is given in Sect. 4.1.3.

4.1.2 Proof of Corollary 4.2

We only prove the case of \(k=1\). A similar argument gives the continuity of higher derivatives.

We denote the nth Birkhoff average on the right hand side of of (4.1) by \({\tilde{S}}_n\), so (4.1) can be rephrased as \({\tilde{h}}=\lim _{n\rightarrow \infty } {\tilde{S}}_n\) up to an additive constant. Since \({\mathcal {A}}\) is assumed to be pre-compact in \({\mathcal {H}}_{\mathcal {F}}^1\) and the pointwise convergence is given by (4.1), we have that \(\{D{\tilde{S}}_n|_{\mathcal {F}}: n\ge 1\}\) is precompact in the \(C^0\) operator norm, by Theorem 5.35 of [AB], which states that the convex hull of compact sets is compact in a completely metrizable locally convex space. This shows that h is differentiable along \({\mathcal {F}}\) and any subsequential limit of \(\{D{\tilde{S}}_n|_{\mathcal {F}}: n\ge 1\}\) is \(Dh|_{\mathcal {F}}\).

We have proved that \(Dh|_{\mathcal {F}}\) is continuous. To show that \(Dh^{-1}|_{{\bar{{\mathcal {F}}}}}\) is also continuous, by the implicit function theorem, it is enough to show that \(\Vert D_xh|_{{\mathcal {F}}}\Vert \) is bounded away from zero. Fix \(\varepsilon > 0\) such that \(\Vert D_{x_0}h|_{\mathcal {F}}\Vert <\varepsilon \) for some \(x_0\). Then the same inequality holds in a small neighborhood \(B(x_0)\) of \(x_0\). By the ergodicity of \(T^p\), there exist finitely many \(p_i\), \(i=1,\ldots ,n\) such that \(\cup _{i=1}^nT^{p_i}(B(x_0))={\mathbb {T}}^N\). This, combined with the equation

$$\begin{aligned} D_{T^p(x)}h |_{{\mathcal {F}}}\cdot D_xT^p|_{\mathcal {F}}=D_xh|_{\mathcal {F}}, \end{aligned}$$

implies that there exists a constant C independent of p such that

$$\begin{aligned} \Vert D_xh|_{\mathcal {F}}\Vert <C\varepsilon , \end{aligned}$$
(4.2)

for all \(x\in {\mathbb {T}}^N\)

Consider a leaf \({\mathcal {F}}(x)\) and \(x'\in {\mathcal {F}}(x)\). We lift the leaf to the universal cover and consider the image of the segment between x and \(x'\) under \({\tilde{h}}\), i.e. the line segment between \({\tilde{h}}(x)\) and \({\tilde{h}}(x')\). Since \({\tilde{h}}(x)=x+g(x)\) where g(x) is \({\mathbb {Z}}^N\) periodic, choosing x and \(x'\) far apart on \({\mathcal {F}}(x)\) we can make \(\Vert {\tilde{h}}(x')-{\tilde{h}}(x)\Vert \ge 1.\) Fix such a choice of \(x,x'\). There is a \(C^1\) curve \(\gamma _{x,x'}\subset {\mathcal {F}}(x)\) connecting x to \(x'\) with length bounded by a constant \(C_{x,x'}\). The image \(h(\gamma _{x,x'})\) is a \(C^1\) curve connecting h(x) and \(h(x')\) with length larger than 1. Inequality (4.2) implies that

$$\begin{aligned} 1\le \Vert {\tilde{h}}(x)-{\tilde{h}}(x')\Vert \le \int _{\gamma _{x,x'}}\Vert Dh|_{\mathcal {F}}\Vert \le \varepsilon C_{x,x'}. \end{aligned}$$

This implies that \(\epsilon > C_{x,x'}^{-1}\), and so \(\Vert Dh|_{\mathcal {F}}\Vert \) is uniformly bounded below, completing the proof. \(\quad \square \)

4.1.3 Proof of Proposition 4.3

We first introduce a translation-invariant distance \(d_{\mathcal {F}}^1\) on \({\mathcal {A}}\) that is equivalent to the \(C^1\) norm as follows (c.f. [K]). Let \(B_1\) be the set of \(\phi \in C({\mathbb {T}}^n,{\mathbb {R}})\) with \(\Vert \phi \Vert _{C^1({\mathcal {F}})}=1\). For \(f,g\in {\mathcal {H}}^k_{\mathcal {F}}\), we introduce \(d^1_{\mathcal {F}}(f,g):=d^1_{\mathcal {F}}(f g^{-1},\mathrm {id})\) and

$$\begin{aligned} d^1_{\mathcal {F}}(f):=\log (\max \{\Phi (f),\Phi (f^{-1}\})+\sup _x\Vert f(x)-x\Vert , \end{aligned}$$

where \(\Phi (f):=\sup _{\phi \in B_1}\Vert \phi \circ f\Vert _{C^1({\mathcal {F}})}.\) To verify the triangle inequality, we note that

$$\begin{aligned} \begin{aligned} \Phi (fg)&=\sup _{\phi \in B_1}\Vert \phi \circ (fg)\Vert _{C^1({\mathcal {F}})}=\sup _{\phi \in B_1}\left\| \frac{1}{\Vert \phi \circ f\Vert _{C^1({\mathcal {F}})}}\phi \circ (fg)\right\| _{C^1({\mathcal {F}})}\cdot \Vert \phi \circ f\Vert _{C^1({\mathcal {F}})}\\&\le \sup _{\psi \in B_1} \Vert \psi \circ g\Vert _{C^1({\mathcal {F}})}\sup _{\phi \in B_1} \Vert \phi \circ f\Vert _{C^1({\mathcal {F}})}=\Phi (f)\Phi (g). \end{aligned} \end{aligned}$$
(4.3)

The chain rule implies that \(d_{\mathcal {F}}^1\) is equivalent to the \(C^1\) distance \(\sup _x \Vert f(x)-x\Vert _{C^1({\mathcal {F}})}\).

By the assumption on the modulus of continuity \(\psi \), the map from \(\varvec{\rho }p\in {\mathbb {T}}^N\) to \(C^1({\mathcal {F}})\) via \(\varvec{\rho }p\mapsto T^p\) is continuous in the \(C^1\) norm at the point \(\varvec{\rho }p=0\). By the translation invariance of the \(d^1_{\mathcal {F}}\) norm, it is continuous at every point \(\varvec{\rho }p\) mod \({\mathbb {Z}}^N\). From the compactness of \({\mathbb {T}}^N\) we obtain that \(\{ T^p(x),\ p\in {\mathbb {Z}}^m\}\) is pre-compact in \({\mathcal {H}}^1_{\mathcal {F}}\). If then follows from Corollary 4.2 that the functions \(Dh|_{\mathcal {F}}\) and \(Dh^{-1}|_{{\bar{{\mathcal {F}}}}}\) are continuous. Differentiating the expression \(T^p(x)= h^{-1}( h(x)+\varvec{\rho }p)\) along the leaf \({\mathcal {F}}(x)\), we get that

$$\begin{aligned} D_xT^p|_{\mathcal {F}}-\mathrm {id}|_{\mathcal {F}}=\left( D_{(h(x) + \varvec{\rho }p)}h^{-1}|_{{\bar{{\mathcal {F}}}}}-D_{h(x)}h^{-1}|_{{\bar{{\mathcal {F}}}}}\right) \cdot D_xh|_{{\mathcal {F}}}. \end{aligned}$$

Since the LHS satisfies the modulus of continuity \(\psi \) by assumption, i.e.

$$\begin{aligned} \Vert D_xT^p|_{\mathcal {F}}-\mathrm {id}|_{\mathcal {F}}\Vert _{C^0}\le C \psi (\Vert \varvec{\rho }p\Vert ), \end{aligned}$$

for all p with \(\Vert \varvec{\rho }p\Vert \) small, we get

$$\begin{aligned} \Vert D_{(h(x) + \varvec{\rho }p)}h^{-1}|_{{\bar{{\mathcal {F}}}}}-D_{h(x)}h^{-1}|_{{\bar{{\mathcal {F}}}}} \Vert _{C^0}\le C \psi (\Vert \varvec{\rho }p\Vert )(\min _x\Vert D_xh|_{\mathcal {F}}\Vert )^{-1}. \end{aligned}$$

Hence \(Dh^{-1}|_{{\bar{{\mathcal {F}}}}}\) has modulus of continuity \(\psi \). To get the same modulus of continuity for \(Dh|_{{\mathcal {F}}}\), we use \(Dh|_{{\mathcal {F}}}\cdot Dh^{-1}|_{{\bar{{\mathcal {F}}}}}=\mathrm {id}|_{{\bar{{\mathcal {F}}}}}\). \(\quad \square \)

4.2 Hyperbolic dynamics: invariant foliations of Anosov diffeomorphisms

In this section, we recall some results from hyperbolic dynamics. Our statements concern the the unstable objects \({\mathcal {W}}^u\) and \(E^u\); the stable analogues also hold.

Definition 4.3

A \(C^1\) diffeomorphism \(A:{\mathbb {T}}^N\rightarrow {\mathbb {T}}^N\) is a Anosov diffeomorphism with simple Mather spectrum if there exists a DA-invariant splitting of the tangent space

$$\begin{aligned} T_x{\mathbb {T}}^N=E^s_\ell (x)\oplus \cdots \oplus E^s_1(x)\oplus E^u_1(x)\oplus \cdots \oplus E^u_k(x),\quad k+\ell =N,\ k,\ell \ge 1 \end{aligned}$$

and numbers

$$\begin{aligned} {\underline{\mu }}_\ell ^s\le {\bar{\mu }}_\ell ^s<\cdots<{\underline{\mu }}_1^s\le {\bar{\mu }}_1^s<1<{\underline{\mu }}_1^u\le {\bar{\mu }}_1^u<\cdots <{\underline{\mu }}_k^s\le {\bar{\mu }}_k^s \end{aligned}$$

such that for some constant \(C>1\),

$$\begin{aligned} \frac{1}{C}({\underline{\mu }}_i^{u,s})^n\le \frac{\Vert DA^n v\Vert }{\Vert v\Vert }\le C({\bar{\mu }}_i^{u,s})^n,\quad \forall v\in E^{u,s}_i{\setminus } \{0\}, \end{aligned}$$

where \(i=1,\ldots ,\ell \) for s and \(i=1,\ldots ,k\) for u.

The next result is classical (see [HPS]).

Proposition 4.4

For any \(C^r,\ r > 1\) Anosov diffeomorphism \(A : {\mathbb {T}}^N\rightarrow {\mathbb {T}}^N\) with simple Mather spectrum, the strong invariant distribution \(E^u_{ i\le } := E^u_{i}\oplus \ldots \oplus E^u_{k}\) is uniquely integrable, tangent to a foliation \({\mathcal {W}}^u_{i\le }\) of \({\mathbb {T}}^N\) whose leaf \({\mathcal {W}}^u_{i\le }(x)\) passing through x is \(C^r, x \in {\mathbb {T}}^N\). This gives rise to a flag of strong unstable foliations

$$\begin{aligned} {\mathcal {W}}^u_k(x)\subset {\mathcal {W}}^{u}_{(k-1)\le }(x) \subset \cdots \subset {\mathcal {W}}^u_{2\le }(x) \subset {\mathcal {W}}^u_{1\le }(x):= {\mathcal {W}}^u(x),\quad x\in {\mathbb {T}}^N, \end{aligned}$$

where each of the inclusions is proper and \({\mathcal {W}}^u_{i\le }\) sub-foliates \({\mathcal {W}}^u_{(i-1)\le }\) with \(C^r\) leaves for \(i = 2,\ldots , k\).

It is known that simple Mather spectrum is an open property in the \(C^1\) topology. In particular, if \({{\bar{A}}}\) is a toral automorphism with simple real spectrum, then an Anosov diffeormophism that is \(C^1\) close to \({{\bar{A}}}\) has simple Mather spectrum.

Proposition 4.5

(Hölder regularity of the invariant distribution, Theorem 19.1.6 of [KH]). For each i, the distribution \(E^u_i (x)\) is Hölder in the base point x.

The Holder exponent depends only on the expansion and contraction rates \({\bar{\mu }}_i^{u,s}\) and \({\underline{\mu }}_i^{u,s}\).

We denote the weak unstable bundles for A by \(E^u_{\le i}:=E^u_1(x)\oplus E^u_2(x)\oplus \cdots \oplus E^u_i(x),\) and that of \({{\bar{A}}}\) by \({{\bar{E}}}^u_{\le i}(x):={{\bar{E}}}^u_1(x)\oplus {{\bar{E}}}^u_2(x)\oplus \cdots \oplus {{\bar{E}}}^u_i(x),\)\(i = 1,\ldots ,k\), \(x\in {\mathbb {T}}^N\). Denote the unstable foliation of A by \({\mathcal {W}}^u\) and that of \({{\bar{A}}}\) by \({\bar{{\mathcal {W}}}}^u\).

Proposition 4.6

(Lemma 6.1–6.3 of [G1]). Consider A a \(C^r\) Anosov diffeomorphism that is \(C^1\) close to a linear toral automorphism \({{\bar{A}}}\) with simple real spectrum, and the bi-Hölder conjugacy h given by Theorem 3.1 with \(h \circ A = {{\bar{A}}} \circ h\). Then

  1. (1)

    h preserves the unstable foliation: \(h({\mathcal {W}}^u(x)) = {\bar{{\mathcal {W}}}}^u(h(x))\), for all \(x\in {\mathbb {T}}^N\);

  2. (2)

    each weak unstable distribution \(E^u_{\le i}\) is uniquely integrable, tangent to a foliation \({\mathcal {W}}^u_{\le i}\) of \({\mathbb {T}}^N\), whose leaf \({\mathcal {W}}^u_{\le i}(x)\) passing through \(x\in {\mathbb {T}}^N\) is \(C^{1+}\);

  3. (3)

    each distribution \(E^u_{i,j}:= E^u _{\ge i} \cap E^u_{\le j},\ i \le j\), is uniquely integrable, tangent to a foliation with \(C^{1+}\) leaves;

  4. (4)

    h preserves the weak unstable foliations: \(h({\mathcal {W}}^u_{\le i}(x)) = {\bar{{\mathcal {W}}}}^u_{\le i}(h(x))\), for \(i = 1,\ldots , k\) and all \(x\in {\mathbb {T}}^N\).

We remark that the item (1) does not require any \(C^1\) closeness of A to \({{\bar{A}}}\), and it holds under the same assumption as Theorem 3.1.

From Proposition 4.6 we obtain a flag of weak foliations

$$\begin{aligned} {\mathcal {W}}^u_1 \subset {\mathcal {W}}^u_{\le 2} \subset \cdots \subset {\mathcal {W}}^u_{\le k-1}\subset {\mathcal {W}}^u_{\le k} := {\mathcal {W}}^u, \end{aligned}$$

where each of the inclusions is proper and \({\mathcal {W}}^u_{\le (i-1)}\) sub-foliates \({\mathcal {W}}^u_{\le i}\) with \(C^{1+}\) leaves for \(i = 2, \ldots , k\). This flag is preserved by the conjugacy h.

When the weak distributions \(E^u_i\) are known to be uniquely integrable, we have the following proposition, which is proved by a standard graph transform technique.

Proposition 4.7

  1. (1)

    Each weak unstable leaf \({\mathcal {W}}^u_{\le i}\) in item (2) of Proposition 4.6 is subfoliated by \({\mathcal {W}}^u_i\), whose leaves are uniformly \(C^r\).

  2. (2)

    The weakest unstable leaf \({\mathcal {W}}^u_1(x)\) is \(C^{1+}\), and its tangent distribution \(E_1^u(x)\) is Hölder.

5 Elliptic Regularity with the Help of an Invariant Distribution

In this section, we show how to get regularity of the conjugacy h by combining elliptic dynamics within invariant distributions with hyperbolic dynamics.

5.1 Elliptic dynamics within invariant distributions

We start with a few preparatory lemmas. Recall that we use the abbreviation \(T^p=\prod _{i=1}^m T_i^{p_i},\ p=(p_1,\ldots ,p_m)\in {\mathbb {Z}}^m\) where \(T_i,\ i=1,\ldots ,m\) are commuting diffeomorphisms.

Lemma 5.1

Let \({\mathcal {F}}\) be an orientable foliation of \({\mathbb {T}}^N\) with uniformly \(C^1\) leaves of dimension one. Suppose a homeomorphism h conjugates the abelian group \(\{T^p,\ p\in {\mathbb {Z}}^m\}<\mathrm {Diff}_0({\mathbb {T}}^N)\) to translations \(\{{{\bar{T}}}^p(x)=x+\varvec{\rho }p,\ p\in {\mathbb {Z}}^m\}\) and sends the foliation \({\mathcal {F}}\) to an affine foliation \({\bar{{\mathcal {F}}}}\). Denote by E(x) the one-dimensional distribution that is tangent to the leaf \({\mathcal {F}}(x)\). Then the distribution E(x) is invariant under the \(DT^p\); that is,

$$\begin{aligned} D_xT^p\left( E(x)\right) = E(T^p(x)), \end{aligned}$$

for all \(p\in {\mathbb {Z}}^m\) and \(x\in {\mathbb {T}}^N\).

Proof

The straight line foliation \({\bar{{\mathcal {F}}}}\) is invariant under translations, so after the conjugation the foliation \({\mathcal {F}}\) is also invariant under \(\{T^p,\ p\in {\mathbb {Z}}^m\}.\) The lemma follows directly by differentiating the equation \(T^p {\mathcal {F}}(x) = {\mathcal {F}}(T^p(x))\) along the leaves. \(\quad \square \)

Lemma 5.2

Suppose the conjugacy h in the previous Lemma 5.1 is bi-Hölder. Then for each \(p\in {\mathbb {Z}}^m{\setminus }\{0\}\), all the Lyapunov exponents of \(T^p\) with respect to any invariant probability measure are zero.

Proof

Suppose there is an invariant measure \(\mu \) with at least one nonzero exponent. Without loss of generality, assume that this exponent is negative. Pesin theory implies that through \(\mu \)-a.e. x, there are local stable manifolds, which are smoothly embedded disks on which \(T^n\) contracts distances at an exponential rate. Thus for two points \(y, y'\) on the same local stable manifold of a point x, we have \(\Vert T^{np}(y)- T^{np} (y')\Vert \) converges to zero exponentially fast.

On the other hand, using the bi-Hölder conjugacy h we have

$$\begin{aligned} \Vert T^{np}(y)- T^{np}(y')\Vert= & {} \Vert h^{-1}(n\varvec{\rho }p + h(y)) - h^{-1}(n\varvec{\rho }p + h(y'))\Vert \\\ge & {} \mathrm {const.}\Vert h(y)- h(y')\Vert ^\eta . \end{aligned}$$

where \(\eta \) is the Hölder exponent of \(h^{-1}\), which gives a contradiction.\(\quad \square \)

5.2 A quantitative Kronecker theorem

We will need the following number theoretic result, whose proof is postponed to the Appendix.

Theorem 5.3

Let \(N,K\in {\mathbb {N}}\) be given. Then there exists a full measure set \({\mathcal {O}}\) in the set \({\mathcal {M}}_{N\times K}({\mathbb {T}})\) of matrices of \(N\times K\) such that for all \(M\in {\mathcal {O}}\), the following holds.

For any small \(\epsilon >0\), there exists a constant C such that for any \(y \in {\mathbb {T}}^N\) and any \(n\in {\mathbb {N}}\) there exist \(q\in {\mathbb {Z}}^N\), \(p\in {\mathbb {Z}}^K\) satisfying \(\Vert p\Vert < n\), such that the following inequality holds

$$\begin{aligned} \Vert Mp-q-y\Vert \le C n^{-\frac{K}{N}+\epsilon }. \end{aligned}$$

This theorem is a quantitative version of the classical Kronecker approximation theorem. When \(K=1\), this is the classical Dirichlet’s simultaneous Diophantine approximation theorem where we can set \(\epsilon =0\). The \(N=1\) case was proved in [K].

This theorem inspires the following definition.

Definition 5.1

Suppose the m vectors \(\rho _1,\ldots ,\rho _m\in {\mathbb {T}}^N\) rationally generate \({\mathbb {T}}^N\), and consider the set of finite linear combinations

$$\begin{aligned} S:=S(\rho _1,\ldots ,\rho _m)=\left\{ \sum _{i=1}^m p_i\rho _i\ \mathrm {mod\ }{\mathbb {Z}}^N\ |\ p_i\in {\mathbb {Z}},\quad i=1,\ldots ,m\right\} . \end{aligned}$$
(5.1)

For each element \(\gamma \in S\), we denote by \(\Vert \gamma \Vert _w\) the word length \(\Vert \gamma \Vert _w:=\Vert p\Vert _{\ell _1},\) where \(p = (p_1,\ldots , p_m)\in {\mathbb {Z}}^m\) and by \(\Vert \gamma \Vert \) the closest Euclidean distance of \(\gamma \) (mod \({\mathbb {Z}}^N)\) to zero.

We say that S has dimensiond if there exists a constant c such that for any \(x\in {\mathbb {T}}^N\) and any \(\ell >0\) there exists a point \(\gamma \in S\) satisfying

$$\begin{aligned} \Vert \gamma \Vert _w\le \ell ,\quad \Vert \gamma -x\Vert \le c\ell ^{-d}. \end{aligned}$$

Theorem 5.3 implies that for almost every choice of vector tuple \(\rho _1,\ldots , \rho _m \in {\mathbb {T}}^N\), the set S formed by linear combinations as above has dimension \(m/N -\epsilon \) for all \(\epsilon > 0\) small.

5.3 Organization of the proofs of Theorem 1.8 and Theorem 1.9

To prove Theorems 1.8 and 1.9, we just need to improve the regularity of the conjugacies obtained in Theorems 1.5 and 1.7, respectively. We carry this out in the following propositions.

The first proposition chooses the \(K_0\) in Theorems 1.8 and  1.9.

Proposition 5.4

Given \(\eta \in (0,1)\) and \(d>2/\eta ^2\), there exists \(K_0\) such that the following holds: for all \(K>K_0\), there exists a full measure set \({\mathcal {R}}_{N,K}\subset ({\mathbb {T}}^N)^K\) such that the set S generated by any tuple of vectors \((\rho _{1},\ldots , \rho _{K})\) lying in \({\mathcal {R}}_{N,K}\) is dense on \({\mathbb {T}}^N\) and has dimension d.

Proof of Proposition 5.4

To satisfy the inequality \(2/d<\eta ^2\), we choose \(K_0> 2N/\eta ^2\). Applying Theorem 5.3, we get a full measure set in \(({\mathbb {T}}^N)^K,\ K>K_0,\) each point of which generates a set S of dimension \(d=K/N-\epsilon \) satisfying \(2/d<\eta ^2\), where \(\epsilon \) is arbitrarily small. Next, removing further a zero measure set to guarantee that the vectors rationally generate \({\mathbb {T}}^N\), we get the full measure set \({\mathcal {R}}_{N,K}\) as claimed. \(\quad \square \)

The next proposition gives the choice of \(\eta \) in Proposition 5.4, and will give the \(C^{1+}\) regularity of h along the one-dimensional leaves of a foliation after applying Corollary 4.2 and Proposition 4.3.

Proposition 5.5

Suppose

  1. (1)

    the abelian group \({\mathcal {A}}(<{\mathcal {H}}^r)\) is generated by

    $$\begin{aligned} \{ T_{i, j}\ |\ \ i = 1,\ldots , N,\ j = 1,\ldots , K,\ T_{i,j}T_{i',j'}=T_{i',j'}T_{i,j}\}; \end{aligned}$$
  2. (2)

    there is an \(\eta \)-bi-Hölder conjugacy h such that \(T_{i, j}(x) = h^{-1}(h(x) +\rho _{i, j})\);

  3. (3)

    there is a \(\{T_{i,j}\}\)-invariant foliation \({\mathcal {F}}\) into one-dimensional \(C^1\) leaves \({\mathcal {F}}(x)\) with tangential distributions \(E(x),\ x\in {\mathbb {T}}^N\) that is \(\eta \)-Hölder in x. Denote by v(x) a unit vector field tangent to \({\mathcal {F}}(x)\), \(x\in {\mathbb {T}}^N\);

  4. (4)

    the set S generated by the rotation vectors \(\rho _{i, j}\) has dimension d, with \(2/d < \eta ^2\).

For \(\gamma \in S\), we write

$$\begin{aligned} T_\gamma :=\prod _{j=1}^K\prod _{i=1}^N T_{i,j}^{q_{i,j}}, \end{aligned}$$

where \(q_{i, j}\in {\mathbb {Z}}\) are the coefficients in the linear combination of \(\gamma \), i.e. \(\gamma =\sum _{j}\sum _i q_{i,j}\rho _{i,j}\).

Then for all \(\gamma \in S\) with \(\Vert \gamma \Vert \) small enough, we have

$$\begin{aligned} |\Vert D_xT_\gamma v(x)\Vert _{C^0}- 1| \le \mathrm {const.}\Vert \gamma \Vert ^\nu \end{aligned}$$

where \(\nu \le \eta ^2 - 2/d \).

We defer the proof to Sect. 5.4. We next cite the following well-known theorem of Journé.

Theorem 5.6

[J]. Suppose \({\mathcal {F}}^1\), \({\mathcal {F}}^2\) are two transverse continuous foliations a manifold M with uniformly \(C^{n,\nu }\) leaves. Suppose that a continuous function \(u : M \rightarrow {\mathbb {R}}\) is uniformly \(C^{n,\nu }\) when restricted to each local leaf \({\mathcal {F}}^1_\varepsilon (x), {\mathcal {F}}^2_\varepsilon (x),\ x \in M\). Then u is \(C^{n,\nu }\) on M.

In the 2-dimensional case, we apply Proposition 5.5 and Proposition 4.3 to get that h is \(C^{1+}\) along the stable and unstable foliations of the Anosov diffeomorphism A. Applying Theorem 5.6, we get that h is \(C^{1+}\) on \({\mathbb {T}}^2\).

An application of the next result completes the proof of Theorem 1.8. More details of the proof of Theorem 1.8 will be given in Sect. 6.1.

Theorem 5.7

[LMM, Ll]. Suppose f and g are two \(C^r,\ r > 1,\) Anosov diffeomorphims \({\mathbb {T}}^2\) that are topologically conjugated by h, i.e. \(f \circ h = h\circ g\). Suppose the periodic data of f and g coincide, namely, \(D_{h(x)}f^q\) is conjugate to \(D_xg^q\) at every q-periodic point x of g for all \(q\in {\mathbb {Z}}\). Then \(h \in C^{r-\varepsilon }\) for \(\varepsilon \) arbitrarily small.

The proof of Theorem 1.9 in the \(N>2\) case follows from the same general strategy. However, there is some more work needed to show that the conjugacy h sends the one-dimensional leaves \(W_i^{u,s}\) to the straight lines parallel to the eigenvectors of \({{\bar{A}}}\). We will give the proof of the \(C^{1+}\) regularity of h in Sect. 6.2.

In dimension three, we get improved regularity (Corollary 1.10) by applying the following result of Gogolev in [G2].

Theorem 5.8

(Addendum 1.2 of [G2]). Suppose \({{\bar{A}}}\in \mathrm {SL}(3,{\mathbb {Z}})\) has simple real spectrum and \(A:{\mathbb {T}}^3\rightarrow {\mathbb {T}}^3\) is \(C^r,\ r>3\) that is \(C^1\) close to \({{\bar{A}}}\). Suppose also that \({{\bar{A}}}\) and A have the same periodic data, then there exists \(h:{\mathbb {T}}^3\rightarrow {\mathbb {T}}^3\) in \(C^{r-3-\varepsilon }\) with \(h\circ A={{\bar{A}}}\circ h.\) Furthermore there exists \(\kappa \in {\mathbb {Z}}\), such that if \(r\notin (\kappa ,\kappa +3)\), then \(h\in C^{r-\varepsilon }\), where \(\varepsilon \) is arbitrarily small.

5.4 Elliptic regularity in the presence of invariant distributions

In this section, we prove Proposition 5.5.

Proof of Proposition 5.5

Let \(T_\gamma \) and v(x) be as in the statement, and let \(\mu \) be any ergodic measure of \(T_\gamma \). We get from Lemma 5.2 that for \(\mu \)-a.e. x

$$\begin{aligned} \lim _k\frac{1}{k}\log \Vert D_x(T_\gamma )^k(x)v(x)\Vert =\int \log \Vert D_xT_\gamma (x)v(x)\Vert \,d\mu =0. \end{aligned}$$
(5.2)

This shows that \(\log \Vert D_xT_\gamma (x)v(x)\Vert \) vanishes at some point on \({\mathbb {T}}^N\).

To simplify notation, we reindex the \(T_{i, j}\) appearing in \(T_\gamma \) by \(T_1,\ldots ,T_{\Vert \gamma \Vert _w}\), and write \(T_\gamma =\prod _{i(\gamma )=1}^{\Vert \gamma \Vert _w}T_{i(\gamma )}\). (Due to the commutativity of the \(T_{i, j}\)’s, the ordering of the (ij) appearing in \(i(\gamma )\) does not matter). We also write \(T_{i(\gamma )}x = x_{i(\gamma )}\).

Consider the \(\eta \)-Hölder function \(\ell _{i(\gamma )}(x) :=\log \Vert D_xT_{i(\gamma )+1}v(x)\Vert \). Invariance of the distribution E implies that

$$\begin{aligned} \begin{aligned} \log \Vert D_xT_\gamma v(x)\Vert&=\log \left\| D_x \left( \prod _{i(\gamma )=1}^{\Vert \gamma \Vert _w}T_{i(\gamma )}\right) v(x)\right\| =\sum _{i(\gamma )=1}^{\Vert \gamma \Vert _w}\log \Vert D_{x_{i(\gamma )}}T_{i(\gamma )+1}v(x_{i(\gamma )})\Vert \\&=\sum _{i(\gamma )=1}^{\Vert \gamma \Vert _w}\ell _{i(\gamma )}(x_{i(\gamma )}). \end{aligned} \end{aligned}$$

To prove the lemma, it suffices to restrict attention to a neighborhood of \(\gamma =0\). We consider a dyadic decomposition of a small neighborhood of 0 by

$$\begin{aligned} D_m =\{ \gamma \in S\ |\ c2^{-d(m+1)/2} < \Vert \gamma \Vert \le c2^{-dm/2}\}, \end{aligned}$$

where c is the constant in Definition 5.1. Next, for \(D_m\), we introduce a \(c2^{-dm}\)-net by defining

$$\begin{aligned} S_m:= \{ \gamma \in D_m\ |\ \Vert \gamma \Vert _w\le 2^{m}\}. \end{aligned}$$

The remaining proof is split into two steps. In the first step, we prove the following

Claim 1

For any \(\gamma \in S_m\), we have

$$\begin{aligned} |\log \Vert D_xT_\gamma v(x)\Vert |\le \mathrm {const.}\Vert \gamma \Vert ^{\eta ^2-2/d},\ \forall \ x\in {\mathbb {T}}^N. \end{aligned}$$

Proof of Claim 1

First, by (5.2), for any given \(\gamma \in S_m\), there exists \(y\in {\mathbb {T}}^N\) such that \(\log \Vert D_yT_\gamma v(y)\Vert =0\). Next, it follows from the definition of the dimension of the set S that there exists \( \delta \in S\) with

$$\begin{aligned} \Vert \delta \Vert _w\le \Vert \gamma \Vert _w,\quad \Vert \delta +{{\bar{y}}}-{{\bar{x}}}\Vert \le c\Vert \gamma \Vert _w^{-d},\ {{\bar{x}}}=h(x),\ {{\bar{y}}}=h(y). \end{aligned}$$

We denote \(y_\delta =T_\delta y\) and \(y_\gamma =T_\gamma y.\)

Since h is bi-Hölder, we have for all \(i(\gamma )=0,1,\ldots ,\Vert \gamma \Vert _w-1\) and \(i(\delta )=0,1,\ldots ,\Vert \delta \Vert _w-1\), the following estimates

$$\begin{aligned} \Vert x_{i(\gamma )}-(y_\delta )_{i(\gamma )}\Vert \le \mathrm {const.}\Vert \gamma \Vert _w^{-d\eta },\; \hbox {and }\,\Vert y_{i(\delta )}-(y_\gamma )_{i(\delta )}\Vert \le \mathrm {const.}\Vert \gamma \Vert ^{\eta }. \end{aligned}$$

Next we estimate \(\log \Vert D_xT_\gamma v(x)\Vert \) as follows

$$\begin{aligned}&|\log \Vert D_xT_\gamma v(x)\Vert |\nonumber \\&\quad =|\log \Vert D_xT_\gamma v(x)\Vert -\log \Vert D_y(T_\gamma T_\delta ) v(y)\Vert +\log \Vert D_y(T_\delta T_\gamma ) v(y)\Vert |\nonumber \\&\quad =|\log \Vert D_xT_\gamma v(x)\Vert -\log \Vert D_{T_\delta y} T_\gamma v(T_\delta y)\Vert \nonumber \\&\qquad -\log \Vert D_y T_\delta v(y)\Vert +\log \Vert D_{T_\gamma y} T_\delta v(T_\gamma y)\Vert +\log \Vert D_yT_\gamma v(y)\Vert |\nonumber \\&\quad =|\log \Vert D_xT_\gamma v(x)\Vert -\log \Vert D_{ T_\delta y}T_\gamma v(y_\delta )\Vert \nonumber \\&\qquad -\log \Vert D_yT_\delta v(y)\Vert +\log \Vert D_{T_\gamma y} T_\delta v(y_\gamma )\Vert |\nonumber \\&\quad =\Big |\sum _{i(\gamma )=1}^{\Vert \gamma \Vert _w-1}(\ell _{i(\gamma )}(x_{i(\gamma )})-\ell _{i(\gamma )}((y_\delta )_{i(\gamma )}))+\sum _{i(\delta )=1}^{\Vert \delta \Vert _w-1}(\ell _{i(\delta )}(y_{i(\delta )})-\ell _{i(\delta )}((y_\gamma )_{i(\delta )}))\Big |\nonumber \\&\quad \le \mathrm {const.}(\Vert \gamma \Vert _w\cdot \Vert \gamma \Vert _w^{-\eta ^2 d}+\Vert \delta \Vert _w\Vert \gamma \Vert ^{\eta ^2})\nonumber \\&\quad \le \mathrm {const.}(2^{m(1-\eta ^2 d)}+2^m\Vert \gamma \Vert ^{\eta ^2})\nonumber \\&\quad \le \mathrm {const.}(\Vert \gamma \Vert ^{2(\eta ^2-1/d)}+\Vert \gamma \Vert ^{\eta ^2-2/d})\nonumber \\&\quad \le \mathrm {const.}\Vert \gamma \Vert ^{\eta ^2-2/d}. \end{aligned}$$
(5.3)

\(\square \)

In the second step, we prove the following.

Claim 2

Suppose for any \(\gamma _m\in S_m\), we have \(\Vert \log \Vert D_xT_{\gamma _m}v(x)\Vert \Vert _{C^0}\le \mathrm {const.}\Vert \gamma _m\Vert ^{\nu }\), for some \(\nu >0\) and all x. Then for any \(\gamma \in S\), we have \(\Vert \log \Vert D_xT_\gamma v(x)\Vert \Vert _{C^0}\le \mathrm {const.}\Vert \gamma \Vert ^{\nu }\).

Proof of Claim 2

By the definition of \(D_m\) and \(S_m\) and Definition 5.1, we get that each annulus \(D_m\) in the dyadic decomposition is covered by at least \(O(2^{dNm/2})\) balls of radius \(c2^{-dm}\) centered at points in \(S_m\).

We claim that for any \(\gamma \in S\) with small norm \(\Vert \gamma \Vert \), there exists a finite number \(\kappa (\gamma )\) and \(\{ \gamma _{m_k},\ k=1,2,\ldots ,\kappa (\gamma )\}\) satisfying \(\gamma _{m_k}\in D_{m_k},\ \mathrm {and\ } m_{k+1}\ge 2m_k\) and \(\gamma =\sum _{k=1}^{\kappa (\gamma )}\gamma _{m_k}.\)

The algorithm is as follows. First find m such that \(\gamma \in D_m\). Denote this m by \(m_1\) and find \(\gamma _{m_1}\in S_{m_1}\) that is closest to \(\gamma \). The closest distance is bounded by \(c 2^{-dm_1}\). Next consider the vector \(\gamma -\gamma _{m_1}\) and repeat the above procedure to it in place of \(\gamma \). We see that \(\gamma -\gamma _{m_1}\in D_{m_2}\) for some \(m_2\ge 2m_1\). This procedure terminates after finitely many steps since \(\gamma \in S\) is a finite integer linear combination of the rotation vectors \(\rho _i,\ i=1,\ldots , m\).

Next, let \(x_{m_i}=\prod _{j=i}^{\kappa (\gamma )}T_{\gamma _{m_j}}(x)\) and \(x_{m_{\kappa (\gamma )+1}}=x\). Then

$$\begin{aligned} \begin{aligned} |\log \Vert D_xT_\gamma v(x)\Vert |&=|\log \Vert D_x\prod _{i}T_{\gamma _{m_i}} v(x)\Vert |\\&\le \sum _i|\log \Vert D_{x_{m_{i+1}}}T_{\gamma _{m_i}} v(x_{m_{i+1}})\Vert |\\&\le \mathrm {const.}\sum _{i=1}^{\kappa (\gamma )}\Vert \gamma _{m_i}\Vert ^{\nu }. \end{aligned} \end{aligned}$$

By the construction of \(D_m\) and \(S_m\), for all \(\gamma \in S\) , we have that \(\frac{1}{2}\Vert \gamma _{m_1}\Vert \le \Vert \gamma \Vert \le 2\Vert \gamma _{m_1}\Vert \), and \(\Vert \gamma _{m_k}\Vert \) decays exponentially with uniform exponential rate. This gives that \( |\log \Vert D_xT_\gamma v(x)\Vert |\le \mathrm {const.}\Vert \gamma \Vert ^\nu \) for every \(\gamma \in S\) close to zero. \(\quad \square \)

This completes the proof of Proposition 5.5. \(\quad \square \)

6 Proof of the Theorems

In this section, we prove Theorems 1.8 and 1.9.

6.1 Proof of Theorem 1.8

Proof of Theorem 1.8

We first explain how to choose \(K_0\) and the open set \({\mathcal {O}}\) in the statement of Theorem 1.8. We choose \({\mathcal {O}}\) to be a \(C^1\) neighborhood of \({{\bar{A}}}\) in the set of Anosov diffeomorphisms with simple spectrum.

By Proposition 5.4, in order to determine \(K_0\), it is enough to determine \(\eta \). Given \({{\bar{A}}}\) and an Anosov diffeomorphsm \(A:{\mathbb {T}}^2\rightarrow {\mathbb {T}}^2\) homotopic to \({{\bar{A}}}\), Theorem 3.1 provides a a bi-Hölder map h such that \(h\circ A={{\bar{A}}}\circ h\). The Hölder regularity of the conjugacy h depends on both the spectrum of \({{\bar{A}}}\) and the Mather spectrum of A ( [KH] Theorem 19.1.2), and the Hölder regularity of the invariant distribution \(E_i(x)\) of the Anosov diffeomorphism A depends on the Mather spectrum of A. We choose \(\eta \) to be the minimum of these Hölder exponents.

For \(K>K_0\), Proposition 5.4 supplies a full measure set \({\mathcal {R}}_{2,K}\) in \({\mathbb {T}}^{N\times K}\). Given \(i:\{1,\ldots ,K\}\rightarrow \{1,2\}\), if the rotation vectors of \((\rho _{i(1),1},\ldots , \rho _{i(K),K})\) lie in \({\mathcal {R}}_{2,K}\), then the set S generated by the set of all rotation vectors \(\{\rho _{i,j},\ i=1,2,\ j=1,\ldots ,K\}\) has dimension \(d\in (K/2, K).\) For \(K>K_0\), we have \(2/d< \eta ^2\) by Proposition 5.4. Moreover S is dense on \({\mathbb {T}}^N\).

Consider now an action \(\alpha :\Gamma _{{{\bar{B}}},K}\rightarrow \mathrm {Diff}^r({\mathbb {T}}^2)\) with \(\alpha (g_0)=A:{\mathbb {T}}^2\rightarrow {\mathbb {T}}^2\) Anosov, and \(\alpha (g_{i,k}), \ i=1,2,\ k=1,\ldots ,K\) generating an abelian subgroup action \(({\mathbb {Z}}^2)^K\rightarrow \mathrm {Diff}^r({\mathbb {T}}^2)\). As in the hypotheses of the theorem, assume that the subgroup generated by \(\alpha (g_{1,1})\) and \(\alpha (g_{2,1})\) has sublinear oscillation. Then applying Theorem 1.5 to the \(\Gamma _{{{\bar{A}}}}\) action generated by \(\alpha (g_0)\), \(\alpha (g_{1,1})\) and \(\alpha (g_{2,1})\), we get a bi-Hölder map h linearizing the \(\Gamma _{{{\bar{B}}}}\) action.

We show that the conjugacy h given by Theorem 1.5 also linearizes the whole \(\Gamma _{{{\bar{B}}},K}\) action \(\alpha .\) Indeed, for any diffeomorphism f that commutes with \(\alpha (g_{1,1}),\alpha (g_{2,1})\), we have

$$\begin{aligned} h f h^{-1}(x + \rho _{i,1}) = h f h^{-1}(x) + \rho _{i,1}, \end{aligned}$$

for \(i=1,2\). Since the rotation vectors \(\rho _{i,1},\ i=1,2,\) rationally generate \({\mathbb {T}}^N\), by taking Fourier expansions, we get that \(h f h^{-1}\) is a rigid rotation by a constant vector that is the rotation vector of f. Thus h conjugates the whole action \(\alpha \) to an affine action by rigid translations.

We next apply Proposition 5.5, Corollary 4.2 and Proposition 4.3 to get that the conjugacy h is \(C^{1+}\) along the stable and unstable leaves of the Anosov diffeomorphism A. By Theorem 5.6, we get that h is \(C^{1+}\) on \({\mathbb {T}}^2\) and finally by Theorem 5.7, we get that h is \(C^{r-\varepsilon }\), for \(\varepsilon \) sufficiently small. \(\quad \square \)

6.2 Proof of Theorem 1.9, the N dimensional case

The main difficulty in generalizing the above argument to the N-dimensional case is that it is in general unknown if the one dimensional distributions \(E_i^u\) (or \(E^s_i\)) that are invariant under DA are also invariant under \(DT_\gamma \). It is only known that the weakest stable and unstable distributions \(E^u_1\) and \(E^s_1\) are invariant under \(DT_\gamma \) by Proposition 4.6 (4) and Lemma 5.1.

We cite the following Lemma from [GKS].

Lemma 6.1

(Proposition 2.4 of [GKS]). Let A, \({{\bar{A}}}\) and h be as in Proposition 4.6. Suppose h is \(C^{1+}\) along \({\mathcal {W}}^u_{\le i}\) and \(h({\mathcal {W}}^u_j (x)) ={\bar{{\mathcal {W}}}}^u_j(h(x))\), \(1\le j\le i\), then

$$\begin{aligned} h({\mathcal {W}}^u_{i+1}(x)) ={\bar{{\mathcal {W}}}}^u_{i+1}(h(x)),\ x \in {\mathbb {T}}^N. \end{aligned}$$

Using this lemma, we now prove that \(h \in C^{1+}\) in the general case \(N >2\).

Proof of Theorem 1.9

The proof follows the strategy of the proof of Theorem 1.8 with small modifications to deal with the high dimensionality.

We first choose \(K_0\) and the open set \({\mathcal {O}}\) of Anosov diffeomorphisms. Since \({{\bar{A}}}\) is assumed to have simple spectrum, it has a \(C^1\) small neighborhood in which the Anosov diffeomorphisms have simple Mather spectrum. We choose such a neighborhood and denote it by \({\mathcal {O}}\). We will choose \(K_0\) to satisfy \(2/d<\eta ^2\) using Proposition 5.4, where d is the dimension of the set S generated by the rotation vectors \(\rho _{i,j}\) and \(\eta \) is a lower bound on the Hölder exponent of the conjugacy h and all the distributions \(E^{u,s}_i\), for all Anosov diffeomorphisms in \({\mathcal {O}}\).

Proposition 5.4 then gives a full measure set \({\mathcal {R}}_{N,K}\subset {\mathbb {T}}^{N\times K}\). We obtain a bi-Hölder conjugacy h that linearizes the whole action \(\alpha :\Gamma _{{{\bar{B}}},K}\rightarrow \mathrm {Diff}^r({\mathbb {T}}^N)\) by applying Theorem 1.7 and the argument in the proof of Theorem 1.8.

It remains to improve the regularity of h to \(C^{1+}\). To start, Proposition 4.6 (4) implies that weakest leaves are preserved: \(h({\mathcal {W}}^u_1 (x)) ={\bar{{\mathcal {W}}}}^u_1(h(x))\), for all x. Next, we apply Lemma 5.1 to get that the weakest distribution \(E^u_1\) is invariant under the abelian group action generated by \(\alpha (g_{i,k}),\ i=1,\ldots ,N,\ k=1,\ldots ,K\). Applying Proposition 5.5, Corollary 4.2 and Proposition 4.3, we conclude that h is \(C^{1+}\) along the weakest leaves \({\mathcal {W}}^u_1(x)\). Thus the assumption of the Lemma 6.1 is satisfied with \(i=1\), and we conclude that the second weakest leaves are preserved \(h({\mathcal {W}}^u_2 (x)) ={\bar{{\mathcal {W}}}}^u_2(h(x))\). We next apply Lemma 5.1, Proposition 5.5, Corollary 4.2 and Proposition 4.3 to conclude that h is \(C^{1+}\) along \({\mathcal {W}}^u_{2}\). By Journé’s theorem 5.6, we get that h is \(C^{1+}\) along the leaves \({\mathcal {W}}^u_{\le 2}\).

Applying Lemma 6.1 inductively in i, we conclude h is \(C^{1+}\) along the unstable foliation \({\mathcal {W}}^u\). Similarly, we prove that h is \(C^{1+}\) along \({\mathcal {W}}^s\). Then by Journé’s theorem 5.6, we have that \(h \in C^{1+}\).\(\quad \square \)

6.3 Alternative assumptions

In this section, we discuss possible alternative assumptions for Theorem 1.9. Our technique developed in Sect. 4 relies on the existence of foliations by one dimensional leaves that are invariant under the abelian group action. In our proofs, the foliations are provided by the Anosov diffeomorphism. The foliations being invariant under the abelian group action follows from the existence of a common conjugacy h. In other words, we need that the leaves (straight lines) of the invariant foliation of the toral automorphism \({{\bar{A}}}\) are mapped to the leaves of the invariant foliations of A by the conjugacy \(h^{-1}\) (Proposition 4.6). This is true when \(N=2\) or in higher dimensions when we assume that A is \(C^1\) close to \({{\bar{A}}}\). There are also circumstances under which Proposition 4.6 can be proved without the \(C^1\) smallness assuption. We mention here two main cases.

In [G1], the author considers an Anosov diffeomorphism A homotopic to a linear map \({{\bar{A}}}\) with simple Mather spectrum and the property that in each connected component of the Mather spectrum, there lies exactly one eigenvalue of \({{\bar{A}}}\). Moreover it is assumed that the invariant distributions \(E^{u,s}_i\) form angles less than \(\pi /2\) with the corresponding affine distributions \({{\bar{E}}}^{u,s}_i\) for the linear map \({{\bar{A}}}\). (This assumption guarantees a certain quasi-isometric property of \({\mathcal {W}}^s\) and \({\mathcal {W}}^u\)). Under these assumptions, the conclusions of Proposition 4.6 hold [G1].

In [FPS], a similar result is shown assuming that A is isotopic to \({{\bar{A}}}\) along a path of Anosov diffeomorphisms with simple Mather spectrum.