1 Introduction

1.1 General panorama

The dynamics of (non-)solvable groups of germs of diffeomorphisms around a fixed point is an important subject that has been studied by many authors in connexion to foliations and differential equations. There is, however, a natural group-theoretical aspect of this study of large interest. In this direction, the classification of solvable groups of diffeomorphisms in dimension 1 has been completed, at least in large regularity: see [7, 17] for the real-analytic case and [31] for the \(C^2\) case; see also [3, 32] for the piecewise-affine case. (For the higher-dimensional case, see [1, 23].)

In the \(C^1\) context, this issue was indirectly addressed by Cantwell and Conlon [9]. Indeed, although they were interested on problems concerning smoothing of some codimension-1 foliations, they dealt with a particular one for which the holonomy pseudo-group turns to be the Baumslag–Solitar group. In concrete terms, they proved that a certain natural (non-affine) action of BS(1, 2) on the closed interval is non-smoothable. Later, using the results of topological classification of general actions of BS(1, 2) on the interval contained in [35], the wholeFootnote 1 picture was completed in [19]: every \(C^1\) action of BS(1, n) on the closed interval with no global fixed point inside is semiconjugate to the standard affine action.

Cantwell–Conlon’s proof uses exponential growth of the orbit of certain intervals to yield a contradiction (such a behaviour is impossible close to a parabolic fixed point). This clever argument was later used in [28] to give a counter-example to the converse of the Thurston stability theorem: there exists a finitely-generated, locally indicableFootnote 2 group with no faithful action by \(C^1\) diffeomorphisms of the interval. (See also [8].) As we will see, the relation with Thurston’s stability arises not only at the level of results. Indeed, although Cantwell–Conlon’s argument is very different, an arsenal of techniques close to Thurston’s that may be applied in this context and related ones (see e.g. [23]) was independently developed in [4] (see also [5]). The aim of this work is to put together all these ideas (and to introduce new ones) to get a quite complete picture of all possible \(C^1\) actions of a very large class of solvable groups, namely the Abelian-by-cyclic ones. We will show that these actions are rigid provided the cyclic factor acts hyperbolically on the Abelian subgroup, and that this rigidity disappears in the non-hyperbolic case.

The idea of relating a certain notion of hyperbolicity (or at least, of growth of orbits) to \(C^1\) rigidity phenomena for group actions on 1-dimensional spaces has been proposed—though not fully developed—by many authors. This is explicitly mentioned in [28], while it is implicit in the examples of [36]. More evidence is provided by the examples in [10, 13, 30] relying on the original constructions of Pixton [33] and Tsuboi [39]. All these works suggest that actions with orbits of (uniformly bounded) subexponential growth should be always \(C^1\)-smoothableFootnote 3 (compare [9, Conjecture 2.3]) and realizable in any neighborhood of the identity/rotations [25]. Despite this evidence and the results presented here, a complete understanding of all rigidity phenomena arising in this context remains far from being reached. More generally, the full picture of groups of homeomorphisms that can/cannot act faithfully by \(C^1\) diffeomorphisms remains obscure. A particular case that is challenging from both the dynamical and the group-theoretical viewpoints can be summarized in the next

Question 1.1

What are the subgroups of the group of piecewise affine homeomorphisms of the circle/interval that are topologically conjugate to groups of \(C^1\) diffeomorphisms ?

For simplicity, in this work, all actions are assumed to be by orientation-preserving maps.

1.2 Statements of results

Given a matrix \(A=(\alpha _{i,j})\in M_d(\mathbb Z)\cap GL_d(\mathbb {R})\), \(d \ge 1\), let us consider the meta-Abelian group \(G_A\) with presentation

$$\begin{aligned} G_A := \big \langle a, b_1,\ldots , b_d\mid b_i\,b_j=b_j\, b_i,\, \;\;ab_ia^{-1}= b_1^{\alpha _{1,i}}, \ldots , b_d^{\alpha _{d,i}} \big \rangle . \end{aligned}$$
(1)

It is known that every finitely-presented, torsion-free, Abelian-by-cyclic group has this form [2] (see also [14]).

It is quite clear that \(M_d(\mathbb Z) \cap GL_d(\mathbb {R}) \!\subset \! GL_d(\mathbb Q)\). In particular, the group \(G_A\) above is isomorphic to a subgroup of \(\mathbb Z\ltimes _A \mathbb Q^d\). In a slightly more general way, from now on we consider \(A\in GL_d(\mathbb Q)\) and H an \(A^{\pm 1}\)-invariant subgroup of \(\mathbb Q^d\) with \(rank_{\mathbb Q}(H)=d\) (recall that \(rank_\mathbb Q(H)\), the \(\mathbb {Q}\)-rank of H, is the smallest \(d'\) such that H embeds into \(\mathbb {Q}^{d'}\)), and we let \(G=\mathbb Z\ltimes _A H\).

Proposition 1.2

Suppose that the matrix \(A \in GL_d (\mathbb Q)\) is \(\mathbb Q\)-irreducible and that the \(\mathbb Q\)-rank of \(H \subset \mathbb Q^d\) equals d. Then \(\mathbb Z\ltimes _A H\) has a faithful affine action on \(\mathbb {R}\) if and only if A has a positive real eigenvalue.

Next, we assume that A has all its eigenvalues of norm \(\not =1\). Our main result is the following

Theorem 1.3

Assume \(A \in GL_d(\mathbb Q)\) has no eigenvalue of norm 1, and let G be a subgroup of \(\mathbb Z\ltimes _A \mathbb Q^d\) of the form \(G = \mathbb Z\ltimes _A H\), where \(rank_{\mathbb Q}(H)=d\). Then every representation of G into \(\mathrm {Diff}^1_+([0,1])\) whose image group admits no global fixed point in (0, 1) is topologically conjugate to a representation into the affine group.

For the proof of Theorem 1.3, let us begin by considering an action of a general group G as above by homeomorphisms of [0, 1]. We have the next generalization of [35, §4.1]:

Lemma 1.4

Let G be a group as in Theorem 1.3. Assume that G acts by homeomorphisms of the closed interval with no global fixed point in (0, 1). Then either there exists \(b \in H\) fixing no point in (0, 1), in which case the action of G is semiconjugate to that of an affine group, or H has a global fixed point in (0, 1), in which case the element \(a\in G\) acts without fixed points inside (0, 1).

In virtue of this lemma, the proof of Theorem 1.3 reduces to the next two propositions.

Proposition 1.5

Let G be a group as in Theorem 1.3. Assume that G acts by homeomorphisms of [0, 1] with no global fixed point in (0, 1). If the subgroup H acts nontrivially but has a global fixed point inside (0, 1), then the action of G cannot be by \(C^1\) diffeomorphisms.

Proposition 1.6

Let G be a group as in Theorem 1.3. Then every action of G by \(C^1\) diffeomorphisms of [0, 1] with no global fixed point in (0, 1) and having non-Abelian image is minimal on (0, 1).

The structure theorem for actions is complemented by a result of rigidity for the multipliers of the group elements mapping into homotheties. More precisely, we prove

Theorem 1.7

Let \(G = \mathbb Z\ltimes _A H\) be a group as in Theorem 1.3, with \(a \in G\) being the generator of \(\mathbb {Z}\) (whose action on H is given by A). Assume that G acts by \(C^1\) diffeomorphisms of [0, 1] with no fixed point in (0, 1) and the image group is non-Abelian. Then the derivative of a at the interior fixed point coincides with the ratio of the homothety to which a is mapped under the homomorphism of G into the affine group given by Theorem 1.3. More generally, for each \(k \ne 0\) and all \(b \in H\), the derivative of \(a^k b\) at its interior fixed point equals the kth-power of the ratio of that homothety.

Besides several consequences of the preceding theorem given in the next section, there is an elementary one of particular interest. Namely, if we consider actions as in Theorem 1.3 but allowing the possibility of global fixed points in (0, 1), then only finitely many components of the complement of the set of these points are such that the action restricted to them has non-Abelian image. Otherwise, the element a would admit a sequence of hyperbolic fixed points, all of them with the same multiplier, converging to a parabolic fixed point, which is absurd.

Another consequence of the previous results concerns centralizers. For simplicity, we just give an statement involving the Baumslag–Solitar’s group, yet a more general version certainly holds for groups as in Theorem 1.3.

Proposition 1.8

The centralizer inside \(\mathrm {Diff}_+^1([0,1])\) of a subgroup G isomorphic to \(BS(1,2) := \langle g,h \!: ghg^{-1} = h^2 \rangle \) is contained in the group of diffeomorphisms having support in the complement of the support of h. In particular, if G has no global fixed point in the interior, then its centralizer is trivial.

Indeed, let I be a closed interval in [0, 1] restricted to which the action of h has no global fixed point in the interior. We need to show that every element f of the centralizer of G fixes I and acts trivially on it. To do this, first notice that, by Theorem 1.3, the group G fixes I, and its action on it is topologically conjugate to that of an affine group. Therefore, if f fixes I, then by commutativity it must fix the unique fixed point of g inside I. Again by commutativity, the set of fixed points of f is G-invariant, and since the G-orbits inside I are dense, we conclude that f acts trivially on I. If f does not fix I, then it moves it into a disjoint interval, so that \(I, f(I), f^2 (I), \ldots \) are infinitely many pairwise disjoint intervals restricted to which the G-action has non-Abelian image, which was shown to be impossible just after the statement of Theorem 1.7.

Theorem 1.7 could lead one to think that the topological conjugacy to the affine action is actually smooth at the interior.Footnote 4 (Compare [37].) Nevertheless, a standard application of the Anosov–Katok technique leads to \(C^1\) (faithful) actions for which this is not the case. As we will see, in higher regularity, the rigidity holds: if \(r \ge 2\), then for every faithful action by \(C^r\) diffeomorphisms with no interior global fixed point, the conjugacy is a \(C^r\) diffeomorphism at the interior. It seems to be an interesting problem to try to extend this rigidity to the class \(C^{1+\tau }\). Another interesting problem is to construct actions by \(C^1\) diffeomorphisms that are conjugate to actions of non-Abelian affine groups though they are non-ergodic with respect to the Lebesgue measure. (Compare [22].)

The hyperbolicity assumption for the matrix A is crucial for the validity of Theorem 1.3. Indeed, Abelian groups of diffeomorphisms acting nonfreely (as those constructed in [39]) provide easy counter-examples with all eigenvalues equal to 1. A more delicate construction leads to the next.

Theorem 1.9

Let \(A \in GL_d (\mathbb {Q})\) be non-hyperbolic and \(\mathbb Q\)-irreducible. Then \(G := \mathbb Z\ltimes _A \mathbb Q^d\) admits a faithful action by \(C^1\) diffeomorphisms of the closed interval that is not semiconjugate to an affine one though has no global fixed point in (0, 1).

This work is closed by some extensions of our main theorem to actions by \(C^1\) diffeomorphisms of the circle. Roughly, we rule out Denjoy-like actions in class \(C^1\) for the groups G above, though such actions may arise in the continuous cathegory (and also in the Lipschitz one; see [27, Proposition 2.3.15]). In particular, we have:

Theorem 1.10

Let G be a group as in Theorem 1.3. Assume that G acts by \(C^1\) diffeomorphisms of the circle with non-Abelian image. Then the action admits a finite orbit.

This theorem clarifies the whole picture. Up to a finite-index subgroup \(G_0\), the action has global fixed points. The group \(G_0\) can still be presented in the form \(\mathbb Z\ltimes _{A^k} H_0\) for a certain \(k \ge 1\); as \(A^k\) is hyperbolic, and application of Theorem 1.3 to the restriction of the action of \(G_0\) to intervals between global fixed points shows that these are conjugate to affine actions. Thus, roughly, G is a finite (cyclic) extension of a subgroup of a product of affine groups acting on intervals with pairwise disjoint interior. Moreover, only finitely many of these affine groups can be non-Abelian. (Otherwise, by Theorem 1.7, there would be accumulation of hyperbolic fixed points of \(a^k\) with the same multiplier towards a parabolic fixed point.)

To conclude, let us mention that the examples provided by Theorem 1.9 can be adapted to the case of the circle. More precisely, if \(A \in GL_d(\mathbb Q)\) is non-hyperbolic and \(\mathbb Q\)-irreducible, then \(G := \mathbb Z\ltimes _A \mathbb Q^d\) admits a faithful action by \(C^1\) circle diffeomorphisms having no finite orbit.

1.3 Some comments and complementary results/examples

Although the results presented so far only concern certain solvable groups, they lead to relevant results for other classes of groups. Let us start with an almost direct consequence of Theorem 1.7. For any pair of positive integers mn, let BS(1, m; 1, n) be the group defined by

$$\begin{aligned} BS(1,m;1,n) := \big \langle a,b,c \mid aba^{-1}=b^m \!, aca^{-1}=c^n \big \rangle = BS(1,m) *_{\langle a \rangle } BS(1,n). \end{aligned}$$

In other words, the subgroups generated by ab and ac are isomorphic to BS(1, m) and BS(1, n), respectively, and no other relation is assumed.

Notice that every element \(\omega \in BS(1,m;1;n)\) can be written in a unique way as \(\omega = a^k \omega _0\), where \(k\!\in \!\mathbb Z\) and \(\omega _0\) belongs to the subgroup generated by bc and their roots. One easily deduces that BS(1, m; 1, n) is locally indicable, hence it admits a faithful action by homeomorphisms of the interval (see the second footnote in page 1). However, it is easy to give a more explicit embedding of BS(1, m; 1, n) into \(\mathrm {Homeo}_+([0,1])\). Indeed, start by associating to a a homeomorphism f without fixed points in (0, 1). Then choose a fundamental domain I of f and homeomorphisms \(g_0,h_0\) supported on I and generating a rank-2 free group. Finally, extend \(g_0\) and \(h_0\) into homeomorphisms gh of [0, 1] so that \(fgf^{-1}=g^m\) and \(fhf^{-1}=h^n\) hold. Then the action of B(1, m; 1, n) defined by associating g to b and h to c is faithful.

In what concerns smooth actions of BS(1, m; 1, n) on the interval, we have:

Theorem 1.11

Let mn be distinct positive integers. Given a representation of B(1, m; 1, n) into \(\mathrm {Diff}^1_+([0,1])\), let us denote by fgh the images of abc, respectively. Then, the interiors of the supports of g and h are disjoint. In particular, g and h commute, hence the action is not faithful.

Proof

The supports of g and h consist of unions of segments bounded by successive non-repelling fixed points of f; in particular, any two of these segments either coincide or have disjoint interior. If one of these segments is contained in the support of g (resp., h), then Theorem 1.7 asserts that its interior contains a unique hyperbolically-repelling fixed point of f with derivative equal to m (resp., n). Since \(m\ne n\), the open segments in the supports of g and h must be disjoint. \(\square \)

Remark 1.12

Theorem 1.11 admits straightforward generalizations replacing the Baumslag–Solitar groups BS(1, m) and BS(1, n) by groups associated (as in Theorem 1.3) to matrices A and B that are hyperbolically expanding (i.e. with every eigenvalue of modulus \(>1\)) and have distinct eigenvalues.

Below we give two other results in the same spirit. The first of these is new, whereas the second is already known though our methods provide a new and somewhat more conceptual proof. More sophisticated examples will be treated elsewhere.

Let us consider the group \(G_{\lambda ,\lambda '}\) generated by the transformations of the real-line

$$\begin{aligned} c \!: x\mapsto x+1, \quad b \!: x \mapsto \lambda x, \quad a \!: x \mapsto sgn(x) |x|^{\lambda '}, \end{aligned}$$

where \(\lambda , \lambda '\) are positive numbers. These groups are known to be non-solvable for certain parameters \(\lambda '\). Indeed, if \(\lambda '\) is a prime number, then the elements a and c generate a free group (see [11]).

Theorem 1.13

For all integers mn larger than 1, the group \(G_{m,n}\) does not embed into the group \(C^1\) diffeomorphisms of the closed interval.

Proof

Assume that \(G_{m,n}\) can be realized as a group of \(C^1\) diffeomorphisms of [0, 1]. Then Theorem 1.3 applies to both subgroups \(\langle b,c \rangle \) and \(\langle a,b \rangle \) (which are isomorphic to BS(1, m) and BS(1, n), respectively). Let us consider a maximal open subinterval \(I = ( x_0,x_1 )\) that is invariant under c and where the dynamics of c has no fixed point. The relation \(bcb^{-1} = c^m\) shows that the action of b on I is nontrivial. Proposition 1.5 then easily implies that b preserves I, and by Theorem 1.3, the restriction of the action of \(\langle b,c \rangle \) to I is conjugate to an affine action. Let y be the fixed point of b inside I. As before, the relation \(aba^{-1} = b^n\) forces a to fix all points \(x_0,y,x_1\); moreover, the actions of \(\langle a,b \rangle \) on both intervals \((x_0,y)\) and \((y,x_1)\) are conjugate to affine actions. Finally, notice that the relation \(aba^{-1} = b^n\) forces the derivative of b to be equal to 1 at y. However, this contradicts Theorem 1.7 when applied to \(\langle b,c \rangle \). \(\square \)

As another application of our results, we give an alternative proof of a theorem from [28]:

Theorem 1.14

If \(\Gamma \) is a non-solvable subgroup of \(SL_2(\mathbb {R})\), then \(\Gamma \ltimes \mathbb Z^2\) does not embed into \(\mathrm {Diff}_+^1 ([0,1])\).

Proof

Since \(\Gamma \) is non-solvable, it must contain two hyperbolic elements AB generating a free group. Theorem 1.3 applied to \(\mathbb Z\ltimes _A \mathbb Z^2 \subset \Gamma \ltimes \mathbb Z^2\) implies that the action restricted to \(\langle A,\mathbb Z^2 \rangle \) is topologically conjugate to an affine action with dense translation part on each connected component I fixed by \(\langle A,\mathbb Z^2 \rangle \) and containing no point that is globally fixed by this subgroup. As B normalizes \(\mathbb Z^2\), it has to be affine in the coordinates induced by this translation part. As a consequence, the action of \(\Gamma \ltimes \mathbb Z^2\) is that of an affine group on each interval I as above. We thus conclude that the action factors throughout that of a solvable group, hence it is unfaithful. \(\square \)

Remark 1.15

It is not hard to extend the previous proof to show that \(\Gamma \ltimes \mathbb Z^2\) does not embed into the group of \(C^1\) diffeomorphisms of neither the open interval nor the circle. (Compare [28, §4.2] and [28, §4.3], respectively.)

Remark 1.16

All groups discussed in this section are locally indicable. We thus get different infinite families of finitely-generated, locally-indicable groups with no faithful actions by \(C^1\) diffeomorphisms of the closed interval. The existence of such groups was first established in [28]; the examples considered therein correspond to those of Theorem 1.14.

2 On affine actions

Before passing to proofs, let us make an important remark. So far, in all statements concerning groups of the form \(\mathbb {Z} \rtimes _A H\), the \(\mathbb Q\)-rank of H is assumed to be maximal. Therefore, we may fix a \(\mathbb Q\)-basis \(b_1,\ldots ,b_d\) of \(\mathbb Q^d\) made of elements in H. More importantly, since all conditions to be imposed on A (if any) are invariant under conjugacy, up to changing A by a conjugate matrix in \(GL_d (\mathbb Q)\), we may assume that \(b_1,\ldots ,b_d\) is the canonical basis of \(\mathbb Q^d\). This assumption will be made in order to simplify specific computations.

In this section, we prove Proposition 1.2. To simplify, vectors \((t_1,\ldots ,t_d) \in \mathbb {R}^d\) will be denoted horizontally, though must be viewed as vertical ones. We begin with

Proposition 2.1

Given \(A \in GL_d(\mathbb Q)\), let G be a subgroup of \(\mathbb Z\ltimes _A \mathbb Q^d\) of the form \(\mathbb Z\ltimes _A H\), where H contains the canonical basis \(\{ b_1, \ldots , b_d\}\) of \(\mathbb Q^d\) (so that, in particular, \(rank_{\mathbb Q}(H) = d\)).

  1. (i)

    If \((t_1,\ldots ,t_d)\in \mathbb {R}^d\) is an eigenvector of the transpose \(A^T\) with eigenvalue \(\lambda \in \mathbb {R}_+\!\setminus \!\{1\}\), then there exists a homomorphism \(\psi :G\rightarrow \mathrm {Aff}_+(\mathbb {R})\) with non-Abelian image defined by \(\psi (b_i) := T_{t_i}\) and \(\psi (a) := M_\lambda \), where \(T_t\) and \(M_{\lambda }\) stand for the translation by an amplitude t and the multiplication by a factor \(\lambda \), respectively. This homomorphism is injective if and only if \(\{t_1,\ldots ,t_d\}\) is a \(\mathbb Q\)-linearly-independent subset of \(\mathbb {R}\).

  2. (ii)

    Every homomorphism \(\psi \!: G \rightarrow \mathrm {Aff}_+(\mathbb {R})\) with non-Abelian image is conjugate to one as those described in (i).

Proof

The first claim of item (i) is obvious. For the other claim, notice that injectivity of \(\psi \) is equivalent to injectivity of its restriction to H. We let a be the generator of the \(\mathbb Z\)-factor of G. Assume there is an element \(b \in H\) mapping into the trivial translation. This element writes as \(b = b_1^{\beta _1} ,\ldots , b_d^{\beta _d} \in H\) for certain \(\beta ,\ldots ,\beta _d\) in \(\mathbb Q\). Then we have \(\sum _i \beta _i t_i = 0\), which implies that the \(t_i\)’s are linearly dependent over \(\mathbb Q\). Conversely, assume \(\sum _i \beta _i t_i = 0\) holds for certan rational numbers \(\beta _i\) that are not all equal to zero. Up to multiplying them by a common integer factor, we may assume that such a relation holds with all \(\beta _i\)’s integer. Then \(b := b_1^{\beta _1},\ldots , b_d^{\beta _d}\) is a nontrivial element of H mapping into the trivial translation under \(\psi \).

For (ii), suppose \(\psi :G \rightarrow \mathrm {Aff}_+(\mathbb {R})\) is a homomorphism with non-Abelian image. Then we have

$$\begin{aligned} \{ id \} \subsetneq \psi ([G,G]) \subseteq [\mathrm {Aff}_+(\mathbb {R}),\mathrm {Aff}_+(\mathbb {R})] = \{T_t,\ t \in \mathbb {R}\}.\end{aligned}$$

Fix \(b\in [G,G]\) such that \(\psi (b)\) is a nontrivial translation. As \(b\in H\), we have that \(\psi (b)\) commutes with every element in \(\psi (H)\). Therefore, \(\psi (H)\) is a subgroup of the group of translations.

Let \(t_1,\ldots ,t_d\) in \(\mathbb {R}\) be such that \(\psi (b_i)=T_{t_i}\). As \(\psi (G)\) is non-Abelian, we have \(\psi (a)=T_t M_\lambda \) for certain \(\lambda \ne 1\) and \(t \in \mathbb {R}\). We may actually suppose that \(t=0\) just by conjugating \(\psi \) by \(T_{\frac{t}{\lambda -1}}\). Then, if we choose \(k \in \mathbb {N}\) so that all \(k \alpha _{i,j}\) belong to \(\mathbb Z\), for each \(i \! \in \! \{1,\ldots , d\}\) we have

$$\begin{aligned} T_{\lambda k t_i} = \psi (a)\psi (b_i^k)\psi (a)^{-1} = \psi \big ( (a b_i a^{-1} )^k \big ) = \psi \left( b_1^{k\alpha _{1,i}},\ldots , b_d^{k\alpha _{d,i}}\right) = T_{k \alpha _{1,i}t_1+\ldots + k \alpha _{d,i}t_d}. \end{aligned}$$

Thus, \(\lambda t_i=\alpha _{1,i} t_1+\cdots +\alpha _{d,i} t_d\), which implies that \((t_1,\ldots ,t_d)\) is an eigenvector of \(A^T\) with eigenvalue \(\lambda \). \(\square \)

Remark 2.2

The preceding proposition implies in particular that if \(A^T\) has no real eigenvalue, then there is no representation of G in \(\mathrm {Aff}_+(\mathbb {R})\) with non-Abelian image. As a consequence, due to Theorem 1.3, if moreover the eigenvalues of \(A^T\) all have modulus different from 1, then every representation of G in \(\mathrm {Diff}_+^1([0,1])\) has Abelian image.

As a matter of example, given positive integers mn, let A be the matrix

$$\begin{aligned} A=A_{m,n}:=\left( \begin{array}{cc} m &{} \quad -n\\ n &{} \quad m \end{array}\right) . \end{aligned}$$

Then the group \(G(m,n) := \mathbb Z\ltimes _A \mathbb Q^2\) has no injective representation into \(\mathrm {Diff}^1_+([0,1])\). Notice that each of these groups G(mn) is locally indicable. Hence, this produces still another infinite family of finitely-generated, locally-indicable groups with no faithful action by \(C^1\) diffeomorphisms of the closed interval. (Compare Remark 1.16.)

In view of the discussion above, the proof of Proposition 1.2 is closed by the next

Lemma 2.3

Suppose that the matrix \(A \in GL_d(\mathbb Q)\) is \(\mathbb Q\)-irreducible. If \(\lambda \in \mathbb {R}\) is an eigenvalue of \(A^T\) and \(v := (t_1,\ldots ,t_d)\in \mathbb {R}^d\) is such that \(A^T v = \lambda v\), then \(\{t_1,\ldots ,t_d\}\) is a \(\mathbb Q\)-linearly-independent subset of \(\mathbb {R}\).

Proof

If v is an eigenvector of \(A^T\), then the subspace \(v^\perp \subseteq \mathbb {R}^d\) is invariant under A. Since A is \(\mathbb Q\)-irreducible, we have \(v^\perp \cap \mathbb Q^d = \{ 0 \}\). Therefore, if \(v := (t_1,\ldots ,t_d)\) and \(\beta _1,\ldots ,\beta _d\) in \(\mathbb Q\) verify \(\beta _1 t_1 + \cdots + \beta _d t_d=0\), then we have \(\beta _1=\cdots =\beta _d=0\). \(\square \)

3 On continuous actions on the interval

In this section, we deal with actions by homeomorphisms. The proof below was given in [35] for the Baumslag–Solitar group B(1, 2). As we next see, the argument can be adapted to the group G.

Proof of Lemma 1.4

Assume that H has a global fixed point x. Then the a-orbit of x is made up of global fixed points of H. Therefore, this a-orbit has to accumulate at both 0 and 1, which means that a has no fixed point in (0, 1). Indeed, otherwise this a-orbit would accumulate into a common fixed point \(x_* \in (0,1)\) of both H and a, which would mean that \(x_*\) is G-invariant, which is against our assumption.

We next show that if G acts by homeomorphisms of [0, 1] in such a way that H admits no global fixed point on (0, 1), then the action is semiconjugate to that of an affine group. To do this, we let \(N\subseteq H\) be the set of elements having a fixed point inside (0, 1). As H is Abelian, N is easily seen to be a subgroup. We claim that N is strictly contained in H. Indeed, let \(\{ b_1,\ldots , b_d \} \subset H\) be a \(\mathbb {Q}\)-basis of \(\mathbb {Q} \otimes H\). We affirm that one of these generators \(b_i\) has no fixed point. Indeed, if \(b_1\) has no fixed point, then we are done. Otherwise, let \(x_1\) be a fixed point of \(b_1\). If \(b_2\) has no fixed point, then we are done. Otherwise, let \(x_2\) be a fixed point of \(b_2\). If \(x_2\) is fixed by \(b_1\), then we have found a common fixed point \(x_* := x_2\). If not, then either \(b_1 (x_2)\) or \(b_1^{-1} (x_2)\) is closer to \(x_1\) than \(x_2\). In the former case, let \(x_* := \lim _{n \rightarrow \infty } b_1^n (x_2)\), and in the latter case, let \(x_*:= \lim _{n \rightarrow -\infty } b_1^n (x_2)\). By commutativity, we have that \(x_*\) is fixed by both \(b_1\) and \(b_2\) in each case. Now repeating this argument finitely many times, we will detect a common fixed point for all the elements \(b_i\) provided each of them has a fixed point. Nevertheless, this common fixed point will be fixed by all rational powers of the generators, hence by all elements in H, which is contrary to our assumption.

We now claim that there is an H-invariant infinite measure \(\nu \) on (0, 1) that is finite on compact subsets. (Actually, this follows from [27, Proposition 2.2.48], but we repeat the argument here because is so simple.) Indeed, let \(h \in H\) be an element having no fixed point in (0, 1). Then \(H / \langle h \rangle \) naturally acts on the space \((0,1) / \! \! \sim \), where the equivalence relation \(\sim \) corresponds to being in the same orbit under h. As h is fixed-point free, \((0,1) / \!\! \sim \) is topologically a circle. Since \(H / \langle h \rangle \) is Abelian, its action on this topological circle preserves a probability measure. This measure lifts to a measure \(\nu \) on (0, 1) that is finite on compact sets and is invariant under H.

We next claim that \(\nu \) has no atoms, and that it is unique up to scalar multiple. Indeed, by [34] (see also [31, §2.4.5]), this holds whenever H / N is not isomorphic to \(\mathbb Z\), and this is the case here because N is \(A^T\)-invariant and \(A^T\) has no eigenvalue of modulus 1.

Now, as H is normal in G, we have that \(a_*(\nu )\) is also invariant by H. (Remind that, by definition, \(a_*(\mu ) (S):=\mu (a^{-1}(S))\).) By uniqueness, this implies that \(a_*(\nu )=\lambda \nu \) for some \(\lambda \in \mathbb {R}_{>0}\). More generally, for every \(g \in G\), there exists \(\lambda _g \in \mathbb {R}_{>0}\) such that \(g_*(\nu ) = \lambda _g \nu \). The map \(g \mapsto \lambda _g\) is a homomorphism from G into \(\mathbb {R}^*_{>0}\). It is then easy to check that the map \(\psi \! :G\rightarrow \mathrm {Aff}_+(\mathbb {R})\), \(g\mapsto \psi _g\), defined by

$$\begin{aligned} \psi _g(x) := \frac{1}{\lambda _b}x + \nu \big ( [1/2,b(1/2)] \big ) \end{aligned}$$

is a representation, where \(\nu ([q,p]) := -\nu ([p,q])\) for \(q > p\). Indeed, for \(f,g\in G\), we have

$$\begin{aligned} \psi _{fg}(x)= & {} \frac{1}{\lambda _f}\left( \frac{1}{\lambda _g} x + \mu \big ( [f^{-1}(1/2),g(1/2)] \big ) \right) \\= & {} \frac{1}{\lambda _f}\left( \frac{1}{\lambda _g} x +\mu \big ( [1/2,g(1/2)] \big ) + \mu \big ( [f^{-1}(1/2),1/2] \big ) \right) \\= & {} \frac{1}{\lambda _f} \left( \frac{1}{\lambda _g}x+\mu \big ( [1/2,g(1/2)] \big ) \right) + \mu \big ( [1/2,f(1/2)] \big )\\= & {} \psi _f\circ \psi _g(x).\end{aligned}$$

Moreover, the map \(F \!:\mathbb {R}\rightarrow \mathbb {R}\) defined by \(F(x):=\nu ([1/2,x])\) semiconjugates the action of G with \(\psi \). Indeed,

$$\begin{aligned} F (g(x))= & {} \mu \big ( [1/2,g(x)] \big )\\= & {} \mu \big ( [g(1/2),g(x)] \big ) +\mu \big ( [1/2,g(1/2)] )\\= & {} g^{-1}_*\mu \big ( [1/2,x] \big ) + \mu \big ( [1/2,g(1/2)] \big )\\= & {} \frac{1}{\lambda _g} F(x) + \mu \big ( [1/2,g(1/2)] \big )\\= & {} \psi _g \big ( F(x) \big ), \end{aligned}$$

as desired. \(\square \)

In the statement of Lemma 1.4, the semiconjugacy is not necessarily a conjugacy. This easily follows by applying a Denjoy-like technique replacing the orbit of a single point by that of a wandering interval. See also Theorem 4.8 below, where this procedure is carried out smoothly on the open interval.

4 On \(C^1\) actions on the interval

4.1 All actions are semiconjugate to affine ones

In this section, we show Proposition 1.5. Suppose for a contradiction that for an action of G by \(C^1\) diffeomorphisms of [0, 1] without global fixed points in (0, 1), the subgroup H acts nontrivially on (0, 1) but having a fixed point inside. For each \(x\in (0,1)\) which is not fixed by H, let us denote by \(I_x\) the maximal interval containing x such that H has no fixed point inside. Since G has no global fixed point in (0, 1) and H is normal in G, we must have \(a^n(I_x)\cap I_x =\emptyset \quad \text {for all} \quad n\not = 0.\)   In particular, \(I_x\) is contained in (0, 1). Moreover, a has no fixed point in (0, 1), and up to changing it by its inverse, we may suppose that \(a(z) > z\) for all \(z \in (0,1)\). Notice that the set of fixed points of H is invariant under a, hence it accumulates at both endpoints of [0, 1].

The rough idea now is, for a point x not fixed by H as above, to apply \(a^{-1}\) iteratively at x and examine the behavior of an appropriately defined displacement vector (see (2) below). Our first lemmata (4.2 to 4.4) build the groundwork needed to show that the direction of this vector nearly converges along a subsequence (Lemma 4.5). That A is hyperbolic then implies that, along this subsequence, the magnitude of the vector is uniformly expanded (Lemma 4.7), giving a contradiction.

To implement the strategy above, we first recall a useful tool that arises in this context, namely, there is an H-invariant infinite measure \(\mu _x\) supported on \(I_x\) which is finite on compact subsets. This measure is not unique, but independently of the choice, we can define the translation number homomorphism \(\tau _{\mu _x} \!: H\rightarrow \mathbb {R}\) by \(\tau _{\mu _x}(h) := \mu _x([z,h(z)))\) (This value is independent of \(z \in I_x\).) The kernel \(K_x\) of this homomorphism coincides with the set of elements of H having fixed points inside \(I_x\); see [27, Section 2.2.5] for all of this.

From now on, as explained at the beginning of Sect. 2, we let \(\{b_1,\ldots ,b_d\}\) be the canonical basis of \(\mathbb Q^d\), which (with no loss of generality) we assume to be contained in H. Although unnatural, this choice equips \(\mathbb {R}\otimes H\) with an inner product, which yields to the following crucial notion.

Definition 4.1

For every \(I_x\) as above, we define the translation vector \(\vec {\tau }_{\mu _x} \in \mathbb {R}^d\) as the unit vector (with respect to the max norm) pointing in the direction \((t_1,\ldots , t_d)\), where \(t_i := \mu _x ([z, b_i(z)))\).

In the sequel, we will denote \(\vec {\tau }_{\mu _x}\) simply by \(\vec {\tau }_x\). We have

Lemma 4.2

The directions of \(\vec {\tau }_{a^{-1}(x)}\) and \(A^T \vec {\tau }_x\) coincide.

Proof

Since \(a^*(\mu _x)\) (remind that \(a^{*} (\nu ) (S) := \nu (a(S))\)) is an H-invariant measure on \(I_{a^{-1}(x)}\), by definition, we have that the \(i^{th}\) entry of \(\vec {\tau }_{a^{-1}(x)}\) coincides with \(a^{*} (\mu _x) \big ( [a^{-1}(x),b_i (a^{-1}(x)) \big ).\) Thus, this entry equals

$$\begin{aligned} \mu _x \big ( [x, ab_ia^{-1}(x)] \big )= & {} \tau _{\mu _x} (a b_i a^{-1})\\= & {} \frac{1}{k} \tau _{\mu _x} (a b_i^k a^{-1})\\= & {} \frac{1}{k} \tau _{\mu _x} \left( b_1^{k \alpha _{1,i}} b_2^{k \alpha _{2,i}}, \ldots , b_d^{k \alpha _{i,d}}\right) . \end{aligned}$$

Choosing k so that all \(k \alpha _{i,j}\) belong to \(\mathbb {Z}\), this yields

$$\begin{aligned} \mu _x \big ( [x, ab_ia^{-1}(x)] \big ) = \frac{1}{k} \sum _{j = 1}^{d} k \alpha _{j,i} \tau _{\mu _x} (b_j) = \sum _{j=1}^d \alpha _{j,i} \vec {\tau }_x (b_j), \end{aligned}$$

as desired. \(\square \)

We now state our main tool to deal with \(C^1\) diffeomorphisms. Roughly, it says that diffeomorphisms that are close-enough to the identity in the \(C^1\) topology behave like translations under composition. For each \(\delta > 0\), we denote \(U_{\delta }(id)\) the neighborhood of the identity in \(\mathrm {Diff}_+^1([0,1])\) given by

$$\begin{aligned} U_{\delta }(id) := \Big \{ f \in \mathrm {Diff}_+^1([0,1]) \!: \sup _{z \in [0,1]} \big | Df(z) - 1 \big | < \delta \Big \}. \end{aligned}$$

Proposition 4.3

[4] For each \(\eta >0\) and all \(k\in \mathbb N\), there exists a neighborhood U of the identity in \(\mathrm {Diff}^1_+([0,1])\) such that for all \(f_1,\ldots , f_k\) in U, all \(\epsilon _1,\ldots , \epsilon _k\) in \(\{-1,1\}\) and all \(x\in [0,1]\), we have

$$\begin{aligned} \Big | [f_k^{\epsilon _k}\circ , \ldots , \circ f_1^{\epsilon _1}(x) - x] - \sum _i \epsilon _i(f_i(x)-x) \Big | \le \eta \max _j \big \{ |f_j(x)-x| \big \}. \end{aligned}$$

Proof

First of all, observe that if \(g\in \mathrm {Diff}^1_+([0,1])\) satisfies \(|Dg(z)-1|<\lambda \) for all \(z\in [0,1]\), then for all xy,

$$\begin{aligned} \big | (g(x)-x)-(g(y)-y) \big | < \lambda |x-y|. \end{aligned}$$

Next, notice that for every \(f \in U_{\delta }(id)\) and all \(x \in [0,1]\), there exists \(y \in [0,1]\) such that

$$\begin{aligned} \big | (f_i^{-1}(x) - x) - (x - f_i(x)) \big |= & {} \big | (f_i(x) - x) - (f_i(f_i^{-1}(x)) - f_i^{-1}(x)) \big |\\= & {} \big | Df_i(y) - 1 \big | \cdot \big | x - f_i^{-1}(x) \big |\\\le & {} \delta |x - f_i^{-1}(x)|. \end{aligned}$$

Using this, it is not hard to see that we may assume that \(\epsilon _i=1\) for all i.

We proceed by induction on k. The case \(k=1\) is trivial. Suppose the lemma holds up to \(k-1\), and choose \(\delta >0\) so that the lemma applies to any \(k-1\) diffeomorphisms in the neighborhood \(U=U_{\delta }(id)\) for the constant \(\eta /2\). We may suppose \(\delta \) is small enough to verify \(\delta (k-1 +\eta /2)<\eta /2\). Now take \(f_1,\ldots , f_k\) in \(U_{\delta }(id)\) and \(x\in [0,1]\). Then the value of the expression

$$\begin{aligned} \Big | f_k\circ \cdots \circ f_1 (x)-x -\displaystyle \sum _{i=1}^k (f_i(x)-x) \Big | \end{aligned}$$

is smaller than or equal to

$$\begin{aligned} \Big | f_k\circ \cdots \circ f_1 (x)- f_{k-1}\circ \cdots \circ f_1 (x) -(f_k(x)-x) \Big | + \Big | f_{k-1}\circ \cdots \circ f_1 (x)-x \!-\! \displaystyle \sum _{i=1}^{k-1} (f_i(x)-x) \Big |. \end{aligned}$$

Now notice that, by the inductive hypothesis, the second term in the sum above is bounded from above by \(\eta /2 \max _j|f_j(x)-x|\). Moreover, the observation at the beginning of the proof and the inductive hypothesis yield

$$\begin{aligned} \big | f_k(f_{k-1}\circ \cdots \circ f_1 (x))- f_{k-1}\circ \cdots \circ f_1 (x)-(f_k(x)-x) \big | \le \delta \big | f_{k-1}\circ \cdots \circ f_1 (x)-x \big |\\ \le \delta \left( \sum _{i=1}^{k-1}|f_i(x)-x|+\varepsilon \right) , \end{aligned}$$

with \(\varepsilon < \eta /2 \max _j|f_j(x)-x|\). By the choice of \(\delta \), the last expression is bounded from above by \(\eta /2 \max _j|f_j(x)-x|\), thus finishing the proof. \(\square \)

We next deduce some consequences from this proposition. To do this, first recall that the set of global fixed points of H accumulate at both endpoints of [0, 1]. Hence, given an element \(b \in H\), for each \(\delta > 0\), there is \(\sigma _1 > 0\) which is fixed by H and such that b restricted to \([0,\sigma _1]\) belongs to the \(U_{\delta }(id)\)-neighborhood of the identity in \(\mathrm {Diff}^1_+([0,\sigma _1])\) (the latter group is being identified with \(\mathrm {Diff}^1_+([0,1])\) just by rescaling the interval). Similarly, there is \(\sigma _2 > 0\) such that \(1 - \sigma _2\) is fixed by H and b restricted to \([1-\sigma _2,1]\) belongs to the \(U_{\delta }(id)\)-neighborhood of the identity in \(\mathrm {Diff}^1_+([1-\sigma _2,1])\).

For \(x \in [0,1]\), let us consider the displacement vector \(\triangle (x)\) defined by

$$\begin{aligned} \triangle (x) := \big ( b_1(x)-x, \ldots , b_d(x)-x \big ) \in \mathbb {R}^d, \end{aligned}$$
(2)

and let us denote by \(\Vert \triangle (x) \Vert \) its max norm. Notice that \(\Vert \triangle (x) \Vert \le 1\) for all \(x\in [0,1]\).

Lemma 4.4

For all \(r > 0\), there exists \(\sigma >0\) such that

$$\begin{aligned} \triangle (a^{-1}(x))=Da^{-1}(0) \, A^T \triangle (x)+\epsilon (x) \quad \text {for all} \quad x\in (0,\sigma ) \end{aligned}$$

and

$$\begin{aligned} \triangle (a(x))=Da(1) \,(A^T)^{-1}\triangle (x)+\hat{\epsilon }(x) \quad \text {for all} \quad x\in (1-\sigma ,1), \end{aligned}$$

where \(\Vert \epsilon (x) \Vert \le r \big ( \Vert \triangle (x) \Vert + \Vert \triangle (a^{-1}(x)) \Vert \big )\) and \(\Vert \hat{\epsilon }(x)\Vert \le r \big ( \Vert \triangle (x) \Vert + \Vert \triangle (a^{-1}(x)) \Vert \big )\).

Proof

Both assertions being analogous, we will prove only the first one. Let \(q \in \mathbb {N}\) be such that \(\beta _{i,j} := q\alpha _{i,j}\) is an integer for each ij. Let U be a neighborhood of the identity in \(\mathrm {Diff}^1_+([0,1])\) for which Proposition 4.3 holds for \(\eta > 0\) and \(k \in \mathbb {N}\) defined as

$$\begin{aligned} \eta := \frac{r}{2D^2 + \max _i \sum _j | \alpha _{j,i} |} \quad \text{ and } \quad k := \max \Big \{ \max _i \Big \{ \sum _j |\beta _{j,i}| \Big \}, q \Big \}, \end{aligned}$$

where \(D := \sup _z \max \{ Da (z), Da^{-1}(z) \}\). Let \(\sigma > 0\) be fixed by H such that the (renormalized) restrictions to \([0,\sigma ]\) of all the maps \(b_j\) and \(ab_ja^{-1}\), as well as their inverses, belong to U, and such that \(|Da^{-1}(z) - Da^{-1}(0)| \le \eta \) holds for all \(z \in [0,\sigma ]\). Then, by Proposition 4.3,

$$\begin{aligned} a b_i^q a^{-1} (x) - x = (a b_i a^{-1})^q (x) - x = q ( ab_i a^{-1}(x) - x) + r_{i,1} (x) \end{aligned}$$

and

$$\begin{aligned} b_1^{\beta _{1,i}}, \ldots , b_d^{\beta _{d,i}} (x) - x = \sum _j \beta _{j,i} (b_j(x) - x) + r_{i,2}(x), \end{aligned}$$

where

$$\begin{aligned} | r_{i,1} (x) | \le \eta \big | ab_ia^{-1}(x) - x \big | \quad \text{ and } \quad | r_{i,2}(x) | \le \eta \max _j \big \{ | b_j(x) - x | \big \} = \eta \Vert \triangle (x) \Vert . \end{aligned}$$

Since

$$\begin{aligned} a b_i^q a^{-1} = b_1^{\beta _{1,i}}, \ldots , b_d^{\beta _{d,i}}, \end{aligned}$$

we conclude that

$$\begin{aligned} q ( ab_i a^{-1}(x) - x) + r_{i,1} (x) = \sum _j \beta _{j,i} (b_j(x) - x) + r_{i,2}(x), \end{aligned}$$

hence

$$\begin{aligned} ab_i a^{-1}(x) - x = \sum _j \alpha _{j,i} (b_j(x) - x) + \frac{r_{i,2}(x) - r_{i,1}(x)}{q}.\end{aligned}$$
(3)

The i-th entry of the vector \(\triangle (a^{-1}(x))\) is

$$\begin{aligned} b_ia^{-1}(x)-a^{-1}(x) = a^{-1}ab_ia^{-1}(x)-a^{-1}(x) = Da^{-1}(z_i) \big ( ab_ia^{-1}(x) - x \big ), \end{aligned}$$

where the last equality holds for a certain point \(z_i\in I_x\). By (3) above, for \(x \in (0,\sigma )\), this expression equals

$$\begin{aligned} Da^{-1}(z_i) \sum _j \alpha _{j,i} \big ( b_j(x)-x \big ) \end{aligned}$$

up to an error \(\tilde{\varepsilon }_i(x)\) satisfying

$$\begin{aligned} |\tilde{\epsilon }_i(x)| \le Da^{-1} (z_i) \cdot \frac{| r_{i,1} (x) | + | r_{i,2} (x) |}{q} \le 2D \eta \max \big \{ \Vert \triangle (x) \Vert , |ab_ia^{-1}(x)-x| \big \}. \end{aligned}$$

Since

$$\begin{aligned} ab_ia^{-1}(x) - x = a(b_ia^{-1}(x)) - a(a^{-1}(x)) = Da (z_i') \big ( b_i a^{-1}(x)) - a^{-1}(x) \big ) \end{aligned}$$

for a certain \(z_i' \in I_{a^{-1}(x)}\), we have

$$\begin{aligned} |\tilde{\epsilon }_i(x)| \le 2D \eta \max \big \{ \Vert \triangle (x) \Vert , D \Vert \triangle (a^{-1}(x)) \Vert \big \} \le 2D^2 \eta \max \big \{ \Vert \triangle (x) \Vert , \Vert \triangle (a^{-1}(x)) \Vert \big \}. \end{aligned}$$

Moreover, by the choice of \(\sigma \), the value of \(Da^{-1}(z_i) \sum _j \alpha _{j,i} (b_j(x)-x)\) equals

$$\begin{aligned} Da^{-1}(0) \sum _j \alpha _{j,i} (b_j(x)-x) \end{aligned}$$

up to an error bounded from above by

$$\begin{aligned} \eta \Big | \sum _j \alpha _{j,i} (b_j(x)-x) \Big | \le \eta \Vert \triangle (x) \Vert \sum _j |\alpha _{j,i}|. \end{aligned}$$

Summarizing, \(b_ia^{-1}(x)-a^{-1}(x)\) coincides with \(Da^{-1}(0) \sum _j \alpha _{j,i} \big ( b_j(x) - x \big )\) up to an error \(\varepsilon _i(x)\) satisfying

$$\begin{aligned} |\varepsilon _i(x)| \le 2 D^2 \eta \max \big \{ \Vert \triangle (x) \Vert , \Vert \triangle (a^{-1}(x)) \Vert \big \} + \eta \Vert \triangle (x) \Vert \sum _j |\alpha _{j,i}|. \end{aligned}$$

By the choice of \(\eta \), the last expression is smaller than or equal to \(r \big ( \Vert \triangle (x) \Vert + \Vert \triangle (a^{-1}(x)) \Vert \big )\), which finishes the proof. \(\square \)

Before stating our next lemma, we observe that Lemma 4.2 and the compactness of the unit sphere \(S^{d-1} \subset \mathbb {R}^d\) imply that for each point \(x_0\) not fixed by H, the vectors \(\vec {\tau }_{a^{-n}(x_0)}\) (resp., \(\vec {\tau }_{a^{n}(x_0)}\)) accumulate at some \(\vec {\tau } \in S^{d-1}\) (resp., \(\vec {\tau }_*\)) as \(n \rightarrow \infty \). For each \(n \in \mathbb {Z}\), we let \(x_n := a^{-n} (x_0)\), and we choose a sequence of positive integers \(n_k\) such that \(\vec {\tau }_{x_{n_k}}\rightarrow \vec {\tau }\) and \(\vec {\tau }_{x_{-n_k}}\rightarrow \vec {\tau }_*\) as \(k\rightarrow \infty \).

Lemma 4.5

For every \(\eta >0\), there exists K such that \(k \ge K\) implies

$$\begin{aligned} \frac{\triangle (x_{n_k})}{\Vert \triangle (x_{n_k})\Vert }=\vec {\tau }+\epsilon (k) \quad \text{ and } \quad \frac{\triangle (x_{-n_k})}{\Vert \triangle (x_{-n_k})\Vert }=\vec {\tau }_*+\epsilon _* (k), \end{aligned}$$

where \(\Vert \epsilon (k) \Vert \le \eta \) and \(\Vert \epsilon _*(k) \Vert \le \eta \).

Proof

Let \(H_1\) the subgroup of H generated by \(b_1,\ldots ,b_d\). Up to passing to a subsequence if necessary, there is \(b_* \in \{b_1,\ldots , b_d\}\) such that for all k,

$$\begin{aligned} | b_*(x_{n_k})-x_{n_k} | = \max _i \big \{ | b_i(x_{n_k})-x_{n_k} | \big \}. \end{aligned}$$

Then the functions \(\psi _k: H_1 \rightarrow \mathbb {R}\) defined by

$$\begin{aligned} \psi _k (b) = \frac{b(x_{n_k}) - x_{n_k}}{b_*(x_{n_k})-x_{n_k}} \end{aligned}$$

converge as \(k \rightarrow \infty \) to a homomorphism \(\psi \!: H_1 \rightarrow \mathbb {R}\) which is normalized, in the sense that \(\max _i |\psi (b_i)| = 1\). Indeed, this is the content of the Thurston’s stability theorem [38] (which in its turn can be easily deduced from Proposition 4.3).

The vectors \(\vec {\tau }_k\) and \(\vec {\tau }\) naturally induce normalized homomorphisms from H into \(\mathbb {R}\), namely the normalized translation number homomorphisms, and their limit homomorphism. We denote them by \(\vec {\tau }_k\) and \(\vec {\tau }\), respectively. For these homomorphisms and any bc in H, the inequality \(\vec {\tau } (b) < \vec {\tau } (c)\) implies \(\vec {\tau }_k (b) < \vec {\tau }_k (c)\) for k larger than a certain \(K_0\), which implies \(b (z) < c (z)\) for all \(z \in I_{x_{n_{k}}}\) and all \(k > K_0\). By evaluating at \(z = x_{n_k}\), this yields \(\psi _k (b) < \psi _k (c)\) for \(k > K_0\). Passing to the limit, we finally obtain \(\psi (b) \le \psi (c)\). As a consequence, there must exist a constant \(\lambda \) for which \(\vec {\tau } = \lambda \psi \). Nevertheless, since both homomorphisms are normalized (and point in the same direction), we must have \(\lambda = 1\), which yields the convergence of \(\triangle (x_{n_k}) / \Vert \triangle (x_{n_k})\Vert \) towards \(\vec {\tau }\). The second convergence is proved in an analogous way. \(\square \)

Henceforth, and in many other parts of this work, we will use a trick due to Muller and Tsuboi that allows reducing to the case where all group elements are tangent to the identity at the endpoints. This is achieved after conjugacy by an appropriate homeomorphism that is smooth at the interior and makes flat the germs at the enpoints. In concrete terms, we have:

Lemma 4.6

[24, 40] Let us consider the germ at the origin of the local (non-differentiable) homeomorphism \(\varphi (x) := sgn(x) \exp (-1/|x|)\). Then for every germ of \(C^1\) diffeomorphism f (resp. vector field \(\mathcal {X}\)) at the origin, the germ of the conjugate \(\varphi ^{-1} \circ f \circ \varphi \) (resp., push-forward \(\varphi _* (\mathcal {X})\)) is differentiable and flat in a neighborhood of the origin.

We should stress, however, that although this lemma simplifies many computations, in what follows it may avoided just noticing that, as Da is continuous, the element a behaves like an affine map close to each endpoint.

Recall that \(\mathbb {R}^d\) decomposes as \(E^s\oplus E^u\), where \(E^s\) (resp. \(E^u\)) stands for the stable (unstable) subspace of \(A^T\). We denote by \(\pi _s\) and \(\pi _u\) the projections onto \(E^s\) and \(E^u\), respectively. We let \(\Vert \cdot \Vert _*\) be the natural norm on \(\mathbb {R}^d\) associated to this direct-sum structure, namely,

$$\begin{aligned} \Vert v \Vert _* := \max \{\Vert \pi _s (v) \Vert , \Vert \pi _u (v) \Vert \}. \end{aligned}$$

Lemma 4.7

For any neighborhood \(V \subset S^{d-1}\) of \(E^u \cap S^{d-1}_*\) in the unit sphere \(S^{d-1}_* \subset \mathbb {R}^d\) (with the norm \(\Vert \cdot \Vert _*\)), there is \(\sigma >0\) such that for all \(x \in (0,\sigma )\) not fixed by H,

$$\begin{aligned} \frac{\triangle (x)}{\Vert \triangle (x)\Vert _*}\in V \large \implies \frac{\triangle (a^{-1}(x))}{\Vert \triangle (a^{-1}(x)) \Vert _*} \in V. \end{aligned}$$

Moreover, if V is small enough, then there exists \(\kappa > 1\) such that

$$\begin{aligned} \frac{\triangle (x)}{\Vert \triangle (x) \Vert _*} \in V \large \implies \Vert \triangle (a^{-1}x) \Vert _*\ge \kappa \Vert \triangle (x)\Vert _* . \end{aligned}$$

Proof

For the first statement, we need to show that for every prescribed positive \(\varepsilon < 1\), for points x close to the origin and not fixed by H, we have

$$\begin{aligned} \frac{\Vert \pi _s \triangle (a^{-1}(x))\Vert }{\Vert \pi _u \triangle (a^{-1}(x)) \Vert }< \varepsilon \quad \text{ provided } \quad \frac{\Vert \pi _s \triangle (x) \Vert }{\Vert \pi _u \triangle (x) \Vert } < \varepsilon . \end{aligned}$$

Let \(\lambda > 1\) (resp., \(\lambda ' < 1\)) be such that the norm of nonzero vectors in \(E^u\) (resp., \(E^s\)) are expanded by at least \(\lambda \) (resp., contracted by at least \(\lambda '\)) under the action of \(A^T\). Choose a positive \(r < \varepsilon \) small enough so that

$$\begin{aligned} \frac{1 + r}{1 - r} \left[ \frac{\varepsilon \lambda '}{\lambda - 2r} + \frac{2r}{\lambda - 2r} \right] \le \frac{\varepsilon - \frac{r}{1 - r}}{1 + \frac{\varepsilon r}{1 + r}}. \end{aligned}$$
(4)

Consider a point x not fixed by H lying in the interval \((0,\sigma )\) given by Lemma 4.4. Then from

$$\begin{aligned} \Vert \pi _s \triangle (a^{-1}(x)) \Vert\le & {} \Vert \pi _s A^T \triangle (x)\Vert + r \big ( \Vert \triangle (x) \Vert + \Vert \triangle (a^{-1}(x)) \Vert \big ) \\\le & {} \lambda ' \Vert \pi _s \triangle (x) \Vert + r \Vert \pi _s \triangle (x)\Vert + r \Vert \pi _u \triangle (x)\Vert + r \Vert \pi _s \triangle (a^{-1}(x))\Vert \\&+\, r \Vert \pi _u \triangle (a^{-1}(x))\Vert \end{aligned}$$

we obtain

$$\begin{aligned} \Vert \pi _s \triangle (a^{-1}(x)) \Vert \le \frac{1}{1-r} \left[ \lambda ' \Vert \pi _s \triangle (x) \Vert + 2r \Vert \pi _u \triangle (x)\Vert + r \Vert \pi _u \triangle (a^{-1}(x))\Vert \right] . \end{aligned}$$
(5)

Similarly, from

$$\begin{aligned} \Vert \pi _u \triangle (a^{-1}(x)) \Vert\ge & {} \Vert \pi _u A^T \triangle (x)\Vert - r \big ( \Vert \triangle (x) \Vert + \Vert \triangle (a^{-1}(x)) \Vert \big ) \\\ge & {} \lambda \Vert \pi _u \triangle (x) \Vert - 2r \Vert \pi _u \triangle (x) \Vert - r \Vert \pi _u \triangle (a^{-1}(x)) \Vert - r \Vert \pi _s \triangle (a^{-1}(x)) \Vert , \\ \end{aligned}$$

we obtain

$$\begin{aligned} \Vert \pi _u \triangle (a^{-1}(x)) \Vert \ge \frac{1}{1+r} \left[ (\lambda - 2r) \Vert \pi _u \triangle (x) \Vert - r \Vert \pi _s \triangle (a^{-1}(x)) \Vert \right] . \end{aligned}$$
(6)

Thus, letting \(\alpha := \Vert \pi _s \triangle (a^{-1}(x)) \Vert \), \(\beta := \Vert \pi _u \triangle (a^{-1}(x)) \Vert \),

$$\begin{aligned} A := \frac{1}{1-r} \left[ \lambda ' \Vert \pi _s \triangle (x) \Vert + 2r \Vert \pi _u \triangle (x)\Vert \right] \quad \text{ and } \quad B := \frac{(\lambda - 2r) \Vert \pi _u \triangle (x) \Vert }{1+r}, \end{aligned}$$

we have that (5) and (6) translate into

$$\begin{aligned} \alpha - \frac{\beta r}{1-r} \le A \qquad \text{ and } \qquad \beta + \frac{\alpha r}{1 + r} \ge B, \end{aligned}$$

and hence,

$$\begin{aligned} \frac{\frac{\alpha }{\beta } - \frac{r}{1-r}}{1 + \frac{\alpha }{\beta } \cdot \frac{r}{1+r}} \le \frac{A}{B}. \end{aligned}$$

But

$$\begin{aligned} \frac{A}{B} = \frac{1+r}{1-r} \left[ \frac{\lambda ' }{\lambda - 2r} \frac{\Vert \pi _s \triangle (x) \Vert }{ \Vert \pi _u \triangle (x) \Vert } + \frac{2r}{\lambda - 2r} \right] < \frac{1+r}{1-r} \left[ \frac{\varepsilon \lambda ' }{\lambda - 2r} + \frac{2r}{\lambda - 2r} \right] . \end{aligned}$$

Therefore, by the choice of r (see (4)), and the fact that \( x\mapsto \frac{x-c}{1+c d}\) is an increasing on x for positive cd, we obtain that \(\alpha / \beta < \varepsilon \), which shows the first assertion of the lemma.

To conclude the proof, notice that by the estimate (6) above,

$$\begin{aligned} \Big ( 1 + \frac{r}{1+r} \Big ) \Vert \triangle (a^{-1}(x)) \Vert _*= & {} \Big ( 1 + \frac{r}{1+r} \Big ) \Vert \pi _u \triangle (a^{-1}(x)) \Vert \\\ge & {} \frac{\lambda - 2r}{1 + r} \Vert \pi _u \triangle (x) \Vert = \frac{\lambda - 2 r}{1 + r} \Vert \triangle (x) \Vert _*, \end{aligned}$$

which shows the second assertion of the lemma for \(\kappa := (\lambda - 2r) / (1 + 2 r)\) and r small enough. \(\square \)

Now we can easily finish the proof of Proposition 1.5. To do this, choose a point \(x_0\in (0,1)\) that is not fixed by H. We need to consider two cases:

Case 1 \(\vec {\tau }_{x_0} \notin E^s\)

In this case, we first observe that Lemma 4.2 implies that any accumulation point of \(\vec {\tau }_{a^n(x_0)}\) (in particular, \(\vec {\tau }\)) must belong to \(E^u\). Let V be a small neighborhood around \(E^u\cap S^{d-1}_*\) in \(S^{d-1}_*\) so that both statements of Lemma 4.7 hold. Then, by Lemma 4.5, the vector \(\triangle (x_k) / \Vert \triangle (x_k) \Vert _*\) belongs to V starting from a certain \(k = K\). This allows applying Lemma 4.7 inductively, thus showing that for all \(n \ge 0\),

$$\begin{aligned} 1\ge \Vert \triangle (x_{n+k}) \Vert _* \ge \kappa ^n \Vert \triangle (x_k) \Vert _*. \end{aligned}$$

Letting n go to infinity, this yields a contradiction.

Case 2 \(\vec {\tau }_{x_0}\in E^s\).

In this case, Lemma 4.2 yields \(\vec {\tau }_* \in E^s\). We then proceed as above but on a neighborhood of 1 working with a instead of \(a^{-1}\) and with \((A^T)^{-1}\) instead of \(A^T\). Details are left to the reader. (This requires for instance an analog of Lemma 4.7 for the dynamics close to 1.)

4.2 Minimality of affine-like actions

In this section, we begin by showing Proposition 1.6. Let \(\phi : G \rightarrow \mathrm {Diff}^1_+([0,1])\) be a representation with non-Abelian image. We know from Proposition 1.5 that \(\phi \) is semiconjugate to a representation \(\psi \!: G \rightarrow \mathrm {Aff}_+([0,1])\) in the affine group. The elements in the commutator subgroup \([\psi (G),\psi (G)]\) are translations. In what follows, we will assume that the right endpoint is topologically attracting for \(\psi (a)\), hence \(\psi (a)\) is conjugate to an homothety \(x\rightarrow \lambda x\) with \(\lambda >1\) (the other case is analogous). Up to changing a by a positive power, we may assume that \(\lambda \ge 2\). We fix \(b \!\in \! H\) such that \(\psi (b)\) is a non-trivial translation. Up to changing b by its inverse and conjugating \(\psi \) by an appropriate homothety, we may assume that \(\psi (b) = T_1\). We consider a finite system of generators of G that contains both a and b.

Suppose for a contradiction that \(\phi (G)\) does not act minimally. Then there is an interval I that is wandering for the action of \([\phi (G),\phi (G)]\). As before, we may assume that \(D\phi (c)(1)=1\) for all \(c\in G\). Fix \(\varepsilon > 0\) such that \( (1-\varepsilon )^3 > 1/2\), and let \(\delta >0\) be such that

$$\begin{aligned} 1-\epsilon \le D \phi (c) (x) \le 1+\epsilon \quad \text{ for } \text{ each } c \in \{ a^{\pm 1}, b \} \text{ and } \text{ all } x\in [1-\delta , 1]. \end{aligned}$$
(7)

Clearly, we may assume that \(I\subset [1-\delta ,1]\).

Notice that \(\psi (a^{-k} b a^k) = T_{\lambda ^{-k}}\) for all \(k \in \mathbb {Z}\). We consider the following family of translations

$$\begin{aligned} h_{(\varepsilon _i)} := (a^{-n}b^{\varepsilon _n} a^{n}) \cdots (a^{-2} b^{\varepsilon _2} a^{2}) (a^{-1} b^{\varepsilon _1} a), \end{aligned}$$

where \((\varepsilon _i) = (\varepsilon _1, \ldots ,\varepsilon _n) \in \{0,1\}^n\). These satisfy the following properties:

  1. (i)

    We have that \((\varepsilon _i) \not = (\tilde{\varepsilon }_i)\) implies \(h_{(\varepsilon _i)} \not = h_{(\tilde{\varepsilon }_i)}\): this easily follows from that \(\lambda \ge 2\).

  2. (ii)

    We have \(\phi (h_{(\varepsilon _i)})(1-\delta ) \ge 1-\delta \): this follows from that \(\phi (b)\) attracts towards 1 and that \(\varepsilon _i \ge 0\) for all i.

  3. (iii)

    The element \(h_{(\varepsilon _i)} = a^{-n} (b^{\varepsilon _n}a) \cdots (b^{\varepsilon _2} a) (b^{\varepsilon _1} a)\) belongs to the ball of radius 3n in G. In particular, due to (7) and the preceding claim, we have \(D \phi (b_{(\varepsilon _i)})(x) \ge (1-\epsilon )^{3n}\) for all \(x\in [1-\delta ,1]\).

Since for each \(c \in G\) there exists \(x_I \in I\) for which \(|c(I)| = Dc(x_I)|I|\) (where \(| \cdot |\) stands for the length of the corresponding interval), putting together the assertions above we conclude

$$\begin{aligned} 1 \ge \sum _{(\varepsilon _i)} \big | h_{(\varepsilon _i)} (I) \big | \ge 2^n (1-\epsilon )^{3n} |I| > 1, \end{aligned}$$

where the last inequality holds for n large enough. This contradiction finishes the proof of Proposition 1.6.

It should be emphasized that Proposition 1.6 is no longer true for \(C^1\) (even real-analytic) actions on the real line (equivalently, on the open interval). Indeed, this issue was indirectly adressed by Ghys and Sergiescu in [18, section III], as we next state and explain.

Theorem 4.8

[18] The Baumslag–Solitar group \(BS(1,2):= \big \langle a,b\mid aba^{-1}=b^2 \big \rangle \) embeds into \(\mathrm {Diff}_+^1(\mathbb {R})\) via an action that is semiconjugate, but not conjugate, to the canonical affine action and such that the element \(a \!\in \! B(1,2)\) acts with two fixed points.

Recall that BS(1, 2) is isomorphic to the group of order-preserving affine bijections of \(\mathbb Q_2\), where \(\mathbb Q_2\) is the group of diadic rationals. Hence, every element in BS(1, 2) may be though as a pair \(\big ( 2^n,\frac{p}{2^q} \big )\), which identifies to the affine map

$$\begin{aligned} \left( 2^n,\frac{p}{2^q} \right) \!: x\rightarrow 2^nx+\frac{p}{2^q}. \end{aligned}$$

Notice that \(\mathbb Q_2\) corresponds to the subgroup of translations inside BS(1, 2).

Next, following [18], we consider a homeomorphism \(f:\mathbb {R}\rightarrow \mathbb {R}\) satisfying the following properties:

  1. (I)

    For every \(x\in \mathbb {R}\), we have \(f(x+1)=f(x)+2\).

  2. (II)

    \(f(0)=0\).

Lemma 4.9

[18] The map \(\theta _f:\frac{p}{2^q} \in \mathbb Q_2\rightarrow f^{-q}T_pf^q \in \mathrm {Homeo}_+(\mathbb {R})\) is a well-defined homomorphism.

Lemma 4.10

[18] The map \(\big ( 2^n,\frac{p}{2^q} \big ) \in BS(1,2) \rightarrow \theta _f(\frac{p}{2^q}) \circ f^n \in \mathrm {Homeo}_+(\mathbb {R})\) is a group homomorphism.

The homomorphism provided by the last lemma above will still be denoted by \(\theta _f\). Notice that \(\theta _f(a)=f\).

Next, for \(1\le r\le \infty , \omega \), we impose a third condition on f:

(III\(_r\)) The map f is of class \(C^r\).

We have

Lemma 4.11

[18] The image \(\theta _f(BS(1,2))\) is a subgroup of \(\mathrm {Diff}^r_+(\mathbb {R})\).

We end with

Lemma 4.12

[18] Suppose that the function f has at least two fixed points. Then \(\theta _f(BS(1,2))\) has an exceptional minimal set (i.e. a minimal invariant closed set locally homeomorphic to the Cantor set).

To close this section, we point out that a similar construction can be carried out for all Baumslag–Solitar’s groups \(BS(1,n) := \big \langle a,b \mid bab^{-1} = a^n \big \rangle \). Roughly, we just need to replace condition (I) by:

(I)\(_n\) For every \(x \in \mathbb {R}\), we have \(f(x+1) = f(x+n)\).

4.3 Rigidity of multipliers

We start by dealing with the Baumslag–Solitar group BS(1, 2). Let us consider a faithful action of this group by \(C^1\) diffeomorphisms of the closed interval. We known that such an action must be topologically conjugate to an affine action, hence to the standard affine action given by \(a \!: x \mapsto 2x\) and \(b \! : x \mapsto x+1\). (It is not hard to check that all faithful affine actions of B(1, 2) are conjugate inside \(\mathrm {Aff}(\mathbb {R})\).) Let \(\varphi : (0,1) \rightarrow \mathbb {R}\) denote the topological conjugacy. Our goal is to show

Proposition 4.13

The derivative of a at the interior fixed point equals 2.

Proof

For the proof, we let \(I := \varphi ^{-1} ([0,1])\). Notice that for all positive integers nN, the intervals

$$\begin{aligned} (a^{-n} b^{\varepsilon _n} a^n) \cdots (a^{-2} b^{\varepsilon _2} a^2) (a^{-1}b^{\varepsilon _1}a) b^N a^{-n}(I), \quad \varepsilon _i \in \{0,1\}, \end{aligned}$$

have pairwise disjoint interiors. Indeed, these intervals are mapped by \(\varphi \) into the dyadic intervals of length \(1/2^n\) contained in \([N,N+1]\). For simplicity, we assume below that both a and b have derivative 1 at the endpoints. (As before, this may be performed via the Muller–Tsuboi trick; c.f. Lemma 4.6).

Assume first that \(Da (x_0) < 2\), where \(x_0\) is the interior fixed point of a. Then there are \(C > 0\) and \(\varepsilon > 0\) such that for all \(n \ge 1\),

$$\begin{aligned} \big | a^{-n}(I) \big | \ge C \left( \frac{1}{2} + \varepsilon \right) ^n |I|. \end{aligned}$$

Fix \(\delta > 0\) such that

$$\begin{aligned} \big ( 1-\delta \big )^3 \left( \frac{1}{2} + \varepsilon \right) > 1/2. \end{aligned}$$
(8)

Let \(\sigma > 0\) be small enough so that

$$\begin{aligned} Da(x) \ge 1-\delta , \quad Da^{-1}(x) \ge 1-\delta \quad \text{ and } \quad Db(x) \ge 1-\delta \quad \text{ for } \text{ all } \quad x \in [1-\sigma ,1]. \end{aligned}$$

Finally, let \(N \ge 1\) be such that \(b^N(x_0) \ge 1 - \sigma \). Similarly to the proof of Proposition 1.6, for such N and all \(n \ge 1\), we have for all choices \(\varepsilon _i \in \{0,1\}\),

$$\begin{aligned} \big | (a^{-n} b^{\varepsilon _n} a^{n}) \cdots (a^{-2} b^{\varepsilon _2} a^{2}) (a^{-1}b^{\varepsilon _1}a) b^N a^{-n}(I) \big | \ge (1-\delta )^{3n} DC\left( \frac{1}{2} +\varepsilon \right) ^n \big | I \big |, \end{aligned}$$

where \(D := \min _x Db^N (x)\). As there are \(2^n\) of these intervals, we have

$$\begin{aligned} 1 \ge 2^n (1-\delta )^{3n} DC \left( \frac{1}{2}+\varepsilon \right) ^n |I|, \end{aligned}$$

which is impossible for a large-enough n due to (8).

Assume next that \(Da (x_0) > 2\). Then there are \(C' > 0\) and \(\varepsilon ' > 0\) such that for all \(n \ge 1\),

$$\begin{aligned} \big | a^{-n}(I) \big | \le C' \left( \frac{1}{2} - \varepsilon ' \right) ^n. \end{aligned}$$

Fix \(\delta ' > 0\) such that

$$\begin{aligned} \big ( 1+\delta ' \big )^3 \left( \frac{1}{2} - \varepsilon ' \right) < 1/2. \end{aligned}$$
(9)

Let \(\sigma ' > 0\) be small enough so that

$$\begin{aligned} Da(x) \le 1+\delta , \quad Da^{-1}(x) \le 1+\delta \quad \text{ and } \quad Db(x) \le 1+\delta \quad \text{ for } \text{ all } \quad x \in [1-\sigma ',1]. \end{aligned}$$

Finally, let \(N' \ge 1\) be such that \(b^{N'}(x_0) \ge 1 - \sigma '\). Proceeding as before, we see that for such \(N'\) and all \(n \ge 1\), we have for all choices \(\varepsilon _i \in \{0,1\}\),

$$\begin{aligned} \big | (a^{-n} b^{\varepsilon _n} a^{n}) \cdots (a^{-2} b^{\varepsilon _2} a^{2}) (a^{-1}b^{\varepsilon _1}a) b^N a^{-n}(I) \big | \le (1+\delta ')^{3n} D'C'\Big (\frac{1}{2} -\varepsilon ' \Big )^n \big | I \big |, \end{aligned}$$

where \(D' := \max _x Db^N (x)\). However, the involved intervals cover \(I_{N'} := b^{N'} (I) = \varphi ^{-1} \big ( [N',N'+1] \big )\). Thus,

$$\begin{aligned} |I_{N'}| \le 2^n (1+\delta ')^{3n} D'C'\Big (\frac{1}{2} -\varepsilon ' \Big )^n \big | I \big |, \end{aligned}$$

which is again impossible for a large-enough n due to (9). \(\square \)

Remark 4.14

The action of the Baumslag–Solitar group by \(C^1\) diffeomorphisms of the real line constructed in the preceding section can be easily modified into a minimal one for which the derivative of a at the fixed point equals 1. Roughly, we just need to ask for the map f along the construction to have a single fixed point, with derivative 1 at this point. This shows that Theorem 1.7 is no longer true for actions by \(C^1\) diffeomorphisms of the open interval.

The preceding proposition corresponds to a particular case of Theorem 1.7 but illustrates the technique pretty well. Below we give the proof of the general case along the same ideas. First, as A is supposed to be hyperbolic, we know that the action of G is topologically conjugate to an affine one. Moreover, Proposition 2.1 completely describes such an action: up to a topological conjugacy \(\varphi \), it is given by correspondences \(a \mapsto M_\lambda \) and \(h_i \mapsto T_{t_i}\), where \((t_1,\ldots ,t_d)\) is an eigenvector of A with eigenvalue \(\lambda \). Up to conjugacy in \(\mathrm {Aff} (\mathbb {R})\), we may assume that one of the \(t_i's\) equals 1, hence \(b:= b_i\) is sent into \(T_t := T_1\).

Next, we proceed as above, but with a little care. Notice that changing a by an integer power if necessary, we may assume that \(\lambda \ge 2\).

Assume first that \(Da (x_0) < \lambda \), where \(x_0\) is the interior fixed point of a. Then there are \(C > 0\) and \(\varepsilon > 0\) such that for all \(n \ge 1\),

$$\begin{aligned} \big | a^{-n}(I) \big | \ge C \left( \frac{1}{\lambda } + \varepsilon \right) ^n. \end{aligned}$$

Fix \(\delta > 0\) such that \((1-\delta )^3 (\frac{1}{\lambda } + \varepsilon ) > \frac{1}{\lambda }\). Let \(\sigma > 0\) be small so that \(Da(x) \ge 1-\delta \), \(Da^{-1}(x) \ge 1-\delta \text{ and } Db(x) \ge 1-\delta \text{ hold } \text{ for } \text{ all } x \in [1-\sigma ,1].\) Finally, let \(N \ge 1\) be such that \(b^N(x_0) \ge 1 - \sigma \). Given \(n \ge 1\), we consider for all choices \(\varepsilon _i \in \{0,1,\ldots ,[\lambda ]\}\), the intervals \((a^{-n} b^{\varepsilon _n} a^{n}) \cdots (a^{-2} b^{\varepsilon _2} a^{2}) (a^{-1}b^{\varepsilon _1}a) b^N a^{-n}(I)\), where I is the preimage of [0, 1] under the topological conjugacy into the affine action. As before, we have for each such choice

$$\begin{aligned} \big | (a^{-n} b^{\varepsilon _n} a^{n}) \cdots (a^{-2} b^{\varepsilon _2} a^{2}) (a^{-1}b^{\varepsilon _1}a) b^N a^{-n}(I) \big | \ge (1-\delta )^{3n} DC\left( \frac{1}{\lambda } + \varepsilon \right) ^n \big | I \big |, \end{aligned}$$

where \(D := \min _x Db^N (x)\). These intervals do not necessarily have pairwise disjoint interiors, but their union covers I with multiplicity at most 2. As there are \(\big ( [\lambda ]+1 \big )^n\) of these intervals, we have

$$\begin{aligned} 2 \ge \big ( [\lambda ]+1 \big )^n (1-\delta )^{3n} DC \left( \frac{1}{\lambda }+\varepsilon \right) ^n |I|, \end{aligned}$$

which is impossible for large-enough n.

Assume next that \(Da (x_0) > \lambda \). Then there are \(C '> 0\) and \(\varepsilon ' > 0\) such that for all \(n \ge 1\),

$$\begin{aligned} \big | a^{-n}(I) \big | \le C' \left( \frac{1}{\lambda } - \varepsilon ' \right) ^n. \end{aligned}$$

Fix \(\delta ' > 0\) such that \((1+ \delta ')^3 (\frac{1}{\lambda } - \varepsilon ') \!<\! \frac{1}{\lambda }\). Let \(\sigma ' > 0\) be small enough so that \(Da(x) \le 1+\delta '\), \(Da^{-1}(x) \le 1+\delta '\) and \(Db(x) \!\le \!1+\delta ' \text{ hold } \text{ for } \text{ all } x \in [1-\sigma ',1].\) Finally, let \(N' \ge 1\) be such that \(b^{N'}(x_0) \ge 1 - \sigma '\). As before, given \(n \ge 1\), for all choices \(\varepsilon _i \in \{0,1,\ldots ,[\lambda ]\}\), we have

$$\begin{aligned} \big | (a^{-n} b^{\varepsilon _n} a^{n}) \cdots (a^{-2} b^{\varepsilon _2} a^{2}) (a^{-1}b^{\varepsilon _1}a) b^{N'} a^{-n}(I) \big | \le (1+\delta ')^{3n} D'C'\left( \frac{1}{\lambda } - \varepsilon ' \right) ^n \big | I \big |, \end{aligned}$$

where \(D' := \max _x Db^N (x)\). These intervals cover \(I_{N'} := b^{N'}(I)\) for each \(n \ge 1\). As there are \(([\lambda ]+1)^n\) of these intervals, we have

$$\begin{aligned} |I_{N'}| \le \big ( [\lambda ]+1 \big )^n (1+\delta ')^{3n} DC \left( \frac{1}{\lambda }-\varepsilon ' \right) ^n |I|. \end{aligned}$$

Although this is not enough to conclude, we notice that we may replace a by \(a^k\) along the preceding computations, now yielding

$$\begin{aligned} |I_{N'}| \le \big ( [\lambda ^k]+1 \big )^n (1+\delta ')^{3n} DC \left( \frac{1}{\lambda }-\varepsilon ' \right) ^{kn} |I|. \end{aligned}$$

Choosing k large enough so that

$$\begin{aligned} \big ( [\lambda ^k] + 1 \big ) \left( \frac{1}{\lambda } - \varepsilon ' \right) ^k (1 + \delta ')^3 < 1 \end{aligned}$$

and then letting n go to infinity, this gives the desired contradiction.

We have hence showed that \(Da (x_0) = \lambda \). To show that the derivative of \(a^k b\) at the interior fixed point equals \(\lambda ^k\) for each \(k \ne 0\) and all \(b \in H\), just notice that the associated affine action can be conjugate in \(\mathrm {Aff}(\mathbb {R})\) so that \(a^k b\) is mapped into \(T_{\lambda ^k}\). Knowing this, we may proceed in the very same way as above.

4.4 On the smoothness of conjugacies

As we announced in the Introduction, actions by \(C^1\) diffeomorphisms are rarely rigid in what concerns the regularity of conjugacies. In the context of non-Abelian affine actions of finitely-generated groups, this is actually never the case, as it is shown by the next

Proposition 4.15

Let G be a finitely-generated group of the form \(\mathbb Z\ltimes _A H\), where \(A \in GL_d(\mathbb Q)\) and \(rank_{\mathbb Q}(H) = d\). Then, every faithful action of G by \(C^1\) diffeomorphisms of [0, 1] can be approximated in the \(C^1\) topology by actions by \(C^1\) diffeomorphism that are topologically conjugate to it, but for which no Lipschitz conjugacy exists.

The proof of this proposition follows by a straightforward application of the Anosov–Katok method. The reader is referred to [15, 16] for a general panorama on this, yet the construction we give below is self-contained. It is to be noticed that the only dynamical properties that we need for the proposition above to hold are that G has a free orbit and its centralizer inside the group of homeomorphisms of the line is trivial. Thus, the proposition applies in many more situations. We start with an elementary lemma.

Lemma 4.16

There exists a family of \(C^{\infty }\) diffeomorphisms \(\varphi _{I}^D \!: I \rightarrow I\) between closed intervals I, where \(D \ge 1\), that are infinitely tangent to the identity at the endpoints, satisfy \(\varphi _{I}^D (m) = m\) and \(\sup _{x \in I} D \varphi _I^D (x) = D \varphi _{I}^D (m) = D\) for the midpoint m of I, and such that given \(\varepsilon > 0\) and \(\bar{D} > 1\), there exists \(\delta > 0\) such that for all \(y \in I\) and all \(D,D'\) in \([1,\bar{D}]\) satisfying \(1 - \delta \le D/D' \le 1+\delta \), we have

$$\begin{aligned} 1 - \varepsilon< \frac{D \varphi _{I}^D (y)}{D \varphi _{I}^{D'}(y)} < 1 + \varepsilon . \end{aligned}$$

Proof

Let \(\chi \) be a vector field on \([-1,1]\) which is infinitely flat at the endpoints, strictly negative (resp. positive) on \([-1,0]\) (resp. [0, 1]), and satisfies \(\chi (x) = x \frac{\partial }{ \partial x}\) in a neighborhood of the origin. Then let \(\varphi _{[-1,1]}^D\) be the flow associated to \(\chi \) up to time \(T := \log (D)\), so that \(D\varphi _{[-1,1]}^D (0) = D\). Finally, for an interval I as in the statement, define \(\varphi _{I}^D\) to be the affine conjugate of \(\varphi _{[-1,1]}^D\). Then the family \(\{ \varphi _{I}^D \}\) satisfies all the desired properties, as the reader may easily check. \(\square \)

Proof of Proposition 4.15

Start with any finitely-generated group G of \(C^1\) diffeomorphisms of [0, 1] whose action at the interior is topologically conjugate to a non-Abelian affine group. We will show that there exist homeomorphisms \(\varphi \!: [0,1] \rightarrow [0,1]\) which are not Lipschitz that conjugate G into groups of \(C^1\) diffeomorphisms that are arbitrarily close to G. This is enough to prove the proposition since \(\varphi \) is the unique conjugacy between G and \(\varphi G\varphi ^{-1}\). To see this, let \(\psi : [0,1]\rightarrow [0,1]\) be an arbitrary conjugacy between G and \(\varphi G \varphi ^{-1}\), that is, \(\psi g \psi ^{-1}=\varphi g \varphi ^{-1}\) holds for all \(g\in G\). Then, \(\varphi ^{-1} \psi g=g\varphi ^{-1} \psi \) for all \(g \in G\). But since G is conjugated to a non-Abelian affine group, it must have trivial centralizer. Therefore, \(\varphi ^{-1} \psi = id\), thus \(\psi =\varphi \).

Let \(\eta > 0\), and fix a point \(x_0\) in (0, 1) having a free orbit by G, and fix also a finite generating set of G. We will inductively construct a sequence of diffeomorphisms \(\varphi _k\) of [0, 1], all fixing \(x_0\), in such a way that \(\tilde{\varphi }_k := \varphi _1 \circ \cdots \circ \varphi _k\) satisfies:

  1. (i)

    \(\big \Vert \tilde{\varphi }_{k+1} - \tilde{\varphi }_k \big \Vert _{C^0} \le \frac{1}{2^k}\),

  2. (ii)

    \(\big \Vert \tilde{\varphi }_{k+1} \circ c \circ \tilde{\varphi }_{k+1}^{-1} - \tilde{\varphi }_{k} \circ c \circ \tilde{\varphi }_{k}^{-1} \big \Vert _{C^1} \le \frac{\eta }{2^k}\) for each generator c,

  3. (iii)

      \(x_0\) is fixed by \(\varphi _k\),

  4. (iv)

      \(D \varphi _{k} (x_0) > k / \min _y D \tilde{\varphi }_{k-1} (y)\), and if we denote by \(J_{k}\) the connected component of the set \(\big \{x \mid D \varphi _{k} (x) > k/\min _y D \tilde{\varphi }_{k-1} (y) \big \}\) containing \(x_0\), then the support of \(\varphi _{k+1}\) has measure \(< |J_k|/2\).

Assume for a while that we have performed this construction, and let us complete the proof. By (i), we have that the sequence \((\tilde{\varphi }_k)\) converges to a homeomorphism \(\tilde{\varphi }_{\infty }\). By (ii), the sequence of the actions conjugated by \(\tilde{\varphi }_k\) converge in the \(C^1\) topology to the action conjugated by \(\tilde{\varphi }_{\infty }\), and this action is \(C^1\) close to the initial one. Due to (iii), each \(\tilde{\varphi }_k\) fixes \(x_0\), hence the same holds for \(\tilde{\varphi }_{\infty }\). Finally, by (iv),

$$\begin{aligned} D \tilde{\varphi }_{k} (x_0) = D \tilde{\varphi }_{k-1} (\varphi _{k} (x_0)) \cdot D \varphi _{k} (x_0) \ge \min _{y} D \tilde{\varphi }_k (y) \cdot D \varphi _k (x_0) > k. \end{aligned}$$

Besides, the derivative of \(\tilde{\varphi }_k\) is larger than k on certain intervals that remain disjoint from the supports of \(\varphi _{k+1},\varphi _{k+2}, \ldots \). As a consequence, the limit homeomorphism \(\tilde{\varphi }_{\infty }\) is not Lipschitz. Because of the uniqueness of the conjugacy previously discussed, this implies that G and \(\tilde{\varphi }_{\infty } G \tilde{\varphi }_{\infty }^{-1}\) cannot be conjugated by any Lipschitz homeomorphism.

To conclude the proof, we proceed to the construction of the sequence \(\varphi _k\). The idea is to inductively make \(\varphi _{k+1}\) almost commute with the action of G conjugated by \(\tilde{\varphi }_k\) along a very small neighborhood of a large but finite part of the orbit of \(x_0\). More precisely, let us number the points in the G-orbit of \(x_0\) as \(x_0,x_1,\ldots \) so that \(x_i\) is no further to the origin than \(x_j\) in the corresponding Schreier graph whenever \(i \le j\). (Since \(x_0\) has free orbit, this Schreier graph is actually isomorphic to the Cayley graph of G.) We let \(d_i\) be the graph distance between \(x_i\) and the origin. (Hence, \(d_i \le d_j\) for \(i \le j\).) Denote \(x_i^k := \tilde{\varphi }_{k-1} (x_i)\), where by definition \(x_i^1=x_i\). Assume that, for some positive integer \(\ell _k\), the support of \(\varphi _k\) consists of a collection of disjoint intervals \(I_0^k = \tilde{\varphi }_{k-1}^{-1} (J_0^k), \ldots , I_{\ell _k}^k = \tilde{\varphi }_{k-1}^{-1} (J_{\ell _k}^k)\), so that each \(x_i^k\) lies in the interior of \(J_i^k\) and \(x_0\) is the midpoint of \(I_0^k\) and is fixed by \(\tilde{\varphi }_k\) (hence \(x_0^k = x_0\)). Then \(\varphi _{k+1}\) will be defined so that its support consist of a collection of intervals \(I_0^{k+1} = \tilde{\varphi }_k^{-1} (J_0^{k+1}),\ldots , I_{\ell _{k+1}}^{k+1} = \tilde{\varphi }_k^{-1} (J_{\ell _{k+1}}^{k+1})\), where each \(J_j^{k+1}\) contains \(x_j^{k+1} := \tilde{\varphi }_k (x_j)\) in its interior and, for \(j \le \ell _k\), is a subset of (but much smaller than) \(J_j^k\), with \(x_0\) being the midpoint of \(I_0^{k+1}\). (Notice that \(\ell _{k+1}\) is to be chosen.) Moreover, we will ask that \(\varphi _{k+1}\) fixes \(x_0\); see the figure below. Besides, if c is a generator sending \(x_i\) into \(x_j\) for some \(0 \le i \le j \le \ell _{k+1}\), then we will ask that \(c (I_i^{k+1}) = I_j^{k+1}\). If the intervals \(I_i^{k+1}\) are chosen small enough, condition (i) above will certainly hold. The second half of condition (iv) will be also ensured by this property. The most involved issue concerns condition (ii).

figure a

To properly define the diffeomorphism \(\varphi _{k+1}\), let

$$\begin{aligned} \bar{D} := \frac{k+2}{\min D \tilde{\varphi }_k (y)} \quad \text{ and } \quad \varepsilon := \frac{\eta }{M_k 2^{k+1}}, \end{aligned}$$
(10)

where

$$\begin{aligned} M_k := \frac{\sup D \tilde{\varphi }_k (y)}{\inf D \tilde{\varphi }_k (y)} \cdot \sup Dc (y). \end{aligned}$$
(11)

Fix \(\delta > 0\) provided by Lemma 4.16 applied to \(\bar{D}\) and \(\varepsilon \) above. Let d be a large-enough integer so that

$$\begin{aligned} 1 < \bar{D}^{1/d} \le 1 + \delta . \end{aligned}$$

Let \(\ell _{k+1}\) be such that all points at distance \(\le \! d\) to the origin in the orbit graph appear in \(x_0,\ldots ,x_{\ell _{k+1}}\). For instance, we can (and we will) take \(\ell _{k+1}\) as being the cardinal of the ball B(d) of radius d centered at the origin in G. Then let \(\varphi _{k+1}\) be the diffeomorphism whose restriction to \(I_i^{k+1}\) coincides with \(\varphi _{I_i^{k+1}}^{D}\), where \(D = D(i) = \bar{D}^{1-\frac{d_i}{d}}\), and which is the identity outside these intervals. We claim that this choice accomplishes our needs provided the lengths of the intervals \(I_{i}^{k+1}\) are very small.

To show this, notice that the first half of condition (iv) directly follows from the construction. Indeed, since \(x_0\) is forced to be the midpoint of \(I_0^k\), the definition of \(\bar{D}\) yields

$$\begin{aligned} D \tilde{\varphi }_{k+1} (x_0) = D (\tilde{\varphi }_k \circ \varphi _{k+1}) (x_0) = D \tilde{\varphi }_k (\varphi _{k+1}(x_0)) \cdot \bar{D} \ge k+2 > k+1. \end{aligned}$$

To deal with property (ii), notice that

$$\begin{aligned} D (\tilde{\varphi }_k c \tilde{\varphi }^{-1}_k)(x) = \frac{D \tilde{\varphi }_k (c \tilde{\varphi }^{-1}_k (x))}{D \tilde{\varphi }_k (\tilde{\varphi }^{-1}_k (x))} Dc (\tilde{\varphi }^{-1}_k (x)) \end{aligned}$$

and

$$\begin{aligned} D (\tilde{\varphi }_{k+1} c \tilde{\varphi }^{-1}_{k+1})(x) = \frac{D \tilde{\varphi }_k (\varphi _{k+1} c \varphi _{k+1}^{-1} \tilde{\varphi }^{-1}_k (x))}{D \tilde{\varphi }_k (\tilde{\varphi }^{-1}_k (x))} \frac{D \varphi _{k+1} (c \varphi _{k+1}^{-1} \tilde{\varphi }_k^{-1} (x))}{D \varphi _{k+1} (\varphi _{k+1}^{-1} \tilde{\varphi }_k^{-1}(x))} Dc (\varphi _{k+1}^{-1} \tilde{\varphi }^{-1}_k (x)). \end{aligned}$$

Using the continuity of \(D \tilde{\varphi }_k\) and Dc, and choosing \(I_i^{k+1}\) sufficiently small, we may ensure that

$$\begin{aligned} \left| \frac{D \tilde{\varphi }_k (c \tilde{\varphi }^{-1}_k (x))}{D \tilde{\varphi }_k (\tilde{\varphi }^{-1}_k (x))} Dc (\tilde{\varphi }^{-1}_k (x)) - \frac{D \tilde{\varphi }_k (\varphi _{k+1} c \varphi _{k+1}^{-1} \tilde{\varphi }^{-1}_k (x))}{D \tilde{\varphi }_k (\tilde{\varphi }^{-1}_k (x))} Dc \big ( \varphi _{k+1}^{-1} \tilde{\varphi }^{-1}_k (x) \big ) \right| \le \frac{\eta }{2^{k+1}}. \end{aligned}$$

We are hence left to check that the factor

$$\begin{aligned} \frac{D \varphi _{k+1} (c \varphi _{k+1}^{-1} \tilde{\varphi }_k^{-1} (x))}{D \varphi _{k+1} (\varphi _{k+1}^{-1} \tilde{\varphi }_k^{-1}(x))} \end{aligned}$$
(12)

can be made close to 1 so that it lies between \(1 - \varepsilon \) and \(1+\varepsilon \). Indeed, via a triangle inequality and using (10) and (11), such an estimate would imply the desired upper bound

$$\begin{aligned} \big \Vert D ( \tilde{\varphi }_{k+1} \circ c \circ \tilde{\varphi }_{k+1}^{-1}) - D( \tilde{\varphi }_{k} \circ c \circ \tilde{\varphi }_{k}^{-1}) \big \Vert _{C^0} \le \frac{\eta }{2^k}. \end{aligned}$$

To show the claimed estimate, let \(y := \varphi _{k+1}^{-1} \tilde{\varphi }_k^{-1}(x)\). If neither y nor c(y) belong to some of the intervals \(I_i^{k+1},\) then \(\varphi _{k+1}\) is the identity in neighborhoods of y and c(y), thus the expression (12) obviously equals 1. If y belongs to some \(I_i^{k+1}\) but c(y) does not belong to any of such intervals, then \(x_i\) belongs to B(d) but \(c(x_i)\) does not, hence the distance of \(x_i\) to \(x_0\) in the orbit graph must be equal to d. Therefore, due to the definition of D(i) as \(\bar{D}^{1-\frac{d_i}{d}}\), we have that \(\varphi _{k+1}\) is still the identity in a neighborhood of y, hence the expression (12) still equals 1. The same occurs whenever \(c(x_i)\) lies in B(d) but \(x_i\) does not.

Finally, assume that y lies in \(I_i^{k+1}\) and \(c(x_i) = x_j\) is such that \(d_i \le d_j \le \ell _{k+1}\), hence \(c (I_i^{k+1}) = I_j^{k+1}\). (The case \(d_j \le d_i \le \ell _{k+1}\) is analogous.) Let \(\bar{c}\) be the affine map sending \(I_i^{k+1}\) into \(I_j^{k+1}\). Then the expression (12) equals

$$\begin{aligned} \frac{D \varphi _{k+1} (c(y))}{D \varphi _{k+1}(y)} = \frac{D \varphi _{I_j^{k+1}}^{D(j)}(\bar{c}(y))}{D \varphi _{I_i^{k+1}}^{D(i)}(y)} \cdot \frac{D \varphi _{I_j^{k+1}}^{D(j)} (c(y))}{D \varphi _{I_j^{k+1}}^{D(j)}(\bar{c} (y))}. \end{aligned}$$

Now, as \(\bar{c}\) is affine on \(I_i^{k+1}\) and sends this interval into \(I_j^{k+1}\), it also transforms by conjugacy the map \(\varphi _{I_i^{k+1}}^D\) into \(\varphi _{I_j^{k+1}}^{D}\). Since our choice of constants allows applying Lemma 4.16, this yields that the first factor in the expression above strictly lies between \(1-\varepsilon \) and \(1+ \varepsilon \). We claim that the second factor can be made arbitrarily close to 1 by choosing \(I_i^{k+1}\) small enough, thus yielding condition (ii) of the Proposition. To see this, notice that, if we denote the affine map sending [0, 1] into \(I_j^{k+1}\) by \(\psi \), then using the equality

$$\begin{aligned} \varphi _{I_{j}^{k+1}}^{D(j)} = \psi \circ \varphi _{[0,1]}^{D(j)} \circ \psi ^{-1} \end{aligned}$$

we obtain

$$\begin{aligned} \frac{D \varphi _{I_j^{k+1}}^{D(j)} \big ( c(y) \big ) }{D \varphi _{I_j^{k+1}}^{D(j)} \big ( \bar{c} (y) \big ) } = \frac{D \varphi _{[0,1]}^{D(j)} \big ( \psi ^{-1}( c(y) ) \big ) }{D \varphi _{[0,1]}^{D(j)} \big ( \psi ^{-1}( \bar{c} (y) ) \big )}. \end{aligned}$$

Now, because of the uniform upper bound \(D(j) \le \bar{D}\), we are reduced to showing that, by choosing \(I_i^{k+1}\) small enough, the points \(\psi ^{-1} (c(y))\) and \(\psi ^{-1} (\bar{c} (y))\) become very close. To check this, letting \(\bar{y}\) be the left endpoint of \(I_{i}^{k+1}\), we have (remind that both \(D\psi \) and \(D \bar{c}\) are constant)

$$\begin{aligned} \left| \psi ^{-1} (c(y)) - \psi ^{-1} (\bar{c} (y)) \right|= & {} D \psi ^{-1} \left| \int _{\bar{y}}^y \left[ Dc (t) - D\bar{c} \right] dt \right| \\\le & {} \frac{1}{|I_j^{k+1}| } \int _{\bar{y}}^y \max _{t \in I_i^{k+1}} \big | Dc(t) - D\bar{c} \big | \\= & {} \frac{| I_i^{k+1} |}{| I_j^{k+1} |} \max _{t \in I_i^{k+1}} \big | Dc(t) - D \bar{c} \big | \\\le & {} \frac{1}{\min _s Dc(s)} \max _{t \in I_i^{k+1}} \big | Dc(t) - D\bar{c} \big |, \end{aligned}$$

and the last expression can be made arbitrarily small by choosing \(|I_i^{k+1}|\) very small due to the continuity of Dc and the Mean Value Theorem. \(\square \)

Next, we deal with the \(C^r\) case, where \(r \ge 2\).

Proposition 4.17

Let G be a group of the form \(\mathbb Z\ltimes _A H\), where \(A \in GL_d(\mathbb Q)\) has no eigenvalue of norm 1 and \(rank_{\mathbb Q}(H) = d\). Then for all \(r \ge 2\), every faithful action of G by \(C^r\) diffeomorphisms of [0, 1] with no global fixed point in (0, 1) is conjugate to an affine action by a homeomorphism that restricted to (0, 1) is a \(C^r\) diffeomorphism.

Proof

We know from Theorem 1.3 that the action is conjugate to an affine action via a homeomorphism \(\varphi \). The image of H is a subgroup of the group of translations which is necessarily dense; otherwise, H would have rank 1 and \(A^2\) would stabilize it pointwise, thus contradicting hyperbolicity. As g is assumed to be \(C^r\), \(r\ge 2\), and has no fixed point in (0, 1), Szekeres’ theorem implies that the restrictions of g to [0, 1) and (0, 1] are the time-one maps of the flows of vector fields \(\mathcal {X_-}\) and \(\mathcal {X_+}\), respectively, that are \(C^1\) on their domains and \(C^{r-1}\) at the interior. Futhermore, Kopell’s lemma implies that the \(C^1\) centralizer of g is contained in the intersection of the flows of \(\mathcal {X_-}\) and \(\mathcal {X_+}\). Therefore, the flows coincide for a dense subset of times, hence \(\mathcal {X_-} = \mathcal {X_+}\) on (0, 1). We denote this vector field by \(\mathcal {X}\). (See [27, §4.1.3] for the details.)

The homeomorphism \(\varphi \) must send this flow into that of the translations. Since \(\mathcal {X}\) is of class \(C^{r-1}\) on (0, 1), we have that \(\varphi \) is a \(C^{r-1}\) diffeomorphism of (0, 1). To see that \(\varphi \) is actually a \(C^r\) diffeomorphism, we use Theorem 1.7, which says that the interior fixed point \(x_0\) of the element a is hyperbolic. Indeed, this implies that \(\varphi \) is a \(C^1\) diffeomorphism that conjugates two germs of hyperbolic diffeomorphisms. By a well-known application of (the sharp version of) Sternberg’s linearization theorem, such a diffeomorphism has to be of class \(C^r\) in a neighborhood of \(x_0\) (see [27, Corollary 3.6.3]). Since the action is minimal on (0, 1) due to Proposition 1.6, this easily implies that \(\varphi \) is of class \(C^r\) on the whole open interval. \(\square \)

Remark 4.18

Let G be a group of the form \(\mathbb Z\ltimes _A H\), where \(A \in GL_d(\mathbb Q)\) has no eigenvalue of norm 1 and \(rank_{\mathbb Q}(H) = d\). Assume that G acts by \(C^r\) diffeomorphisms of [0, 1] with no global fixed point in (0, 1) and that \(r \ge 1\). Then every bi-Lipschitz conjugacy of G into an affine group is \(C^r\) on (0, 1). Indeed, this essentially follows (with minor modifications) from [27, §3.6].

5 Examples involving non-hyperbolic matrices

We next consider the situation where \(A \in GL_d(\mathbb Q)\) has some eigenvalues of modulus \(=1\) and some others of modulus \(\ne 1\). Our goal is to prove Theorem 1.9, according to which the group \(\mathbb Z\ltimes _A \mathbb Q^d\) has an action by \(C^1\) diffeomorphisms of the closed interval that is not semiconjugate to an affine action provided A is irreducible. In particular, this is the case for the matrix

$$\begin{aligned} A := \left( \begin{array}{c@{\quad }c@{\quad }c@{\quad }c} 0 &{} 0 &{} 0 &{} -1\\ 1 &{} 0 &{} 0 &{} -4\\ 0 &{} 1 &{} 0 &{} -4\\ 0 &{} 0 &{} 1 &{} -4 \end{array} \right) \in SL_4 (\mathbb Z). \end{aligned}$$

Indeed, A has characteristic polynomial \(p(x)=x^4+4x^3+4x^2+4x+1=p_1(x) p_2(x)\), where \(p_1 (x):=x^2+(2+\sqrt{2})x+1\) and \(p_2 (x):=x^2+(2-\sqrt{2})x+1\). Notice that p(x) has no rational root, neither a decomposition into two polynomial of rational coefficients of degree two; hence, it is irreducible over \(\mathbb Q\). Moreover, the roots \(\lambda \) and \(1/\lambda \) of \(p_1\) are real numbers of modulus different from 1, while the roots w, \(\overline{w}\) of \(p_2\) are complex numbers of modulus 1, where

$$\begin{aligned} w = \frac{\sqrt{2}-2+i\sqrt{4\sqrt{2}-2}}{2}, \qquad \lambda = \frac{-\sqrt{2}-2 + \sqrt{4\sqrt{2}+2}}{2}. \end{aligned}$$

Given any \(A \in GL_d(\mathbb Q)\), we begin by constructing an action of \(G := \mathbb Z\ltimes _A \mathbb Q^d\) by homeomorphisms of the interval that is not semiconjugate to an affine action. To do this, we consider a decomposition \([0,1] = \overline{\bigcup _{k \in \mathbb {Z}} I_k}\), where the \(I_k\)’s are open intervals disposed on [0, 1] in an ordered way and such that the right endpoint of \(I_k\) coincides with the left endpoint of \(I_{k+1}\), for all \(k \in \mathbb Z\). Let f be a homeomorphism of [0, 1] sending each \(I_{k}\) into \(I_{k+1}\). For each \((t_1,\ldots ,t_d)\!\in \!\mathbb {Q}^d\) and \(k \in \mathbb Z\), denote

$$\begin{aligned} (t_{1,k},\ldots ,t_{d,k}) := A^k (t_1,\ldots ,t_d). \end{aligned}$$

Let \(\xi ^t\) be a nontrivial topological flow on \(I_0\). Next, fix \((s_1,\ldots ,s_d) \!\in \! \mathbb {R}^d\), and for each \((t_1,\ldots ,t_d)~\in ~\mathbb Q^d\), define \(g:=g_{(t_1,\ldots ,t_d)}\) on \(I_0\) by \(g|_{I_0} := \xi ^{\sum _i s_i t_i}\). Extend g to the whole interval by letting

$$\begin{aligned} g \big |_{I_{-k}} = f^{-k} \circ \xi ^{\sum _i s_i t_{i,k}} \big |_{I_0} \circ f^{k}. \end{aligned}$$
(13)

It is not hard to see that the correspondences \(a \mapsto f,\) \((t_1,\ldots ,t_d) \mapsto g_{(t_1,\ldots ,t_d)},\) define a representation of G, where a stands for the generator of the \(\mathbb Z\)-factor of G.

Lemma 5.1

If A is \(\mathbb Q\)-irreducible and \((s_1,\ldots ,s_d)\) is nonzero, then the action constructed above is faithful.

Proof

Denote by \(b_1,\ldots ,b_d\) the canonical basis of \(H:=\mathbb Q^d\). We need to show that for a given nontrivial \(b := b_1^{t_1}, \ldots , b_d^{t_d} \in H\), the associated map \(g := g_{(t_1, \ldots , t_d)}\) acts nontrivially on [0, 1]. Assume otherwise. Then according to (13), for all \(k \in \mathbb {Z}\),

$$\begin{aligned} 0 = \sum _i s_i t_{i,k} = \big \langle (s_1,\ldots ,s_d), A^k (t_1,\ldots ,t_d) \big \rangle . \end{aligned}$$

As a consequence, the \(\mathbb Q\)-span of \( A^k (t_1,\ldots ,t_d), k \in \mathbb {Z}\), is a \(\mathbb Q\)-invariant subspace orthogonal to \((s_1,\ldots ,s_d)\). However, as A is \(\mathbb Q\)-irreducible, the only possibility is \((t_1,\ldots ,t_d) = 0\), which implies that b is the trivial element in H. \(\square \)

Assume next that A is not hyperbolic. Associated to the transpose matrix \(A^T\), there is a decomposition \(\mathbb {R}^d = E^s \oplus E^u \oplus E^c\) into stable, unstable, and central subspaces, respectively. The space \(E^c\) necessarily contains a subspace \(E^c_*\) of dimension 1 or 2 that is completely invariant under \(A^T\) and such that for each nontrivial vector therein, all vectors in its orbit under \(A^T\) have the same norm. Our goal is to prove

Proposition 5.2

If \((s_1,\ldots ,s_d)\) belongs to \(E^c_*\), then the action above is \(C^1\) smoothable.

This will follow almost directly from the next

Proposition 5.3

The map f and the subintervals \(I_k\) of the preceding construction can be taken so that f is a \(C^1\) diffeomorphism that commutes with a \(C^1\) vector field whose support in (0, 1) is nontrivial and contained in the union of the interior of the \(I_k\)’s.

Using f and the vector field above, we may perform the construction taking \(\xi ^t\) as being the flow associated to it. Indeed, since the vector field is \(C^1\) on the whole interval, equation (13) implies that for a given \((t_1,\ldots ,t_d)\), the corresponding \(g_{(t_1,\ldots ,t_d)}\) is a \(C^1\) diffeomorphism provided the expressions \(\sum _i s_i t_{i,k}\) remain uniformly bounded on k. However, as \((s_1,\ldots ,s_d)\) belongs to \(E^c_*\), this is always the case, because

$$\begin{aligned} \sum _i s_i t_{i,k} = \big \langle (s_1,\ldots ,s_d), A^k (t_1,\ldots ,t_d) \big \rangle = \big \langle (A^T)^k (s_1,\ldots ,s_d), (t_1,\ldots ,t_d) \big \rangle \end{aligned}$$

and \(\{ (A^T)^k (s_1,\ldots ,s_d), k \in \mathbb {Z}\}\) is a bounded subset of \(\mathbb {R}^d\).

To conclude the proof of Theorem 1.9, we need to show Proposition 5.3. Although at this point we could refer to the classical construction of Pixton [33], we prefer to give a simpler argument that decomposes into two elementary parts given by the next lemmas.

Lemma 5.4

There exists a vector field \(\mathcal {X}_0\) on [0, 1] with compact support in (0, 1) and a sequence \((\varphi _k)\) of \(C^{\infty }\) diffeomorphisms of [0, 1] with compact support inside (0, 1) that converges to the identity in the \(C^1\) topology and such that the diffeomorphisms \(\tilde{\varphi }_k := \varphi _k \circ \cdots \circ \varphi _1\) satisfy \((\tilde{\varphi }_k)_* (\mathcal {X}_0) = t_k \mathcal {X}_0\) for a certain sequence \((t_k)\) of positive numbers converging to zero.

Proof

Start with the flow of translations on the real line and the corresponding (constant) vector field. Any two positive times of this flow are smoothly conjugate by appropriate affine transformations. Now, map the real-line into the interval by a projective map. This yields the desired vector field and diffeomorphisms, except for that the supports are not contained in (0, 1). To achieve this, just start by performing the Muller–Tsuboi trick (c.f. Lemma 4.6) in order to make everything flat at the endpoints, then extend everything trivially in both directions by slightly enlarging the interval, and finally renormalize the resulting interval into [0, 1]. \(\square \)

Given a diffeomorphism \(\varphi \) of (resp., vector field \(\mathcal {X}\) on) an interval I, we denote by \(\varphi ^{\vee }\) (resp., \(\mathcal {X}^{\vee }\)) the diffeomorphism of (resp., vector field on) [0, 1] obtained after conjugacy (resp., push forward) by the unique affine map sending I into [0, 1]. Proposition 5.3 is a direct consequence of the next.

Lemma 5.5

There exists a \(C^1\) diffeomorphism f of [0, 1] fixing only the endpoints (with the origin as a repelling fixed point) as well as a \(C^1\) vector field \(\mathcal {Y}\) on [0, 1] such that

\(f_* (\mathcal {Y}) = \mathcal {Y}\) and so that for a certain \(x_0 \in (0,1)\), we have \((\mathcal {Y}|_{[x_0,f(x_0)]})^{\vee } = \mathcal {X}_0\).

Proof

Start with a \(C^{\infty }\) diffeomorphism g of [0, 1] that has no fixed point at the interior, and has the origin as a repelling fixed point. Fix any \(x_0 \in (0,1)\), and let \(\mathcal {Z}\) be a vector field on \([x_0,g(x_0)]\) such that \(\mathcal {Z}^{\vee } = \mathcal {X}_0\). A moment’s reflexion shows that this construction can be performed so that g is affine close to each endpoint.

For each \(k \in \mathbb Z\), let \(I_k := g^k ( [x_0,f(x_0)] )\). Let \(\varphi _k^{\wedge }\) be a diffeomorphism of \(I_k\) into itself such that \((\varphi _k^{\wedge })^{\vee } = \varphi _k\). Now let f be defined by letting \(f \big |_{I_{|k|}} := \varphi _{|k|}^{\wedge } \circ g\big |_{I_{|k|}}\). Extend \(\mathcal {Z}\) to the whole interval [0, 1] by making it commute with g. Finally, define \(\mathcal {Y}\) by letting \(\mathcal {Y} \big |_{I_{|k|}} := t_{|k|} \mathcal {Z} \big |_{I_{|k|}}\) for every \(k \in \mathbb Z\). One easily checks that f and \(\mathcal {Y}\) satisfy the desired properties. \(\square \)

To close this section, we remark that similar ideas yield to faithful actions by \(C^1\) circle diffeomorphisms without finite orbits for the groups considered here. Indeed, it suffices to consider f as being a Denjoy counter-example and then proceed as before along the intervals \(I_k := f^{k}(I)\), where I is a connected component of the complement of the exceptional minimal set of f. We leave the details of this construction to the reader.

6 Actions on the circle

Recall the next folklore (and elementary) result: For every group of circle homeomorphisms, one of the next three possibilities holds:

  1. (i)

    there is a finite orbit,

  2. (ii)

    all orbits are dense,

  3. (iii)

    there is a unique minimal invariant closed set that is homeomorphic to the Cantor set. (This is usually called an exceptional minimal set.)

Moreover, a result of Margulis states that in case of a minimal action, either the group is Abelian and conjugate to a group of rotations, or it contains free subgroups in two generators. (See [27, Chapter 2] for all of this.)

Assume next that a non-Abelian, solvable group acts faithfully by circle homeomorphisms. By the preceding discussion, such an action cannot be minimal. As we next show, it can admit an exceptional minimal set. For concreteness, we consider the group \(G := \mathbb Z\ltimes _A \mathbb Q^d\), with \(A \in GL_d (\mathbb Q)\). Start with a Denjoy counter-example \(g \in \mathrm {Homeo}_+(\mathrm {S}^1)\), that is, a circle homeomorphism of irrational rotation number that is not minimal. Let \(\Lambda \) be the exceptional minimal set of g. Let I be one of the connected components of \(\mathrm {S}^1 {\setminus } \Lambda \), and for each \(n \!\in \! \mathbb Z\), denote \(I_n := g^n (I)\). Consider any representation \(\phi _I \!: \mathbb Q^d \rightarrow \mathrm {Homeo}(I)\). (Such an action can be taken faithful just by integrating a topological flow up to rationally independent times and associating the resulting maps to the generators of \(\mathbb Q^d\).) Then extend \(\phi _I\) into \(\phi : G \rightarrow \mathrm {Homeo}_+(\mathrm {S}^1)\) on the one hand by letting \(\phi (a) := g\), and on the other hand, for each \(b \in H\), letting the restriction of \(\phi (b)\) to \(\mathrm {S}^1 {\setminus } \bigcup _n I_n\) being trivial, and setting \(\phi (h)|_{I_{n}} = g^{-n} \circ \phi _I (A^{-n} (h) ) \circ g^{n}\) for each \(n \in \mathbb Z\). It is easy to check that \(\phi \) is faithful. Part of the content of Theorem 1.10 is that in case A is hyperbolic, such an action cannot be by \(C^1\) diffeomorphisms. (Compare [19], where Cantwell–Conlon’s argument is used to prove this for the case of the Baumslag–Solitar group.)

We next proceed to the proof of Theorem 1.10. Let again denote by G a subgroup of \(\mathbb Z\ltimes _A \mathbb Q^d\) of the form \(H \times _A \mathbb Z\), with \(rank_{\mathbb Q}(H) = d\) and \(A \in GL_d(\mathbb Z)\). Assume with no loss of generality that the canonical basis \(\{b_1,\ldots ,b_d\}\) of \(\mathbb Q^d\) is contained in H (see Sect. 2), and denote by a the generator of the cyclic factor (induced by A). We start with the next

Lemma 6.1

Suppose A has no eigenvalue equal to 1. Then for every representation of G into \(\mathrm {Homeo}_+(\mathrm {S}^1)\), the set \(\bigcap \mathrm {Per}(b_i)\) of common periodic points of the \(b_i\)’s is nonempty and G-invariant.

Proof

Let \(\rho _i\in \mathbb {R}/\mathbb Z\) be the rotation number of \(b_i\). Since H is Abelian and \(a b_i a^{-1}= b_1^{\alpha _{1,i}}, \ldots , b_d^{\alpha _{d,i}}\), we have

$$\begin{aligned} \rho _i = \alpha _{1,i} \rho _1 + \cdots + \alpha _{d,i} \rho _d \quad (mod \ \mathbb Z). \end{aligned}$$

If we denote \(v := (\rho _1,\ldots ,\rho _d)\), this yields \(A^T v=v\) \((mod \ \mathbb Z^d)\). Hence, \(v\in (A^T-I)^{-1}(\mathbb Z^d)\subseteq \mathbb Q^d\). Therefore, all the rotation numbers \(\rho _i\) are rational, thus all the \(b_i\)’s have periodic points. Next, notice that for every family of commuting circle homeomorphisms each of which has a fixed point, there must be common fixed points. Indeed, they all necessarily fix the points in the support of a common invariant probability measure. To show the invariance of \(\bigcap \mathrm {Per}(b_i)\), notice that H-invariance is obvious by commutativity. Next, let x be fixed by \(b_1^{k_1}, \ldots , b_d^{k_d}\). Take \(q \in \mathbb N\) such that \(q\alpha _{i,j}\) is an integer for all ij. Then

$$\begin{aligned} a b_i^{q k_i} a^{-1} (x) = b_1^{k_iq\alpha _{1,i}}, \ldots , b_d^{k_iq\alpha _{d,i}} (x) = x, \end{aligned}$$

hence \(b_i^{qk_i}a^{-1}(p) = a^{-1}(p)\). We thus conclude that \(a^{-1}(x)\) is a common periodic point of the \(b_i\)’s, as desired. \(\square \)

Lemma 6.2

If a has periodic points, then there exists a finite orbit for G.

Proof

If a has periodic points, then every probability measure \(\mu \) that is invariant by a must be supported at these points. Since G is solvable (hence amenable), such a \(\mu \) can be taken invariant by the whole group. The points in the support of this measure must have a finite orbit. \(\square \)

Summarizing, for every faithful action of G by circle homeomorphisms, the nonexistence of a finite orbit implies that a admits an exceptional minimal set, say \(\Lambda \). In what follows, we will show that this last possibility cannot arise for representations into \(\mathrm {Diff}^1_+(S^1)\) with non-Abelian image.

As the set \(\bigcap \mathrm {Per}(b_i)\) is invariant under a, closed, and nonempty, we must have \(\Lambda \subseteq \bigcap \mathrm {Per}(b_i)\). Changing each \(b_i\) by \(b_i^k\) for some \(k\in \mathbb N\), we may assume that the periodic points of the \(b_i\)’s are actually fixed. (Observe that the map sending \(b_i\) into \(b_i^k\) and fixing a is an automorphism of G.) Given a point x in the complement of \(\bigcap \mathrm {Fix}(b_i)\) (which is nonempty due to the hypothesis), denote by \(I_{x}\) the connected component of the complement of \(\bigcap \mathrm {Fix}(b_i)\) containing x. Then there is an H-invariant measure \(\mu _x\) supported on \(I_{x}\) associated to which there is a translation vector \(\tau _{x}\); moreover, Lemma 4.2 still holds in this context.

If I is any connected component of the complement of \(\bigcap \mathrm {Fix}(b_i)\), then there are points \(z_1,\ldots z_d\) in I such that \(Db_i (z_i)=1\). Therefore, for every \(\varepsilon >0\), there exists \(\delta >0\) such that if \(|I|<\delta \), then \(1 - \varepsilon \le Db_i(z) \le 1+\varepsilon \) holds for all \(z \in I\) and all \(i \in \{1,\ldots ,d\}\). By decreasing \(\delta \) if necessary, we may also assume that

$$\begin{aligned} 1-\varepsilon \le \frac{Da(y)}{Da(z)} \le 1+\varepsilon \text{ for } \text{ all } y,z \text{ at } \text{ distance } dist(z,y) \le \delta . \end{aligned}$$
(14)

As \(I_x\) is a wandering interval for a, we have that there exists \(k_0\in \mathbb N\) such that \(|a^k(I_x)|<\delta \) and \(|a^{-k}(I_x)|<\delta \) for all \(k\ge k_0\). Together with (14), this allows to show the next analogue of Lemma 4.4 for the translation vectors \(\Delta (x) := \big ( b_1(x)-x,\ldots ,b_d(x)-x \big )\).

Lemma 6.3

For every \(\eta >0\), there exists \(k_0 \in \mathbb N\) such that if we denote by \(y_k\) the left endpoint of \(a^k(I)\) and we let \(\varepsilon ,\hat{\varepsilon }\) be defined by

$$\begin{aligned} \triangle (a^{-1}(x))=Da^{-1}(y_{-k}) \,A^T\triangle (x)+\epsilon (x), \quad x\in I_{a^{-k}(x_0)} \end{aligned}$$

and

$$\begin{aligned} \triangle (a(x))=Da(y_k) \,(A^T)^{-1}\triangle (x)+\hat{\epsilon }(x), \quad x\in I_{a^{k}(x_0)}, \end{aligned}$$

then \(\Vert \epsilon (x) \Vert \le \eta \big ( \Vert \triangle (x) \Vert + \Vert \triangle (a^{-1}(x)) \Vert \big )\) and \(\Vert \hat{\epsilon }(x) \Vert \le \eta \big ( \Vert \triangle (x) \Vert + \Vert \triangle (a^{-1}(x)) \Vert \big )\) do hold for all \(k \ge k_0\).

Again, the normalized translation vectors \(\vec {\tau }_{a^{-n}(x_0)}\) (resp., \(\vec {\tau }_{a^{n}(x_0)}\)) accumulate at some \(\vec {\tau } \in S^d\) (resp., \(\vec {\tau }_*\)) as \(n \rightarrow \infty \). For each \(n \in \mathbb {Z}\), we let \(x_n := a^{-n} (x_0)\), and we choose a sequence of positive integers \(n_k\) such that \(\vec {\tau }_{x_{n_k}}\rightarrow \vec {\tau }\) and \(\vec {\tau }_{x_{-n_k}}\rightarrow \vec {\tau }_*\) as \(k\rightarrow \infty \). With this notation, Lemma 4.5 remains true.

Finally, Lemma 4.7 is easily adapted to this case:

Lemma 6.4

For any neighborhood \(V \subset S^{d-1}\) of \(E^u \cap S^{d-1}_*\) in the unit sphere \(S^{d-1}_* \subset \mathbb {R}^d\) (with the norm \(\Vert \cdot \Vert _*\)), there is \(K_0 \in \mathbb N\) such that for all \(k \ge K_0\) and all \(x \in a^{-k}(I_{x_0})\) not fixed by H,

$$\begin{aligned} \frac{\triangle (x)}{\Vert \triangle (x)\Vert _*}\in V \large \implies \frac{\triangle (a^{-1}(x))}{\Vert \triangle (a^{-1}(x)) \Vert _*}\in V. \end{aligned}$$

Moreover, if V is small enough, then there exists \(\kappa > 1\) such that

$$\begin{aligned} \frac{\triangle (x)}{\Vert \triangle (x) \Vert _*} \in V \large \implies \Vert \triangle (a^{-1}x) \Vert _*\ge \kappa D a^{-1}(y_{-k}) \Vert \triangle (x)\Vert _* . \end{aligned}$$

Now, we may conclude as in the proof of Proposition 1.5 up to a small detail. Namely, suppose \(\vec \tau _{x_0}\notin E^s\). Then \(\vec \tau \in E^u\). Using Lemmas 4.5 and 6.4, we get for \(k\ge K_0\) and all \(n\in \mathbb N\),

$$\begin{aligned} \Vert \triangle (x_{n+k}) \Vert _* \ge \kappa ^n Da^{-n}(y_{-k}) \Vert \triangle (x_k) \Vert _*. \end{aligned}$$

Now, using the fact that the growth of \(Da^n\) is uniformly sub-exponential,Footnote 5 we get a contradiction as n goes to infinity. In the case where \(\vec \tau _{x_0}\in E^s\), we have \(\vec \tau _{x_0} \notin E^s\), and we may proceed as before using \(a^{-1}\) instead of a.

This closes the proof of the absence of an exceptional minimal set, hence of the existence of a finite orbit for G.